跳到内容

示例: 使用经过验证的引用回答问题

完整的代码示例请查看 examples/validated_citations/index.ts

概述

本示例演示了如何将 Instructor-js 与 Zod 验证器结合使用,以确保语言模型 (LM) 的每一个陈述都有提供的上下文中的直接引用作为支持,从而防止幻觉并确保引用的准确性。它定义了 TypeScript 函数和 Zod 架构来封装单个事实和整个答案的信息。

数据结构

Fact 架构

Fact 架构封装了单个陈述或事实。它包含两个属性

  • fact: 一个字符串,表示事实或陈述的主体。
  • substring_quote: 一个字符串列表。每个字符串都是来自上下文的直接引用,用于支持 fact。

验证方法: createFactWithContext

此方法动态创建一个 Fact 的 Zod 架构,并进行上下文依赖的验证。它使用正则表达式验证来源 (substring_quote),以查找每个子字符串引用在给定上下文中的范围。如果找不到范围,则从列表中删除该引用。

import Instructor from "@/instructor"
import { z } from "zod"


function createFactWithContext(dynamicContext: string) {
  return z.object({
    statement: z.string(),
    substring_quote: z.array(z.string()).transform((quotes) => {
      return quotes.flatMap((quote) => {
        const spans = getSpans(quote, dynamicContext);
        return spans.map(span => dynamicContext.substring(span[0], span[1]));
      });
    })
  });
}

function getSpans(quote: string, context: string): Array<[number, number]> {
  const matches: any = [];
  // Example regex search for simplicity; adjust according to your actual implementation
  const regex = new RegExp(quote, 'g');
  let match;

  while ((match = regex.exec(context)) !== null) {
    matches.push([match.index, regex.lastIndex]);
  }
  return matches.length > 0 ? matches : [];
}

QuestionAnswer 架构

此架构封装了问题及其对应的答案。它的存在是为了提供 OpenAI API 调用的响应结构。它包含两个属性

  • question: 提出的问题。
  • answer: 构成答案的 Fact 对象列表。
const QuestionAnswer = z.object({
  question: z.string(),
  answer: z.array(z.object({
    statement: z.string(),
    substring_quote: z.array(z.string()), // Basic structure without dynamic context validation
  }))
});
type QuestionAnswerType = z.infer<typeof QuestionAnswer>

验证方法: createQuestionAnswerWithContext

此方法动态生成一个 QuestionAnswer 的 Zod 架构,并进行上下文敏感的验证,确保答案列表中的每个 Fact 对象至少有一个有效的来源。如果 Fact 对象没有有效的来源,则将其从答案列表中删除。

function createQuestionAnswerWithContext(dynamicContext: string) {
  const FactSchemaWithContext = createFactSchemaWithContext(dynamicContext);

  return z.object({
    question: z.string(),
    answer: z.array(FactSchemaWithContext).transform((answers) => {
      // Filter out any Facts that, after validation, have no valid quotes
      return answers.filter(fact => fact.substring_quote.length > 0);
    })
  });
}

询问 AI 的函数

askAI 函数

此函数接受一个字符串 question 和一个字符串 context,并返回一个 QuestionAnswer 对象。它使用 OpenAI API 和动态 Zod 架构进行验证。

import Instructor from "@/instructor"
import OpenAI from "openai"
import { z } from "zod"

const oai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY ?? undefined,
  organization: process.env.OPENAI_ORG_ID ?? undefined
})

const client = Instructor({
  client: oai,
  mode: "FUNCTIONS"
})

async function askAI(question: string, context: string): Promise<QuestionAnswerType> {
  const response = await client.chat.completions.create({
    model: "gpt-3.5-turbo-0613",
    temperature: 0,
    response_model: { schema: QuestionAnswer, name: "Question and Answer" },
    messages: [
      { role: "system", content: "You are a world class algorithm to answer questions with correct and exact citations." },
      { role: "user", content: context },
      { role: "user", content: `Question: ${question}` },
    ],
  });
  const QuestionAnswerWithContext = createQuestionAnswerWithContext(context);
  const parsedResponse = QuestionAnswerWithContext.parse(response);

  return parsedResponse;
}

示例

以下是使用这些类和函数来提出问题并验证答案的示例。

const question = "Where did he go to school?"
const context = `My name is Jason Liu, and I grew up in Toronto Canada but I was born in China.
I went to an arts high school but in university I studied Computational Mathematics and physics.
  As part of coop I worked at many companies including Stitchfix, Facebook.
  I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.`

输出将是一个包含经过验证的事实及其来源的 QuestionAnswer 对象。

{
  question: "Where did Jason Liu go to school?",
  answer: [
    {
      statement: "Jason Liu went to an arts high school.",
      substring_quote: [ "arts high school" ],
    }, 
    {
      statement: "Jason Liu studied Computational Mathematics and physics in university.",
      substring_quote: [ "Computational Mathematics and physics" ],
    }
  ],
}

这确保了答案中的每一条信息都已对照上下文进行了验证。