示例: 使用经过验证的引用回答问题¶
完整的代码示例请查看 examples/validated_citations/index.ts
概述¶
本示例演示了如何将 Instructor-js 与 Zod 验证器结合使用,以确保语言模型 (LM) 的每一个陈述都有提供的上下文中的直接引用作为支持,从而防止幻觉并确保引用的准确性。它定义了 TypeScript 函数和 Zod 架构来封装单个事实和整个答案的信息。
数据结构¶
Fact 架构¶
Fact 架构封装了单个陈述或事实。它包含两个属性
- fact: 一个字符串,表示事实或陈述的主体。
- substring_quote: 一个字符串列表。每个字符串都是来自上下文的直接引用,用于支持 fact。
验证方法: createFactWithContext¶
此方法动态创建一个 Fact 的 Zod 架构,并进行上下文依赖的验证。它使用正则表达式验证来源 (substring_quote),以查找每个子字符串引用在给定上下文中的范围。如果找不到范围,则从列表中删除该引用。
import Instructor from "@/instructor"
import { z } from "zod"
function createFactWithContext(dynamicContext: string) {
return z.object({
statement: z.string(),
substring_quote: z.array(z.string()).transform((quotes) => {
return quotes.flatMap((quote) => {
const spans = getSpans(quote, dynamicContext);
return spans.map(span => dynamicContext.substring(span[0], span[1]));
});
})
});
}
function getSpans(quote: string, context: string): Array<[number, number]> {
const matches: any = [];
// Example regex search for simplicity; adjust according to your actual implementation
const regex = new RegExp(quote, 'g');
let match;
while ((match = regex.exec(context)) !== null) {
matches.push([match.index, regex.lastIndex]);
}
return matches.length > 0 ? matches : [];
}
QuestionAnswer 架构¶
此架构封装了问题及其对应的答案。它的存在是为了提供 OpenAI API 调用的响应结构。它包含两个属性
- question: 提出的问题。
- answer: 构成答案的 Fact 对象列表。
const QuestionAnswer = z.object({
question: z.string(),
answer: z.array(z.object({
statement: z.string(),
substring_quote: z.array(z.string()), // Basic structure without dynamic context validation
}))
});
type QuestionAnswerType = z.infer<typeof QuestionAnswer>
验证方法: createQuestionAnswerWithContext¶
此方法动态生成一个 QuestionAnswer 的 Zod 架构,并进行上下文敏感的验证,确保答案列表中的每个 Fact 对象至少有一个有效的来源。如果 Fact 对象没有有效的来源,则将其从答案列表中删除。
function createQuestionAnswerWithContext(dynamicContext: string) {
const FactSchemaWithContext = createFactSchemaWithContext(dynamicContext);
return z.object({
question: z.string(),
answer: z.array(FactSchemaWithContext).transform((answers) => {
// Filter out any Facts that, after validation, have no valid quotes
return answers.filter(fact => fact.substring_quote.length > 0);
})
});
}
询问 AI 的函数¶
askAI 函数¶
此函数接受一个字符串 question 和一个字符串 context,并返回一个 QuestionAnswer 对象。它使用 OpenAI API 和动态 Zod 架构进行验证。
import Instructor from "@/instructor"
import OpenAI from "openai"
import { z } from "zod"
const oai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY ?? undefined,
organization: process.env.OPENAI_ORG_ID ?? undefined
})
const client = Instructor({
client: oai,
mode: "FUNCTIONS"
})
async function askAI(question: string, context: string): Promise<QuestionAnswerType> {
const response = await client.chat.completions.create({
model: "gpt-3.5-turbo-0613",
temperature: 0,
response_model: { schema: QuestionAnswer, name: "Question and Answer" },
messages: [
{ role: "system", content: "You are a world class algorithm to answer questions with correct and exact citations." },
{ role: "user", content: context },
{ role: "user", content: `Question: ${question}` },
],
});
const QuestionAnswerWithContext = createQuestionAnswerWithContext(context);
const parsedResponse = QuestionAnswerWithContext.parse(response);
return parsedResponse;
}
示例¶
以下是使用这些类和函数来提出问题并验证答案的示例。
const question = "Where did he go to school?"
const context = `My name is Jason Liu, and I grew up in Toronto Canada but I was born in China.
I went to an arts high school but in university I studied Computational Mathematics and physics.
As part of coop I worked at many companies including Stitchfix, Facebook.
I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.`
输出将是一个包含经过验证的事实及其来源的 QuestionAnswer 对象。
{
question: "Where did Jason Liu go to school?",
answer: [
{
statement: "Jason Liu went to an arts high school.",
substring_quote: [ "arts high school" ],
},
{
statement: "Jason Liu studied Computational Mathematics and physics in university.",
substring_quote: [ "Computational Mathematics and physics" ],
}
],
}
这确保了答案中的每一条信息都已对照上下文进行了验证。