跳到内容

分割搜索查询

在这个例子中,我们将演示如何利用 OpenAI 函数调用的 MultiTaskenum 特性来分割搜索查询。我们将使用 Zod 定义必要的模式,并演示如何将查询分割成多个子查询并并行执行它们。

动机

从文本中提取任务列表是利用语言模型的一个常见用例。此模式可应用于各种应用程序,例如 Siri 或 Alexa 等虚拟助手,理解用户意图并将请求分解为可执行的任务至关重要。在此示例中,我们将演示如何使用 OpenAI 函数调用来分割搜索查询并并行执行它们。

数据结构

SearchTypeSchema 是一个 Zod 模式,它定义了搜索查询对象的结构。它包含三个字段:titlequerytypetitle 字段是请求的标题,query 字段是用于搜索相关内容的查询,而 type 字段是搜索的类型。executeSearch 函数用于执行搜索查询。

import Instructor from "@/instructor"
import OpenAI from "openai"
import { z } from "zod"

const oai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY ?? undefined,
  organization: process.env.OPENAI_ORG_ID ?? undefined
})

const client = Instructor({
  client: oai,
  mode: "FUNCTIONS"
})

const SearchTypeSchema = z
  .enum(["VIDEO", "EMAIL"])
  .describe("Enumeration representing the types of searchs that can be performed")

const SearchSchema = z
  .object({
    title: z.string().describe("Title of the request"),
    query: z.string().describe("Query to search fro relevant content"),
    type: SearchTypeSchema.describe("Type of search")
  })
  .describe(
    "Object representing a single search query which contains title, query, and the search type"
  )

type Search = z.infer<typeof SearchSchema>

async function executeSearch(search: Search) {
  setTimeout(
    () => console.log(`Searching for ${search.title} with ${search.query} using ${search.type}`),
    1000
  )
}

const MultiSearchSchema = z
  .object({
    searches: z.array(SearchSchema).describe("List of searches")
  })
  .describe("Object representing multiple search queries")

type MultiSearch = z.infer<typeof MultiSearchSchema>

async function executeMultiSearch(multiSearch: MultiSearch) {
  return Promise.all(
    multiSearch.searches.map((search: Search) => {
      executeSearch(search)
    })
  )
}

/**
 * Convert a string into multiple search queries
 */
async function segment(data: string): Promise<MultiSearch> {
  return await client.chat.completions.create({
    messages: [
      {
        role: "system",
        content: "You are a helpful assistant."
      },
      {
        role: "user",
        content: `Consider the data below:\n${data} and segment it into multiple search queries`
      }
    ],
    model: "gpt-4-1106-preview",
    response_model: { schema: MultiSearchSchema, name: "Multi Search" },
    max_tokens: 1000,
    temperature: 0.1
  })
}

const queries = await segment(
  "Please send me the video from last week about the investment case study and also documents about your GPDR policy"
)
executeMultiSearch(queries)

// >>> Searching for `Video` with query `investment case study` using `SearchType.VIDEO`
// >>> Searching for `Documents` with query `GPDR policy` using `SearchType.EMAIL`