新一代的交互形式LUI(language user interface)到来,微软为它架了一座桥

发布于 2024-7-9 08:17
浏览
0收藏

新一代的交互形式LUI(language user interface)到来,微软为它架了一座桥-AI.x社区

ChatGPT为代表的LLM以其惊艳的自然语言理解能力得到大众认可之后,一大批基于自然语言的应用被纷纷提上日程,以对话的方式进行人机交互的模式再一次被业内所关注。

大约五六年前,智能音箱因为其自然语言对话的交互形式,掀起了一场智能音箱热潮,各大公司对其产品给予了厚望,但由于技术不够成熟,理解和对话能力不够强,导致了智能音箱始终难以作为大众消费品获得成功,而当下基于大模型的语义理解和对话能力有了质的提高,大家对于LUI(language user interface)又有了新的期待,都在等待新的时代到来,类似苹果通过革命性的电容屏和IOS替代nokia时代的电阻屏和塞班系统带来的手机体验变革开启移动互联网时代。因此,行业内兴起了LUI作为下一代交互体验变革的研究和应用。

新一代的交互形式LUI(language user interface)到来,微软为它架了一座桥-AI.x社区

LUI介绍,PPT:https://drive.google.com/file/d/1A6ZkodDUuyQzVE3HCrI8iHs3y-9wDI1t/view

然而,在开发以ChatGPT为代表的大模型作为大脑,对接下游服务的LUI应用时,遇到了一个新矛盾(以前交互方式恰好相反),那就是对用户友好,但对服务间集成不友好。人和系统的沟通和下达指令的语言是多变的,灵活的,然而系统与系统之间的调用却希望是格式化的,稳定的。而大模型做到了理解用户多变的指令的同时,带来一个问题,就是它自身的输出同样会变得多变和不稳定,这就对于系统集成带来了很大的挑战。为了解决这一矛盾,langchain这类框架,通过精心构造prompt以及专门设计“Output parsers”来处理各种各样的输出,当然也包含之前提到了几个库,如guidance,Guardrails,但基于prompt engineering的解决路径,有一定的缺陷,首先,对于传统开发者来讲,它是一个新物种,比较陌生,有一定的学习门槛,其次,模型输出具有一定不确定性,随着类型复杂度越来越高,链路也越来越脆弱,经常会因为边角问题导致系统崩溃。为此,OpenAI自己也开发了function calls功能来提升大模型与系统的对接能力。

新一代的交互形式LUI(language user interface)到来,微软为它架了一座桥-AI.x社区

近日(7.20),微软在这一领域又发布了一款新的开源前端库TypeChat(https://github.com/microsoft/TypeChat),给后续开发者解决这类问题的一个新思路,它的一个创新点是通过schema engineering替代了 prompt engineering,巧妙的借助了typescript的能力,开发者只需定义代表LUI应用所支持的意图的类型即可。这可以是简单的情感分类的接口类型,也可以是更复杂的数据类型,如购物车或音乐应用程序的类型。例如,要在schema中添加额外的意图(intents),开发人员可以在可辨识联合(Discriminated Unions)添加额外的类型。为使schema具有层次性,开发人员可以使用 "meta-schema",根据用户输入选择一个或多个子schema。定义完类型后,TypeChat 搞定后续工作:

  • 使用类型构建对 LLM 的提示。
  • 验证 LLM 响应是否符合模式。如果验证失败,则通过进一步的语言模型交互来修复不符合要求的输出。
  • 简明扼要地总结实例(无需使用 LLM),并确认其符合用户意图。

模式对比:

  • prompt 方式

开发者构造prompt,给出示例,要求LLM以JSON格式输出,然后基于输出进行后续处理。

User: Provide 3 suggestions for specific places to go to in Seattle on a rainy day. Respond in the form of JSON. The JSON should have the following forma

[
    { "venue": "...", "description": "..." },
    { "venue": "...", "description": "..." }
]

ChatGPT:

{
        "venue": "Seattle Art Museum",
        "description": "The Seattle Art Museum (SAM) is a world-class museum with a diverse collection of art from around the globe. It offers a great indoor activity for a rainy day, allowing you to explore various art exhibitions and displays, including contemporary art, African and Oceanic art, and European painting and sculpture."
    },
    {
        "venue": "Chihuly Garden and Glass",
        "description": "Chihuly Garden and Glass is an extraordinary exhibition showcasing the breathtaking glass artwork of Dale Chihuly. Located at the Seattle Center, this indoor attraction offers a mesmerizing experience as you wander through vibrant glass sculptures and installations. The beautiful interplay of light and color will captivate your senses, making it a perfect rainy day destination."
    },
    {
        "venue": "Museum of Pop Culture",
        "description": "The Museum of Pop Culture (MoPOP) is a vibrant museum dedicated to contemporary pop culture, music, science fiction, and more. It features engaging exhibits that celebrate music icons, explore science fiction themes, and delve into the world of video games. Whether you're a fan of music, movies, or pop culture in general, MoPOP offers an immersive and entertaining experience, making it an ideal choice on a rainy day."
    }
]
  • schema方式

开发者使用熟悉的方式开发,定义类型,TypeChat帮你搞定内部细节。事实上,typechat利用了typescript本身能力,借助TypeScript编译器来处理covercase,能够很好消除语言模型响应的一些尖锐边缘,提高鲁棒性。对于开发者本身来讲,也可以基于原有开发模式编程,可进一步地进行数据加工和处理,达到1+1>2的效果。

User: Provide 3 suggestions for specific places to go to in Seattle on a rainy day. Respond strictly with JSON. The JSON should be compatible with the TypeScript type Response from the following:

interface Response {
data: Array<{ venue: string, description: string }>;
}

ChatGPT:

{
"data": [
    {
"venue": "Museum of Pop Culture",
"description": "Explore the fascinating world of popular culture through interactive exhibits, music memorabilia, and multimedia displays at the Museum of Pop Culture."
    },
    {
"venue": "Seattle Art Museum",
"description": "Indulge in the diverse collection of artworks from around the world at the Seattle Art Museum, featuring both contemporary and classical pieces."
    },
    {
"venue": "Pike Place Market",
"description": "Enjoy the vibrant atmosphere of Pike Place Market, where you can wander through covered stalls offering fresh produce, crafts, and delicious food from local vendors."
    }
  ]
}

利用TypeChat做情感分析的例子:

// ./src/sentimentSchema.ts


// The following is a schema definition for determining the sentiment of a some user input.


export interface SentimentResponse {
    /** The sentiment of the text. */
    sentiment: "negative" | "neutral" | "positive";
}




// ./src/main.ts


import * as fs from "fs";
import * as path from "path";
import dotenv from "dotenv";
import * as typechat from "typechat";
import { SentimentResponse } from "./sentimentSchema";


// Load environment variables.
dotenv.config({ path: path.join(__dirname, "../.env") });


// Create a language model based on the environment variables.
const model = typechat.createLanguageModel(process.env);


// Load up the contents of our "Response" schema.
const schema = fs.readFileSync(path.join(__dirname, "sentimentSchema.ts"), "utf8");
const translator = typechat.createJsonTranslator<SentimentResponse>(model, schema, "SentimentResponse");


// Process requests interactively.
typechat.processRequests("😀> ", /*inputFile*/ undefined, async (request) => {
    const response = await translator.translate(request);
    if (!response.success) {
        console.log(response.message);
        return;
    }
    console.log(`The sentiment is ${response.data.sentiment}`);
});

结语:

TypeChat是微软在LUI的一个新的尝试,通过typeChat可以很好的连接语言和服务,前端开发者可以通过TypeChat将自己过去的软件改造为新的LUI的交互形式,并且也能够在此基础上开发更多功能,而这个开发经验本身是传承的,模式是渐进的。目前,该库支持OPEN API和Azure OpenAI service ,对于其它模型来讲,还不支持,期待能够支持自定义的模型服务能力,当然这也对大模型本身能力有一定的要求。

总之:Types are all you need!

本文转载自 AI工程化​,作者: ully


收藏
回复
举报
回复
相关推荐