AI调用工具和编程

wdong

目前对行内的人来说一个既不幸又幸运的事情是，一眼看得到底的工程性的东西基本上不是被解决了，就是已经有很专业团队在做了。工具整合就是这么一个东西。Composio是一家做工具整合做得质量非常好的公司https://composio.dev/。常用的功能比如gmail, github, serpapi这些都整合了，非常方便。Composio提供了接口可以直接连OpenAI。我的平台因为工作在completion level，并不假设底层工具调用的接口。但是也非常方便地整合了。下面是我整合composio的代码+

#!/usr/bin/env python3
import os
import asyncio
import json
import postline
from composio import ComposioToolSet, Action

API_KEY = os.getenv('COMPOSIO_API_KEY')

toolset = ComposioToolSet(api_key=API_KEY)

def make_reply_message (self_address, message, content):
re = postline.Message()
re['From'] = self_address
re['To'] = message['From']
re['Content-Type'] = 'text/plain'
re.set_content(content)
return re

app = postline.App("composio")

self_address = "composio@realm.localdomain"

@app.receive(self_address) # 监听邮件地址
async def handle_message (app, message):
action = message['Subject'] # 邮件标题是action
content = message.get_content()
params = json.loads(content) # 邮件内容是参数
resp = toolset.execute_action( # 调用工具就是这一行
action=getattr(Action, action),
params=params,
entity_id="default",
)
msg = make_reply_message(self_address, message, json.dumps(resp))
await app.send(msg)

asyncio.run(app.run())

然后对话的时候教AI，告诉它想要用composio的功能就是给composio@realm.localdomain发邮件，标题是action，比如SERPAPI_SEARCH， GMAIL_SEND_EMAIL，参数则以json形式作为邮件内容。比如像下面这样：

From: user@localdomain
To: a100@agents.localdomain
Let's try to test the composio tool. The way to use it is to send an email to composio@realm.localdomain . The subject should be the action name, and the email should be of the MIME type application/json . The email body is a JSON dict of params.

For the first test, try to use the action (email subject) GMAIL_GET_PROFILE . The body has only one parameter: "user_id": "me" .

From: a100@agents.localdomain
To: composio@realm.localdomain
Subject: GMAIL_GET_PROFILE
Content-Type: application/json

{
"user_id": "me"
}

From: composio@realm.localdomain
To: a100@agents.localdomain

{"data": {"response_data": {"emailAddress": "wdong@wdong.org", "messagesTotal
": 93456, "threadsTotal": 66445, "historyId": "15502828"}}, "error": null, "s
uccessfull": true, "successful": true}

可以看到agent收到我的邮件后就给composio发了个邮件。这个邮件格式是正确的，然后上面贴的那个composio接口也回复了。发送邮件，google search我试了都没问题。我非要自己搞一套，是因为我发现AI和传统软件不同，非常versatile，基本上你想要什么姿势它都能用什么姿势配合你。OpenAI的工具调用接口根本就没有需要做。有时候我这边程序没写好，它也能变通一下配合我让程序跑起来。

眼前的问题是，google search结果很大，搜几次如果积累在journal里，一次inference的开销马上就涨到了一毛多。需要利用agent clone功能开发出一个调用子agent的方法，让子agent对搜索结果摘要之后再返回。

Composio里面有大量的和软件开发相关的工具。然后他们在这之上和crewAI结合搞了个swe-kit, 是一个做软件工程的agent。我看了眼目前他们的逻辑比较简单，高度依赖底层AI的自主性。我感觉用agent做软件工程这个方向有大量的事情可以做。Benchmark已经有了https://www.swebench.com/.

hellofolks

您可以试试谷歌的api。context大，而且有免费的额度。

wdong

hellofolks 写了： 2025年 2月 20日 14:41 您可以试试谷歌的api。context大，而且有免费的额度。

谢谢，我之前试过第三方的gemma，至少能follow我的protocol。我试下google的。

TheMatrix · 帖子由 **TheMatrix** » 2025年 2月 20日 17:21

wdong 写了： 2025年 2月 20日 09:23
From: user@localdomain
To: a100@agents.localdomain
Let's try to test the composio tool. The way to use it is to send an email to composio@realm.localdomain . The subject should be the action name, and the email should be of the MIME type application/json . The email body is a JSON dict of params.

For the first test, try to use the action (email subject) GMAIL_GET_PROFILE . The body has only one parameter: "user_id": "me" .

你的a100 agent是如何懂你发给它的邮件的呢？

wdong

第一个邮件解释，他是个agent只能通过邮件和外界联系。下面第二个邮件是我硬加的而不是LLM产生的，
目的是为了让LLM习惯这么交流（LLM无法判断这个记忆是它自己的还是我硬塞进去的)。
然后第三个邮件就是用户请求了。因为邮件这种格式太常见了，大部分大厂官方llm都能无痛适应。

From: system@localdomain
To: a100@agents.localdomain
Subject: Agent created

You are an AI agent who communicates with the outside world through email
messages. The incoming messages might be from human users, other AI agents, or
the system itself. You should respond to the messages as appropriate. Make
sure you generate the emails headers correctly. If you decide to add a Subject,
make it concise; or you could just leave it blank.

From: a100@agents.localdomain
To: system@localdomain
Subject: RE: Agent created

I'm ready to process messages.

TheMatrix · 帖子由 **TheMatrix** » 2025年 2月 20日 21:03

wdong 写了： 2025年 2月 20日 17:59 第一个邮件解释，他是个agent只能通过邮件和外界联系。下面第二个邮件是我硬加的而不是LLM产生的，
目的是为了让LLM习惯这么交流（LLM无法判断这个记忆是它自己的还是我硬塞进去的)。
然后第三个邮件就是用户请求了。因为邮件这种格式太常见了，大部分大厂官方llm都能无痛适应。

From: system@localdomain
To: a100@agents.localdomain
Subject: Agent created

You are an AI agent who communicates with the outside world through email
messages. The incoming messages might be from human users, other AI agents, or
the system itself. You should respond to the messages as appropriate. Make
sure you generate the emails headers correctly. If you decide to add a Subject,
make it concise; or you could just leave it blank.

From: a100@agents.localdomain
To: system@localdomain
Subject: RE: Agent created

I'm ready to process messages.

我们编一下号。如下，有三封邮件：
第一封是用户写给a100 agent的。
第二封是a100 agent写给composio的。
第三封是composio写给a100 agent的。

我的问题是：第一封信a100 agent收到之后，它是怎么弄懂邮件内容的呢？

1.

From: user@localdomain
To: a100@agents.localdomain
Let's try to test the composio tool. The way to use it is to send an email to composio@realm.localdomain . The subject should be the action name, and the email should be of the MIME type application/json . The email body is a JSON dict of params.

For the first test, try to use the action (email subject) GMAIL_GET_PROFILE . The body has only one parameter: "user_id": "me" .

2.

From: a100@agents.localdomain
To: composio@realm.localdomain
Subject: GMAIL_GET_PROFILE
Content-Type: application/json

{
"user_id": "me"
}

3.

From: composio@realm.localdomain
To: a100@agents.localdomain

{"data": {"response_data": {"emailAddress": "wdong@wdong.org", "messagesTotal
": 93456, "threadsTotal": 66445, "historyId": "15502828"}}, "error": null, "s
uccessfull": true, "successful": true}

wdong

每次agent收到邮件会触发inference，也就是把agent至今为止收发的邮件全都放到prompt里调用LLM。因为最后一个肯定是收到的，所以它知道要产生下一个回复，而且知道要回复给谁。

其实chat API的输入本身就是a list of messages. 我只是要把邮件转成字符串。

我目前干的东西还超级简单。

TheMatrix · 帖子由 **TheMatrix** » 2025年 2月 21日 17:00

wdong 写了： 2025年 2月 21日 00:38 每次agent收到邮件会触发inference，也就是把agent至今为止收发的邮件全都放到prompt里调用LLM。因为最后一个肯定是收到的，所以它知道要产生下一个回复，而且知道要回复给谁。

其实chat API的输入本身就是a list of messages. 我只是要把邮件转成字符串。

我目前干的东西还超级简单。

也就是邮件1发送给a100 agent之后，agent把它作为prompt发给了LLM。然后邮件2是LLM产生的，agent收到之后发给composio。

是这样吗？

stonesthat

你这做法的根本革新是在LLM之上又套了一层，分离了 in context learning (training) with deployment,
这样有可能藏起 in context 训练过程，可以在别人发布的LLM之上做商用服务。你paper也说分离能够
满足隐私要求，很大程度上这是能保护做 agent 的服务方的隐私。这么理解对吗？

其实目前所有的LLM其实都支持多轮对话，用所有 history 做 prompt. 问题是很多 chat thread 聊完
之后也就完了，找回来接着聊的其实不多，这些 thread 其实openai都拿去训练新模型了。
就这还不够，openai 请了>phD level 的 consultant 天天跟这些模型聊天，感觉走入歧途了。

wdong

stonesthat 写了： 2025年 3月 3日 12:47 你这做法的根本革新是在LLM之上又套了一层，分离了 in context learning (training) with deployment,
这样有可能藏起 in context 训练过程，可以在别人发布的LLM之上做商用服务。你paper也说分离能够
满足隐私要求，很大程度上这是能保护做 agent 的服务方的隐私。这么理解对吗？

其实目前所有的LLM其实都支持多轮对话，用所有 history 做 prompt. 问题是很多 chat thread 聊完
之后也就完了，找回来接着聊的其实不多，这些 thread 其实openai都拿去训练新模型了。
就这还不够，openai 请了>phD level 的 consultant 天天跟这些模型聊天，感觉走入歧途了。

你说的对，就是把目前的界面变成群聊。所以从新的东西上来说其实几乎没有革新。
我想象中的革新主要是在概念上的，就是反对那种在用户界面上画流程图的做法。就这么群聊就可以实现所有的东西。

jb · 帖子由 **jb（Joe Biden）** » 2025年 3月 6日 01:03

https://annas-archive.org/slow_download ... 4f423c/0/0

新未名空间

AI调用工具和编程

#1 AI调用工具和编程

#2 Re: AI调用工具和编程

#3 Re: AI调用工具和编程

#4 Re: AI调用工具和编程

#5 Re: AI调用工具和编程

#6 Re: AI调用工具和编程

#7 Re: AI调用工具和编程

#8 Re: AI调用工具和编程

#9 Re: AI调用工具和编程

#10 Re: AI调用工具和编程

#11 Re: AI调用工具和编程