10分钟上手AutoGen+FastAPI：构建Web智能体服务的极简指南

2026-02-04 05:15:13作者：温艾琴Wonderful

你是否还在为如何将大型语言模型(LLM)能力集成到Web应用中而烦恼？是否遇到过对话状态管理复杂、多智能体协作困难的问题？本文将带你通过AutoGen与FastAPI的无缝集成，快速构建一个具备状态持久化和多智能体协作能力的Web智能体服务，全程只需三个步骤，无需复杂配置即可实现生产级部署。

技术架构概览

AutoGen是微软开源的智能体框架，支持多智能体协作、函数调用和状态管理；FastAPI则是高性能的Python Web框架，两者结合可快速构建AI原生Web服务。本方案采用"前端-API层-智能体层"三层架构：

graph TD
    A[用户浏览器] -->|HTTP/WebSocket| B[FastAPI服务]
    B --> C[AutoGen智能体/团队]
    C --> D[LLM模型服务]
    C --> E[状态存储(JSON文件)]

核心代码结构位于python/samples/agentchat_fastapi/目录，包含两个关键实现：单智能体聊天(app_agent.py)和多智能体协作(app_team.py)。

环境准备与安装

基础依赖安装

首先克隆项目仓库并安装依赖：

git clone https://gitcode.com/GitHub_Trending/au/autogen.git
cd autogen/python/samples/agentchat_fastapi
pip install -U "autogen-agentchat" "autogen-ext[openai]" "fastapi" "uvicorn[standard]" "PyYAML"

模型配置

创建model_config.yaml文件配置LLM模型，支持OpenAI、Azure等多种后端：

type: openai
model: "gpt-3.5-turbo"
api_key: "your_api_key_here"

提示：国内用户可使用Azure OpenAI服务或其他兼容API的本地模型，配置示例可参考model_config_template.yaml

单智能体服务实现

核心代码解析

单智能体服务通过AssistantAgent处理用户请求，关键实现位于app_agent.py：

# 智能体初始化
async def get_agent() -> AssistantAgent:
    async with aiofiles.open(model_config_path, "r") as file:
        model_config = yaml.safe_load(await file.read())
    model_client = ChatCompletionClient.load_component(model_config)
    agent = AssistantAgent(
        name="assistant",
        model_client=model_client,
        system_message="You are a helpful assistant.",
    )
    # 加载状态
    if os.path.exists(state_path):
        async with aiofiles.open(state_path, "r") as file:
            state = json.loads(await file.read())
        await agent.load_state(state)
    return agent

HTTP接口设计采用RESTful风格，提供聊天和历史记录查询功能：

@app.post("/chat", response_model=TextMessage)
async def chat(request: TextMessage) -> TextMessage:
    agent = await get_agent()
    response = await agent.on_messages(messages=[request])
    # 保存状态和历史
    state = await agent.save_state()
    async with aiofiles.open(state_path, "w") as file:
        await file.write(json.dumps(state))
    # ...省略历史记录保存代码...
    return response.chat_message

启动与测试

运行服务并访问http://localhost:8001即可看到聊天界面：

python app_agent.py

服务启动后会自动创建状态文件agent_state.json和历史记录文件agent_history.json，实现对话状态的持久化存储。测试流程：

在浏览器输入问题
查看返回结果
重启服务验证状态是否保留

多智能体团队协作

团队架构设计

多智能体版本使用RoundRobinGroupChat实现智能体轮流对话，架构如下：

graph LR
    User[用户] -->|输入| FastAPI[FastAPI服务]
    FastAPI -->|任务分发| Team[RoundRobinGroupChat]
    Team --> A[AssistantAgent<br>普通助手]
    Team --> B[YodaAgent<br>尤达风格助手]
    Team --> C[UserProxyAgent<br>用户代理]
    A & B --> LLM[语言模型服务]

核心实现位于app_team.py，通过WebSocket实现实时交互：

# 创建多智能体团队
async def get_team(user_input_func):
    # ...省略模型加载代码...
    agent = AssistantAgent(name="assistant", model_client=model_client)
    yoda = AssistantAgent(
        name="yoda",
        model_client=model_client,
        system_message="Repeat the same message in the tone of Yoda."
    )
    user_proxy = UserProxyAgent(name="user", input_func=user_input_func)
    team = RoundRobinGroupChat([agent, yoda, user_proxy])
    # ...省略状态加载代码...
    return team

WebSocket实时通信

团队聊天使用WebSocket实现双向通信，支持流式响应：

@app.websocket("/ws/chat")
async def chat(websocket: WebSocket):
    await websocket.accept()
    async def _user_input(prompt, cancellation_token):
        data = await websocket.receive_json()
        return TextMessage.model_validate(data).content
    
    team = await get_team(_user_input)
    stream = team.run_stream(task=request)
    async for message in stream:
        await websocket.send_json(message.model_dump())

启动多智能体服务：

python app_team.py

访问http://localhost:8002体验多智能体协作，系统会依次显示普通助手和尤达风格助手的回复，用户可在指定轮次参与对话。

高级特性与最佳实践

状态持久化机制

AutoGen提供save_state()和load_state()方法实现智能体状态持久化，默认使用JSON文件存储：

# 保存状态
state = await agent.save_state()
async with aiofiles.open(state_path, "w") as file:
    await file.write(json.dumps(state))

# 加载状态
async with aiofiles.open(state_path, "r") as file:
    state = json.loads(await file.read())
await agent.load_state(state)

生产环境建议使用数据库存储状态，可参考AutoGen状态管理文档。

性能优化建议

模型缓存：使用autogen-ext提供的缓存组件减少重复请求
异步处理：所有文件操作和网络请求使用异步IO(aiofiles)

连接池：配置LLM客户端连接池，示例：

model_client = ChatCompletionClient(
    model="gpt-3.5-turbo",
    api_key=api_key,
    max_retries=3,
    timeout=30,
    http_client=AsyncHTTPClient(limits=ClientLimits(max_connections=10))
)

常见问题解决

CORS跨域问题

FastAPI默认配置可能导致前端跨域错误，需添加CORS中间件：

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],  # 生产环境指定具体域名
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

状态文件权限问题

确保应用有状态文件读写权限，建议指定绝对路径：

state_path = os.path.join(os.path.dirname(__file__), "agent_state.json")

模型连接超时

检查API密钥和网络连接，国内用户可配置代理：

export HTTP_PROXY=http://proxy.example.com:8080
export HTTPS_PROXY=https://proxy.example.com:8080

部署与扩展

Docker容器化

创建Dockerfile简化部署：

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app_agent:app", "--host", "0.0.0.0", "--port", "8000"]