LiteLLM代理调用Azure托管Llama3模型时的参数传递问题解析

2025-05-10 16:59:02作者：俞予舒Fleming

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM]

项目地址：https://gitcode.com/GitHub_Trending/li/litellm

在大型语言模型(LLM)的应用开发中，LiteLLM作为一款流行的API调用工具，能够简化不同模型API的调用流程。然而，当开发者尝试通过LiteLLM代理调用Azure托管的Llama3模型时，可能会遇到一个特定的参数传递问题。

问题现象

开发者在通过LiteLLM代理调用Azure托管的Llama3模型时，收到了错误提示："Extra parameters ['stream_options'] are not allowed when extra-parameters is not set or set to be 'error'"。这个错误表明，系统拒绝了包含stream_options参数的请求。

值得注意的是，相同的调用方式在使用Azure OpenAI模型时工作正常，直接调用Llama3部署（不使用LiteLLM代理）也没有问题。这种不一致性说明问题出在LiteLLM代理与Azure Llama3模型API的交互环节。

问题根源

经过分析，这个问题源于Azure Llama3模型API对额外参数的严格校验机制。默认情况下，Azure Llama3 API会拒绝任何未明确允许的额外参数。这与Azure OpenAI API的行为有所不同，后者对参数传递更为宽松。

当开发者通过LiteLLM代理发送包含stream_options参数的请求时，由于没有明确设置参数传递策略，Azure Llama3 API会按照默认的严格模式拒绝这些额外参数。

解决方案

要解决这个问题，需要在LiteLLM配置中明确指定参数传递策略。具体方法是在模型配置中添加headers字段，设置extra-parameters为pass-through：

model_name: llama3
litellm_params:
  model: azure_ai/Meta-Llama-3-70B-Instruct
  api_base: <base>
  api_key: os.environ/LLAMA3_API_KEY
  headers:
    extra-parameters: pass-through