Azure SDK for Python 中评估器版本问题的分析与解决方案

2025-06-10 07:00:30作者：裴麒琰

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://docs.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.

项目地址：https://gitcode.com/GitHub_Trending/az/azure-sdk-for-python

问题背景

在使用Azure SDK for Python进行AI模型评估时，开发者可能会遇到评估器无法正确处理输入数据的问题。具体表现为某些评估指标（如Groundedness、Relevance等）无法完成计算，而相似度指标(Similarity)却能正常工作。

问题现象

当开发者使用SDK提供的评估器ID（如GroundednessEvaluator.id）时，系统会返回错误提示"Only text conversation inputs are supported"。然而，同样的数据文件在使用UI界面进行评估时却能正常工作。这表明问题并非出在数据格式本身，而是与评估器的版本选择有关。

根本原因

经过分析，我们发现问题的根源在于：

SDK中内置的评估器ID指向的可能是较旧的模型版本
这些旧版本评估器对输入数据的解析方式与新版本存在差异
评估器接口在迭代过程中可能发生了不兼容的变更

解决方案

开发者可以通过显式指定评估器的最新版本来解决此问题。以下是各评估器推荐使用的版本号：

相似度评估器(Similarity)：版本4
基础性评估器(Groundedness)：版本6
相关性评估器(Relevance)：版本6
暴力内容评估器(Violence)：版本4
专业基础性评估器(GroundednessPro)：版本2

实现示例

evaluators={
    "Similarity": EvaluatorConfiguration(
        id="azureml://registries/azureml/models/Similarity-Evaluator/versions/4",
        init_params={"model_config": model_config}
    ),
    "Groundedness": EvaluatorConfiguration(
        id="azureml://registries/azureml/models/Groundedness-Evaluator/versions/6",
        init_params={"model_config": model_config}
    ),
    # 其他评估器配置...
}