解决pandas-ai中direct_sql配置引发的ExecuteSQLQueryNotUsed异常

2025-05-11 01:31:05作者：胡唯隽

在使用pandas-ai库进行数据分析时，当配置了direct_sql: true参数后，系统会强制要求使用execute_sql_query函数执行SQL查询。这一设计是为了确保SQL查询的安全性和可控性，但同时也带来了一些使用上的注意事项。

问题现象

当开发者在pandas-ai中配置了direct_sql: true参数，但没有正确使用execute_sql_query函数执行SQL查询时，系统会抛出ExecuteSQLQueryNotUsed异常。错误信息明确指出："For Direct SQL set to true, execute_sql_query function must be used"。

问题根源

pandas-ai库在direct_sql模式下强制要求使用特定的execute_sql_query函数，这是出于以下几个考虑：

安全性控制：确保所有SQL查询都经过安全检查
性能优化：统一管理数据库连接和查询执行
功能扩展：为后续功能预留接口

解决方案

要解决这个问题，开发者需要遵循以下步骤：

确保在配置中正确设置了direct_sql: true参数
使用execute_sql_query函数执行所有SQL查询
对SQL查询进行安全检查

示例代码如下：

# 创建MySQL连接器
mysql_connector = MySQLConnector(
    config={
        "host": "localhost",
        "port": 3306,
        "database": "test_db",
        "username": "user",
        "password": "password",
        "table": "sample_table"
    }
)

# 创建SmartDatalake实例
smart_df = SmartDatalake(
    [mysql_connector],
    config={
        "direct_sql": True,
        # 其他配置参数...
    }
)

# 执行SQL查询
try:
    result = smart_df.execute_sql_query("SELECT * FROM sample_table")
    print(result)
except Exception as e:
    print(f"查询执行失败: {e}")