KServe 中自定义本地网关配置的实践指南

2025-06-15 19:45:02作者：凤尚柏Louis

问题背景

在使用 KServe 进行模型服务部署时，很多团队会遇到需要自定义本地网关（Local Gateway）的场景。本文将以一个实际案例为基础，详细介绍在 KServe 中配置非默认本地网关时可能遇到的问题及其解决方案。

典型症状

当 KServe 与自定义网关配置不匹配时，通常会表现出以下症状：

通过推理服务 URL 访问时出现 504 网关超时错误
直接访问服务私有端点可以正常工作
日志中显示 activator 请求超时
网关日志中出现路由未找到的错误

根本原因分析

出现这些问题的主要原因在于 KServe 的多层配置需要协调一致：

Knative Serving 配置中需要正确定义本地网关
KServe 的配置需要与 Knative 配置保持一致
Istio 网关需要正确配置主机路由规则

完整解决方案

1. Knative Serving 配置

在 KnativeServing 自定义资源中，需要正确配置本地网关信息：

apiVersion: operator.knative.dev/v1beta1
kind: KnativeServing
metadata:
  name: knative-serving
  namespace: knative-serving
spec:
  config:
    istio:
      local-gateways: |
        - name: custom-local-gateway
          namespace: istio-system
          service: custom-local-gateway.istio-system.svc.cluster.local
  ingress:
    istio:
      enabled: true
      knative-local-gateway:
        selector:
          app: custom-local-gateway
          istio: custom-local-gateway

2. KServe 配置

在 KServe 的 inferenceservice-config ConfigMap 中，需要同步更新网关配置：

apiVersion: v1
kind: ConfigMap
metadata:
  name: inferenceservice-config
  namespace: kserve
data:
  ingress: |-
    {
        "ingressGateway": "knative-serving/knative-ingress-gateway",
        "localGateway": "istio-system/custom-local-gateway",
        "localGatewayService": "custom-local-gateway.istio-system.svc.cluster.local",
        "ingressClassName": "istio"
    }

3. Istio 网关配置

确保 Istio 网关包含了必要的域名匹配规则：

apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: custom-local-gateway
  namespace: istio-system
spec:
  selector:
    app: custom-local-gateway
  servers:
  - hosts:
    - "*.online-serving"
    - "*.online-serving.svc"
    - "*.online-serving.svc.cluster.local"
    port:
      name: http
      number: 80
      protocol: HTTP