5分钟搞定RustFS监控：Prometheus与Grafana实战指南

2026-02-04 04:12:50作者：翟萌耘Ralph

🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platforms such as MinIO and Ceph.

项目地址：https://gitcode.com/GitHub_Trending/rus/rustfs

你是否曾因存储集群故障导致业务中断？是否在排查性能问题时缺乏有效数据支持？本文将带你5分钟内完成RustFS与Prometheus、Grafana的监控集成，实时掌握分布式存储集群的健康状态，提前预警潜在风险。

准备工作

环境要求

2核4G以上服务器配置
Docker及Docker Compose环境
RustFS v1.0+版本部署

监控配置文件

修改部署配置文件启用监控功能：

# 启用监控模块
export RUSTFS_OBS_ENVIRONMENT=production
# 设置指标暴露端口
export RUSTFS_METRICS_PORT=9090
# 配置采样频率
export RUSTFS_METRICS_INTERVAL=5s

部署架构

RustFS监控系统由三个核心组件构成：

graph LR
    A[RustFS集群] -->|暴露指标| B[Prometheus]
    B -->|存储指标| C[Grafana]
    C -->|可视化面板| D[管理员]

数据流向：RustFS节点 → Prometheus采集 → Grafana可视化
通信协议：HTTP/HTTPS
默认端口：RustFS指标9090，Prometheus 9091，Grafana 3000

Prometheus配置

1. 指标暴露配置

RustFS通过obs模块实现Prometheus格式指标暴露，默认端点：

http://<rustfs-node-ip>:9090/metrics

2. Prometheus部署

创建基础配置文件prometheus.yml：

global:
  scrape_interval: 15s
  
scrape_configs:
  - job_name: 'rustfs'
    static_configs:
      - targets: ['rustfs-node1:9090', 'rustfs-node2:9090']

3. 服务启动

使用Docker快速部署Prometheus：

docker run -d -p 9091:9090 \
  -v ./prometheus.yml:/etc/prometheus/prometheus.yml \
  --name prometheus prom/prometheus

Grafana配置

数据源设置

登录Grafana控制台(http://localhost:3000)
添加Prometheus数据源：
- 名称：RustFS
- URL：http://prometheus:9090
- 保存并测试连接

关键指标解读

指标名称	类型	正常范围	指标含义
rustfs_object_count	Counter	无上限	存储对象总数
rustfs_disk_usage_bytes	Gauge	<85%容量	磁盘使用率
rustfs_request_latency_ms	Histogram	P95<100ms	请求延迟分布
rustfs_node_health	Gauge	1=健康,0=异常	节点健康状态
rustfs_replication_failure	Counter	0	数据复制失败次数

常见问题排查

指标采集失败

检查RustFS监控模块状态：

curl http://rustfs-node:9090/metrics

验证网络连通性：

telnet prometheus-server 9090

数据延迟问题

调整Prometheus抓取间隔：

scrape_interval: 10s  # 缩短为10秒

检查性能分析配置

高级配置

告警规则设置

在Prometheus中配置磁盘使用率告警：

groups:
- name: rustfs_alerts
  rules:
  - alert: HighDiskUsage
    expr: rustfs_disk_usage_bytes / rustfs_disk_total_bytes > 0.85
    for: 5m
    labels:
      severity: critical
    annotations:
      summary: "高磁盘使用率告警"
      description: "节点 {{ $labels.instance }} 磁盘使用率超过85%"