pentaho-kettle 命令行工具详解：Carte 服务器配置与远程执行

2026-02-05 05:03:46作者：侯霆垣

什么是Carte服务器

Carte是Pentaho Data Integration (PDI)的远程服务器组件，提供基于Web的API用于执行和监控转换(Transformation)与作业(Job)。作为数据集成和变换工具的核心服务，它允许用户通过网络接口远程管理数据处理任务，实现分布式数据集成架构。

快速启动Carte服务器

基础启动命令

Carte服务器可以通过命令行快速启动，默认使用8080端口：

./carte.sh

如需指定配置文件启动：

./carte.sh carte-config.xml

启动参数说明

-l：指定日志级别，如-l DEBUG启用调试日志
-x：启用XML格式日志输出

配置文件详解

配置文件结构

Carte服务器使用XML格式的配置文件，典型结构如下：

<?xml version="1.0" encoding="UTF-8"?>
<slave_config>
  <slaveserver>
    <name>Carte-Server</name>
    <hostname>localhost</hostname>
    <port>8080</port>
    <username>admin</username>
    <password>password</password>
    <master>N</master>
  </slaveserver>
  <max_log_lines>10000</max_log_lines>
  <max_log_timeout_minutes>1440</max_log_timeout_minutes>
  <object_timeout_minutes>1440</object_timeout_minutes>
</slave_config>

关键配置参数

参数	说明	默认值
name	服务器名称	Carte-Server
hostname	主机名或IP地址	localhost
port	服务端口	8080
username	认证用户名	admin
password	认证密码	password
max_log_lines	最大日志行数	10000
max_log_timeout_minutes	日志超时分钟数	1440

服务器管理API

基础URL与认证

Carte API的基础URL格式为：

http://{hostname}:{port}/kettle

默认情况下使用HTTP Basic认证，可通过curl命令访问：

curl -u username:password "http://localhost:8080/kettle/status?xml=Y"

获取服务器状态

使用以下命令检查Carte服务器运行状态：

# 获取XML格式状态
curl "http://localhost:8080/kettle/status?xml=Y"

# 获取HTML格式状态
curl "http://localhost:8080/kettle/status"

响应内容包含：

服务器内存使用情况
CPU信息
运行中的转换和作业
系统信息

关闭服务器

通过API优雅关闭Carte服务器：

curl "http://localhost:8080/kettle/stopCarte?xml=Y"

转换(Transformation)管理

注册转换

从仓库或文件系统注册转换：

curl "http://localhost:8080/kettle/registerTrans?name=my-transformation&xml=Y"

执行转换

同步执行

curl "http://localhost:8080/kettle/executeTrans?name=my-transformation&xml=Y"

异步执行

curl "http://localhost:8080/kettle/runTrans?name=my-transformation&xml=Y"

监控转换状态

curl "http://localhost:8080/kettle/transStatus?name=my-transformation&xml=Y"

响应包含：

执行状态
步骤状态
日志信息
性能指标

停止转换

curl "http://localhost:8080/kettle/stopTrans?name=my-transformation&xml=Y"

作业(Job)管理

注册作业

curl "http://localhost:8080/kettle/registerJob?name=my-job&xml=Y"

执行作业

# 同步执行
curl "http://localhost:8080/kettle/executeJob?name=my-job&xml=Y"

# 异步执行
curl "http://localhost:8080/kettle/runJob?name=my-job&xml=Y"

监控作业状态

curl "http://localhost:8080/kettle/jobStatus?name=my-job&xml=Y"

实用工具API

获取服务器属性

curl "http://localhost:8080/kettle/properties?xml=Y"

健康检查

简单的健康检查端点：

curl "http://localhost:8080/kettle/status"

如果返回200状态码，表示Carte服务器运行正常。

安全配置

启用认证

在配置文件中设置用户名和密码：

<slaveserver>
  <name>slave-server-name</name>
  <hostname>localhost</hostname>
  <port>8080</port>
  <username>admin</username>
  <password>password</password>
</slaveserver>

安全最佳实践

始终启用认证
生产环境使用HTTPS加密传输
通过防火墙限制Carte端口访问
定期更新密码
监控服务器资源使用情况

集群管理

注册从服务器

curl "http://localhost:8080/kettle/registerSlave?xml=Y"

查看从服务器列表

curl "http://localhost:8080/kettle/getSlaves?xml=Y"

故障排除

常见问题解决

连接被拒绝：检查Carte是否运行及端口是否可访问
认证失败：验证用户名/密码配置
转换未找到：确保使用正确名称注册转换
内存问题：监控服务器资源并调整JVM设置

启用调试日志

./carte.sh carte-config.xml -l DEBUG

完整工作流示例

以下是一个完整的数据处理任务流程示例：

启动Carte服务器

./carte.sh carte-config.xml

注册并执行转换

# 注册转换
curl "http://localhost:8080/kettle/registerTrans?name=data-ETL&xml=Y"

# 执行转换
curl "http://localhost:8080/kettle/runTrans?name=data-ETL&xml=Y"

# 检查状态
curl "http://localhost:8080/kettle/transStatus?name=data-ETL&xml=Y"

执行后续作业

# 注册作业
curl "http://localhost:8080/kettle/registerJob?name=report-generation&xml=Y"

# 执行作业
curl "http://localhost:8080/kettle/runJob?name=report-generation&xml=Y"

查看作业状态

curl "http://localhost:8080/kettle/jobStatus?name=report-generation&xml=Y"

通过这些API和命令，您可以构建自动化的数据集成流程，实现远程、定时、批量的数据处理任务。Carte服务器为pentaho-kettle提供了强大的分布式执行能力，是构建企业级数据集成平台的关键组件。

pentaho-kettle

Pentaho Data Integration ( ETL ) a.k.a Kettle

项目地址：https://gitcode.com/gh_mirrors/pe/pentaho-kettle

登录后查看全文

项目优选

收起

Ascend Extension for PyTorch

Claude Code 的开源替代方案。连接任意大模型，编辑代码，运行命令，自动验证 — 全自动执行。用 Rust 构建，极致性能。｜ An open-source alternative to Claude Code. Connect any LLM, edit code, run commands, and verify changes — autonomously. Built in Rust for speed. Get Started

旨在打造算法先进、性能卓越、高效敏捷、安全可靠的密码套件，通过轻量级、可剪裁的软件技术架构满足各行业不同场景的多样化要求，让密码技术应用更简单，同时探索后量子等先进算法创新实践，构建密码前沿技术底座！

1.1 K

611

ops-math

本项目是CANN提供的数学类基础计算算子库，实现网络在NPU上加速计算。

C++

1.01 K

MindSpeed-MM

华为昇腾面向大规模分布式训练的多模态大模型套件，支撑多模态生成、多模态理解。

openEuler内核是openEuler操作系统的核心，既是系统性能与稳定性的基石，也是连接处理器、设备与服务的桥梁。