Spring Batch中实现MongoDB聚合查询的ItemReader方案

2025-06-28 21:14:57作者：侯霆垣

背景介绍

Spring Batch作为企业级批处理框架，在处理大规模数据时提供了强大的支持。其中与MongoDB集成的部分，传统上主要通过MongoItemReader来实现基础查询功能。然而在实际业务场景中，我们经常需要进行更复杂的数据聚合操作，这正是标准MongoItemReader所欠缺的功能。

现有方案分析

Spring Batch提供的MongoItemReader主要基于简单的查询条件进行数据读取，无法满足以下场景需求：

多集合关联查询（$lookup）
复杂数据转换（$project）
分组统计（$group）
条件过滤（$match）

这些操作恰恰是MongoDB聚合框架的核心能力。虽然社区在2020年就提出了相关需求，但至今未得到官方实现。

自定义聚合ItemReader实现

基于实际项目需求，我们可以通过扩展MongoItemReader来实现聚合查询功能。核心思路是：

继承MongoItemReader基类
注入MongoTemplate和Aggregation对象
实现分页查询逻辑
处理聚合结果映射

public class AggregationMongoItemReader<T> extends MongoItemReader<T> {
    private MongoOperations mongoTemplate;
    private Aggregation aggregation;
    private Class<T> classType;
    private String collection;
    private int pageSize = 5;
    private AtomicInteger currentPage = new AtomicInteger(0);

    @Override
    protected Iterator<T> doPageRead() {
        int skip = currentPage.getAndIncrement() * pageSize;
        
        List<AggregationOperation> stages = new ArrayList<>(aggregation.getPipeline().getOperations());
        stages.add(Aggregation.skip((long) skip));
        stages.add(Aggregation.limit(pageSize));
        Aggregation limitedAggregation = Aggregation.newAggregation(stages);

        AggregationResults<T> results = mongoTemplate.aggregate(limitedAggregation, collection, classType);
        return results.getMappedResults().iterator();
    }
}

实际应用案例

在金融认证场景中，我们需要从模拟数据和认证数据两个集合中关联查询：

Aggregation aggregation = Aggregation.newAggregation(
    Aggregation.lookup("certifications", "idCertification", "_id", "certification"),
    Aggregation.addFields()
        .addField("certification")
        .withValueOf(ArrayOperators.ArrayElemAt.arrayOf("$certification").elementAt(0))
        .build(),
    Aggregation.match(Criteria.where("certification.ledger").is(ledger)
        .and("certification.certificationType").is(certificationType),
    Aggregation.group("$idCertification")
        .sum(ConditionalOperators.Cond.when(/*条件*/).then(1).otherwise(0))
        .as("ok")
        .count().as("total"),
    Aggregation.project("_id","ok","total","accounts")
);