Cpp-TaskFlow中递归异步任务的死锁问题与解决方案

2025-05-21 09:25:51作者：胡唯隽

问题背景

在使用Cpp-TaskFlow库时，开发者经常会遇到需要创建递归异步任务的情况。例如，一个任务在执行过程中又创建了子任务，而子任务可能还会继续创建更深的子任务。这种模式在实际开发中非常常见，特别是在处理树形结构或分治算法时。

典型死锁场景

让我们通过一个简单的例子来说明这个问题：

tf::Executor executor(2);  // 使用2个线程的executor

void bottom() {
    // 底层任务实现
}

void middle() {
    auto t = executor.async([]() { bottom(); });
    t.wait();  // 这里会阻塞
}

void top() {
    auto t = executor.async([]() { middle(); });
    t.wait();
}

int main() {
    auto t = executor.async([]() { top(); });
    t.wait();
}

这个程序会死锁，因为：

main()创建了top()任务
top()创建了middle()任务
middle()创建了bottom()任务
但executor只有2个线程，无法同时运行3个任务
middle()在等待bottom()完成，但bottom()永远不会被调度

问题本质

问题的核心在于std::future::wait()的阻塞行为。当我们在任务中调用wait()时，当前线程会被完全阻塞，无法参与任务调度。这会导致：

工作线程被占用
新任务无法被调度
形成死锁条件

解决方案探索

尝试1：使用多个Executor

最初的想法是为每个递归层级创建独立的Executor：

void middle() {
    tf::Executor executor(1);  // 每个层级有自己的executor
    executor.async([]() { bottom(); });
    executor.wait_for_all();
}

这种方法虽然能避免死锁，但会导致：

线程资源浪费
失去全局线程控制
性能下降

尝试2：使用corun_until

更优雅的解决方案是利用corun_until方法：

void middle() {
    auto t = executor.async([]() { bottom(); });
    executor.corun_until([&t](){ 
        return t.wait_for(0s) == std::future_status::ready; 
    });
}

corun_until的优势在于：

当前线程不会完全阻塞
可以参与其他任务的执行
保持线程池的高效利用

最终解决方案

结合上述探索，我们得到了一个通用的wait_for_task函数：

template <typename T>
void wait_for_task(tf::Executor &executor, std::future<T> &future) {
    if (executor.this_worker_id() >= 0) {
        // 在工作线程中，使用corun_until
        executor.corun_until([&future](){ 
            return future.wait_for(0s) == std::future_status::ready; 
        });
    } else {
        // 在非工作线程中，直接等待
        future.wait();
    }
}

这个方案具有以下特点：

智能判断执行环境：自动检测当前是否在TaskFlow的工作线程中
高效调度：在工作线程中使用非阻塞等待
安全回退：在非工作线程中使用传统等待
通用性强：适用于各种递归深度和线程配置

实际应用示例

void recursive_task(int depth) {
    if (depth > 0) {
        auto t = executor.async([=]() { recursive_task(depth-1); });
        wait_for_task(executor, t);
    } else {
        // 基础情况处理
    }
}