How to launch hundreds of asynchronous tasks of which the first two subtasks' success will cancel other subtasks?
gaochenyi opened this issue · comments
I need to calculate a two-dimensional integration numerically on a grid. Because the integrand is kind of exotic from the perspective of numerical integration, I need to use four numerical integration algorithms for it and my experiments show that at different (x,y) point, the speed of the four algorithms are quite different: the fastest algorithm at point A may be the slowest at point B.
I have read the first eight steps of the taskflow cookbook, but I still not find the solution. Is it possible with taskflow?
Basically I would like to have the following task topology
├── numerical integrations at point A
│ ├── algorithm 1
│ ├── algorithm 2
│ ├── algorithm 3
│ └── algorithm 4
├── numerical integrations at point B
│ ├── algorithm 1
│ ├── algorithm 2
│ ├── algorithm 3
│ └── algorithm 4
├── ...
All tasks (including subtasks) are asynchronous, and I only need results of the fastest two algorithms (for cross-validation) at each point.
Hi @gaochenyi I am not fully understanding your need. Would you please give an algorithm pseudocode? Have you also looked into dependent async?
Hi @gaochenyi I am not fully understanding your need. Would you please give an algorithm pseudocode? Have you also looked into dependent async?
Yes, the algorithm pseudocode is as follows. I hope it can help clarifying the question.
double integral_0(double x);
double integral_1(double x);
double integral_2(double x);
double integral_3(double x);
void work_on_single_point(double x, double &res_a, double &res_b) {
auto j0 = asynchronous_launch(integral_0(x));
auto j1 = asynchronous_launch(integral_1(x));
auto j2 = asynchronous_launch(integral_2(x));
auto j3 = asynchronous_launch(integral_3(x));
std::vector<Job> job_list = {j0, j1, j2, j3};
// collect results
std::vector<double> results;
while (true) {
for (job : job_list) {
if (job.success) {
results.push_back(job.result);
remove_job_from_job_list(job, job_list);
if (results.size() == 2) { // two fastest algorithms for this point have returned
cancel_other_two_jobs(); // cancel the other two still-running algorithms
res_a = results[0];
res_b = results[1];
return;
}
}
}
}
}
int main() {
std::vector<double> grid_points = {-10, -10+0.01, /*...*/, 10};
for (const double x : grid_points) {
asynchronous_launch(task_for_single_point(x));
}
}
Hi @gaochenyi , since each taskflow task is atomic, it's not possible to stop a running task in the middle of its execution. I see in your implementation the best way is to incorporate your application-specific control flow to cancel the task. Does that make sense?
@tsung-wei-huang Thanks for replying.
I am a bit confused --- you mean I need to embed the application-specific control flow in each taskflow task (i.e., callables supplied to taskflow), or to use the application-specific control flow to let taskflow cancel taskflow tasks?
In another word, can I use taskflow facilities to implement the application-specific control flow?
BTW, I upload a visualization of my task topology in the first post.
Correct - When Taskflow runs a task, it will just run that function given by you. The execution is atomic and cannot be preempted. What I mean by applicaiton-specific control flow is something below:
[](){ // your algorithm task
while(num_finished_job < 2) {
make_some_progress_on_your_algorithm();
}
num_finished_job++;
}
Does this make sense to u?
Yes, now I know what to do in the next. Thanks for clarifying.