[API Proposal]: Making "Process asynchronous tasks as they complete" easy by using IAsyncEnumerable
Vijay-Nirmal opened this issue · comments
EDITED on 1/23/2024 by @stephentoub:
namespace System.Threading.Tasks;
public class Task
{
+ public static IAsyncEnumerable<Task> WhenEach(params Task[] tasks);
+ public static IAsyncEnumerable<Task> WhenEach(params ReadOnlySpan<Task> tasks); // params when possible
+ public static IAsyncEnumerable<Task> WhenEach(IEnumerable<Task> tasks);
+ public static IAsyncEnumerable<Task<TResult>> WhenEach(params Task<TResult>[] tasks); // params for now; move it to ReadOnlySpan overload when that syntax is possible
+ public static IAsyncEnumerable<Task<TResult>> WhenEach(params ReadOnlySpan<Task<TResult>> tasks); // params when possible
+ public static IAsyncEnumerable<Task<TResult>> WhenEach(IEnumerable<Task<TResult>> tasks);
}
Background and motivation
Currently, if we need to "Process asynchronous tasks as they complete" then we need to write lots of unnecessary codes and its not straight forward, something like below.
// Using currently available APIs
List<Task<int>> downloadTasks = downloadTasksQuery.ToList();
while (downloadTasks.Any())
{
Task<int> finishedTask = await Task.WhenAny(downloadTasks);
downloadTasks.Remove(finishedTask);
Process(await finishedTask);
}
API Proposal
namespace System.Threading.Tasks
{
public class Task : IAsyncResult, IDisposable
{
public static IAsyncEnumerable<Task> WhenEach(params Task[] tasks); // Please change the name, if needed
public static IAsyncEnumerable<Task> WhenEach(IEnumerable<Task> tasks);
public static IAsyncEnumerable<Task<TResult>> WhenEach(params Task<TResult>[] tasks);
public static IAsyncEnumerable<Task<TResult>> WhenEach(IEnumerable<Task<TResult>> tasks);
}
}
API Usage
// Using newly created APIs
await foreach (var finishedTask in Task.WhenEach(downloadTasksQuery))
{
Process(await finishedTask);
}
Alternative Designs
No response
Risks
No response
Updates
(Others can edit this section and add more info)
WhenEach
name suggested by @theodorzoulias, #61959 (comment)
Tagging subscribers to this area: @dotnet/area-system-threading-tasks
See info in area-owners.md if you want to be subscribed.
Issue Details
Background and motivation
Currently, if we need to "Process asynchronous tasks as they complete" then we need to write lots of unnecessary codes and its not straight forward, something like below.
// Using currently available APIs
List<Task<int>> downloadTasks = downloadTasksQuery.ToList();
while (downloadTasks.Any())
{
Task<int> finishedTask = await Task.WhenAny(downloadTasks);
downloadTasks.Remove(finishedTask);
Process(await finishedTask);
}
API Proposal
namespace System.Threading.Tasks
{
public class Task : IAsyncResult, IDisposable
{
public static IAsyncEnumerable<Task> WhenAnyAsEnumerable(params Task[] tasks); // Please change the name, this is not a good name for this method :)
public static IAsyncEnumerable<Task> WhenAnyAsEnumerable(IEnumerable<Task> tasks);
public static IAsyncEnumerable<Task<TResult>> WhenAnyAsEnumerable(params Task<TResult>[] tasks);
public static IAsyncEnumerable<Task<TResult>> WhenAnyAsEnumerable(IEnumerable<Task<TResult>> tasks);
}
}
API Usage
// Using new available APIs
await foreach (var finishedTask in Task.WhenAnyAsEnumerable(downloadTasksQuery))
{
Process(await finishedTask);
}
Alternative Designs
No response
Risks
No response
Author: | Vijay-Nirmal |
---|---|
Assignees: | - |
Labels: |
|
Milestone: | - |
This method already exists in AsyncEx
, where it's called OrderByCompletion
. It returns a (non-async) collection of wrapper tasks, but I suspect that's because it's older than IAsyncEnumerable<T>
and using IAsyncEnumerable<T>
returning the original tasks is the better approach today.
@Vijay-Nirmal a name that might be more suitable for the Task.WhenAnyAsEnumerable
method is Task.WhenEach
. :-)
You can achieve the same behavior with TaskCompletionPipe that has native support of IAsyncEnumerable<T>
as well as channel-like methods WaitToReadAsync
and TryRead
.
Looks good as proposed
namespace System.Threading.Tasks;
public class Task
{
public static IAsyncEnumerable<Task> WhenEach(params Task[] tasks);
public static IAsyncEnumerable<Task> WhenEach(params ReadOnlySpan<Task> tasks);
public static IAsyncEnumerable<Task> WhenEach(IEnumerable<Task> tasks);
public static IAsyncEnumerable<Task<TResult>> WhenEach(params Task<TResult>[] tasks);
public static IAsyncEnumerable<Task<TResult>> WhenEach(params ReadOnlySpan<Task<TResult>> tasks);
public static IAsyncEnumerable<Task<TResult>> WhenEach(IEnumerable<Task<TResult>> tasks);
}
Nit: missing <TResult>
:
+ public static IAsyncEnumerable<Task<TResult>> WhenEach<TResult>(params Task<TResult>[] tasks); // params for now; move it to ReadOnlySpan overload when that syntax is possible + public static IAsyncEnumerable<Task<TResult>> WhenEach<TResult>(params ReadOnlySpan<Task<TResult>> tasks); // params when possible + public static IAsyncEnumerable<Task<TResult>> WhenEach<TResult>(IEnumerable<Task<TResult>> tasks);
using
IAsyncEnumerable<T>
returning the original tasks is the better approach today.
I wonder what is the benefit here. Wouldn't IEnumerable be cheaper?
At first look I thought this is going to return the value instead of the task because we're awaiting that one on MoveNextAsync.
public static IAsyncEnumerable<TResult> WhenEach<TResult>(IEnumerable<Task<TResult>> tasks)
I wonder what is the benefit here. Wouldn't IEnumerable be cheaper?
You'd block synchronously in MoveNext waiting for the next task to complete.
At first look I thought this is going to return the value instead of the task because we're awaiting that one on MoveNextAsync.
It's returning the completed task, just like with WhenAny. That gives the consumer the ability to examine / use the completed Task however they like.
You'd block synchronously in MoveNext waiting for the next task to complete.
If I'm not mistaken, OrderByCompletion
returns a IEnumerable<Task<T>>
which you're indeed be awaiting to examine / use however you'd like. Note there's no blocking -- MoveNext returns immediately since we're enumerating an array.
It's returning the completed task, just like with WhenAny. That gives the consumer the ability to examine / use the completed Task however they like.
So what I'm saying is that IEnumerable<Task<T>>
will give you exactly that, IAsyncEnumerable
seems like an auxiliary helper when you want to stop on first exception.
IAsyncEnumerable<T> WhenEach(Task<T>[] tasks) {
foreach (Task<T> task in tasks.OrderByCompletion())
yield return await task;
}
}
@alrz, how would you propose for MoveNext
to immediately complete such that Current
could return the next Task
even when no next task had yet completed? The only way to do that would be to allocate a new Task that could be returned immediately, with a different object identity than the original task, and then whenever the next task completes, marshal its results/exception/cancellation information to that proxy. At that point, you've significantly increased the cost and you've lost the ability to compare Tasks by reference.
I'm not understanding the aversion to using IAsyncEnumerable
here.
Note I'm talking about the the implementation in AsyncEx
Note I'm talking about the the implementation in AsyncEx
Which allocates a new TCS / Task for every input:
https://github.com/StephenCleary/AsyncEx/blob/0361015459938f2eb8f3c1ad1021d19ee01c93a4/src/Nito.AsyncEx.Tasks/TaskExtensions.cs#L211-L213