Introduction to Tasks in TPL

What is a Task?

In the first two posts about asynchronous programming we saw what asynchronous programming is and which techniques .Net offers to accomplish different units of work in an asynchronous manner.

Besides Threads, EAP and APM with .Net 4.0 yet another way for building asynchronous applications was introduced: the Task Parallel Library or TPL for short.

A major goal of this API was the unification of the async api with the abstraction of a Task. Where the task works as a Future and is essentially used to represent an asynchronous piece of work which value is yet unknown but will be at a point in the future.

This also includes an api for reporting and cancellation, which had to be implemented by the application developer before TPL. For the unification, one of the key guidelines was to simplify async APIs so the programming style could move closer to synchronous code. This is meant to avoid the overcomplication of previous async API’s.

What this post will cover

What is a Task essentially
Computation based and I/O based Tasks
Passing Data from and to a task
Cancellation and Progress reporting
Continuation and Nested Tasks
Error handling

What this post will not cover

Use of Async and await
advanced topics like TaskScheduler, Synchronization and others.

So what is a Task then?

A Task, as mentioned above, represents an asynchronous unit of work. This can amount to any network based resource, I/O completion, or long running computations. To put it in other words it just represents an ongoing activity as the main thread continues and therefore simply is an an abstraction over threads/Async work in general.

A Task can represent I/O operations as well as computation based work.

Compute based task

Simple Computation based tasks are characterized as running on a background thread, which will not keep the process alive. Also do those Tasks run on thread pool threads.

To start a Task you can use the following syntax:

static void Main(string[] args)
{
var task = new Task(() => { Console.WriteLine("a wonderfully lonesome task");});
task.Start();
}

This task would run on a background thread as mentioned above. To avoid that the application closes instantaneous (because remember background threads do not keep the process alive), you would need to block until the task completes. This can be done with

static void Main(string[] args)
{
var task = new Task(() => { Console.WriteLine("a wonderful lonesome task");});
task.Start();
task.Wait();
}
//You can also start a Task in the following shorter ways:
void MyAction()
{
/* omitted */ 
}
Task.Factory.StartNew(MyAction);
// or
Task.Run(MyAction);

You can also specify special options at creation of a Task with Factory.StartNew. These are defined in the TaskCreationOptions Enum:

[Flags]
public enum TaskCreationOptions
{
AttachedToParent = 4,
DenyChildAttach = 8,
HideScheduler = 16,
LongRunning = 2,
None = 0,
PreferFairness = 1,
RunContinuationsAsynchronously = 64
}

The LongRunning creation option for example tells the TPL to create a new Thread for this task, instead of using a thread pool thread.

So creating a Task with above methods always creates a computation based task that employs a thread pool thread. What if you want to use a I/O bound operation?

I/O Bound Tasks

When you want to do some I/O operation, e.g. load something from disk, wait on the network or something similar, the application feels unresponsive even for the briefest moment, because it will block until I/O completion ports return.

This can be overcome by creating a task that waits for the completion ports and during that you use the CPU to keep the UI responsive. It not only makes sense for a responsive UI, but also for server applications (think of two parallel WebRequests).

(no async await if you was expecting it, because I want to demonstrate the idea in general)

static void Main(string[] args)
{
string result = DownloadWebPage("http://www.anyurl.com");
Console.WriteLine(result);
}
private string DownloadWebPage(string url)
{
var request = WebRequest.Create(url);
var response = request.GetResponse();
string pageContent = "";
using (var reader = new StreamReader(response.GetResponseStream())
{
pageContent = reader.ReadToEnd();
}
return pageContent;
}
// and asynchronous version
private Task<string> DownloadWebPageAsync(string url)
{
return Task.Factory.StartNew(() => DownloadWebPage(url));
}
//adjust Main
static void Main(strin[] args)
{
Task<string> downloadTask = DownloadWebPageAsync("http://www.anyurl.com");
while (!downloadTask.IsCompleted)
{
Console.WriteLine("...");
Thread.Sleep(250);
}
Console.WriteLine(downloadTask.Result);
}

Note the convention that all asynchronous methods are to be suffixed with Async.

But as you probably can tell, this wastes CPU time because it uses two cpu bound threads. Before Tasks you needed APM for this (Begin/End see async techniques in .Net) With Tasks you use either FromAsync were you supply an IAsyncResult to the Task like so:

privat Task<string> DownloadWebPageAsync(string url)
{
var request = WebRequest.Create(url);
IAsyncResult ar = request.BeginGetResponse(null, null); // no callback or asyncState object needed 
Task<string> downloadTask = Task.Factory.FromAsync<string>(ar, iar =>
{
using (var response = request.EndGetResponse(iar)) 
using (var reader = new StreamReader(response.GetResponseStream()))
return reader.ReadToEnd();
};
return downloadTask;
}

This may look more complex than the above code, but is a lot more efficient in respect to the thread pool usage. This is because FromAsync will utilize I/O bound threads and therefore threads of the thread pool are free to use otherwise. Which in turn reduces memory consumption.

Another way to stop a thread from being used to wait on the completion ports, and therefore wasted, is the use of TaskCompletionSource<TResult>. With TaskCompletionSource<TResult> you get a Task<TResult> from the property that is aptly called Task, which lifecycle is controlled by TaskCompletionSource SetResult, SetException and SetCanceled methods. To employ I/O bound work in this way, you would create a TaskCompletionSource, register for the I/O event, wait for it to complete and then use the SetResult method and at the end you’d return the Task. The TaskFactory.FromAsync method encapsulates this behavior, so you won’t have to implement this by yourself.

public static Task<string> LoadIOAsync()
{
var tcs = new TaskCompletionSource<string>();
// register for I/O operation and its completion
tcs.TrySetResult(...);
return tcs.Task;
}

Besides calling computation based and I/O based tasks need some data to be passed into or return some data. This we will explore in the next sections:

Passing Data into a Task

As with Threads, you would want to supply the asynchronous operation that is performed by the Task with some data at some point. One way to this is that the Factory.StartNew method takes a delegate of Type Action<object>. See below how that would look like:

void DoStuff(object o)
{
/* omitted */
}
var t = Task.Factory.StartNew(DoStuff, "any data");

Even though this works, it is kind of smelly code to look at. So the second and most common way to supply data into a task is the use of closures and lambda syntax.
For starters, a Closure is the method invocation (of a lambda anonymous delegate in this case) and its enclosing environment:

string data = "any data";
var t = Task.Factory.StartNew(() => Console.WriteLine(data));

The closure in this case is therefore the Lambda that is used as delegate in the StartNew method and the string data value. You can imagine it this way: The compiler will generate a class with a method that has the body of the lambda and a field of the given value. Then on invocation of the delegate, the generated class is invoked with the local values it has.

A closure is less efficient than a Action<object> delegate, yet it is a way better read (at least in my opinion).

Returning Data from a Task

Besides passing data to a task, you can use a Task to return a value. For this you use the Task<TResult> and the following StartNew<TResult> method signature:

public Task<TResult> StartNew<TResult>(Func<TResult> function);

The Task<TResult> has an additional property Result that is used instead of Wait(). This yields the result only on completion and therefore blocks the calling thread.

Introduction to Tasks in TPL

Published by theiten on 05/13/201805/13/2018

What is a Task?

What this post will cover

What this post will not cover

So what is a Task then?

Compute based task

I/O Bound Tasks

Passing Data into a Task

Returning Data from a Task

0 Comments

Leave a Reply Cancel reply

TPL

Asynchronous mechanisms

TPL

Concurrent Data Structures

TPL

Immutable shared state object

Introduction to Tasks in TPL

Published by theiten on 05/13/201805/13/2018

What is a Task?

What this post will cover

What this post will not cover

So what is a Task then?

Compute based task

I/O Bound Tasks

Passing Data into a Task

Returning Data from a Task

0 Comments

Leave a Reply Cancel reply

Related Posts

TPL

Asynchronous mechanisms

TPL

Concurrent Data Structures

TPL

Immutable shared state object