Asynchronous programming with dotnet will be introduced by this post. We examine how it is different to synchronous programming, what drives the need for it and what the mechanisms of asynchronous programming are.
To see .Net specific implementations and asynchronous techniques see: Async Techniques in .Net.
What this post will cover:
- What is async programming and what problems does it solve
- Mechanisms of asynchronous programming
- Preferred asynchronous mechanism on Windows and .Net : Threads
What this post will not cover:
- Details on how to deal with async problems
- Specifics on technical use of asynchronous programming in .Net.
To get deeper into this topic, you can also watch my udemy course on Asynchronous programming, where I boundle up this topcis with examples. (Also it comes at a huge discount of 70% to the normal price on udemy!)
But now back to the topic:
What is asynchronous programming?
To explain asynchronous programming we will look first at its counterpart, synchronous code.
When you look at a piece of code, in most cases it is synchronous code. That is, it runs in a deterministic fashion. That means it will always run the same instructions, in the same order with the same outputs, if all things are equal concerning the input.
So synchronous code is mostly straightforward, you just have to follow the sequence during the execution. One can write messy synchronous code that is hard to understand, but my point is, that synchronous code is by its nature deterministic.
Want an example?
public class Program
static void Main(args items)
foreach (var item in items)
So this runs one statement after the other each time you will start it…
So asynchronous programming on the other hand is when you write code, that does several things at the same time (at least logically) . This can lead to nondeterministic behavior…
Because the concurrent executed code might have to wait on one another, which might slow down the pace, all the concurrent code might change the state of an application simultaneously and so on and so forth.
So when, as it seems on first view, it is more complicated, why would you want to use asynchronous code?
What problems does asynchronous programming solve?
There are three trends in modern computing that drive the need for asynchronous code, because it cannot be accomplished with synchronous code.
1. Unresponiveness of User Interfaces
Synchronous code, can by its very nature only run one computation at a time.
If you want to run a long computation and expect user input at the same time, the user needs to wait for the synchronous code to finish.
The application becomes unresponsive to further input. This can changed by asynchronous programming. For the User interface example, you would run the computation in the background, while the UI thread could perform the task of detecting user input.
Users expect this behavior today, because of the wide acceptance of mobile devices and web applications.
2. Take full advantage of processing power!
The second reason that asynchronous programming is so important these days, is that instead of one ultra fast processor most processors nowadays have multiple cores. But to make use of all of the cores you need to use asynchronous programming techniques.
Not doing so would be a waste of system resources, which is considered bad for obvious reasons.
3. The push to remote services
The last trend is the use of remote technologies, like cloud technologies, to process application inputs.
So a lot of functionality in an application today, is potentially run in a geographically remote location. But networking always comes at a cost, which is in the most if not all cases a much higher latency. This can degrade the use and feel of applications (see point 1) or lead to missed performance targets.
To deal with this latency, one needs asynchronous programming.
What problems come with it?
But as always in life, there is no free lunch. Asynchronous programming, as you might have guessed tends to be more complicated:
- Mutable shared state synchronization
- Race conditions
- Dead Locks
- Harder Debugging
- Harder to understand than sequential code
The first three are actually one, because race conditions and dead locks are a symptom of mutable shared state handlinge. Mutable shared state, is all the applications state that can be accessed and changed by any code in the application. So obviously you cannot run any application without state, and so this is always a problem for asynchronous code.
Race conditions describe a situation where two concurrent executed parts of code modify, or depend in any sort on shared state and if one arrives before the other, the state is modified differently than assumed. This can lead to very subtle bugs, and can therefore be hard to spot. In essence it is a timing problem, where the expected state depends on the “correct” sequence, which cannot always be guaranteed if the necessary precautions are not met.
A dead lock ensues when two concurrent parts of code lock a resource (locking a resource is controlling the asynchronous access and is part of synchronization) and wait on another part and their lock to be released, which in turn waits on the lock of the other resource to be released.
Consider for example the following scenario: a thread t1 has a lock on resource A and another thread t2 has a lock on resource B. Now if the thread t1 needs to access resource B it has to wait on t2 to release the lock. Should the releasing of the lock depend on thread t2 accessing resource A you have a dead lock, where no further code can be executed. In networking this will lead to a timeout in most cases.
Asynchronous code is obviously harder to debug, because it has multiple execution paths at once (as it is the definition of asynchronous programming) and it is nondeterministic. This can lead to confusing states during debugging sessions. I am going to write a full series on asynchronous debugging techniques.
With this brief introduction, we will now look at common mechanisms of asynchrony.
On the next page we are going to look at the mechanisms for asynchronous programming.
Mechanisms of asynchronous programming
For asynchronous programming there are three possible mechanisms. These are the use of:
- Multiple machines
- Multiple processes
- Multiple threads in a single process
1. Multiple machines
The first way to use asynchronous operations is to connect multiple machines (called nodes) by message queueing. In each of this message queue each machine can request an operation by adding an element to a queue. So the requester gets only blocked for the briefest moment, when the item is enqueued. The result of an operation is then also enqueued on the reverse way.
This allows for a high scalability because those machines do not compete for system resources. But it also comes with high complexity, because you potentially need to synchronize state between machines, and you need to accomodate for node failure and networking issues.
2. Multiple processes
A process is a unit of isolation in an OS, where an application can be run with its own security context and virtual memory address space. The difference to multiple machines, is that they compete for processor resources. The communication is also enabled by the same technique as with multiple machines, which is the queueing of messages. The advantage over multiple machines is the easier handling of faults on the same machine.
Another model for different processes is known from web servers that used CGI to spawn off a new process that handled the given request aka handing off long running operations to another process.
3. Multiple Threads
A thread is an individually schedulable set of instructions, that is packaged with nonshared resources. One of those nonshared resources for example is that every thread get its own stack. Each thread is bound to a process and cannot leave this process. There can be one or more threads at any given time in a process. All the threads in a process share the heap memory and OS system resources like sockets, file handles etc.
How to utilize Threads:
The queue pattern is acutally also helpful here, because it is a generally helpful asynchronous pattern. Yet threads do benefit least from it, because they share so much resources. On the other hand does this sharing make it much easier to handle failures, because they are all in the same process. So there is actually no need for a queue.
Where to use Threads in .Net
On Windows (where still most of .Net code runs) a process is a very hefty construct, because of the house keeping, checking in registry and such, which is the reason that the most appropriate way of doing asynchronous programming on Windows is the use of threads.
Yet creating or destroying threads is a complex and costly matter, so it is best to avoid it. Because each thread has its own stack, and default stack size is 1MB. This is a nontrivial amount of memory, hence the need to reuse threads.
Now that we know, that threads are the main mode of execution for asynchronous programming in .Net, we look at how thread scheduling works, which is essential for efficient and effective use of threads.
As mentioned a thread is an individually schedulable set of instructions. This scheduling is done by the OS and its thread scheduling mechanism. For this each running thread has a priority, which is a combination of the process priority and the priority of the thread in question.
Normally each active thread gets a time slice from the scheduler to perform its work (if it is not in the SleepWaitJoin state) then runs its instructions till this alloted time slice ends and the next thread gets its time slice.
This is just the ideal case, sometimes not everything is equal. This is where the priorities come into play. There are
7 different of them, which range from idle to time critical.
Besides this priority scheduling, Windows uses preemptive thread scheduling. This means threads with lower priority are removed (preempted) for threads with higher priority. This would potentially starve threads with lower priority of their processing time. To not starve the lower priority threads, sometimes those get their priority boosted. Possible reasons for this can be user input, putting it into the focus, or another event. After the thread is alloted its time slice and used it, its priority degrades over time to its former value.
A thread scheduler also works out the mapping of all threads to processor cores.
Threads and resources
As mentioned above, threads share system resources and also the heap, but not the stack. This is an important distinction with regard to mutable shared state of the application, which is one of the most common reasons for errors in asynchronous programming. (see race conditions and dead locks)
Thread local storage
There is a notion of storage slots, where each thread has an entry for each defined slot and it can store thread specific values. This slots cannot be accessed by other threads. TLS slots are guaranteed to at least have 64 or up to 1088 slots.
copy of the register values.–> important for preemption to return to its instructions (isntruction pointer is restored)
As mentioned all threads in a process share the heap memory, which is all the memory that can be allocated for pointers, and/or object references (in the case of .Net). This is efficient, yet again as mentioned before, it can lead to bugs.
Threads also share operating handles. (file, window, load dll) When a thread is destroyed/ends the thread will not return the handle. So this needs to be done explicitly, else it is only done when the process exits.
In this post we explored how asynchronous is different to synchronous programming, and what problems and trends in modern computing it does solve. Namely the need for a responsive UI, or user input in general, to gain the full advantage of modern multi core processors and the push to cloud and remote functionality.
We also explored the differnt mechanisms of asnychrony, and that Threads are the most apt for windows and .Net technologies.
To end this post we looked into the way threads are scheduled by windows and what resources they share or do not share.