In this post we will explore the different mechanisms of asynchronous programming. For more topics on asynchronous programming see the overview.

Mechanisms of asynchronous programming

For asynchronous programming there are three possible mechanisms. These are the use of:

  1. Multiple machines
  2. Multiple processes
  3. Multiple threads in a single process

1. Multiple machines

The first way to use asynchronous operations is to connect multiple machines (called nodes) by message queueing. In each of this message queue each machine can request an operation by adding an element to a queue. So the requester gets only blocked for the briefest moment, when the item is enqueued.

There are lots of different message queues out there like RabbitMQ (where I also offer an online course and posts on this website)

There are two modes with this setup, either Publish and Subscribe or otherwise a RPC call (Request-Repsonse via Remote procedure call)

Left RPC with Response Queue | Right: publish/subscribe

This allows for a high scalability because those machines do not compete for system resources. But it also comes with high complexity, because you potentially need to synchronize state between machines, and you need to acommodate for node failure and networking issues. Also you probably need to learn about the used Broker Software (RabbitMQ, ActiveMQ, Apache Kafka)

2. Multiple processes

A process is a unit of isolation in an OS, where an application can be run with its own security context and virtual memory address space. The difference to multiple machines, is that they compete for processor resources.
The communication is also enabled by the same technique as with multiple machines, which is the queueing of messages. The advantage over multiple machines is the easier handling of faults on the same machine.

Another model for different processes is known from web servers that used CGI to spawn off a new process that handled the given request aka handing off long running operations to another process.

Left Queue inside a single machine | Right: CGI like process model

3. Multiple Threads

A thread is an individually schedulable set of instructions, that is packaged with nonshared resources.
One of those nonshared resources for example is that every thread get its own stack.
Each thread is bound to a process and cannot leave this process. There can be one or more threads at any given time in a process. All the threads in a process share the heap memory and OS system resources like sockets, file handles etc.

Multiple Threads inside of Processes. With Relations to Stack and Heap
How to utilize Threads:

The queue pattern is acutally also helpful here, because it is a generally helpful asynchronous pattern.
Yet threads do benefit least from it, because they share so much resources (Especially the Heap where all objects do live). On the other hand sharing resources makes it easier to handle failures, because they are all in the same process.
So there is actually no need for a queue.

Where to use Threads in .Net

On Windows (where still most of .Net code runs) a process is a very hefty construct, because of the house keeping, checking in registry and so on, which is the reason that the most appropriate way of doing asynchronous programming on Windows is the use of threads.
Yet creating or destroying threads is a complex and costly matter, so it is best to avoid it. Because each thread has its own stack, and default stack size is 1MB. This is a nontrivial amount of memory, hence the need arises to reuse threads.

Now that we know, that threads are the main mode of execution for asynchronous programming in .Net, we look at how thread scheduling works, which is essential for efficient and effective use of threads.

Thread Scheduling.

As mentioned a thread is an individually schedulable set of instructions. This scheduling is done by the OS and its thread scheduling mechanism. For this each running thread has a priority, which is a combination of the process priority and the priority of the thread in question.

Normally each active thread gets a time slice from the scheduler to perform its work (if it is not in the SleepWaitJoin state) then runs its instructions till this alloted time slice ends and the next thread gets its time slice.
This is just the ideal case, sometimes not everything is equal. This is where the priorities come into play. There are
7 different of them, which range from idle to time critical.

Besides this priority scheduling, Windows uses preemptive thread scheduling. This means threads with lower priority are removed (preempted) for threads with higher priority. This would potentially starve threads with lower priority of their processing time. To not starve the lower priority threads, sometimes those get their priority boosted. Possible reasons for this can be user input, putting it into the focus, or another event. After the thread is alloted its time slice and used it, its priority degrades over time to its former value.

A thread scheduler also works out the mapping of all threads to processor cores.

Thread scheduling (simplified example)

Threads and resources

As mentioned above, threads share system resources and also the heap, but not the stack. This is an important distinction with regard to mutable shared state of the application, which is one of the most common reasons for errors in asynchronous programming. (see race conditions and dead locks)

Thread local storage

There is a notion of storage slots, where each thread has an entry for each defined slot and it can store thread specific values. This slots cannot be accessed by other threads. TLS slots are guaranteed to at least have 64 or up to 1088 slots.

Registers

copy of the register values.–> important for preemption to return to its instructions (isntruction pointer is restored)

Shared Resources

As mentioned all threads in a process share the heap memory, which is all the memory that can be allocated for pointers, and/or object references (in the case of .Net). This is efficient, yet again as mentioned before, it can lead to bugs.
Threads also share operating handles. (file, window, load dll) When a thread is destroyed/ends the thread will not return the handle. So this needs to be done explicitly, else it is only done when the process exits.

Summary

In this post we explored the different mechanisms of asynchronous programming.

We learned that Threads are the preferred method for asynchronous programming with .Net for a single application.

To end this post we looked into the way threads are scheduled by windows and what resources they share or do not share.

The next post will deal with an overview of asynchronous techniques in .Net.


0 Comments

Leave a Reply