Asynchronous Delegates | Advanced .NET Programming

Much of the rest of this chapter will be devoted to working through some samples that most of the above techniques. The samples are as far as possible based on realistic situations, in order to give you a flavor of the kinds of situation where you'll need to use the main multithreading techniques.

We're going to jump in at the deep end concept-wise by presenting asynchronous delegates first. The reason is that this technique is often the most useful one for asynchronous processing. However, you do need to understand quite a few concepts in order to invoke delegates asynchronously, which means this next section is going to be quite top-heavy on theory, but once you've grasped the concepts, asynchronous delegate invocation is very often the easiest and most efficient way to write multithreaded applications.

In .NET, the usual pattern for implementing an asynchronous method call is for some object to expose two methods, BeginXXX() and EndXXX(), where XXX is some word that describes the method. There is usually a separate unrelated method, XXX(), which performs the operation synchronously.

BeginXXX() is the method that is called to start the operation. It returns immediately, with the method left executing - on a thread-pool thread. EndXXX() is called when the results are required. If the asynchronous operation has already completed when EndXXX() is called, it simply returns the return values from the operation. If the operation is still executing, EndXXX() waits until it is completed before returning the values. The actual parameters expected by BeginXXX() and EndXXX() will of course depend on the parameters passed on to the asynchronous operation. If an exception is thrown in the asynchronous operation, this exception will be transferred by the CLR to the EndXXX() call, which will throw this same exception on the calling thread.

This design pattern is implemented by various classes, including:

System.IO.FileStream(BeginRead()/EndRead();BeginWrite()/EndWrite() for reading or writing to a stream)
System.Net.WebRequest(BeginRequest()/EndRequest() for making a request to a remote server)
System.Windows.Forms.Control(BeginInvoke()/EndInvoke() for executing any method; however, the underlying implementation here is different - the message loop is used instead of the thread pool for the asynchronous operation)
System.Messaging.MessageQueue(BeginReceive()/EndReceive() for receiving a message)

However, what interests us here is that this same design pattern is also implemented by delegates. This means that you can asynchronously invoke literally any method at all, just by wrapping it in a delegate and invoking the delegate asynchronously. The methods involved here are called BeginInvoke() and EndInvoke(). Paradoxically, these methods are not defined in either the System.Delegate or System.MulticastDelegate classes, but they can be defined, with native CLR support, in any delegate which derives from these classes - and you'll find that most high-level language compilers, including the C#, VB.NET, and C++ ones, add this support automatically.

In order to explain how asynchronous delegates are implemented, we first need to go over a couple of classes and interfaces provided by the .NET Framework classes to support asynchronous method calls. First off, we'll look at the IAsyncResult interface. IAsyncResult instances hold information about an asynchronous call that allows calling code to check the status of the operation. IAsyncResult exposes a number of properties:

Property	Type	Description
AsyncState	object	Extra data that is supplied by the calling method
AsyncWaitHandle	System.Threading.WaitHandle	A synchronization object that can be used to wait till the operation completes
CompletedSynchronously	bool	Gives some indication of whether the operation has completed, and if so whether it was on this thread
IsCompleted	bool	Whether the operation has completed yet

We will come across this interface quite a lot in this chapter. Calls to BeginInvoke() on delegates return an IAsyncResult reference - and more generally, calls to BeginXXX() invariably do the same. This IAsyncResult is used by the calling code to check the progress of the operation.

We also need to be aware of the existence of the System.Runtime.Remoting.Messaging.AsyncResult class. The significance of this class is that it implements IAsyncResult. In practice, this is the class that the CLR normally uses to provide the implementation of IAsyncResult for asynchronous delegates. Don't worry that this class is in a messaging-related namespace - AsyncResult does have uses in connection with messaging (since messaging architecture is full of asynchronous operations), but it happens to be used with delegates too. AsyncResult implements another property of interest to us, AsyncResult.AsyncDelegate, which contains a reference to the actual delegate that was originally invoked asynchronously, stored as an object reference.

There is also one other delegate we need to know about, System.AsyncCallback. This delegate is defined as follows:

 public delegate void AsyncCallback(IAsyncResult ar);

AsyncCallback is there to represent a callback method that the CLR should invoke to inform your application that the asynchronous operation has been completed - an initialized AsyncCallback can be passed to BeginInvoke() for this purpose, though you can instead pass in null if you don't want the CLR to call back any method.

That may all sound a bit much to take in. The best way to see how this pattern works in practice is to look at a specific example.

In Chapter 2, when we covered delegates, we briefly looked at the IL that is emitted when a delegate class is defined in C#, and saw that it contained two extra methods, BeginInvoke() and EndInvoke(), and we indicated in that chapter that we were going to postpone discussion of these methods until the threading chapter. For our purposes, we're not really interested in IL here. Unfortunately, the BeginInvoke() and EndInvoke() methods are only defined at the IL level - they do not exist at source-code level - so if we want to look at their definitions we'll have to examine them in IL. Suppose a delegate is defined in C# like this:

 public delegate float SomeDelegate(int input, ref int refparam,                                    out int output);

The IL emitted by the compiler looks like this:

 .class public auto ansi sealed SomeDelegate        extends [mscorlib]System.MulticastDelegate {   .method public hidebysig specialname rtspecialname           instance void  .ctor(object 'object',                                native int 'method') runtime managed   {   } // end of method SomeDelegate::.ctor   .method public hidebysig virtual instance float32           Invoke(int32 input,                  int32& refparam,                  [out] int32& output) runtime managed   {   } // end of method SomeDelegate::Invoke   .method public hidebysig newslot virtual           instance class [mscorlib]System.IAsyncResult           BeginInvoke(int32 input,                       int32& refparam,                       [out] int32& output,                       class [mscorlib]System.AsyncCallback callback,                       object 'object') runtime managed   {   } // end of method SomeDelegate::BeginInvoke   .method public hidebysig newslot virtual           instance float32 EndInvoke(int32& refparam,                                        [out] int32& output,                                        class [mscorlib]System.IAsyncResult                                                    result) runtime managed   {   }  // end of method SomeDelegate::EndInvoke } // end of class SomeDelegate

I've highlighted the key methods. What has been emitted is the IL equivalent of this:

 // Pseudo-code. No method implementation supplied. IAsyncResult BeginInvoke(      int input, ref int refparam, out int output, ASyncCallback callback,      object state); float EndInvoke(ref int refparam, out int output, IAsyncResult result);

Although these methods don't exist in any actual C# source code, the C# compiler knows they are there in the IL, and so will allow you to write code that invokes them (and although we are using C# for our discussion, the situation is exactly the same in both VB and C++). Note also that the implementations of these methods is supplied automatically by the runtime - hence the runtime flags in the IL method signatures.

Instead of invoking the delegate, the client code calls the BeginInvoke() method. BeginInvoke() has the same signature as the synchronous method, except for its return type and an extra two parameters. The return type is always an IAsyncResult interface reference, which the client code can use to monitor the progress of the operation. The first extra parameter is an ASyncCallback delegate instance, which must have been set up by the client, and which supplies details of a callback method that will be invoked automatically when the operation is completed. If you don't want any callback to be invoked, you can set this parameter to null. The second extra parameter is there in case you want to store any information about the asynchronous operation for later use. It's not used by the asynchronous operation, but will be passed on through the returned IAsyncResult interface: the ASyncState property of this interface instance will be set to refer to this object, just in case the client code needs it. In most cases, you'll pass in null for this parameter.

EndInvoke() is of course the method that is used to return the result. Its signature is a little different: it has the same return type as the original delegate (that makes sense because EndInvoke() needs to return the result), and it has any parameters that are passed to the delegate as either ref or out parameters. Again, that makes sense because these parameters will contain return values too. However, if the original delegate was expecting any value-type parameters passed by value, these will not be present in the EndInvoke() parameter list. There's no point because these parameters can't return any values. EndInvoke() is not called to pass values into the operation - it's there solely to retrieve return values! There is one other parameter to EndInvoke() - an IAsyncResult interface reference. The CLR's implementation of EndInvoke() will internally use this parameter to figure out which operation is the one that we want the return value from. In other words, if you have a number of asynchronous operations that you've started off using BeginInvoke(), then the IAsyncResult parameter tells the CLR which of the BeginInvoke() calls you want the return value from. Obviously, you'll need to pass in the value returned from the BeginInvoke() call here.

You might think I'm putting quite a lot of emphasis on explaining how to work out the parameter lists for BeginInvoke() and EndInvoke(). There's a good reason for this. At the time of writing, VS.NET IntelliSense for C# doesn't display their signatures - which means that the only way of finding out what parameters are going to be expected by a given BeginInvoke()/EndInvoke() method, apart from inspecting the compiled IL, is to work it out manually. However, this problem is likely to be fixed in future versions of VS.NET.

Notice that you have a clear choice in how you write your client code. You can either:

Arrange for the client thread to call BeginInvoke(), then immediately go off and do some other work, and finally call EndInvoke(). This means that EndInvoke() will be executed on the client thread.
Call BeginInvoke() on the client thread, passing in an AsyncCallback that indicates a callback method. The client thread is then free to forget all about the asynchronous operation. This means that EndInvoke() will actually be executed on a thread-pool thread, presumably (though that's not documented) the same thread used to perform the asynchronous operation.

Note that you must choose one or the other - you can't do both, since you'll get an exception if EndInvoke() is called more than once for the same asynchronous operation.

Which of these techniques you choose depends on which solution best suits your particular situation. The first option has the disadvantage that the client thread has no direct way of knowing when the operation has been completed, other than by polling - that is to say, checking the IsCompleted property of the IAsyncResult interface every so often. If it calls EndInvoke() too early, it will block the client thread until the operation is complete. On the other hand, because it's the client thread that calls EndInvoke(), this thread has direct access to the values returned. The second option may be better if the client thread has other important work to do and doesn't directly need to access the returned results (or if it's a method that doesn't return any values). However, if the client thread does need access to the return values, it will still have to poll to find out if the values are there yet. And unless those values are stored in some member fields, it will have to communicate somehow with the callback function to retrieve those values, which brings up thread synchronization issues.

In the following sample we'll demonstrate both approaches, so you can judge their relative merits.

Asynchronous Delegates Sample

Now that we have seen the theory of asynchronous delegates, we are going to put it into practice by developing a sample that illustrates the different ways that a delegate can be invoked, both synchronous and asynchronous. For this sample, we are going to assume that we have a database of names and addresses. We will define a GetAddress() method, which takes a name as a parameter and pretends to look up the addresses in a database. This is the kind of operation that could take a short while to complete if, for example, the database is located remotely. To keep things simple, GetAddress() won't actually access any database: it simply pauses for one second (to simulate the delay) and then returns one of a couple of hard-coded values. The sample will involve wrapping the GetAddress() method in a delegate. It will then invoke the delegate several times in each of three ways:

Synchronously
Asynchronously, with the result returned via a callback function
Asynchronously, with the main thread later calling EndInvoke() to retrieve the address

The namespaces we will need for this sample (and indeed, all the remaining samples in this chapter are):

 using System; using System.Threading; using System.Runtime.Remoting; using System.Runtime.Remoting.Messaging; using System.Text;

Before we go over the code for the asynchronous calls, I want to present a quick helper utility class that will be used to display thread information:

 public class ThreadUtils {    public static string ThreadDescription(Thread thread)    {       StringBuilder sb = new StringBuilder(100);       if (thread.Name != null && thread.Name != "")       {          sb.Append(thread.Name);          sb.Append(",  ");    }          sb.Append("hash: ");          sb.Append(thread.GetHashCode());          sb.Append(",  pool: ");          sb.Append(thread.IsThreadPoolThread);          sb.Append(",  backgrnd: ");          sb.Append(thread.IsBackground);          sb.Append(",  state: ");          sb.Append(thread.ThreadState);          return sb.ToString();    }    public static void DisplayThreadInfo(string context)    {       string output = "\n" + context + "\n " +                       ThreadDescription(Thread.CurrentThread);       Console.WriteLine(output);    } }

This code shouldn't need any explanation. It simply means that we can call DisplayThreadInfo() to write out quite a bit of information about the currently executing thread so we can see exactly what the code is doing. Unfortunately, System.Threading.Thread.ToString() doesn't appear to have been implemented by Microsoft to do anything intelligent - it simply displays the type name - so I've defined the ThreadUtils class as a substitute.

Now let's look at how we use the delegates. The GetAddress() method, and the various wrapper methods that use delegates to invoke GetAddress() indirectly, are located in a class which we will call the DataRetriever class. We will start off with the GetAddress() method itself:

 public class DataRetriever {    public string GetAddress(string name)    {       ThreadUtils.DisplayThreadInfo("In GetAddress...");       // Simulate waiting to get results off database servers       Thread.Sleep(1000);       if (name == "Simon")          return "Simon lives in Lancaster";       else if (name == "Wrox Press")          return "Wrox Press lives in Acocks Green";       else          throw new ArgumentException("The name " + name +                                      " is not in the database", "name");    }

As you can see, this method first displays the details of the thread it is running on. It then returns a string indicating an address of "Simon lives in Lancaster" if "Simon" is passed in as the name, "Wrox Press lives in Acocks Green" if "Wrox Press" is passed in, or throws an exception for any other name. This will give us the chance to illustrate the catching of asynchronous exceptions.

We also need to define a delegate with the appropriate signature for GetAddress():

 public delegate string GetAddressDelegate(string name);

Based on this delegate syntax, we can call BeginInvoke() and EndInvoke() methods with the following signatures:

 // Pseudo-code - these definitions do not actually exist in any source code public IAsyncResult GetAddressDelegate.BeginInvoke(string name,                                        ASyncResult callback, object state); public string GetAddressDelegate.EndInvoke(IAsyncResult result);

Now for the wrapper methods; first, for calling the delegate synchronously:

 public void GetAddressSync(string name) {    try    {       GetAddressDelegate dc = new GetAddressDelegate(this.GetAddress);       string result = dc(name);       Console.WriteLine("\nSync: " + result);    }    catch (Exception ex)    {       Console.WriteLine("\nSync: a problem occurred: " + ex.Message);    } }

The delegate wrapper methods in this sample don't merely call GetAddress() - they also display the results and catch any errors arising from an incorrect name. The code for GetAddressSync() should be fairly self-explanatory. You will note that in real production code, we would not use a delegate in quite the way done here. GetAddressSync() is in the same class as GetAddress(), and it is known at compile time which method is to be called, so there is no need to use a delegate at all in this particular case! We have done it this way in order to provide a fair comparison between the synchronous and asynchronous techniques for delegate invocation.

Next, let's examine the way that a delegate is called asynchronously using EndInvoke() to get the results:

 public void GetAddressAsyncWait(string name) {    GetAddressDelegate dc = new GetAddressDelegate(this.GetAddress);    IAsyncResult ar = dc.BeginInvoke(name, null, null);    // Main thread can in principle do other work now    try    {       string result = dc.EndInvoke(ar);       Console.WriteLine("\nAsync waiting : "+ result);    }    catch (Exception ex)    {       Console.WriteLine("\nAsync waiting, a problem occurred : " +                         ex.Message);    } }

In this method, the delegate is defined in the same way, but invoked using BeginInvoke(), passing null for the callback delegate. We also pass in null for the state information. BeginInvoke() returns immediately, and at this point the main thread is free to do something else while the background request is processed. Having the main thread free to do something else is of course the main reason for invoking the delegate asynchronously. However, since this is only a sample, there's nothing else for us to do, so instead I've simply left a comment in the code reminding us of the fact. The code in the sample proceeds to call EndInvoke() immediately - something that it would not be sensible to do in production code since, if that's all you wanted to do, you'd just have called the method synchronously! EndInvoke() will block until the asynchronous operation has completed, and then return the results. Notice that I've placed the try block around the EndInvoke() call, since this is the point at which an exception will be raised on the main thread if the asynchronous operation threw an exception.

Finally, we'll look at the technique for using a callback method. First, we need actually to define a callback method. This method forms part of the same DataRetriever class in our sample, though you can define it in a different class if you wish:

 public void GetResultsOnCallback(IAsyncResult ar) {    GetAddressDelegate del = (GetAddressDelegate)                                    ((AsyncResult)ar).AsyncDelegate;    try    {       string result;       result = del.EndInvoke(ar);       Console.WriteLine("\nOn CallBack: result is " + result);    }    catch (Exception ex)    {       Console.WriteLine("\nOn CallBack, problem occurred: " + ex.Message);    } }

The first thing we need to do is to retrieve the delegate that we need to call EndInvoke() against from the IAsyncResult interface reference. To do that, we need to cast the interface reference to the AsyncResult class that is actually used to implement the interface. (Why Microsoft didn't define the callback signature so it was expecting an instance of this class in the first place - saving us an explicit cast operation - is one of those little mysteries that we'll quietly forget about.) Then we call EndInvoke() inside the usual try block. The main differences between the callback method the GetAddressAsyncWait() method gets the results arise from the facts that: (a) GetAddressAsyncWait() already has the delegate reference, whereas GetResultsOnCallback() has to retrieve it from the AsyncResult object; and (b) we know that if GetResultsOnCallback() has been invoked, the asynchronous operation has already finished, so we know that EndInvoke() will not block the thread.

Now for the code to start the asynchronous operation with a callback:

 public void GetAddressAsync(string name) {    GetAddressDelegate dc = new GetAddressDelegate(this.GetAddress);    AsyncCallback cb = new AsyncCallback(this.GetResultsOnCallback);    IAsyncResult ar = dc.BeginInvoke(name, cb, null); }

This code simply sets up an AsyncCallback delegate with the callback method, and calls BeginInvoke().

Finally, we need a Main() method which tests the above code:

 public class EntryPoint {    public static void Main()    {       Thread.CurrentThread.Name = "Main Thread";       DataRetriever dr = new DataRetriever();       dr.GetAddressSync("Simon");       dr.GetAddressSync("Wrox Press");       dr.GetAddressSync("Julian");       dr.GetAddressAsync("Simon");       dr.GetAddressAsync("Julian");       dr.GetAddressAsync("Wrox Press");       dr.GetAddressAsyncWait("Simon");       dr.GetAddressAsyncWait("Wrox Press");       dr.GetAddressAsyncWait("Julian");       Console.ReadLine();    } }

Our Main() method first attaches a name to the main thread - so we can easily identify it when displaying thread information. Then we set up a new DataRetriever object, and start obtaining addresses. We have three test names to pass in: "Simon", "Wrox Press", and "Julian", and we pass each of these names in to GetAddress() using each of the three delegate techniques we've covered. The name Julian is obviously going to cause an exception. Note that, because of the way we have implemented the methods in the DataRetriever class, we will get a message displaying the thread each time GetAddress() is invoked. We will separately get the result of the operation displayed (without thread information) about a second later.

Running the AsyncDelegates sample gives us this result:

 In GetAddress...  Main Thread, hash: 2, pool: False, backgrnd: False, state: Running Sync: Simon lives in Lancaster In GetAddress...  Main Thread, hash: 2, pool: False, backgrnd: False, state: Running Sync: Wrox Press lives in Acocks Green In GetAddress...  Main Thread, hash: 2, pool: False, backgrnd: False, state: Running Sync: a problem occurred: The name Julian is not in the database Parameter name: name In GetAddress...  hash: 17, pool: True, backgrnd: True, state: Background In GetAddress...  hash: 19, pool: True, backgrnd: True, state: Background In GetAddress...  hash: 21, pool: True, backgrnd: True, state: Background On CallBack: result is Simon lives in Lancaster In GetAddress...  hash: 17, pool: True, backgrnd: True, state: Background On CallBack, problem occurred: The name Julian is not in the database Parameter name: name On CallBack: result is Wrox Press lives in Acocks Green Async waiting : Simon lives in Lancaster In GetAddress...  hash: 19, pool: True, backgrnd: True, state: Background Async waiting : Wrox Press lives in Acocks Green In GetAddress...  hash: 21, pool: True, backgrnd: True, state: Background Async waiting, a problem occurred : The name Julian is not in the database Parameter name: name

The first few items in this output are as expected. The program makes three synchronous calls to GetAddress() in succession. If you run the sample, you'll notice the one-second delays before obtaining each result. Next, we make some asynchronous calls. Each call is delegated to a new thread. On the first asynchronous call, the CLR sees that there aren't yet any threads in the pool, so it creates one (which happens to have hash 17) and sets GetAddress() running on this thread. In fact, the CLR does more than that - the thread pool itself is only constructed when the CLR first sees that it is going to be used, so this is the point at which the thread pool itself will be created, which means that this first asynchronous call will take a little time to set up. Immediately after that, the main thread asks for another asynchronous operation. The CLR inspects the thread pool and finds that it contains one thread, but that thread is already occupied. So it creates another one - this time with hash 19. Then the same thing happens again for the third asynchronous call, and we get a new thread with hash 21. It's tempting to see the obvious pattern in the hash values, but do remember that the values themselves are irrelevant - in principle you can't deduce anything from them other than the fact that if you get the same hash value twice, you know you're looking at the same logical CLR thread.

Now something interesting happens. While we've been busy setting up these threads, the first asynchronous result comes through. Simon lives in Lancaster. Hey, I already knew that, but it's nice to get confirmation that I am currently living in the correct house! That means that the thread with hash 17 is now free - so the next asynchronous operation will get sent to this thread - as is confirmed by the next In GetAddress... message. This fourth asynchronous operation is also the first one in which the main thread waits for the result to come back before doing anything else. While we are waiting, the results for the second and third asynchronous calls come through. Notice that the Julian exception arrives just before the Wrox Press result, even though we sent the Wrox Press request off first. That emphasizes one of the main things to watch for with threading - you must allow for the fact that you can't guarantee the order that results will be returned when two or more threads are running in parallel.

Next the Simon result comes back from the fourth operation - and at this point the main thread is awake again, and can make the next request. Since all the thread pool threads are now free, the CLR will just pick any one of them - it goes for the one with hash 19. The final results arrive in sequence since the main thread has to wait for each one before sending off the next request. Remember that, although it looks like sending the request off and waiting hasn't gained us anything compared to making a synchronous request, in a real application the main thread could be doing some other useful work in this time.

What I've presented here are the results on my machine. If you run the sample on your computer, the results are likely to be different depending on how fast your computer is and therefore how long it takes the CLR to execute the code in the Main() method on your machine. You might also want to experiment with varying the delay time in GetAddress() to see what effect that has on your output.