Platform Changes | The Visual Basic .NET Programming Language

< Day Day Up >

Above and beyond the changes to the type system in Visual Basic .NET, a number of changes were required by the underlying structure of the CLR itself.

Deterministic Finalization and Garbage Collection

The CLR was built from the ground up to be a garbage-collected system. That is to say that the CLR tracks all references to objects allocated on the heap. Periodically, the runtime checks to see if there are objects on the heap that no longer have any references to them. If there are any such objects, they are garbage collected ”that is, freed. This scheme for freeing heap allocations is different from the scheme employed by COM. COM employed a reference-counting mechanism to track references to objects. When a reference to an object was copied , the copier would call the AddRef method on the object to let it know that there was another reference to it. When a piece of code was finished with a reference to an object, it called the Release method on the object to let it know that it was finished with the reference. When the number of references to an object reached zero, the object would free itself.

The advantage of the COM scheme is that all objects are deterministically finalized . That is, an object is freed at the very moment that the last reference to it is released. In the CLR, in comparison, an object will not be freed until sometime after the last reference was released ”that is, probably the next time the garbage collector runs. Deterministic finalization is advantageous because an object may be holding on to references to other objects itself, some of which may be scarce (like a database connection). In that case, as soon as you free an object, you want the resources it holds to be freed as well. Even if the garbage collector runs every few milliseconds , that may not be fast enough to prevent resources from being used up.

The advantage of the CLR scheme is that it is very hard to get the COM scheme right. Many thousands of hours have been spent collectively in the software industry tracking down AddRef / Release problems. It is very easy when you are programming against a COM object to forget to do an AddRef when you copy a reference, resulting in a crash if the object is finalized too early. It is also even easier to forget to Release a reference once you are finished with it, resulting in memory leaks and objects hanging around forever and never releasing their resources.

Another advantage of the CLR scheme is that it easily solves the problem of circular references . A circular reference is when object A has a direct or indirect reference to object B and B has a direct or indirect reference to A; thus each object holds a reference to the other. Once a circular reference has been created, in COM the developer has to explicitly use some other mechanism to break the cycle and make the object finalize. However, because the CLR knows about all references between objects, the garbage collector can determine when a set of objects that contain circular references are no longer referenced outside the cycle and finalize all the objects at once.

By and large, the problems with the COM scheme listed above were hidden from Visual Basic developers by the compiler, except for circular references. The compiler would correctly insert AddRef or Release calls into the compiled Visual Basic code to manage the references. So the fact that the CLR uses garbage collection does not buy Visual Basic as much as it buys programmers who write in other languages such as C++, although there are still benefits such as circular reference finalization.

Unfortunately, there is no easy way to have deterministic finalization and garbage collection coexist easily within the same type system. A full discussion of this issue is beyond the scope of this appendix, but the core problem can be boiled down to a single question: is the type Object deterministically finalized or garbage collected? If the answer is that it is garbage collected, then casting an instance of a deterministically finalized to Object would have to cause the instance to lose its deterministic finalization property (because once it was typed as Object , AddRef / Release would not be called). Given that there are many situations (such as storing references in collections) where instances need to be cast to Object , this is not a workable solution. If Object is deterministically finalized, on the other hand, then effectively all types must be deterministically finalized, which means that garbage collection goes out the window. One could work around the question of Object by splitting the type system into two completely separate camps, garbage collected types and deterministically finalized types, but then there would have to be two different root Object types which would cause every type that takes Object (such as arrays and collections) to have two versions: one for deterministically finalized types and one for garbage collected types.

The end result of all this is that there is no solid replacement for deterministic finalization in Visual Basic .NET. An interface, IDisposable , has been added to the .NET Framework which a class should implement if it holds on to precious resources that should be disposed of as soon as possible, but the onus is on the developer to correctly call the Dispose method when they are finished with the instance.

Let and Set Assignment

In the past, Visual Basic has distinguished between value assignment ( Let ) and reference assignment ( Set ). While this can be a useful distinction, one of the reasons it was necessary in Visual Basic was because of parameterless default properties. In previous versions of Visual Basic, classes can expose a default property that has no parameters. In that case, the default property is considered the "value" property of the object (indeed, in most cases the parameterless default property is named Value ). Because of this, there has to be two forms of assignment that distinguish whether you are assigning the "value" of an object or the actual reference to an object. For example:

 Sub Main()   Dim s As String   Dim t1 As TextBox   Dim t2 As TextBox   Let s = t1    ' Assigns the value of the text box   t2 = t1       ' Assigns the value of text box t1 to text box t2   Set f2 = f1   ' Assigns the reference to t1 to t2 End Sub

Because COM distinguished between Let and Set style of assignments, properties had to be able to be defined with both kinds of accessors.

 Property Set foo(Value As Variant) End Property Property Let foo(Value As Variant) End Property

The problem is that the CLR does not distinguish between Let and Set types of assignment, so properties can only define one kind of assignment accessor. This is because most programming languages do not make a distinction between types of assignment, and in this case the majority won out. It would have been possible for Visual Basic to define both types of accessors but only expose one to other languages, but this was very problematic ”if you had an Object property, which can take values and references, which one should be exposed? No matter which one was chosen , it would be wrong in many cases. And the reverse problem would occur with properties defined in other languages.

After much deliberation and considering of alternatives, we decided the simplest way was to drop the distinction between Let and Set forms of assignment. This meant that properties could only define one kind of assignment accessor. It also meant that parameterless default properties had to be removed from the language. Although parameterless default properties were useful, they could also be obscure and confusing, so this was not considered a huge loss. While the loss of the two distinct types of assignment tended to be not that significant for the Visual Basic .NET developer, it does create a headache when interoperating with COM. Just as Visual Basic .NET would have had to do, the CLR can only expose a Let or a Set accessor of a property in COM that has both. The decision was made to always expose the Set accessor of the property from the CLR, so this means that code that wants to call the Let accessor of the property has to do so by calling the accessor directly. For a property, Foo , that exposes both a Let and a Set , the interoperability layer will expose a method called let_Foo that allows calling the Let accessor directly.

Late Binding

Late binding is the mechanism by which resolution of a method call can be deferred until runtime. Essentially , when you are making a call on a variable typed as Object , the compiler instead emits a call to a helper, with all the relevant information about the method call. At runtime, the helper will resolve which method (if any) to call and then make the call itself.

The first challenge in implementing late binding was adapting to the change from COM to the CLR. In COM, late binding was handled by a component called OLE Automation through an interface called IDispatch . IDispatch relied on the information specified in a file called a type library that described a class. Type libraries were either compiled into or accompanied a COM component. In contrast, late binding in the CLR is done through a component called reflection . Whereas IDispatch and type libraries were not easily accessible in previous versions of Visual Basic and were considered more of an internal implementation detail, reflection is a full-fledged part of the.NET Framework. Reflection allows inspection of the type information that is part of every assembly (i.e., the equivalent of a type library compiled into the executable). It also has some mechanisms for doing late binding, but they are extremely simple, only allowing you to inspect the methods that an object has and then invoke one of them.

Because we needed to express the full set of Visual Basic binding semantics, we had to write special helpers that sit on top of reflection to do that binding. When a late-bound helper is invoked, it first goes through much the same process as the compiler does at compile time to determine what method to call. The runtime binder considers inheritance and name hiding, and does overload resolution. It understands Visual Basic's conversion rules and knows what calls are valid and which are not. It is a relatively complex piece of code.

Another challenge had to do with reference parameters. IDispatch was built using Variant s, which, as noted before, have the ability to store pointers in them. This meant that when a late-bound call was made, Visual Basic could pass ByRef Variant s to IDispatch for each argument. Then, if the actual parameter of the method being called was declared ByRef , the pointer could be passed in, and the method would work as expected. Because the CLR does not have a real equivalent to ByRef Variant s, we had to employ the same copy-in/copy-out mechanism that we use for ByRef Object parameters, with an additional twist.

When you are making a call to a method early-bound, it is possible for the compiler to know which parameters are ByRef Object and which are not. So it can figure out which arguments, if any, need to be passed using copy-in/copy-out semantics. But when you are making a late-bound call, it's impossible to know until runtime which parameters are ByRef Object . So there is no way to know whether or not to generate code to do the copy-out. The ugly, but unavoidable, solution is to pass in an array of Boolean values corresponding to the argument list to the late-binding helper. Once the late-binding helper has determined which method it is going to call, it sets each element to True that corresponds to a ByRef parameter. Then, the compiler generates code to check the element for each argument that could do a copy-back (i.e., ignoring literals, read-only fields, and so on) and then do a copy-back if the element has been set to True .

It's also worth noting here that the change in the locus of identity from interfaces in COM to classes in the CLR affects late binding too. In COM, you would late bind to the particular interface that you happened to have in your hand when you made the call. However, since the CLR only sees instances of classes, it is never possible to have an instance of an interface in hand. This means that it is not possible to late bind to interfaces in the CLR, only classes. If a class, Foo , implements an interface, IBar , using a private member, there is no way to late bind to that interface member. This is an inescapable result of the design of the CLR type system.

On Error and Structured Exception Handling

The On Error style of error handling has always been built on top of exception handling mechanisms provided by the operating system. With Visual Basic .NET, we decided to expose the underlying exception handling mechanism, call structured exception handling , directly to programmers through the Try statement. Structured exception handling has two advantages over On Error “style error handling. First, its structured nature encourages more discriminating and precise use of exception handling, which can in some cases result in more optimized assembly output. Second, structured exception handling gives a much finer-grained control over which exceptions are handled and when. This is not to say that On Error “style error handling does not have advantages over structured exception handling. For example, there is no equivalent to Resume or Resume Next in structured exception handling.

Implementing On Error “style exception handling on top of structured exception handling was a relatively straightforward task (except for Resume and Resume Next , which will be discussed in a moment). Any method that contains an On Error statement is wrapped in a Try statement around all the code in the method. Then, a Catch handler is added that catches all exceptions. Each On Error statement sets a value in a local variable that indicates which On Error statement is currently active. When an exception occurs, the Catch handler switches on that local variable to determine where to go. For example:

 Module Test   Sub Main()     Dim i As Integer     On Error Goto Foo     i = 20 Foo:     On Error Goto Bar     i = 30 Bar:     On Error Goto 0     i = 40   End Sub End Module

is equivalent to the following.

 Module Test   Sub Main()     Dim CurrentHandler As Integer     Try       Dim i As Integer       CurrentHandler = 1  ' On Error Goto Foo       i = 20 Foo:       CurrentHandler = 2  ' On Error Goto Bar       i = 30 Bar:       CurrentHandler = 0  ' On Error Goto 0       i = 40     Catch e As Exception When CurrentHandler > 0       Select Case CurrentHandler         Case 1           Goto Foo         Case 2           Goto Bar       End Select       Throw  ' In case something goes wrong.     End Try   End Sub End Module

As noted earlier, Resume and Resume Next are slightly more complex cases. When compiling a method that contains a Resume or Resume Next , the compiler adds a local that tracks which statement is being processed . Code is inserted after each statement to increment the local. When an exception occurs and a Resume or Resume Next statement is executed (either through an On Error statement or on its own), the local is loaded (and incremented if Resume Next ), and then a Select statement is executed to jump to the statement indicated by the local. For example:

 Module Test   Sub Main()     Dim i As Integer     On Error Resume Next     i = 20     i = 30     i = 40   End Sub End Module

is equivalent to the following.

 Module Test   Sub Main()     Dim CurrentHandler As Integer     Dim CurrentLine As Integer     Try       Dim i As Integer Line1:       CurrentLine = 1       CurrentHandler = -1  ' On Error Resume Next Line2:       CurrentLine = 2       i = 20 Line3:       CurrentLine = 3       i = 30 Line4:       CurrentLine = 4       i = 40     Catch e As Exception When CurrentHandler > 0       Select Case CurrentHandler         Case -1           CurrentLine += 1           Select Case CurrentLine             Case 1               Goto Line1             Case 2               Goto Line2             Case 3               Goto Line3             Case 4               Goto Line4           End Select       End Select       Throw  ' In case something goes wrong.     End Try   End Sub End Module

It's important to note that this can only be done by the compiler and not by the CLR ”even if the CLR had a resume mechanism, it could only work based on IL instructions, whereas the Resume statement works based on statements. There's no way for the CLR to know where the "next" statement or "current" statement begins and ends.

Events and Delegates

Events in COM and the CLR are implemented very differently even though they have virtually the same functionality. In COM, a class that sources events exposes one or more event interfaces . An event interface defines callback methods to handle each event that the class exposes. A class that wished to handle the events would implement the event interface using the event handlers that it wished to use. The class handling the events would then give this event interface to the class that sourced the events (called a connection point). The sourcing class then invokes the appropriate methods when the event is raised.

In the CLR, events are built around delegates, which are managed function pointers. A delegate contains the address of a method and a particular instance of the containing class if the method is an instance method. A delegate can then invoke the method that it points to, given a list of arguments. Delegates are multicast , which is to say that a single delegate can point to more than one method. When the delegate is invoked, all the methods that the delegate points to are called in order.

It is worth noting that the semantics of the AddressOf expression were extended slightly in Visual Basic .NET to make working with delegates easier. The AddressOf operator can be applied to any method and produces, essentially, an unnamed delegate type representing that method's signature. Now, the CLR type system does not actually support unnamed delegate types, so this is merely a compiler trick ”the result of an AddressOf expression must ultimately be converted to a named delegate type. The compiler then inserts an instantiation for the correct delegate type at the point of conversion. If the target delegate type is not specified (by converting the AddressOf expression to Object ) or is ambiguous (as is possible in overload resolution), an error will result.

An event, then, is made up of three things in the CLR: a delegate type that defines the signature of the event, a method to add a new handler for the event, and a method that removes an existing handler for the event. When a class wishes to handle an event raised by a class, it first creates a new delegate of the type of the event on the method that will handle the event. It then passes this delegate to the add handler method of the event. The add handler method combines the delegate being passed in with all the other delegates that it has been called with to handle the event. When the event is raised, it simply invokes the delegate with the appropriate parameters, which calls each handler in turn . If a class wishes to stop handling the event, it simply creates another delegate on the handler that it wants to stop having called and calls the remove handler method with the delegate. The remove handler method then removes the particular delegate from the delegate list that it is maintaining for the event.

Visual Basic .NET significantly simplifies the process of defining and handling events. Although we give you access to most of the low-level definition of events, most of the time you don't need to bother with the inner guts. For example, you can declare an event with just a signature, and the compiler will define the event delegate for you under the covers. Also, if you declare a field with the WithEvents modifier, you can then declare methods in the class with the Handles clause, and the compiler will automatically call the add and remove event methods for you. For example:

 Class Raiser   Public Event E()   Public Sub Raise()     RaiseEvent E()   End Sub End Class Class Handler   Public WithEvents R As Raiser   Public Sub New()     R = New Raiser()   End Sub   Public Sub DoRaise()     R.Raise()   End Sub   Public Sub HandleE() Handles R.E   End Sub End Class

is equivalent to the following.

 Class Raiser   Public Delegate Sub EEventHandler()   Public Event E As EEventHandler   Public Sub Raise()     RaiseEvent E()   End Sub End Class Class Handler   Private RLocal As Raiser   Public Property R As Raiser     Get       Return RLocal     End Get     Set (ByVal Value As Raiser)       If Not RLocal Is Nothing Then         R.remove_E(new EEventHandler(AddressOf HandleE))       End If       RLocal = Value       If Not RLocal Is Nothing Then         R.add_E(new EEventHandler(AddressOf HandleE))       End If     End Set   End Property   Public Sub DoRaise()     R.Raise()   End Sub   Public Sub HandleE()   End Sub End Class

It is worth noting that there are some details, such as the add and remove methods of the event, that Visual Basic .NET does not allow you to define for yourself, although this may change in future versions.

< Day Day Up >