Type System Additions | The Visual Basic .NET Programming Language

< Day Day Up >

The CLR type system is considerably richer than the COM type system, which required a number of concepts to be added to the Visual Basic language. The most important was inheritance, but overloading, namespaces, and name hiding were also significant changes.

Classes

In previous versions of Visual Basic, a class was equivalent to a COM coclass . A coclass is a collection of supported interfaces without a distinct identity ”when creating a coclass, the creator always requests one of the interfaces that the coclass implements and never requests the coclass itself. Given one interface supported by a coclass, it is possible to request other supported interfaces through the IUnknown interface that all coclasses support. In general, though, it is not possible to know what coclass an object is, given just an interface pointer.

Previous versions of Visual Basic simplified this situation slightly through the concept of default interfaces . When a class was defined in Visual Basic 6.0, a hidden interface was created that contained all the public members defined in the class. The class then implemented that default interface. Because everything had to be exposed in COM through interfaces, fields had to be exposed as properties. For example, given the following class definition in Visual Basic 6.0:

 Public Value As Integer Public Sub PrintValue()   ... End Sub Public Function CalculateValue() As Integer   ... End Function

the following would have been generated:

An interface named _Class1 that contained four members: a property Get method for Value , a property Let method for Value , a subroutine PrintValue , and a function CalculateValue
A coclass named Class1 that implemented _Class1

In contrast to a COM coclass, a CLR class is a first-class type with its own identity and storage managed by the runtime. Classes can implement interfaces, but a reference ”even to an interface ”always refers to a particular instance of a class . It is never possible to refer to just an interface on its own. There are many benefits to this scheme (not the least of which is that data members do not have to be expressed as properties), but it does make interoperability between versions of Visual Basic a little more difficult.

Because the unit of identity changed between COM and the CLR, Visual Basic's behavior changed subtly in how references to interfaces and classes work. For example, given a class, Class1 , that implements an interface, Interface1 , the following code will produce two different answers on COM and on the CLR.

 Sub Main()   Dim c As Class1   Set c = New Class1      ' VB6 syntax used   Dim i As Interface1   Set i = c               ' VB6 syntax used   Debug.Print TypeName(i) ' VB6 syntax used End Sub

Under COM, the name printed will be Interface1 because that is the unit of identity in COM (i.e., there is no true concept of Class1 ). On the CLR, however, the name printed will be Class1 because the thing that holds identity is the class, not the interface. No matter what interface a class is cast to, the CLR still knows what class the instance is.

The question then became how the COM concepts should be mapped to the CLR concepts when a COM object is being accessed from the CLR. The simplest answer would be to map a COM coclass to a CLR class and a COM interface to a CLR interface. However, in this case the tricks that Visual Basic 6.0 played in COM become troublesome on the CLR. Because the unit of identity in COM is the interface, that is what is used when a COM method is called in COM. So, given a coclass, Class1 , a method taking a Class1 would actually be expressed as taking the default interface, _Class1 . This was fine in Visual Basic 6.0 because the compiler hid the distinction. But since both classes and interfaces are first-class types in Visual Basic .NET, this means that using COM objects would require a lot of casting if strict type checking (new in Visual Basic .NET) is used. For example, given the following class, named Class1 , defined in Visual Basic 6.0:

 Private Value As Integer Public Function CreateNewInstance() As Class1   Set CreateNewInstance = New Class1() End Function

the class would have to be used in Visual Basic .NET as such.

 Option Strict On Module Test   Sub Main()     Dim x As Class1 = New Class1()     ' CType is required without extra work     Dim y As Class1 = CType(x.CreateNewInstance(), Class1)   End Sub End Module

The return value of CreateNewInstance has to be explicitly cast to the class Class1 because strict type checking requires that a cast from an interface to a class that supports the interface be explicit (since the instance might be of some other class type).

To avoid forcing the distinction between interfaces and classes on users as soon as they attempt to upgrade code (since it is likely that Visual Basic .NET code will want to use COM objects), we slightly modified the language rules to make this situation more straightforward. Instead of the simple mapping, the CLR itself changes the name of the default interface to the name of the coclass and appends the word "Class" onto the end of the name of the coclass. So the coclass Class1 with the default interface _Class1 would be mapped to the class Class1Class and the interface Class1 .

The result of this is that a user creating an instance of a COM class will appear to actually be instantiating an instance of the default interface. Under the covers, the compiler then maps this to an instantiation of the coclass. But now the variable is correctly typed and will not require any casts to be assigned to by an API or passed to another API.

Inheritance

Adding inheritance to the Visual Basic .NET language was relatively straightforward. The main set of decisions that had to be made were how explicit to make the various inheritance concepts and what terms to use to refer to them. Since Visual Basic is a language that has a long history that does not include inheritance, we felt it was best to make the most conservative choices in regard to defaults. We also felt that it would be better to choose more descriptive (and more verbose) keywords rather than using C++-style keywords.

When implementing inheritance, we had to choose what kind of name hiding would be done across the inheritance chain. A robust name-hiding scheme was seen as critical to avoiding many of the versioning issues that Visual Basic had with COM. In particular, it was important to be able to "drop in" a new version of a base class library that contained new methods and have code compiled against it continue to work. The canonical example of this situation would be ASP.NET ”it is essential to be able to upgrade ASP.NET without requiring all the Web pages on a Web site to be recoded.

For simplicity, the default name-hiding semantic we initially chose was hide by name . In other words, a member defined with a particular name would hide all members by that name in any base classes. The Shadows keyword was added to the language, but purely for developer awareness ”all members were implicitly marked shadow by name, and omitting Shadows only caused a warning rather than an error.

This choice, however, presented a problem with overloading, which we were also adding to the language. Given a hide-by-name semantic, it is not possible for a method to be overloaded across a base and derived class, because the derived class members hide the base class members by the same name. At this point, it would have been possible to decide that overloading across inheritance was not allowed, but this didn't seem like a good solution. A more desirable result would be to allow developers to explicitly state that they wished to overload a method in a base class rather than hide it. To this end, we added the Overloads keyword. The Overloads keyword specifies that a member is hide by name and signature rather than hide by name. A method that hides by name and signature will only hide a method with the same name and exact signature, thus allowing overloading across the inheritance hierarchy.

 Class Base   Public Sub A(ByVal x As Integer)     ...   End Sub   Public Sub B(ByVal x As Integer)     ...   End Sub End Class Class Derived   Inherits Base   Public Shadows Sub A(ByVal y As Double)     ...   End Sub   Public Overloads Sub B(ByVal y As Double)     ...   End Sub   Public Sub C()     A(10)    ' Calls Derived.A     B(10)    ' Calls Base.B   End Sub End Class

In this example, the method Derived.A shadows the member Base.A because it specifies the keyword Shadows . It does this even though the two signatures don't match. The method Derived.B , on the other hand, overloads the member Base.B because it specifies the keyword Overloads . Keep in mind, though, that if Derived.A and Base.B had the same signature, Derived.B would still hide Base.B . It's not possible to have two methods with the same signature be visible at the same time!

Design

In retrospect, our choice of the keyword Overloads in this situation was unfortunate. Although it accurately describes its function, people get easily confused into thinking that the keyword is required when they are doing overloading within a class, which isn't the case. This has confused even the most advanced Visual Basic .NET programmers.

Given that a method could have one of two separate name-hiding semantics associated with it, the question arose as to which should be the default. We had initially chosen Shadows as the default, but would it make more sense to choose Overloads as the default? Ultimately, we decided that it did not make more sense, because of the implications for overload resolution. Because overload resolution chooses among all the methods in the inheritance hierarchy with the same name (see the Overloading section for more details on overload resolution), a method marked as Overloads by default was vulnerable to changes in the base class.

For example, take the following situation: A base class vendor, Acme, produces a class, Base .

 Class Base   ... End Class

Another company, MegaCorp, buys Acme's product and creates a derived class, Derived .

 Class Derived   Inherits Base   Public Overloads Sub Print(ByVal y As Double)     ...   End Sub   Public Overloads Sub Print(ByVal y As Integer)     ...   End Sub   Public Sub DoWork()     ...     Print(4S)     ...   End Sub End Class

In this example, the method DoWork calls Print with the Short value 4. Because there is no exact overload, the method Print(Integer) is chosen as the best overload.

Now, after some period of time, Acme releases an upgrade to class Base , adding a method called Print to the class.

 Class Base   Public Overloads Sub Print(ByVal s As Short)     ...   End Sub End Class

MegaCorp purchases the new base class, installs it, and rebuilds Derived . The problem is that because Derived.Print was marked as Overloads , the method Base.Print will now be incorporated into the overload resolution for the name Print . In this case, Base.Print(Short) is the best overload, so now Derived.DoWork will silently start calling Base.Print , even though that method may do something radically different than Derived.Print does!

The only way to completely solve this issue would be to choose a method of overload resolution similar to the one that C# uses, as discussed in the Overloading section. Barring that, choosing Shadows as the default name-hiding semantic for methods seemed the safest choice. There is still some danger in using Overloads , but the risk can be assumed explicitly by the developer when adding the keyword.

Overloading

Because overloading is a fundamental part of the CLR, it was necessary to add it to the Visual Basic .NET language. Visual Basic already had a similar mechanism for dealing with optional parameters, but in most cases, overloading is a more robust way of accomplishing the same goals. This is especially true since the values of optional parameters are compiled into the caller rather than staying under the control of the method being called.

As discussed in the Inheritance section, the primary question that had to be answered regarding overloading was how overloading would interact with inheritance. We took what we thought was the simplest answer to the question, figuring that it would be the most straightforward and understandable: When doing overload resolution on a method, the language considers all the methods by a particular name in the inheritance hierarchy at once. This means that the most specific overload is always guaranteed to be chosen.

The downside of this is that, as previously discussed, it creates fragility in a derived class in the face of base class changes. An alternative method was considered that was closer to C#'s method of overload resolution: Consider all the methods overloaded on a name one class at a time, starting with the most derived class and moving to the most base class. This would have solved the fragility issue, but we felt that for Visual Basic programmers the results would be nonintuitive, because a less specific overload in a derived class might get chosen over a more specific overload in a base class. For example, C# and VB will choose different overloads in the following situation.

 Class Base   Public Overloads Sub Print(ByVal s As Short)     ...   End Sub End Class Class Derived   Inherits Base   Public Overloads Sub Print(ByVal y As Double)     ...   End Sub   Public Overloads Sub Print(ByVal y As Integer)     ...   End Sub   Public Sub DoWork()     Print(4S)   End Sub End Class

Because C# considers overloads one level at a time, it will choose Derived.Print(Integer) as the best overload. Because VB considers all the overloads at once, it will choose Base.Print(Short) because it most closely matches the argument.

One other interesting thing to note about overload resolution in Visual Basic .NET is the way that arguments typed as Object are treated. Overload resolution against arguments typed purely as Object will, given a straightforward set of resolution rules, result in an ambiguity. This is because Object is the root type of the type system, so a conversion from Object to any other type will be a narrowing conversion. Thus, the compiler will be unable to determine a most specific argument.

To enable calls to overloaded methods when a variable is typed as Object (especially when the As clause has been omitted), an additional rule was added. If overload resolution fails solely because of arguments typed as Object , the resolution of the call is deferred until runtime. In other words, the call is implicitly turned into a late-bound call ( assuming strict semantics are not being used, because strict semantics disallow late binding). The late-bound method invocation code has the ability to perform overload resolution at runtime, allowing the resolution to be done on the actual type of the parameter at runtime.

Namespaces

Visual Basic already had an extremely rudimentary concept of a namespace ”in previous versions, the name of the project functioned as a namespace for everything declared within it. This allowed for multiple projects to have something named Class1 yet still reference one another. On the CLR, the concept just had to be extended to arbitrarily complex namespace schemes. We felt that most users would not want to see Namespace statements in their source code, so we still retained the original concept of a project-wide namespace. Essentially, every project defines a namespace that all declarations in the project are implicitly wrapped in. This allows users to define their own namespace hierarchies if they wish, but by default each project gets its own namespace. (The default namespace can be overridden, if needed.)

< Day Day Up >