Other Variant Subtypes

Flexibility is the fundamental reason to use Variants. But the built-in flexibility of Variants is not advertised enough, and consequently they tend to be underused. The use of Empty, Null, and Variant arrays—and now in version 6, UDTs—remain underused in the Visual Basic programmer community.

Empty and Null

Any uninitialized Variant has the Empty value until something is assigned to it. This is true for all variables of type Variant, whether Public, Private, Static, or local. This is the first feature to distinguish Variants from other data types—you cannot determine whether any other data type is uninitialized.

As well as testing for VarType zero, a shorthand function exists—IsEmpty—which does the same thing but is more readable.

In early versions of Visual Basic, once a Variant was given a value, the only way to reset it to Empty was to assign it to another Variant that itself was empty. In Visual Basic 5 and 6, you can also set it to the keyword Empty, as follows:

v1 = Empty

I like Empty, although I find it is one of those things that you forget about and sometimes miss opportunities to use. Coming from a C background, where there is no equivalent, isn't much help either. But it does have uses in odd places, so it's worth keeping it in the back of your mind. File under miscellaneous.

Of course, Null is familiar to everyone as that database "no value" value, found in all SQL databases. But as a Variant subtype it can be used to mean no value or invalid value in a more general sense—in fact, in any sense that you want to use it. Conceptually, it differs from Empty in that it implies you have intentionally set a Variant to this value for some reason, whereas Empty implies you just haven't gotten around to doing anything with the Variant yet.

As with Empty, you have an IsNull function and a Null keyword that can be used directly.

Visual Basic programmers tend to convert a variable with a Null value—read, say from a database—to something else as quickly as possible. I've seen plenty of code where Null is converted to empty strings or zeros as soon as it's pulled out of a recordset, even though this usually results in information loss and some bad assumptions. I think this stems from the fact that the tasks we want to perform with data items—such as display them in text boxes or do calculations with them—often result in the all too familiar error 94, "Invalid use of Null."

This is exacerbated by the fact that Null propagates through expressions. Any arithmetic operator (+, -, *, /, \, Mod, ^) or comparison operator (<, >, =, <>) that has a Null as one of its operands will result in a Null being the value of the overall expression, irrespective of the type or value of the other operand. This can lead to some well-known bugs, such as:

v = Null If v = Null Then     MsgBox "Hi"      End if

In this code, the message "Hi" will not be displayed because as v is Null, and = is just a comparison operator here, the value of the expression v = Null is itself Null. And Null is treated as False in If...Then clauses.

The propagation rule has some exceptions. The string concatenation operator & treats Null as an empty string "" if one of its operands is a Null. This explains, for example, the following shorthand way of removing Null when reading values from a database:

v = "" & v

This will leave v unchanged if it is a string, unless it is Null, in which case it will convert it to "".

Another set of exceptions is with the logical operators (And, Eqv, Imp, Not, Or, Xor). Here Null is treated as a third truth value, as in standard many-valued logic. Semantically, Null should be interpreted as unsure in this context, and this helps to explain the truth tables. For example:

v = True And Null

gives v the value Null, but

v = True Or Null

gives v the value True. This is because if you know A is true, but are unsure about B, then you are unsure about A and B together, but you are sure about A or B. Follow?

By the way, watch out for the Not operator. Because the truth value of Null lies halfway between True and False, Not Null must evaluate to Null in order to keep the logical model consistent. This is indeed what it does.

v = Not Null If IsNull(v) Then MsgBox "Hi"  ' You guessed it...

That's about all on Null—I think it is the trickiest of the Variant subtypes, but once you get to grips with how it behaves, it can add a lot of value.

Arrays

Arrays are now implemented using the OLE data type named SAFEARRAY. This is a data type that, like Variants and classes, allows arrays to be self-describing. The LBound and number of elements for each dimension of the array are stored in this structure. Within the inner workings of OLE, all access to these arrays is through an extensive set of API calls implemented in the system library file OLEAUT32.DLL. You do not get or set the array elements directly, but you use API calls. These API calls use the LBound and number of elements to make sure they always write within the allocated area. This is why they are safe arrays—attempts to write to elements outside the allowed area are trapped within the API and gracefully dealt with.²

The ability to store arrays in Variants was new to Visual Basic 4, and a number of new language elements were introduced to support them such as Array and IsArray.

To set up a Variant to be an array, you can either assign it to an already existing array or use the Array function. The first of these methods creates a Variant whose subtype is the array value (8192) added to the value of the type of the original array. The Array function, on the other hand, always creates an array of Variants—VarType 8204 (which is 8192 plus 12).

The following code shows three ways of creating a Variant array of the numbers 0, 1, 2, 3:

Dim v As Variant Dim a() As Integer Dim i As Integer ' Different ways to create Variant arrays ' 1. Use the Array function v = Array(0, 1, 2, 3) 'of little practical use v = Empty ' 2. Create a normal array, and assign it to a Variant.  ' Iterate adding elements using a normal array... For i = 0 To 3     ReDim Preserve a(i) As Integer     a(i) =  i Next i ' ...and copy array to a Variant v = a 'or v = a() ' but not v() = a() v = Empty ' 3. Start off with Array, and then ReDim to preferred size ' avoiding use of intermediate array. For i = 0 To 3     ' First time we need to create array     If IsEmpty(v) Then         v = Array(i)     Else         ' From then on, ReDim Preserve will work on v         ReDim Preserve v(i)     End If     v(i) = i Next i

Notice that the only difference between the last two arrays is that one is a Variant holding an array of integers and the other is a Variant holding an array of Variants. It can be easy to get confused here, look at the following:

ReDim a(5) As Variant

This code is creating an array of Variants, but this is not a Variant array. What consequence does this have? Not much anymore. Before version 6 you could utilize array copying only with Variant arrays, but now you can do this with any variable-sized array.

So what is useful about placing an array in a Variant? As Variants can contain arrays, and they can be arrays of Variants, those contained Variants can themselves be arrays, maybe of Variants, which can also be arrays, and so on and so forth.

Just how deep can these arrays be nested? I don't know if there is a theoretical limit, but in practice I have tested at least 10 levels of nesting. This odd bit of code works fine:

Dim v As Variant, i As Integer ' Make v an array of two Variants, each of which is an array  ' of two Variants, each of...and so on For i = 0 To 10     v = Array(v, v)   Next i ' Set a value... v(0)(0)(0)(0)(0)(0)(0)(0)(0)(0)(0) = 23

How do these compare to more standard multidimensional arrays? Well, on the positive side, they are much more flexible. The contained arrays—corresponding to the lower dimensions of a multidimensional array—do not have to have the same number of elements. Figure 4-2 explains the difference pictorially.

Figure 4-2 The difference between a standard two-dimensional array (top) and a Variant array (bottom)

These are sometimes known as ragged arrays. As you can see from the diagram, we do not have all the wasted space of a multidimensional array. However you have to contrast that with the fact that the Variant "trees" are harder to set up.

This ability of Variants to hold arrays of Variants permits some interesting new data structures in Visual Basic. One obvious example is a tree. In this piece of code, an entire directory structure is folded up and inserted in a single Variant:

Private Sub Form_Load()     Dim v As Variant     v = GetFiles("C:\") ' Places contents of C: into v End Sub Public Function GetFiles(ByVal vPath As Variant) As Variant     ' NB cannot use recursion immediately as Dir     ' does not support it, so get array of files first     Dim vDir As Variant, vSubDir As Variant, i          vDir = GetDir(vPath)          ' Now loop through array, adding subdirectory information.     If Not IsEmpty(vDir) Then         For i = LBound(vDir) To UBound(vDir)             ' If this is a dir, then...             If (GetAttr(vDir(i)) And vbDirectory) = vbDirectory Then                 ' replace dir name with the dir contents.                 vDir(i) = GetFiles(vDir(i))             End If         Next i     End If         GetFiles = vDir      End Function Private Function GetDir(ByVal vPath As Variant) As Variant     ' This function returns a Variant that is an array     ' of file and directory names (not including "." or "..")     ' for a given directory path.     Dim vRet As Variant, fname As Variant          ' Add \ if necessary.     If Right$(vPath, 1) <> "\" Then vPath = vPath & "\"          ' Call the Dir function in a loop.     fname = Dir(vPath, vbNormal & vbDirectory)     Do While fname <> ""         If fname <> "." And fname <> ".." Then             vRet = AddElement(vRet, vPath & fname)         End If         fname = Dir()     Loop                  ' Return the array.     GetDir = vRet End Function Public Function AddElement(ByVal vArray As Variant, _     ByVal vElem As Variant) As Variant     ' This function adds an element to a Variant array     ' and returns an array with the element added to it.     Dim vRet As Variant ' To be returned          If IsEmpty(vArray) Then         ' First time through, create an array of size 1.         vRet = Array(vElem)     Else         vRet = vArray         ' From then on, ReDim Preserve will work.         ReDim Preserve vRet(UBound(vArray) + 1)         vRet(UBound(vRet)) = vElem     End If          AddElement = vRet      End Function

Using + for String Concatenation
This misconceived experiment with operator overloading was considered bad form even back in the days of Visual Basic 2, when the string concatenation operator & was first introduced. Yet it's still supported in Visual Basic 6. In particular, since version 4 brought in extensive implicit type conversion between numerics and strings, this issue has become even more important. It's easy to find examples of how you can get tripped up. Can you honestly be confident of what the following will print?
Debug.Print "56" + 48 Debug.Print "56" + "48" Debug.Print "56" - "48" 
What should happen is that adding two strings has the same effect as subtracting, multiplying, or dividing two strings—that is, the addition operator should treat the strings as numeric if it can; otherwise, it should generate a type mismatch error. Unfortunately, this is not the case. The only argument for why the operator stays in there, causing bugs, is backward compatibility.

One point to note about this code is that this is an extremely efficient way of storing a tree structure, because as v is a multidimensional ragged array, the structure contains less wasted space than its equivalent multidimensional fixed-sized array. This contrasts with the accusation usually leveled at Variants, that they waste a lot of memory space.

User-Defined Types

The rehabilitation of UDTs was the biggest surprise for me in version 6 of Visual Basic. It had looked as if UDTs were being gradually squeezed out of the language. In particular, the new language features such as classes, properties, and methods did not seem to include UDTs. Before version 6, it was not possible to

have a UDT as a public property of a class or form.
pass a UDT as a parameter ByVal to a sub or function.
have a UDT as a parameter to a public method of a class or form.
have a UDT as the return type of a public method of a class or form.
place a UDT into a Variant.

But this has suddenly changed and now it is possible in version 6 to perform most of these to a greater or lesser extent. In this chapter, I am really only concentrating on the last point, that of placing a UDT into a Variant.

Restrictions are imposed on the sorts of UDTs that can be placed in a Variant. They must be declared within a public object module. This rules out their use within Standard EXE programs, as these do not have public object modules. This is a Microsoft ActiveX-only feature. Internally, the Data portion of the Variant structure is always a simple pointer to an area of memory where the UDT's content is sitting. The Type is always 36. This prompts the question of where and how the meta-data describing the fields of the UDT is kept. Remember that all other Variant subtypes are self-describing, so UDTs must be, too. The way it works is that from the Variant you can also obtain an IRecordInfo interface pointer. That interface has functions that return everything you want to know about the UDT.

We are able to improve substantially on the nesting ability demonstrated earlier with Variant arrays. While it is still impossible to have a member field of a UDT be that UDT itself—a hierarchy that is commonly needed—you can use a Variant and sidestep the circular reference trap. The following code shows a simple example of an employee structure (Emp) in an imaginary, not-so-progressive organization (apologies for the lack of originality). The boss and an array of workers are declared as Variant—these will all in fact be Emps themselves. GetEmp is just a function that generates Emps.

' In Class1 Public Type Emp         Name As Variant     Boss As Variant        Workers() As Variant  End Type ' Anywhere Class1 is visible: Sub main()     Dim a As Emp     a.Name = "Adam"     a.Boss = GetEmp(1)     a.Workers = Array(GetEmp(2), GetEmp(3)) End Sub Private Function GetEmp(ByVal n) As Emp     Dim x As Emp     x.Name = "Fred" & n     GetEmp = x End Function

Note that this code uses the ability to return a UDT from a function. Also, the Array function always creates an array of Variants, so this code now works because we can convert the return value of GetEmp to a Variant.

Interface Inviolability
If you're like me, you may well have experienced the frustration of creating ActiveX components (in-process or out-of-process, it doesn't matter) and then realizing you need to make a tiny upgrade.
You don't want to change the interface definition because then your server is no longer compatible, the CLSID has changed, and you get into all the troublesome versioning complexity. Programs and components that use your component will all have problems or be unable to automatically use your upgraded version.
There isn't a lot you can do about this. Visual Basic imposes what is a very good discipline on us with its version compatibility checking, though it is sometimes a bitter pill to swallow.
In this respect, the flexibility gained by using Variants for properties and methods' parameters can be a great headache saver.

Accessing an invalid property like this will not be caught at compile time. This is analagous to the behavior of using Variants to hold objects described earlier. Similarly, the VarType of a.Workers is 8204—vbArray + vbVariant. Visual Basic does not know what is in this array. If we rewrote the above code like this:

This time the VarType of a.Workers is 8228—vbArray + vbUserDefinedType. In other words, Visual Basic knows that Workers is an array of Emps, not an array of Variants. This has similarities to the late-bound and early-bound issue with objects and classes. (See "How Binding Affects ActiveX Component Performance" in the Visual Basic Component Tools Guide.) At compile time, however, the checking of valid methods and properties is still not possible because the underlying declaration is Variant.

The alternative way of implementing this code would be to create a class called Emp that had other Emps within it—I'm sure you've often done something similar to this. What I find interesting about the examples above is the similarity they have with this sort of class/object code—but no objects are being created here. We should find performance much improved over a class-based approach because object creation and deletion still take a relatively long time in Visual Basic. This approach differs slightly in that an assignment from one Variant containing a UDT to another Variant results in a deep copy of the UDT. So in the above examples, if you copy an Emp, you get a copy of all the fields and their contents. With objects, you are just copying the reference and there is still only one underlying object in existence. Using classes rather than UDTs for this sort of situation is still preferable given the many other advantages of classes, unless you are creating hundreds or thousands of a particular object. In this case, you might find the performance improvement of UDTs compelling.

Empty and Null

Arrays

User-Defined Types

Functions are trying to do more than one task

A new class needs to be defined

Functions are returning some data and some related meta-data

Functions are returning some data and an indication of the function's success

Side effects when passing by reference