Using Variants Instead of Simple Data Types

In this section I'll discuss the pros and cons of using Variants in place of simple data types such as Integer, Long, Double, and String. This is an unorthodox practice—the standard approach is to avoid the use of Variants for a number of reasons. We'll look at the counterarguments first.

Performance Doesn't Matter

Every journal article on optimizing Visual Basic includes a mention of how Variants are slower than underlying first-class data types. This should come as no surprise. For example, when iterating through a sequence with a Variant of subtype Integer, the interpreted or compiled code must decode the structure of the Variant every time the code wants to use its integer value, instead of accessing an integer value directly. There is bound to be an overhead to doing this.

Plenty of authors have made a comparison using a Variant as a counter in a For loop, and yes, a Variant Integer takes about 50 percent more time than an Integer when used as a loop counter. This margin decreases as the data type gets more complex, so a Variant Double is about the same as a Double, whereas, surprisingly, a Variant Currency is quicker than a Currency. If you are compiling to native code, the proportions can be much greater in certain cases.

Is this significant? Almost always it is not. The amount of time that would be saved by not using Variants would be dwarfed by the amount of time spent in loading and unloading forms and controls, painting the screen, talking to databases, and so on. Of course, this depends on the details of your own application, but in most cases it is highly unlikely that converting local variables from Variants to Integers and Strings will speed up your code noticeably.

When optimizing, you benefit by looking at the bigger picture. If your program is too slow, you should reassess the whole architecture of your system, concentrating in particular on the database and network aspects. Then look at user interface and algorithms. If your program is still so locally computation-intensive and time-critical that you think significant time can be saved by using Integers rather than Variants, you should be considering writing the critical portion in C++ and placing this in a DLL.

Taking a historical perspective, machines continue to grow orders of magnitude faster, which allows software to take more liberties with performance. Nowadays, it is better to concentrate on writing your code so that it works, is robust, and is extensible. If you need to sacrifice efficiency in order to do this, so be it—your code will still run fast enough anyway.

Memory Doesn't Matter

A common argument against Variants is that they take up more memory than do other data types. In place of an Integer, which normally takes just 2 bytes of memory, a Variant of 16 bytes is taking eight times more space. The ratio is less, of course, for other underlying types, but the Variant always contains some wasted space.

The question is, as with the issue of performance in the previous section, how significant is this? Again I think not very. If your program has some extremely large arrays—say, tens of thousands of integers—an argument could be made to allow Integers to be used. But they are the exception. All your normal variables in any given program are going to make no perceptible difference whether they are Variants or not.

I'm not saying that using Variants improves performance or memory. It doesn't. What I'm saying is that the effect Variants have is not a big deal—at least, not a big enough deal to outweigh the reasons for using them.

Type Safety

A more complex argument is the belief that Variants are poor programming style—that they represent an unwelcome return to the sort of dumb macro languages that encouraged sloppy, buggy programming.

The argument maintains that restricting variables to a specific type allows various logic errors to be trapped at compile time, an obviously good thing. Variants, in theory, take away this ability.

To understand this issue fully we must first look at the way non-Variant variables behave. In the following pages I have split this behavior into four key parts of the language, and have contrasted how Variants behave compared to simple data types in each of these four cases:

  • Assignment

  • Function Calls

  • Operators and Expressions

  • Visual Basic Functions

Case 1: Assignment between incompatible variables

Consider the following code fragment (Example A):

Dim i As Integer, s As String s = "Hello" i = s 

What happens? Well, it depends on which version of Visual Basic you run. In pre-OLE versions of Visual Basic you got a Type mismatch error at compile time. In Visual Basic 6, there are no errors at compile time, but you get the Type mismatch trappable error 13 at run time when the program encounters the i = s line of code.

NOTE


Visual Basic 4 was rewritten using the OLE architecture; thus, versions 3 and earlier are "pre-OLE."

The difference is that the error occurs at run time instead of being trapped when you compile. Instead of you finding the error, your users do. This is a bad thing.

The situation is further complicated because it is not the fact that s is a String and i is an Integer that causes the problem. It is the actual value of s that determines whether the assignment can take place.

This code succeeds, with i set to 1234 (Example B):

Dim i As Integer, s As String s = "1234" i = s 

This code in Example C does not succeed (you might have thought that i would be set to 0, but this is not the case):

Dim i as Integer, s As String s = "" i = s 

These examples demonstrate why you get the error only at run time. At compile time the compiler cannot know what the value of s will be, and it is the value of s that decides whether an error occurs.

The behavior is exactly the same with this piece of code (Example D):

Dim i As Integer, s As String s = "" i = CInt(s) 

As in Example C, a type mismatch error will occur. In fact, Example C is exactly the same as Example D. In Example C, a hidden call to the CInt function takes place. The rules that determine whether CInt will succeed are the same as the rules that determine whether the plain i = s will succeed. This is known as implicit type conversion, although some call it "evil" type coercion.

The conversion functions CInt, CLng, and so on, are called implicitly whenever there is an assignment between variables of different data types. The actual functions are implemented within the system library file OLEAUT32.DLL. If you look at the exported functions in this DLL, you'll see a mass of conversion functions. For example, you'll see VarDecFromCy to convert a Currency to a Decimal, or VarBstrFromR8 to convert a string from an 8-byte Real, such as a Double. The code in this OLE DLL function determines the rules of the conversion within Visual Basic.

If the CInt function had worked the same way as Val does, the programming world would've been spared a few bugs (Example E).

Dim i As Integer, s As String s = "" i = Val(s) 

This example succeeds because Val has been defined to return 0 when passed the empty string. The OLE conversion functions, being outside the mandate of Visual Basic itself, simply have different rules (Examples F and G).

Dim i As Integer, s As String s = "1,234" i = Val(s) Dim i As Integer, s As String s = "1,234" i = CInt(s) 

Examples F and G also yield different results. In Example F, i becomes 1, but in Example G, i becomes 1234. In this case the OLE conversion functions are more powerful in that they can cope with the thousands separator. Further, they also take account of the locale, or regional settings. Should your machine's regional settings be changed to German standard, Example G will yield 1 again, not 1234, because in German the comma is used as the decimal point rather than as a thousands separator. This can have both good and bad side effects.

These code fragments, on the other hand, succeed in all versions of Visual Basic (Examples H and I):

Dim i As Variant, s As Variant s = "Hello" i = s  Dim i As Variant, s As Variant s = "1234" i = s 

In both the above cases, i is still a string, but why should that matter? By using Variants throughout our code, we eliminate the possibility of type mismatches during assignment. In this sense, using Variants can be even safer than using simple data types, because they reduce the number of run-time errors. Let's look now at another fundamental part of the syntax and again contrast how Variants behave compared to simple data types.

LOCALE EFFECTS

Suppose you were writing a little calculator program, where the user types a number into a text box and the program displays the square of this number as the contents of the text box change.

Private Sub Text1_Change()     If IsNumeric(Text1.Text) Then         Label1.Caption = Text1.Text * Text1.Text     Else         Label1.Caption = ""     End If End Sub 

Note that the IsNumeric test verifies that it is safe to multiply the contents of the two text boxes without fear of type mismatch problems. Suppose "1,000" was typed into the text box—the label underneath would show 1,000,000 or 1, depending on the regional settings. On the one hand, it's good that you get this international behavior without performing any extra coding, but it could also be a problem if the user was not conforming to the regional setting in question. Further, to prevent this problem, if a number is to be written to a database or file, it should be written as a number without formatting, in case it is read at a later date on a machine where the settings are different.

Also, you should also avoid writing any code yourself that parses numeric strings. For example, if you were trying to locate the decimal point in a number using string functions, you might have a problem:

InStr(53.6, ".") 

This line of code will return 3 on English/American settings, but 0 on German settings.

Note, finally, that Visual Basic itself does not adhere to this convention in its own source code. The number 53.6 means the same whatever the regional settings. We all take this for granted, of course.

Case 2: Function parameters and return types

Consider the following procedure:

Sub f(ByVal i As Integer, ByVal s As String) End Sub 

This procedure is called by the following code:

Dim i As Integer, s As String s = "Hello" i = 1234 Call f(s, i) 

You'll notice I put the parameters in the wrong order.

With pre-OLE versions of Visual Basic you get a Parameter Type Mismatch error at compile time, but in Visual Basic 4, 5, and 6 the situation is the same as in the previous example—a run-time type mismatch, depending on the value in s, and whether the implicit CInt could work.

Instead, the procedure could be defined using Variants:

Sub f(ByVal i As Variant, ByVal s As Variant) End Sub 

The problem is that you might reasonably expect that after assigning 6.4 to x in the procedure subByRef, which is declared in the parameter list as a Variant, Debug.Print would show 6.4. But instead it shows only 6.

Now no run-time errors or compile-time type mismatch errors occur. Of course, it's not necessarily so obvious by looking at the declaration what the parameters mean, but then that's what the parameter name is for.

Returning to our survey of how Variants behave compared to simple data types, we now look at expressions involving Variants.

Case 3: Operators

I have already suggested, for the purposes of assignment and function parameters and return values, that using Variants cuts down on problematic run-time errors. Does this also apply to the use of Visual Basic's own built-in functions and operators? The answer is, "It depends on the operator or function involved."

Arithmetic operators All the arithmetic operators (such as +, -, *, \, /, and ^) evaluate their parameters at run time and throw the ubiquitous type mismatch error if the parameters do not apply. With arithmetic operators, there is neither an advantage nor a disadvantage to using Variants instead of simple data types; in either case, it's the value, not the data type, that determines whether the operation can take place. In Example A, we get type mismatch errors on both lines:

Dim s As String, v As Variant s = "Fred" v = "Fred" s = s - s v = v - v 

But in Example B, these lines both succeed:

Dim s As String, v As Variant s = "123" v = "123" s = s - s v = v - v 

A lot of implicit type conversion is going on here. The parameters of "-" are converted at run time to Doubles before being supplied to the subtraction operator itself. CDbl("Fred") does not work, so both lines in Example A fail. CDbl("123") does work, so the subtraction succeeds in both lines of Example B.

There is one slight difference between v and s after the assignments in Example B: s is a string of length 1 containing the value 0, while v is a Variant of subtype Double containing the value 0. The subtraction operator is defined as returning a Double, so 0 is returned in both assignments. This is fine for v - v, which becomes a Variant of subtype Double, with value 0. On the other hand, s is a string, so CStr is called to convert the Double value to 0.

All other arithmetic operators behave in a similar way to subtraction, with the exception of +.

Option "Strict Type Checking"

Some other authors have argued for the inclusion of another option along the lines of "Option Explicit" that would enforce strict type checking. Assignment between variables of different types would not be allowed and such errors would be trapped at compile time. The conversion functions such as CInt and CLng would need to be used explicitly for type conversion to take place.

This would effectively return the Visual Basic language to its pre-OLE style, and Examples A, B, and C would all generate compile-time errors. Example D would still return a run-time type mismatch, however.

Examples E, F, and G would succeed with the same results as above. In other words, code using Variants would be unaffected by the feature.

Comparison operators We normally take the comparison operators (such as <, >, and =) for granted and don't think too much about how they behave. With Variants, comparison operators can occasionally cause problems.

The comparison operators are similar to the addition operator in that they have behavior defined for both numeric and string operands, and unfortunately this behavior is different.

A string comparison will not necessarily give the same result as numeric comparison on the same operands, as the following examples show:

Dim a, b, a1, b1  a = "1,000" b = "500" a1 = CDbl(a) b1 = CDbl(b) ' Now a1 > b1 but a < b 

Notice also that all four variables—a, b, a1, and b1—are numeric in the sense that IsNumeric will return True for them.

As with string and number addition, the net result is that you must always be aware of the potential bugs here and ensure that the operands are converted to a numeric or string subtype before the operator is used.

Case 4: Visual Basic's own functions

Visual Basic's own functions work well with Variants, with a few exceptions. I won't cover this exhaustively but just pick out some special points.

The Visual Basic mathematical functions works fine with Variants because they each have a single behavior that applies only to numerics, so there is no confusion. In this way, these functions are similar to the arithmetic operators. Provided the Variant passes the IsNumeric test, the function will perform correctly, regardless of the underlying subtype.

a =  Hex("1,234") a = Log("1,234") 'etc.. No problems here 

Type mismatch errors will be raised should the parameter not be numeric.

The string functions do not raise type mismatch errors, because all simple data types can be converted to strings (for this reason there is no IsString function in Visual Basic). Thus, you can apply the string functions to Variants with numeric subtypes—Mid, InStr, and so forth all function as you would expect. However, exercise extreme caution because of the effect regional settings can have on the string version of a numeric. (This was covered earlier in the chapter.)

The function Len is an interesting exception, because once again it has different behavior depending on what the data type of the parameter is. For simple strings Len returns the length of the string. For simple nonstring data Len returns the number of bytes used to store the variable. However, less well known is the fact that for Variants, it returns the length of the Variant as if it were converted to a string, regardless of the Variant's actual subtype.

Dim v As Variant, i As Integer i = 100 v = i ' The following are now true: ' Len(i) = 2 ' Len(v) = 3 

This provides one of the only ways of distinguishing a simple Integer variable from a Variant of subtype Integer at run time.

Flexibility

Some time ago, while I was working for a big software house, I heard this (presumably exaggerated) anecdote about how the company had charged a customer $1 million to upgrade the customer's software. The customer had grown in size, and account codes required five digits instead of four. That was all there was to it. Of course, the client was almost certainly being ripped off, but there are plenty of examples in which a little lack of foresight proves very costly to repair. The Year 2000 problem is a prime example. It pays to allow yourself as much flexibility and room for expansion that can be reasonably foreseen. For example, if you need to pass the number of books as a parameter to a function, why only allow less than 32,768 books (the maximum value of an Integer)? You might also need to allow for half a book too, so you wouldn't want to restrict it to Integer or Long. You'd want to allow floating-point inputs. You could at this point declare the parameter to be of type Double because this covers the range and precision of Integer and Long as well as handling floating points. But even this approach is still an unnecessary restriction. Not only might you still want the greater precision of Currency or Decimal, you might also want to pass in inputs such as An unknown number of books.

The solution is to declare the number of books as a Variant. The only commitment that is made is about the meaning of the parameter—that it contains a number of books—and no restriction is placed on that number. As much flexibility as possible is maintained, and the cost of those account code upgrades will diminish.

Function ReadBooks(ByVal numBooks As Variant)     ' Code in here to read books End Function 

Suppose we want to upgrade the function so that we can pass An unknown number of books as a valid input. The best way of doing this is to pass a Variant of subtype Null. Null is specifically set aside for the purpose of indicating not known.

If the parameter had not been a Variant, you would have had some choices:

If the parameters are Variants, you avoid these unsatisfactory choices when modifying the functions. In the same way, parameters and return types of class methods, as well as properties, should all be declared as Variants instead of first-class data types.

HUNGARIAN NOTATION

The portion of Hungarian notation that refers to data type has little relevance when programming with Variants. Indeed, as variables of different data types can be freely assigned and interchanged, the notation has little relevance in Visual Basic at all.

I still use variable prefixes, but only to assist in the categorization of variables at a semantic level. So, for example, "nCount" would be a number that is used as a counter of something. The n in this instance stands for a general numeric, not an Integer.

Defensive Coding

I have extolled the virtues of using Variants and the flexibility that they give. To be more precise, they allow the interface to be flexible. By declaring the number of books to be a Variant, you make it unlikely that the data type of that parameter will need to be modified again.

This flexibility of Variants has a cost to it. What happens if we call the function with an input that doesn't make sense?

N = ReadBooks("Ugh") 

Inside the function, we are expecting a number—so what will it make of this? If we are performing some arithmetic operations on the number, we risk a type mismatch error when a Variant with these contents is passed. You must assert your preconditions for the function to work. If, as in this instance, the input must be numeric, be sure that this is the case:

Function ReadBooks(ByVal input As Variant) As Variant     If IsNumeric(input) Then         ' Do stuff, return no error     Else         ' Return error      End If End Function 

In other words, you code defensively by using the set of Is functions to verify that a parameter is suitable for the operation you're going to perform on it.

You might think about using Debug.Assert in this instance, but it is no help at run time because all the calls to the Assert method are stripped out in compilation. So you would still need to implement your own checks anyway.

Of course, verifying that your input parameter is appropriate and satisfies the preconditions is not just about checking the type. It would also involve range checks, ensuring that we are not dividing by 0, and so on.

Is this feasible? In practice, coding defensively like this can become a major chore, and it is easy to slip up or not bother with it. It would be prudent if you were writing an important1 piece of component code, especially if the interface is public, to place defensive checks at your component entry points. But it is equally likely that a lot of the time you will not get around to this.

What are the consequences of not performing the defensive checks? While this naturally depends on what you are doing in the function, it is most likely that if there is an error it will be a type mismatch error. If the string Ugh in the previous example was used by an operator or built-in function that only worked with numerics, a type mismatch would occur. Interestingly, had the parameter to ReadBooks been declared as a Double instead of a Variant, this same error would be raised if the string Ugh was passed.

The only difference is that in the case of the Variant the error is raised within the function, not outside it. You have the choice of passing this error back to the calling client code or just swallowing the error and carrying on. The approach you take will depend on the particular circumstances and your preferences.

Using the Variant as a General Numeric Data Type

Don't get sidetracked by irrelevant machine-specific details. Almost all the time, we want to deal with numbers. For example, consider your thought process when you choose between declaring a variable to be of type Integer or type Long. You might consider what the likely values of the variable are going to be, worry a little bit about the effect on performance or memory usage, and maybe check to see how you declared a similar variable elsewhere so that you can be consistent. Save time—get into the habit of declaring all these variables as Variants.

NOTE

All variables in my code are either Variants or references to classes. Consequently, a lot of code starts to look like this.

Dim Top As Variant Dim Left As Variant Dim Width As Variant Dim Height As Variant
After a time I started to take advantage of the fact that Variants are the default, so my code typically now looks like this:
Dim Top, Left, Width, Height
I see no problem with this, but your current Visual Basic coding standards will more than likely prohibit it. You might think about changing them.

VARIANT BUGS WHEN PASSING PARAMETERS BY REFERENCE

Variants do not always work well when passed by reference, and can give rise to some hard-to-spot bugs. The problem is illustrated in the following example:

Private Sub Form_Load()     Dim i As Integer     i = 3     subByVal i     subByRef i      End Sub Private Sub subByVal(ByVal x As Variant)     x = 6.4     Debug.Print x     'shows 6.4 End Sub Private Sub subByRef(x As Variant)     x = 6.4     Debug.Print x     'shows 6 End Sub 

Notice that the only difference between the procedures subByVal and subByRef is that the parameter is passed ByVal in subByVal and ByRef in subByRef. When subByVal is called, the actual parameter i is of type Integer. In subByVal, a new parameter x is created as a Variant of subtype Integer, and is initialized with the value 3. In other words, the subtype of the Variant within the procedure is defined by the type of the variable that the procedure was actually called with. When x is then set to a value of 6.4, it converts to a Variant of subtype Double with value 6.4. Straightforward.

When subByRef is called, Visual Basic has a bit more of a problem. The Integer is passed by reference, so Visual Basic cannot allow noninteger values to be placed in it. Instead of converting the Integer to a Variant, Visual Basic leaves it as an Integer. Thus, even in the procedure subByRef itself, where x is declared as a Variant, x is really an Integer. The assignment of x = 6.4 will result in an implicit CInt call and x ends up with the value 6. Not so straightforward.

Procedures like subByVal are powerful because they can perform the same task, whatever the data type of the actual parameters. They can even perform different tasks depending on the type of the actual parameter, though this can get confusing.

Procedures like subByRef lead to bugs—avoid them by avoiding passing by reference.



Ltd Mandelbrot Set International Advanced Microsoft Visual Basics 6. 0
Advanced Microsoft Visual Basic (Mps)
ISBN: 1572318937
EAN: 2147483647
Year: 1997
Pages: 168
Authors: Mandelbrot Set International Ltd
BUY ON AMAZON

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net