Variables | Beginning Visual C#supAND#174;/sup 2005

As discussed in the introduction to this chapter, variables are concerned with the storage of data. Essentially, you can think of variables in computer memory as boxes sitting on a shelf. You can put things in boxes and take them out again, or you can just look inside a box to see if anything is there. The same goes for variables; you place data in them and can take it out or look at it, as required.

Although all data in a computer is effectively the same thing (a series of zeros and ones), variables come in different flavors, known as types. Again using the box analogy, you can imagine that your boxes come in different shapes and sizes, and some things will only fit in certain boxes. The reasoning behind this type system is that different types of data may require different methods of manipulation, and by restricting variables to individual types you can avoid getting mixed up. It wouldn't, for example, make much sense to treat the series of zeros and ones that make up a digital picture as an audio file.

To use variables, you have to declare them. This means that you have to assign them a name and a type. Once you have declared variables you can use them as storage units for the type of data that you declared them to hold.

The C# syntax for declaring variables simply involves specifying the type and variable name as follows:

 <type> <name>;

If you try to use a variable that hasn't been declared, then your code won't compile, but in this case the compiler will tell you exactly what the problem was, so this isn't really a disastrous error. In addition, trying to use a variable without assigning it a value will also cause an error, but again, the compiler will detect this.

So, what are the types that you can use?

Well, in fact there are an almost infinite number of types that you can use. The reason for this is that you can define your own types to hold whatever convoluted data you like.

Having said this, there are certain types of data that just about everyone will want to use at some point or another, such as a variable that stores a number. Therefore, there are a number of simple, predefined types that you should be aware of.

Simple Types

Simple types are those types such as numbers and Boolean (true or false) values that make up the fundamental building blocks for your applications, and for other, more complex types. Most of the simple types available are numeric, which at first glance seems a bit strange — surely, you only need one type to store a number?

The reason for the plethora of numeric types is down to the mechanics of storing numbers as a series of zeros and ones in the memory of a computer. For integer values, you simply take a number of bits (individual digits that can be zero or one) and represent your number in binary format. A variable storing N bits will allow you to represent any number between 0 and (2^N – 1). Any numbers above this value will be too big to fit into this variable.

As an example, let's say you have a variable that can store 2 bits. The mapping between integers and the bits representing those integers is, therefore, as follows:

0 = 00 1 = 01 2 = 10 3 = 11

If you want to be able to store more numbers, you need more bits (3 bits will let you store the numbers from 0 to 7, for example).

The inevitable conclusion of this argument is that you would need an infinite number of bits to be able to store every imaginable number, which isn't going to fit in your trusty PC. Even if there were a quantity of bits you could use for every number, it surely wouldn't be efficient to use all these bits for a variable that, for example, was only required to store the numbers between 0 and 10 (because storage would be wasted). Four bits would do the job fine here, allowing you to store many more values in this range in the same space of memory.

Instead, you have a number of different integer types that can be used to store various ranges of numbers, and take up differing amounts of memory (up to 64 bits). The list of these is shown in the following table.

Note

Note that each of these types makes use of one of the standard types defined in the .NET Framework. As discussed in Chapter 1, this use of standard types is what allows interoperability between languages. The names you use for these types in C# are aliases for the types defined in the Framework. The table lists the names of these types as they are referred to in the .NET Framework library.

Type	Alias For	Allowed Values
sbyte	System.SByte	Integer between –128 and 127.
byte	System.Byte	Integer between 0 and 255.
short	System.Int16	Integer between –32768 and 32767.
ushort	System.UInt16	Integer between 0 and 65535.
int	System.Int32	Integer between –2147483648 and 2147483647.
uint	System.UInt32	Integer between 0 and 4294967295.
long	System.Int64	Integer between –9223372036854775808 and 9223372036854775807.
ulong	System.UInt64	Integer between 0 and 18446744073709551615.

The us before some variable names are shorthand for unsigned, meaning that you can't store negative numbers in variables of those types, as can be seen in the Allowed Values column of the table.

Of course, as well as integers you also need to store floating-point values, which are those that aren't whole numbers. There are three floating-point variable types that you can use: float, double, and decimal. The first two of these store floating points in the form +/– m2e, where the allowed values for m and e differ for each type. decimal uses the alternative form +/– m10e. These three types are shown in the following table, along with their allowed values of m and e, and these limits in real numeric terms:

Type	Alias For	Maxm	Mine	Maxe	Approx. Min Value	Approx. Max Value
float	System.Single	224	-149	104	1.510-⁴⁵	3.410³⁸
double	System.Double	253	-1075	970	5.010-³²⁴	1.710³⁰⁸
decimal	System.Decimal	296	-26	0	1.010-²⁸	7.910²⁸

In addition to numeric types, there are three other simple types available, shown in the next table.

Type	Alias For	Allowed Values
char	System.Char	Single Unicode character, stored as an integer between 0 and 65535
bool	System.Boolean	Boolean value, true or false
string	System.String	A sequence of characters

Note that there is no upper limit on the amount of characters making up a string, because it can use varying amounts of memory.

The Boolean type bool is one of the most commonly used variable types in C#, and indeed similar types are equally prolific in code in other languages. Having a variable that can be either true or false has important ramifications when it comes to the flow of logic in an application. As a simple example, consider how many questions there are that can be answered with true or false (or yes and no). Performing comparisons between variable values or validating input are just two of the programmatic uses of Boolean variables that you will examine very soon.

Now that you've seen these types, let's have a quick example of declaring and using them. In the following Try It Out you use some simple code that declares two variables, assigns them values, and then outputs these values.

Try It Out – Using Simple Type Variables

Create a new console application called Ch03Ex01 in the directory C:\BegVCSharp\Chapter3.

Add the following code to Program.cs:

static void Main(string[] args) { int myInteger; string myString; myInteger = 17; myString = "\"myInteger\" is"; Console.WriteLine("{0} {1}.", myString, myInteger); Console.ReadKey(); }

Execute the code. The result is shown in Figure 3-1.

Figure 3-1

How It Works

The code you have added does three things:

It declares two variables.
It assigns values to those two variables.
It outputs the values of the two variables to the console.

Variable declaration occurs in the following code:

int myInteger; string myString;

The first line declares a variable of type int with a name of myInteger, and the second line declares a variable of type string called myString.

Note

Note that variable naming is restricted and you can't use just any sequence of characters. You look at this in the following section on naming variables.

The next two lines of code assign values:

myInteger = 17; myString = "\"myInteger\" is";

Here, you assign two fixed values (known as literal values in code) to your variables using the = assignment operator (the Expressions section of this chapter will cover more on operators). You assign the integer value 17 to myInteger, and the string "myInteger" (including the quotes) to myString. When you assign string literal values in this way, note that double quotation marks are required to enclose the string. Because of this, there are certain characters that may cause problems if they are included in the string itself, such as the double quotation characters, and you must escape some characters by substituting a sequence of characters (an escape sequence) that represents the character you want to use. In this example, you use the sequence \" to escape a double quotation mark:

myString = "\"myInteger\" is";

If you didn't use these escape sequences and tried coding this as

myString = ""myInteger" is";

you would get a compiler error.

Note that assigning string literals is another situation in which you must be careful with line breaks — the C# compiler will reject string literals that span more than one line. If you want to add a line break, you can use the escape sequence for a carriage return in your string, which is \n. For example, the following assignment

 myString = "This string has a\nline break.";

would be displayed on two lines in the console view as follows:

This string has a line break.

All escape sequences consist of the backslash symbol followed by one of a small set of characters (you look at the full set a little later). Because this symbol is used for this purpose, there is also an escape sequence for the backslash symbol itself, which is simply two consecutive backslashes, \\.

Getting back to the code, there is one more new line that you haven't looked at:

Console.WriteLine("{0} {1}.", myString, myInteger);

This looks similar to the simple method of writing out text to the console that you saw in the first example, but now you are specifying your variables. Now, I don't want to get too far ahead here, so I'm not going to go into too much detail about this line of code at this point. Suffice to say that it is the technique you will be using in the first part of this book to output text to the console window. Within the brackets you have two things:

A string
A list of variables whose values you want to insert into the output string, separated by commas

The string you are outputting, "{0} {1}.", doesn't seem to contain much useful text. As you have seen, however, this is not what you actually see when you run the code. The reason for this is that the string is actually a template into which you insert the contents of your variables. Each set of curly brackets in the string is a placeholder that will contain the contents of one of the variables in the list. Each placeholder (or format string) is represented as an integer enclosed in curly brackets. The integers start at 0 and are incremented by 1, and the total number of placeholders should match the number of variables specified in the comma-separated list following the string. When the text is output to the console, each placeholder is replaced by the corresponding value for each variable. In the example you just saw, the {0} is replaced with the actual value of the first variable, myString, and {1} is replaced with the contents of myInteger.

This method of outputting text to the console is what you will use to display output from your code in the examples that follow.

Finally, the code has the line seen in the earlier example for waiting for user input before terminating:

Console.ReadKey();

Again, I don't want to dissect this code at this point, but you will see it quite a lot in subsequent examples. All you need to know for now is that it pauses code execution until you press a key.

Variable Naming

As mentioned in the last section, you can't just choose any sequence of characters as a variable name. This isn't as worrying as it might sound at first, however, because you're still left with a very flexible naming system.

The basic variable naming rules are:

The first character of a variable name must be either a letter, an underscore character (_), or the at symbol (@).
Subsequent characters may be letters, underscore characters, or numbers.

In addition, there are certain keywords that have a specialized meaning to the C# compiler, such as the using and namespace keywords you saw earlier. If you should use one of these by mistake, the compiler will complain, and you'll soon know you've done something wrong, so don't worry about this too much.

For example, the following variable names are fine:

 myBigVar VAR1 _test

These aren't, however:

 99BottlesOfBeer namespace It's-All-Over

And remember, C# is case-sensitive, so you have to be careful not to forget the exact case used when you declare your variables. References to them made later in the program with even so much as a single letter in the wrong case will prevent compilation.

A further consequence of this is that you can have multiple variables whose names differ only in case, for example the following are all separate names:

 myVariable MyVariable MYVARIABLE

Naming Conventions

Variable names are something you will use a lot. Because of this, it's worth spending a bit of time discussing the sort of names that you should use. Before you get started, though, it is worth bearing in mind that this is controversial ground. Over the years, different systems have come and gone, and some developers will fight tooth and nail to justify their personal system.

Until recently the most popular system was what is known as Hungarian notation. This system involves placing a lowercase prefix on all variable names that identifies the type. For example, if a variable were of type int then you might place an i (or n) in front of it, for example iAge. Using this system, it is easy to see at a glance what types different variables are.

More modern languages, however, such as C# make this system tricky to implement. So, for the types you've seen so far you could probably come up with one or two letter prefixes signifying each type. However, since you can create your own types, and there are many hundreds of these more complex types in the basic .NET Framework, this quickly becomes unworkable. With several people working on a project, it would be easy for different people to come up with different and confusing prefixes, with potentially disastrous consequences.

Developers have now realized that it is far better to name variables appropriately for their purpose. If any doubt arises, it is easy enough to work out what the type of a variable is. In VS, you just have to hover the mouse pointer over a variable name and a pop-up box will tell you what the type is soon enough.

There are currently two naming conventions in use in the .NET Framework namespaces, known as PascalCase and camelCase. The casing used in the names is indicative of their usage. They both apply to names that are made up of multiple words and specify that each word in a name should be in lowercase except for its first letter, which should be uppercase. In camelCasing, there is an additional rule: that the first word should start with a lowercase letter.

The following are camelCase variable names:

 age firstName timeOfDeath

Then the following are PascalCase:

 Age LastName WinterOfDiscontent

For your simple variables, you should stick to camelCase, and you should use PascalCase for certain more advanced naming, which is the Microsoft recommendation.

Finally, it is worth noting that many past naming systems involved frequent use of the underscore character, usually as a separator between words in variable names, such as yet_another_variable. This usage is now discouraged (one thing I'm happy about — I always thought it looked ugly!).

Literal Values

In the previous Try It Out, you saw two examples of literal values: integer and string. The other variable types also have associated literal values, as shown in the following table. Many of these involve suffixes, where you add a sequence of characters to the end of the literal value to specify the type desired. Some literals have multiple types, determined at compile time by the compiler based on their context, as shown in the table.

Type(s)	Category	Suffix	Example/Allowed Values
bool	Boolean	None	true or false
int, uint, long, ulong	Integer	None	100
uint, ulong	Integer	u or U	100U
long, ulong	Integer	l or L	100L
ulong	Integer	ul, uL, Ul, UL, lu, lU, Lu, or LU	100UL
float	Real	f or F	1.5F
double	Real	None, d or D	1.5
decimal	Real	m or M	1.5M
char	Character	None	'a', or escape sequence
string	String	None	"a...a", may include escape sequences

String Literals

Earlier in this chapter, you saw a few of the escape sequences that you can use in string literals. It is worth presenting a full table of these for reference purposes.

Escape Sequence	Character Produced	Unicode Value of Character
\'	Single quotation mark	0x0027
\"	Double quotation mark	0x0022
\\	Backslash	0x005C
\0	Null	0x0000
\a	Alert (causes a beep)	0x0007
\b	Backspace	0x0008
\f	Form feed	0x000C
\n	New line	0x000A
\r	Carriage return	0x000D
\t	Horizontal tab	0x0009
\v	Vertical tab	0x000B

The Unicode value column of the preceding table shows the hexadecimal values of the characters as they are found in the Unicode character set.

As well as the preceding, you can specify any Unicode character using a Unicode escape sequence. These consist of the standard \ character followed by a u and a four-digit hexadecimal value (for example, the four digits after the x in the preceding table).

This means that the following strings are equivalent:

 "Karli\'s string." "Karli\u0027s string."

Obviously, you have more versatility using Unicode escape sequences.

You can also specify strings verbatim. This means that all characters contained between two double quotation marks are included in the string, including end-of-line characters and characters that would otherwise need escaping. The only exception to this is the escape sequence for the double quotation mark character, which must be specified in order to avoid ending the string. To do this, you place the @ character before the string:

 @"Verbatim string literal."

This string could just as easily be specified in the normal way, but the following requires this method:

 @"A short list: item 1 item 2"

Verbatim strings are particularly useful in filenames, since these use plenty of backslash characters. Using normal strings, you'd have to use double backslashes all the way along the string, for example:

 "C:\\Temp\\MyDir\\MyFile.doc"

With verbatim string literals you can make this more readable. The following verbatim string is equivalent to the preceding one:

 @"C:\Temp\MyDir\MyFile.doc"

Note

Note that, as you will see later in the book, strings are reference types, unlike the other types you've seen in this chapter, which are value types. One consequence of this is that strings can also be assigned the value null, which means that the string variable doesn't reference a string.

Variable Declaration and Assignment

As a quick recap, recall that you declare variables simply using their type and name, for example:

 int age;

You then assign values to variables using the = assignment operator:

 age = 25;

Note

Remember that variables must be initialized before you use them. The preceding assignment could be used as an initialization.

There are a couple of other things you can do here that you are likely to see in C# code. The first is declaring multiple variables of the same type at the same time, which you can do by separating their names with commas after the type, for example:

 int xSize, ySize;

Here, xSize and ySize are both declared as integer types.

The second technique you are likely to see is assigning values to variables at the same time as declaring them, which basically means combining two lines of code:

 int age = 25;

You can use both these techniques together:

 int xSize = 4, ySize = 5;

Here, both xSize and ySize are assigned different values.

Note that the following

 int xSize, ySize = 5;

will result in only ySize being initialized — xSize is just declared, and it still needs to be initialized before it's used.