Objects In Programming | Pro Visual C++ 2005 for C# Developers

Now that you understand what an object is conceptually (and in everyday life), you can learn more specifically how to apply these concepts to programming.

If you've programmed on Windows before, chances are you're already familiar with objects. For example, think about the various controls that you can place in Windows, including text boxes, list boxes, buttons, and so on. Microsoft has written these controls for you so that you don't need to know, for example, how a text box works internally. You just know that it does certain things. For example, you can set its Text property to display text onscreen, or you can set its Width property to have the text box resize itself.

In programming, you need to distinguish between a class and an object. A class is the generic definition of what an object is — a template. For example, a class could be "car radio" — the abstract idea of a car radio. The class specifies what properties an object must have to qualify as a car radio.

Class Members

So far, you've seen that there are two sides to an object: what it does, which is usually publicly known, and how it works, which is usually hidden. In programming, the "what it does" is normally represented in the first instance by methods, which are blocks of functionality that you can use. A method is just C# parlance for a function. The "how it works" is represented both by methods and by any data (variables) that the object stores. In Java and C++, this data is described as member variables, whereas in Visual Basic this data would be represented by any module-level variables in the class module. In C# the terminology is fields. In general, a class is defined by its fields and methods.

The term member is used by itself to denote anything that is part of a class, be it a field, method, or any of the other items just mentioned that can be defined within a class.

Defining a Class

The easiest way to understand how to code a class is by looking at an example. In the following sections, you develop a simple class called Authenticator. Assume you're in the process of writing a large application, which at some point requires users to log in and supply a password. Authenticator is the name of the class that will handle this aspect of the program. You won't worry about the rest of the application — you'll just concentrate on writing this class. However, you will also write a small piece of test harness code to verify that Authenticator works as intended.

Authenticator allows you to do two things: set a new password and check whether a password is valid. The C# code you need to define the class looks like this:

 public class Authenticator { private string password  = ""; public bool IsPasswordCorrect(string tryPassword) { return (tryPassword == password) ? true : false; } public bool ChangePassword(string oldPassword, string newPassword) { if (oldPassword == password) { password = newPassword; return true; } else return false; } }

The keyword class in C# indicates that you are going to define a new class (type of object). The word immediately following class is the name you're going to use for this class. Then the actual definition of the object — consisting of variables (fields) and methods — follows in braces. In this example, the definition consists of one field, password, and two methods, IsPasswordCorrect() and ChangePassword().

Access Modifiers

The only field in Authenticator, password, stores the current password (initially an empty string when an Authenticator object is created) and is marked by the keyword private. This means that it is not visible outside the class, only to code that is part of the Authenticator class itself. Marking a field or method as private effectively ensures that that field or method will be part of the internal working of the class, as opposed to the external interface. The advantage of this is that if you decide to change the internal working (perhaps you later decide not to store password as a string but to use some other more specialized data type), you can just make the change without breaking the code outside the Authenticator class definition — nothing from outside of this class can access this field.

Any code that uses the Authenticator class can only access the methods that have been marked with the keyword public — in this case the IsPasswordCorrect() and ChangePassword() methods. Both of these methods have been implemented in such a way that nothing will be done (other than returning true or false) unless the calling code supplies the current correct password, as you'd expect for software that implements security. The implementations of these functions access the password field, but that's fine because this code forms part of the Authenticator class itself. Notice that these public functions simultaneously give you the interface to the external world (in other words, any other code that uses the Authenticator class) and define what the Authenticator class does, as viewed by the rest of the world.

private and public are not the only access modifiers available to define what code is allowed to know about the existence of a member. Later, this appendix discusses protected, which makes the member available to this class and certain related classes. C# also allows members to be declared as internal and protected internal, which restrict access to other code within the same assembly.

Instantiating and Using Objects

The easiest way to understand how to use a class in your code is to think of the class as a new type of variable. You're used to the predefined variable types — such as int, float, double, and so on. By defining the Authenticator class, you've effectively told the compiler that there's a new type of variable called an Authenticator. The class definition contains everything the compiler needs to know to be able to process this variable type. Therefore, just as the compiler knows that a double contains a floating-point number stored in a certain format (which enables you to add doubles, for example), you've told the compiler that a variable of type Authenticator contains a string and allows you to call the IsPasswordCorrect() and ChangePassword() methods.

Note

Although a class is described here as a new type of variable, the more common terminology is data type, or simply type.

Creating a user-defined variable (an object) is known as instantiation, because you create an instance of the object. An instance is simply any particular occurrence of the object. So, if your Authenticator object is another kind of variable, you should be able to use it just like any other variable — and you can, as demonstrated in the following example.

Create the MainEntryPoint class, as shown in the following code sample, and place it in the Wrox.ProCSharp.OOProg namespace along with the Authenticator class you created earlier:

 using System; namespace Wrox.ProCSharp.OOProg { class MainEntryPoint { static void Main() { Authenticator myAccess = new Authenticator(); bool done; done = myAccess.ChangePassword("", "MyNewPassword"); if (done == true) Console.WriteLine("Password for myAccess changed"); else Console.WriteLine("Failed to change password for myAccess"); done = myAccess.ChangePassword("", "AnotherPassword"); if (done == true) Console.WriteLine("Password for myAccess changed"); else Console.WriteLine("Failed to change password for myAccess"); if (myAccess.IsPasswordCorrect("WhatPassword")) Console.WriteLine("Verified myAccess\' else Console.WriteLine("Failed to verify myAccess\' } } public class Authenticator { // implementation as shown earlier } }

The MainEntryPoint class is like Authenticator — it can have its own members (that is, its own fields, methods, and so on). However, you've chosen to use this class solely as a container for the program entry point, the Main() method. Doing it this way means that the Authenticator class can sit as a class in its own right that can be used in other programs (either by cutting and pasting the code or by compiling it separately into an assembly). MainEntryPoint only really exists as a class because of the syntactical requirement of C# that even the program's main entry point has to be defined within a class, rather than being defined as an independent function.

Because all the action is happening in the Main() method, let's take a closer look at it. The first line of interest is:

Authenticator myAccess = new Authenticator();

Here you are declaring and instantiating a new Authenticator object instance. Don't worry about = new Authenticator() for now — it's part of C# syntax and is there because in C#, classes are always accessed by reference. You could actually use the following line if you just wanted to declare a new Authenticator object called myAccess:

 Authenticator myAccess;

This declaration can hold a reference to an Authenticator object, without actually creating any object (in much the same way that the line Dim obj As Object in Visual Basic doesn't actually create any object). The new operator in C# is what actually instantiates an Authenticator object.

Calling class methods is done using the period symbol (.) appended to the name of the variable:

done = myAccess.ChangePassword("", "MyNewPassword");

Here you have called the ChangePassword() method on the myAccess instance and fed the return value into the done Boolean variable. You can retrieve class fields in a similar way. Note, however, that you cannot do this:

 string myAccessPassword = myAccess.password;

This code will actually cause a compilation error, because the password field was marked as private, so other code outside the Authenticator class cannot access it. If you changed the password field to be public, the previous line would compile and feed the value of password into the string variable.

You should note that if you are accessing member methods or fields from inside the same class, you can simply give the name of the member directly.

Now that you understand how to instantiate objects, call class methods, and retrieve public fields, the logic in the Main() method should be pretty clear. If you save this code as Authenticator.cs and then compile and run it, you will get this:

 Authenticator Password for myAccess changed  Failed to change password for myAccess  Failed to verify myAccess' password

There are a couple of points to note from the code. First, you'll notice that so far you're not doing anything new compared to what you would do when coding a Visual Basic class module, nor do you do anything that differs from the basic C# syntax covered in the first part of this book. The purpose here is to make sure that you are clear about the concepts behind classes.

Second, the previous example uses the Authenticator class directly in other code within the same source file. You'll often want to write classes that are used by other projects that you or others work on. To do this, you write the class in exactly the same way, but compile the code for the class into a library, as explained in Chapter 15, "Assemblies."

Using Static Members

You may have noticed in the example that the Main() method was declared as static. This section discusses what effect this static keyword has.

Creating static fields

It's important to understand that by default each instance of a class (each object) has its own set of all the fields you've defined in the class. For example, in the following snippet the instances karli and julian each contain their own string called password:

 Authenticator julian = new Authenticator();  Authenticator karli = new Authenticator();  karli.ChangePassword("OldKarliPassword", "NewKarliPassword");  julian.ChangePassword("OldJulianPassword", "NewJulianPassword");

Changing the password in karli has no effect on the password in julian, and vice versa (unless the two references happen to be pointing to the same address in memory, which is discussed later). This situation resembles Figure A-1.

image from book
Figure A-1

In some cases this might not be the behavior you want. For example, suppose you want to define a minimum length for all passwords (and therefore for all of the password fields in all instances) in your Authenticator class. You do not want each password to have its own minimum length. Therefore, you really want the minimum length to be stored only once in memory, no matter how many instances of Authenticator you create.

To indicate that a field should only be stored once, no matter how many instances of the class you create, you place the keyword static in front of the field declaration in your code:

public class Authenticator {    private static uint minPasswordLength = 6;    private string password = "";

Storing a copy of minPasswordLength with each Authenticator instance not only wastes memory but also causes problems if you want to be able to change its value! By declaring the field as static, you ensure that it is only stored once, and this field is shared among all instances of the class. Note that in this code snippet you also set an initial value. You also use an unsigned integer (uint) as opposed toa standard integer (int) because you don't want to ever use a negative value of this variable. Fields declared with the static keyword are referred to as static fields or static data, whereas fields that are not declared as static are referred to as instance fields or instance data. Another way of looking at this is that an instance field belongs to an object, whereas a static field belongs to the class.

Important

VB developers shouldn't confuse static fields with static variables in Visual Basic, which are variables whose values remain between invocations of a method.

If a field has been declared as static, it exists when your program is running from the moment that the particular module or assembly containing the definition of the class is loaded — that is, as soon as your code tries to use something from that assembly, so you can always guarantee a static variable is there when you want to refer to it. This is independent of whether you actually create any instances of that class. By contrast, instance fields exist only when variables of that class are currently in scope — one set of instance fields for each variable.

Note

In some ways, static fields perform the same functions as global variables performed for older languages such as C and FORTRAN.

You should note that the static keyword is independent of the accessibility of the member to which it applies. A class member can be public static or private static.

Creating static methods

As explained in the Authenticator example, by default a method such as ChangePassword() is called against a particular instance, as indicated by the name of the variable in front of the period (.) operator. That method then implicitly has access to all the members (fields, methods, and so on) of that particular instance.

However, just as with fields, it is possible to declare methods as static, provided that they do not attempt to access any instance data or other instance methods. For example, you might want to provide a method to allow users to view the minimum password length:

public class Authenticator { private static uint minPasswordLength = 6; public static uint GetMinPasswordLength() { return minPasswordLength; } ...

You can download the code for Authenticator with this modification from the Wrox Press Web site (www.wrox.com) as the Authenticator2 sample.

In the earlier Authenticator example, the Main() method of the MainEntryPoint class is declared as static. This allows it to be invoked as the entry point to the program, despite the fact that no instance of the MainEntryPoint class was ever created.

Accessing static members

The fact that static methods and fields are associated with a class rather than an object is reflected in how you access them. Instead of specifying the name of a variable before the . operator, you specify the name of the class, like this:

 Console.WriteLine(Authenticator.GetMinPasswordLength());

Also notice that in this code you access the Console.WriteLine() method by specifying the name of the class, Console. That is because WriteLine() is a static method too — you don't need to instantiate a Console object to use WriteLine().

How instance and static methods are implemented in memory

As mentioned earlier, each object stores its own copy of a class's instance fields. This is, however, not the case for methods. If each object had its own copy of the code for a method, it would waste a lot of memory, because the code for the methods remains the same across all object instances. Therefore, instance methods, just like static methods, are stored only once, and associated with the class as a whole. Later on, you'll see other types of class members (constructors, properties, and so on) that contain code rather than data and follow the same logic.

Figure A-2 shows how instance and static methods are implemented in memory.

image from book
Figure A-2

If instance methods are stored only once, how is a method able to access the correct copy of each field? In other words, how can the compiler generate code that accesses Karli's password with the first method call and Julian's with the second in the following example?

 karli.ChangePassword("OldKarliPassword", "NewKarliPassword");  julian.ChangePassword("OldJulianPassword", "NewJulianPassword");

The answer is that instance methods actually take an extra implicit parameter, which is a reference to where in memory the relevant class instance is stored. You can almost think of this code example as the user-friendly version that you have to write, because that's how C# syntax works. However, what's actually happening in your compiled code is this:

 ChangePassword(karli, "OldKarliPassword", "NewKarliPassword");  ChangePassword(julian, "OldJulianPassword", "NewJulianPassword");

Declaring a method as static makes calling it slightly more efficient, because it will not be passed this extra parameter. On the other hand, if a method is declared as static but attempts to access any instance data, the compiler will raise an error for the obvious reason that you can't access instance data unless you have the address of a class instance!

This means that in the Authenticator example you could not declare ChangePassword() or IsPasswordCorrect() as static because both of these methods access the password field, which is not static.

Interestingly, although the hidden parameter that comes with instance methods is never declared explicitly, you do actually have access to it in your code. You can get to it using the keyword this. You can rewrite the code for the ChangePassword() method as follows:

public bool ChangePassword(string oldPassword, string newPassword) { if (oldPassword == this.password)    { this.password = newPassword;       return true;    }    else       return false; }

Generally, you wouldn't write your code like this unless you have to distinguish between variable names. All you've achieved here is to make the method longer and slightly harder to read.

A Note About Reference Types

Before leaving the discussion of classes, you should be aware of one potential gotcha that can occur in C# because C# regards all classes as reference types. This can have some unexpected effects when it comes to comparing instances of classes for equality and setting instances of classes equal to each other. For example, look at this code:

 Authenticator User1;  Authenticator User2 = new Authenticator(); Authenticator User3 = new Authenticator(); User1 = User2; User2.ChangePassword ("", "Tardis");;  // This sets password for User1 as well! User3.ChangePassword ("", "Tardis");; if (User2 == User3) { // contents of this if block will NOT be executed even though // objects referred to by User2 and User3 are contain identical values, // because the variables refer to different objects } if (User2 == User1) { // any code here will be executed because User1 and User2 refer // to the same memory }

In this code, you declare three variables of type Authenticator: User1, User2, and User3. However, you instantiate only two objects of the Authenticator class, because you use only the new operator twice. Then you set the variable User1 equal to User2. Unlike with a value type, this does not copy any of the contents of User2. Rather, it means that User1 is set to refer to the same memory as User2 is referring to. What that means is that any changes you make to User2 also affect User1, because they are not separate objects; both variables refer to the same data. You can also say that they point to the same data, and the actual data referred to is sometimes described as the referent. So when you set the password of User2 to Tardis, you are implicitly also setting the password of User1 to Tardis. This is very different from how value types behave.

The situation gets even less intuitive when you try to compare User2 and User3 in the next statement:

if (User2 == User3)

You might expect that this condition returns true because User2 and User3 have both been set to the same password, so both instances contain identical data. The comparison operator for reference types, however, doesn't compare the contents of the data by default — it simply tests to see whether the two references are referring to the same address in memory. Because they are not, this test returns false, which means anything inside this if block will not be executed. By contrast, comparing User2 with User1 returns true because these variables do point to the same address in memory.

Note

Note that this behavior does not apply to strings, because the == operator has been overloaded for strings. Comparing two strings with == always compares string content. (Any other behavior for strings would be extremely confusing!)

Overloading Methods

To overload a method is to create several methods each with the same name, but each with a different signature. The reason you might want to use overloading is best explained with an example. Consider how in C# you write data to the command line, using the Console.WriteLine() method. For example, if you want to display the value of an integer, you can write this:

 int x = 10; Console.WriteLine(x);

To display a string you can write:

 string message = "Hello";  Console.WriteLine(message);

Even though you are passing different data types to the same method, both of these examples compile. This is because there are actually lots of Console.WriteLine() methods, but each has a different signature — one of them takes int as a parameter, while another one takes string, and so on. There is even a two-parameter overload of the method that allows for formatted output and lets you write code like this:

string Message = "Hello";

 Console.WriteLine("The message is {0}", Message);

Obviously, Microsoft provides all of these Console.WriteLine() methods because it realizes that there are many different data types of which you might want to display the value.

Method overloading is very useful, but there are some pitfalls to be aware of when using it. Suppose you write:

 short y = 10; Console.WriteLine(y);

A quick look at the documentation reveals that no overload of WriteLine() takes short. So what will the compiler do? In principle, it could generate code that converts short to int and call the int version of Console.WriteLine(). Or it could convert short to long and call Console.WriteLine(long). It could even convert short to string.

In this situation, each language will have a set of rules for what conversion will be the one that is actually performed (for C#, the conversion to int is the preferred one). However, you can see the potential for confusion. For this reason, if you define method overloads, you need to take care to do so in a way that won't cause any unpredictable results.

When to Use Overloading

Generally, you should consider overloading a method when you need a number of methods that take different parameters, but conceptually do the same thing, as with Console.WriteLine() in the preceding section. The situations in which you will normally use overloading are explained in the following subsections.

Optional parameters

One common use of method overloads is to allow certain parameters to a method to be optional and to have default values if the client code does not specify their values explicitly. For example, consider this code:

 public void DoSomething(int x, int y) { // do whatever } public void DoSomething(int x) { DoSomething(x, 10); }

These overloads allow client code to call DoSomething(), supplying one required parameter and one optional parameter. If the optional parameter isn't supplied, you effectively assume the second int is 10. Most modern compilers will also inline method calls in this situation so there is no performance loss. This is certainly true of the .NET JIT compiler.

Note

Some languages, including Visual Basic and C++, allow default parameters to be specified explicitly in function declarations, with a syntax that looks like public void DoSomething(int X, int Y=10). C# does not allow this; in C# you have to simulate default parameters by providing multiple overloads of methods as shown in the previous example.

Different input types

You have already seen this very common reason for defining overloads in the Console.WriteLine() example.

Different output types

This situation is far less common; however, occasionally you might have a method that calculates or obtains some quantity, and depending on the circumstances, you might want this to be returned in more than one way. For example, in an airline company, you might have a class that represents aircraft timetables, and you might want to define a method that tells you where an aircraft should be at a particular time. Depending on the situation, you might want the method to return either a string description of the position ("over Atlantic Ocean en route to London") or the latitude and longitude of the position.

You cannot distinguish overloads using the return type of a method. However, you can do so using out parameters. So you could define these:

 void GetAircraftLocation(DateTime Time, out string Location) { ... } void GetAircraftLocation(DateTime Time, out float Latitude, out float Longitude)  { ... }

Note, however, that in most cases using overloads to obtain different out parameters does not lead to an architecturally neat design. In the preceding example, a better design would perhaps involve defining a Location struct that contains the location string as well as the latitude and longitude and returning this from the method call, hence avoiding the need for overloads.

Properties

As mentioned earlier, a class is defined by its fields and methods. However, classes can also contain other types of class members, including constructors, indexers, properties, delegates, and events. For the most part these other items are used only in more advanced situations and are not essential to understanding the principles of object-oriented design. For that reason, this appendix discusses only properties, which are extremely common and can significantly simplify the external user interface exposed by classes. The other class members are introduced in Part I. Properties are in extremely common use, however, and can significantly simplify the external user interface exposed by classes. For this reason, they are discussed here.

Note

Visual Basic programmers will find that C# properties correspond almost exactly to properties in VB class modules and are used in just the same way.

Properties exist for the situation in which you want to make a method call look like a field. You can see what a property is by looking at the minPasswordLength field in the Authenticator class. In this section, you extend the class so that users can read and modify this field without having to use a GetMinPasswordLength() method like the one introduced earlier.

A property is a method or pair of methods exposed to the outside world as if they are fields. To create a property for the minimum password length, modify the code for the Authenticator class as follows:

 public static uint MinPasswordLength  { get { return minPasswordLength; } set { minPasswordLength = value; } }

As you can see from this, you define a property in much the same way as a field, except that after the name of the property, you have a code block enclosed by curly braces. In the code block there may be two methods called get and set. These are known as the get accessor and the set accessor. Note that although no parameter is explicitly mentioned in the definition of the set accessor, there is an implicit parameter passed in, and referred to by the name value. Also, the get accessor always returns the same data type as the property was declared as (in this case uint).

Now, to retrieve the value of minPasswordLength, you use this syntax:

 uint i = Authenticator.MinPasswordLength;

What will actually happen here is that MinPasswordLength property's get accessor is called. In this case, this method is implemented to simply return the value of the minPasswordLength field.

To set the MinPasswordLength field using the property, use the following code:

 Authenticator.MinPasswordLength = 7;

This code causes the MinPasswordLength's set accessor to be called, which is implemented to assign the required value (7) to the minPasswordLength field. As mentioned earlier, the set accessor has an implicit parameter, called value.

Note that in this particular example, the property in question happens to be static. In general that is not necessary. Just as for methods, you will normally declare properties as static only if they refer to static data.

Data Encapsulation

You may wonder what the point of all the preceding code is. Wouldn't it have been easier to make the minPasswordLength field public, so that you could access it directly and not have to bother about any properties? The answer is that fields represent the internal data of an object, so they are an integral part of the functionality of an object. Now, in OOP, you aim to make it so that users of objects only need to know what an object does, not how it does it. So making fields directly accessible to users defeats the ideology behind OOP.

Ideology is all very well, but there must be practical reasons behind it. One reason is this: If you make fields directly visible to external users, you lose control over what they do to the fields. They might modify the fields in such a way as to break the intended functionality of the object (give the fields inappropriate values, for example). However, if you use properties to control access to a field, this is not a problem; you can add functionality to the property that checks for inappropriate values. Related to this, you can also provide read-only properties by omitting the set accessor completely. The principle of hiding fields from client code in this way is known as data encapsulation.

You should only use properties to do something that appears only to set or retrieve a value; in all other instances use methods. That means that the set accessor must only take one parameter and return a void, whereas the get accessor cannot take any parameters. For example, it would not be possible to rewrite the IsPasswordValid() method in the Authenticator class as a property. The parameter types and return value for this method are not of the correct type.