Computer programming is essentially about constructing the software that bridges the gap between the very primitive operations of the hardware presented in the previous section (which are pretty useless to the typical computer user) and programs like word processors and spreadsheet programs that feature specialized functionality targeted at various users' specific needs.
What happens when a C# program is running on the computer? Even though C# abstracts away from the underlying hardware, it is obviously still utilizing the functionality of the computers processor and memory systems at some point.
Figure 1.5 depicts the main parts involved to execute a C# program. First, you need to write the C# program and then store it in the auxiliary memory, typically on a hard disk (see 1 in Figure 1.5). For example, your program could be called MyFirstProgram.cs. Notice the previously mentioned .cs extension that is required for all C# program files.
When prompted, the program or the parts of the program that are to be executed will be loaded into the main memory (2 in Figure 1.5). Through a complex set of operations, the main memory will now collaborate with the processor and execute the operations specified by the program. This is also referred to as running or executing the program.
The lifeblood of the program is the data that it processes. It can get the data from the user (3 in Figure 1.5) in the form of keys pressed on the keyboard and mouse movements and clicks or from other external sources, such as networks and the Internet. Data is also often read in from files kept in the auxiliary memory of the computer (4 in Figure 1.5). For example, if the C# program is a word processor, this data could be the unfinished letter you stored yesterday (5 in Figure 1.5) while you were working on the word processor. When you need to finish it today, you request to have it loaded back into the word processing C# program. After you are finished with the letter, you then store it back into the auxiliary memory. The computer provides output (6 in Figure 1.5) all along by showing the letters of the document on the screen. Perhaps you want to print out the letter, the output is then provided through a printer.
There is one important piece of software not mentioned in the previous discussion the operating system. Examples of commercial operating systems are Windows 2000, Linux, DOS, MacOS (Apple), and UNIX. Even though they are very different, they all have important things in common. The operating system is involved in most operations performed by the computer. In fact, the operating system conducts the entire operation of the computer. It is the first program to be loaded into the computer when it is turned on, and it is so closely fused with the computer that it is easy to mistake it as being the computer. Examples of particular operations performed by the operating system are the retrieval of a program prompted by a command (in the form of a mouse click on an icon or a command typed in with the keyboard) and the initiation of a program's execution. In this scenario, the retrieval and initiation are closely controlled by the operating system.
The first computers manufactured in the 1940s were monstrosities to program. They were programmed with machine language that contained sequences of bits directly controlling a processor's simple operations. Programming on the level of bits is an enormously time consuming and tedious task. The following is a fraction of a program that calculates the greatest common divisor of two integers:
27bdffd0 afbf0014 0c1002a8 00000000 0c1002a8 afa2001c 8fa4001c and so on, and so on.
Yes, I agree total gobbledygook. As you can imagine, programmers soon began to yearn for a less machine-like, more human-like language to increase their productivity. This resulted in the so-called assembly languages where one encounters slightly more human commands such as move, getint, and putint. Even though assembly languages were slightly easier to read and understand, there was still a one to one correspondence between them and machine language. The programmer still had to think in terms of low-level processor operations. As computers evolved and the demand for more complex programs increased, programmers began to wish for a totally machine-independent language. In the mid-1950s, this resulted in probably the first high-level language called FORTRAN. It was suddenly possible to articulate numerical computations by expressions resembling mathematical algebra such as the following:
which calculates the average of 20 and 30.
The popularity of these abstractions away from the machine language was immense, and many other high-level languages soon followed. Today, the number of high-level languages is estimated to more than two thousand. One of the latest additions is C#.
No matter how far we abstract away from the basic computer operations with various high-level languages, we still need to end up with machine language code comprehensible (and hence executable) by the computer hardware. Systems programs called compilers have traditionally performed the job of turning a high-level language into low-level machine language.
Figure 1.6 is a simple illustration of how the source code of a typical high-level language is turned into an executable program. The text you write that contains high-level language instructions is called a source program or source code. In the case of C#, this source code is kept in a .cs file. To turn the source code into machine language, you need a compiler to compile it. The result of this compilation is called an executable program consisting of machine language.
The word program is often used to describe two different things an entire executable program or a piece of source code. In this book, I will strive to use either source code, source program, or simply code to denote a piece of source code. An executable program will be referred to as an executable program, application, or just a program.
Before continuing with the discussion of C# and its execution environment, we need to introduce an important technology closely related to C# called .NET (pronounced dot net).
While C# refers to a language with a set of rules for how to write a source program, .NET is somewhat harder to identify. .NET is an umbrella term for many important services provided during the construction and execution of a C# program. In fact, C# is totally dependent on .NET and, consequently, many of the features and constructs of C# can be traced directly back to .NET. The following are a few important services provided by .NET:
NET provides the means to execute the instructions contained in a C# program. This part of .NET is called an execution engine.
.NET helps promote a so-called type safe environment (more about this in Chapter 6, "Types Part I: The Simple Types") where only certain types of values meant for specific memory locations will be allowed. Metaphorically speaking, .NET ensures the matching of triangular shapes with triangular holes and round shapes with round holes.
.NET frees the programmer from the tedious and error-prone job of managing the computer memory used by the program.
.NET provides a secure environment, attempting to make life harder for computer hackers and their like.
The .NET Framework holds a library containing a vast amount of pre-written program parts that you can make use of in your programs. The .NET Framework library can save you vast amounts of time when constructing various parts of a program. You are, in effect, reusing program components already constructed and thoroughly tested by professional programmers at Microsoft.
Getting a program ready for use (also called deployment) has been simplified in .NET.
.NET provides cross language interoperability. Any language targeting .NET can seamlessly work together with other languages of this platform. At the time of this writing, about 15 languages are being ported to the .NET platform. Because the same .NET runtime is used to execute all languages targeting the .NET platform, the .NET runtime is often called the Common Language Runtime (CLR).
A program constructed with the intent of reuse is called a component or a software component.
These points only represent a superficial listing of a platform featuring many state-of-the-art technologies.
The traditional compilation process for converting the source code of high-level languages into executable programs as described in the previous section has several disadvantages. Two of these are discussed next.
We need a different high-level language compiler for every make of computer because different types of computers have different machine language configurations. Consequently, if you want to run your FORTRAN program on four computers of different makes, you will need four different FORTRAN compilers. Furthermore, whenever a computer manufacturer makes changes to its computer hardware or extends its line of computers for sale, costly and time-consuming compiler adjustments and additions will be necessary.
Most programmers have a preferred programming language, and many "multilingual" programmers have a preferred programming language for specific kinds of programming tasks. As you gradually get acquainted with the world of programming, you will no doubt encounter quite a few programmers with a nearly religious attachment to their favored languages. This is all very well.
However, a problem appears when programmers with different preferences have to collaborate to write the source code for one single project. Perhaps the best solution would be to allow each programmer to use his or her favorite language, but this is very difficult when following the compilation process illustrated in Figure 1.6. Different languages represent the same functionality in different ways at the machine level, due in part to different compiler configurations. This, in turn, makes collaboration between different languages impossible.
An attempt has been made to solve this problem by introducing so-called component systems (such as CORBA and COM) that provide standards for the interactions between different parts of a program. Programmer A can then write a component called X in, say, a language such as Visual Basic that can interact with Programmer B's component Y written in the C++ language.
The commercial success of these component systems has been enormous. However, apart from introducing several other problems, the components involved in the program cannot interact on the same detailed level as if all parts of the program were written in the same language.
C# and .NET provide interesting solutions to the two problems described. Let's have a look at the overall elements involved when compiling under .NET, as shown in Figure 1.7.
First, notice that two other languages (C++ and Visual Basic) are displayed in Figure 1.7. So whether we write in C# or in any other language targeted at .NET, the programming does not change the overall process of compiling under .NET. After the source code is written, we still need to compile it, but, as you can see, the source code is compiled into another language called Microsoft Intermediate Language (MSIL) instead of compiling into machine language right away. In fact, all language compilers targeting the .NET platform will need to compile into this intermediate language. The idea of an intermediate language is not new. It was already applied in connection with another high-level language, called Pascal, by utilizing what the designers called a "UCSD Pascal p-machine." Similar ideas are also being used in languages like Smalltalk and Java.
As the name implies, MSIL is somewhere in between high-level languages and machine languages (also called native code), allowing it to be efficiently translated into machine language by a so-called JIT-Compiler (Just in Time-Compiler). The output from a JIT-Compiler is similar to that of a conventional compiler, but the JIT-Compiler uses a slightly different strategy. Instead of using memory and time to convert all of the MSIL, it only converts the parts that are actually needed during execution. So in effect, the code is compiled, on the run, and the unused parts of the MSIL code (this can be a substantial amount) did not waste the JIT-Compiler's time.
So what are some of the advantages of the .NET architecture? Until now, we only seem to have complicated matters. Well, by inserting the MSIL between the high-level language and the machine language, we have essentially decoupled those two languages. The MSIL remains unchanged, no matter on what kind of computer system it is being used. The JIT-Compiler is the only part that needs to be changed or adjusted when changes in the computer system are being made. Each computer will have its own JIT-Compiler that translates from MSIL to machine language suited for that particular computer configuration. As a result, we only ever need to compile our high-level languages into one, non-changing language. This solves Problem 1 mentioned earlier.
Now, for a simplified explanation to how Problem 2 is solved. Notice the Metadata next to the MSIL in Figure 1.7. Metadata is emitted by the high-level language compiler and contains detailed descriptions of all the elements in your source code. Metadata is also defined as data about data. So detailed is this information that source code from other high-level languages will be able to utilize your source code as if it was written in exactly the same language. It is now possible for a Visual Basic programmer to work with a C++ programmer and a C# programmer on the same project, collaborating as if all were using just one single language.
Metadata and MSIL add many other exciting features to C# and .NET in addition to the two already mentioned. We will discuss those aspects when relevant as we advance through the book.
It is important to know about the existence of MSIL, but in your everyday programming, you are not directly aware of MSIL's presence. If you use the compiler suggested in this book, you will typically give two commands, one to compile your program (into MSIL/Metadata) and one to run the program (thereby activating the JIT-Compiler). Running the program can just be viewed as an execution of the final output from the compilers. The MSIL is not exposed during this process.