What Is a Software Bus? In teaching my COM and COM+ classes at UCLA Extension, I found that thinking of COM as the software equivalent of the hardware bus that exists on all PCs really helped my students. Like the hardware bus on your computer, COM provides sockets that you can plug components into. However, when you build your software components for the COM software bus, you mustin addition to building the application-specific logic of your componentbuild a plug that fits into the socket that the software bus provides. In COM terms, this plug is the additional logic that you must write to enable your components to work with COM: implementing the IUnknown interface, creating a class factory, and creating and registering Globally Unique Identifiers (GUIDs) for your classes and interfaces. This is not unlike what a PC sound card manufacturer has to do if it is building a new sound card. It has to build the card in such a way that it plugs properly into the hardware bus of a PC. Building this plug is extra work, but, if you do the extra work, your software components are endowed with the attributes that the software bus provides. In the case of COM, this was location transparency and programming language independence. Figure 2-1 shows the COM software bus. Figure 2-1. The COM software bus.
Notice that the software bus only encompasses a LAN. Also notice that the end- user applications that use COM software components are PC based. Figure 2-2 shows the .NET software bus as it functions over the Internet. Figure 2-2. The .NET Internet software bus.
The differences here are that the software bus encompasses the entire Internet. The software components used on the .NET software bus are XML Web services. The network protocol that the .NET software bus uses is SOAP, which is platform independent because it requires no special software other than an HTTP server, client software that can send and receive HTTP messages, and an XML parser. Finally, and most important, the clients used with the .NET software bus are not just PC-based Windows applications. Smart phones and PDAs, as well as Windows-based PCs and even Unix workstations will be able to use the Web services that are plugged into the .NET software bus. Figure 2-3 shows the .NET software bus as it functions within machines. Figure 2-3. The Intramachine .NET software bus.
Notice in this case that the integration substrate is the CLR. By integrating with the CLR, a component automatically gains programming language independence and the other features provided by the CLR, including versioning, security, and tamper resistance. The COM software bus enables a stronger notion of location transparency. The intramachine and intermachine integration substrates are the same; they are fundamentally built on RPC. DCOM is built on DCE RPC, so, when you are communicating across machines, you are using DCE RPC. When you are communicating between executable servers on the same machine, COM uses a lightweight variant of RPC called Local RPC (LRPC). In this way, DCOM is essentially just COM with a longer wire. Thus, with COM, it really didn't make any sense to distinguish between an intramachine and intermachine software bus. With .NET, the intramachine integration technologythe CLRis fundamentally different from the intermachine substrate, SOAP and XML Web services. As you will soon see, location transparency is not as strong a concept as it was under COM, but the advantages that .NET has over COM more than makes up for this shortcoming. A number of key technical problems must be solved in order to implement a software bus. In order to be useful and robust, a software bus has to provide a solution for each of the following problems:
COM used GUIDs to provide unique names for software components. It also used reference counting through the AddRef and Release methods of the IUnknown interface to implement life cycle management. COM also used a binary interface standard based on C++ vtables to provide programming language independence, and it used DCOM and LRPC to provide location transparency. Versioning was implemented by immutable interfaces, that is, requiring software components to continue to support old interfaces after they are released. .NET offers an entirely different set of solutions to these problems. In the rest of this chapter, I explore these problems and the .NET solutions one by one, and, by doing this, you will understand .NET better. NamingA precise statement of the naming problem is simple. You need two things:
The naming problem is slightly different when you are using components on a single machine as compared to when you are using components across the Internet. Consider first the intramachine case and what would happen if two companies created spell-checking software components (with completely different interfaces) and both decided to assign their components the imaginative name "SpellChecker." People would write software that used these components, and everything would work okay as long as only one of the company's spell checkers was installed on a particular host. What do you think would happen if both SpellCheckers were installed on the same machine? Client software that uses the SpellChecker and expects to get company A's component might instead receive company B's component with unpredictable results. Another question is how would I even know whether company A's, company B's, or any other SpellChecker is installed on the machine at all. Now consider the Internet case. Eventually, the Internet and Corporate Intranets will contain thousands of XML Web services performing all sorts of services for end-users. How do I uniquely identify my XML Web service among the tens of thousands of other XML Web services out there so that someone half-way around the world can use it? Moreover, how can I promote my Web service so that someone half-way across the world can discover the capabilities of my Web service and build an application that uses it, hopefully paying me a fee in the process. If you are a COM programmer, you are familiar with the COM solution to the naming problem: GUIDs. A GUID is a 16-byte (128-bit) integer that is used to identify COM classes and interfaces. When GUIDs are used to identify classes, they are called class IDs or CLSIDs. When they are used to identify interfaces, they are called interface IDs or IIDs. The following CLSIDs for two well-known COM classes are shown as they appear in the registry.
GUIDs are a sort of social security number for COM objects. The COM runtime includes a function called CoCreateGUID that creates GUIDs; this function uses an algorithm to generate these GUIDs that never generates the same sequence of numbers twice. COM's way of identifying COM objects is through GUIDs whether you are using a COM object on the same host, on another machine on your LAN, or on a machine half-way around the world that you are connected to by the Internet. Sure there are Programmers ID (ProgIDs), which are more user-friendly names for COM objects (such as "Word.Document"), but ProgIDs aren't guaranteed to be unique. They must also be turned into GUIDs before they can be used to identify a COM object anyway. GUIDs for COM objects are stored in the registry of their host machine. An application can discover which objects reside on a machine by interrogating the registry. The naming solution is different with .NET. However, before I dive into those differences, I need to talk a little bit about the differences between the way that COM and .NET components are deployed. COM components are deployed in either an in-process (DLL) or out-of-process (executable) server. The server binary (exe or DLL) is the smallest enforceable unit of deployment, reuse, versioning, andunless you are using COM+/MTSsecurity. Note If you are using COM+/MTS, you can configure security on a per-interface or per-class basis. A COM component may depend on other files, such as resources or other DLLs, but it was up to youthe developerto make sure that the correct versions of all of the files that make up your component are deployed along with the server binary. In most cases it made sense to just bundle all of the code and resources that the component depends on into its server binary. This is the only way that you could be sure that the correct versions of all the constituent pieces of your COM component would be deployed. This complicated the use of COM components in a Web download deployment scenario because it tended to make COM binaries large. .NET defined a notion of an assembly to solve all of these problems. An assembly is the smallest unit of deployment, versioning, and security in the .NET Framework. Logically, an assembly is a collection of related types (classes, interfaces, enumerations, and so forth) and their associated resources and data. Physically, an assembly consists of one or more modules as shown in Figure 2-4. Figure 2-4. A .NET assembly.
Each module may contain either a managed code file with MSIL code and metadata, or it may contain resources like JPEG or bitmap files. One of the code modules must contain a manifest, which is metadata that lists all the types exposed by the assembly and which module (file) that each type is located within. The CLR uses the manifest to create a single-file illusion for consumers of the assembly. The consumer only needs to reference a single filethe file that contains the manifest. The CLR takes care of loading the other modules as needed. The CLR also verifies that the version number and hash of the contents of each module match those that the assembly was built with. In other words, the CLR makes sure that the correct versions of all the files that make up the assembly are deployed. If they are not, it will throw an error. .NET assemblies can be categorized into two groups: private and shared. Private assemblies are deployed within a subdirectory beneath the root directory of the application that uses them, and they are used only by that application. Shared assemblies are installed into the Global Assembly Cache (GAC) and may be used by several applications on a particular machine. Private assemblies aren't required to have a unique name because, at runtime, the CLR will search only subdirectories beneath the client application for the assembly. Moreover, there is no need to discover the private assemblies that are installed on a particular machine; indeed, these private assemblies are not supposed to be discovered . Shared assemblies have all of the naming problems that we associate with COM components. In other words, in the absence of a unique name, it would be possible for two companies to install two different assemblies with the same name into the GAC with unpredictable results. Microsoft's recommended approach is that you use private assemblies. I discuss why when I talk about versioning. Before I discuss how you can assign unique names to a .NET assembly, I need to talk about .NET assembly naming in a more general sense. All .NET assemblies, whether they are private or shared, have a four-part name of the following form: FileName,Version= version# , Culture=[ CultureNameNeutral] ,PublicKeyToken= KeyToken The first part of the name contains the file name of the assembly without the .dll or .exe extension. The "Version" part of the name contains a four-part version number, the details of which I discuss in Chapter 3. The "Culture" part of the name contains an Request for Comments (RFC) 1766 culture name if the assembly contains culture-specific resources or "Neutral" if the assembly contains code. The PublicKeyToken contains the hash of a cryptographic key or null. If the PublicKeyToken is null, the name is a weak name and is not guaranteed to be unique. If the PublicKeyToken field contains a cryptographic key, the name is guaranteed to be unique, and it is called a strong name. Don't worry if you don't understand what a cryptographic key is. I'll explain this shortly. Note RFC 1766 culture names contain both a language and a culture identifier. For instance, "en-US" identifies U.S. English whereas "en-GB" identifies United Kingdom English. The following example shows a weak name for an assembly: MyAssembly,Version=1.0.8.4,Culture=Neutral, PublicKeyToken=null Notice that the PublicKeyToken is equal to null. The following example shows a strong name for an assembly: MyAssembly,Version=1.1.0.0,Culture=Neutral, PublicKeyToken=8a707be49fd7d8f4 In the second case, the PublicKeyToken part of the name contains a hash of the public key associated with the key pair that was used to digitally sign the assembly. The process for creating a cryptographic key pair and signing an assembly is simple. The .NET Framework SDK contains an application called the Shared Name Utility (sn.exe). To create a cryptographic key pair, run the Shared Name Utility with the following command line: sn k keyfilename The Shared Name Utility will write the key pair to the file called keyfilename . You can then digitally sign an assembly using the key file that you just generated by adding the following attribute declaration to one of your source files: [assembly: AssemblyKeyFile("keyfilename")] Adding this code to your assembly will insert the public key from the file called keyfilename into the assembly, and it will use the private key to create a digital signature. A digital signature is just a hash of the contents of the assembly encrypted with the private key. You will want to keep the private key secure. (For details about why you need to do this, see my sidebar in this chapter on public and private key cryptography.) The Shared Name Utility has a number of options that make it simple for you to use one key pair while you are developing your code and a second key pair (the official pair) when you are ready to ship. The .NET Framework supports an attribute called AssemblyDelay sign that allows you to reserve space in an assembly for a digital signature, but to not actually do the signing. The following code shows how you would do this: [assembly: AssemblyDelaySign(false)] You would then sign the assembly later using the Shared Name Utility. Most companies will allow only a select few individuals within the company to have access to their private key. Now Microsoft could have stuck with GUIDs to provide uniqueness for their assembly names, but cryptographic keys have a huge advantage. They can also be used to verify the authenticity and integrity of an assembly, that is, to validate that the assembly that you are using was actually created by the person or organization that you think it was created by (authenticity) and that it has not been altered since it was created (integrity). I discuss how this is done and also how to sign your assembly with a cryptographic key in Chapter 3. The CLR cannot perform these validations unless you give your assembly a strong name, so it makes sense to give all of your assemblies strong (unique) names even if they will not be shared.
You probably know that, when you instantiate a COM class by passing the GUID (CLSID) for the class to the CoCreateInstance (or similar method), the COM runtime looks up the entry for the GUID in the registry and loads the executable or DLL associated with the registry entry. The .NET Framework performs this name-to-implementation mapping in a different way. In order to use a class in a .NET assembly from an application, you must reference the class's assembly at compile time. When you reference an assembly, your CLR-compliant compiler will store the full name of the assembly in the metadata of the referencing application. These entries tell the CLR which assembly to load when the application uses the class. At runtime, when your client application first uses a class from an assembly that is not already loaded, the CLR will use its search algorithm to locate an assembly that contains the class; it will search for the assembly using the full name, which is stored in the metadata of the referencing application. I discuss the full details of this algorithm in Chapter 3 when I do a drilldown into the CLR, but, for now, a grossly simplified way of looking at the CLR's search algorithm is that it searches the GAC first. It then searches a configurable codebase and then finally searches a configurable list of private directories. The GAC may contain multiple versions of the same assembly, so the CLR will use the version information that is cached in the referencing application to try to find the same version of the assembly that the application was built (and tested ) with.
So far I have only talked about naming in the intramachine case, that is, uniquely identifying a component from among the other components installed on a particular machine. Developers also need a way to uniquely identify XML Web services. Fortunately, a unique naming scheme exists on the Internet already: URLs. An XML Web service's URL is its unique name on the Internet. A URL has the following form: Protocol://domain_name/pathname The protocol identifies the communication mechanism that a client will use to exchange documents with the host server. Typically, this protocol will be HTTP when using XML Web services. Domain_name is the Domain Name Service (DNS) name of the server where the resource resides. The domain name is the part of the URL that guarantees uniqueness. Domain names are unique because they are handed out on a first-come, first- served basis by a centralized agency. After you have registered a domain name, this name is unique across the Internet because no one else is allowed to have that name. The pathname portion of the URL identifies the relative path beneath the root directory of the server where the resource resides. Two example URLs are as follows : http://strongbeach/financialwebservice/financial.asmx http://localhost/DevDotNet/ The first is the URL for a Web service that I am running on my machine right now. Notice that the second URL does not contain a file name. This URL identifies the default resource in the specified directory. Usually, this will be a file called index.html or default.html or something similar. To discover Web services on the Internet, you can use Universal Description, Discovery, and Integration (UDDI). UDDI defines a way to publish and discover information about XML Web services that other developers have created. |
Team-Fly |
Top |