|
To demonstrate how to customize the CLR default behavior for locating and loading assemblies, I need to introduce a new deployment model in which you change all three of the builtin assumptions discussed earlier. Specifically, you need a model in which the assemblies are stored in a format other than the standard PE file format on disk, are found in places other than in the application's base directory or the global assembly cache, and have different versioning rules. To this end, I introduce a new deployment model called a cocoon.[1]
A cocoon is a new packaging format for applications. A single cocoon file contains all the assemblies needed to run an application (minus the assemblies shipped as part of the .NET Framework). Packaging all of an application's files into one single file simplifies deployment because the application is more self-contained: it can be installed, removed, or copied simply by moving a single file around. Cocoon files are very similar in concept to .cab files. After I describe how cocoon files are structured and built, I'll walk through the steps needed to write a CLR host that runs applications contained in cocoons. In going through this exercise, I discuss the details of how to write an assembly loading manager. Toward the end of the chapter, I write a program that runs cocoons completely in managed code using the events and methods of System.AppDomain and System.Reflection.Assembly. This second program won't provide the same level of customization as the CLR host does, but it will serve to demonstrate the different capabilities offered by the two approaches. My implementation of the cocoon deployment model is based on object linking and embedding (OLE) structured storage files. The structured storage technology lends itself particularly well to this scenario because it includes concepts that map directly to directories and files on disk (namely, storages and streams). If you're not familiar with structured storage, or your knowledge is a bit rusty, you can find plenty of documentation on the Microsoft Developer Network (MSDN) or in the platform SDK. Cocoons are built by a utility I wrote called makecocoon.exe. This utility packages all executable files in the directory from which it's run into a structured storage file with a .cocoon extension. Each file in the directory ends up as a stream in the .cocoon file. The name of the stream is set to the name of the file on disk, minus its file extension. Makecocoon.exe takes as input the executable containing the entry point for the application and the name of the type within that executable that contains the main method. The name of the .cocoon file created by makecocoon.exe is based on the name of the main executable for the application. For example, consider an application called hrtracker that is contained in the directory shown in the following listing: Volume in drive C has no label. Volume Serial Number is 18EE-14D2 Directory of C:\HRTracker 10/03/2003 10:30 AM <DIR> . 10/03/2003 10:30 AM <DIR> .. 10/01/2003 04:37 PM 50,688 HRTracker.exe 10/01/2003 04:36 PM 122,880 Benefits.dll 09/24/2003 01:56 PM 16,384 Employee.dll 10/01/2003 04:36 PM 453,348 Payroll.dll 5 File(s) 643,300 bytes 2 Dir(s) 45,701,091,328 bytes free The following command would create a cocoon file named hrtracker.cocoon: MakeCocoon HRTracker.exe HRTracker.Application Hrtracker.cocoon contains a stream for each assembly as shown in Figure 8-1. Figure 8-1. A .cocoon file for the HRTracker applicationIn addition to the streams containing the main executable and its dependent assemblies, each .cocoon file also contains three additional streams. These extra streams contain data that is needed by the programs I write later in the chapter to run executables contained in .cocoon files. The first of these streams contains the name of the type in the main executable that contains the main method. This stream, called _entryPoint, is needed so you know which type to instantiate to run the application contained in the .cocoon file. The need for the other two streams isn't quite as obvious. To understand the role these streams play, I need to introduce the notion of CLR binding identities. CLR Binding IdentitiesRecall from Chapter 7 that assemblies can be referenced by strings consisting of the assembly's friendly name and optional values for the public key used to sign the assembly, the assembly's version, and the culture for any resources that the assembly contains. Working with these string-based identities can be problematic for three reasons:
To help alleviate these problems, the CLR hosting interfaces provide a set of methods that make it easy to work with string-based identities. These methods are part of an interface called ICLRAssemblyIdentityManager. Given an assembly, ICLRAssemblyIdentityManager gives you back a fully qualified string identity in the correct format. These canonical textual strings are what I was referring to earlier as binding identities. The nice thing about binding identities is that you can (and should) treat them as opaque identifiers for assemblies. So, you don't need to parse them or interpret their contents in any way. The methods on ICLRAssemblyIdentityManager and the methods provided by the interfaces you use as part of an assembly loading manager handle all that for you. In fact, if you ever find yourself looking inside a binding identity, it's likely you're doing something wrong. The extra streams included in each .cocoon file are needed because the assembly loading manager I write later in the chapter requires the use of binding identities. I use them in two specific places, hence the need for two additional binding identityrelated streams in the .cocoon files. The first place I use a binding identity is to load the executable containing the entry point for the application in the cocoon. Remember that one of the goals of writing an assembly loading manager is to force the CLR to call the host to resolve references to assemblies contained in cocoon files. For this to work properly, all assemblies must be referenced by a full identitythe CLR will not call the assembly loading manager for partial references. I've added a stream to the .cocoon file that contains the binding identity (remember, these are fully qualified) for the assembly containing the application's entry point. This stream is called _exeBindingIdentity. I also need to use binding identities when the CLR calls the assembly loading manager to resolve a reference to an assembly. As you'll see, the CLR passes the assembly reference to resolve in the form of a binding identity. You must know which stream in the .cocoon file contains the assembly with the given binding identity. The easiest way to implement this would have been simply to name the streams in the cocoon based on the binding identity of the assembly the stream contains. Unfortunately, OLE structured storage places constraints on how streams can be named, and binding identities violate those constraints. To work around this limitation, I name the assembly streams based on the assembly's friendly name and create an index stream that maps binding identities to the names of the streams containing the assemblies. The name of this mapping stream is called _index. The format of the _index stream is shown in Figure 8-2. Figure 8-2. The _index stream in a .cocoon fileNow that you understand the need for the additional streams I had to create, take a look at the overall structure of a .cocoon file. To summarize, each .cocoon file has the following streams:
The platform SDK contains a utility called DocFile Viewer that you can use to look at the contents of structured storage files. Figure 8-3 shows the contents of the HRTracker cocoon file using DocFile Viewer. Figure 8-3. A .cocoon file as shown in DocFile Viewer
Obtaining Binding IdentitiesNow that you've seen the role that binding identities will play in the cocoon scenario, take a look at the steps involved in obtaining these identities. As I mentioned, the ICLRAssemblyIdentityManager interface includes methods that return binding identities for a given assembly. In addition to returning binding identities, ICLRAssemblyIdentityManager also has methods that help determine the list of an assembly's references, the list of files the CLR will look for when attempting to resolve a reference to an assembly, and so on. The complete list of methods on ICLRAssemblyIdentityManager is shown in Table 8-1.
As shown in the table, ICLRAssemblyIdentityManager enables you to supply the assembly for which you'd like a binding identity by either providing a pathname to the file containing that assembly's manifest or by supplying a pointer to an IStream that contains the assembly's contents. Given these methods, two steps are involved in obtaining a binding identity for an assembly:
Step 1: Obtaining a Pointer to ICLRAssemblyIdentityManagerUnfortunately, obtaining a pointer to an ICLRAssemblyIdentityManager is more involved than obtaining pointers to the rest of the hosting interfaces implemented by the CLR. You may recall from Chapter 2 that a host typically uses the ICLRControl interface to request pointers to the hosting interfaces implemented by the CLR. ICLRAssemblyIdentityManager doesn't follow this pattern. Instead, you must call a function named GetCLRIdentityManager to get a pointer of type ICLRAssemblyIdentityManager. Here's the definition of GetCLRIdentityManager from mscoree.idl: STDAPI GetCLRIdentityManager(REFIID riid, IUnknown **ppManager); To make matters more complicated, GetCLRIdentityManager is implemented in the main CLR runtime DLL, mscorwks.dll, not from the startup shim (mscoree.dll) like the other functions we've used, such as CorBindToRuntimeEx. Even though GetCLRIdentityManager is implemented in mscorwks.dll, you must still go through mscoree.dll to access it. Recall from Chapter 3 that all accesses to the CLR from unmanaged code must go through mscoree.dll to make sure the proper CLR runtime DLLs are loaded when multiple versions are installed on the machine. The end result of this is that you must access GetCLRIdentityManager dynamically through a function pointer obtained from the GetRealProcAddress function exported from mscoree.dll. GetRealProcAddress redirects the request for a particular function to the proper version of mscorwks.dll. The following sample code uses GetRealProcAddress to get a pointer to the GetCLRIdentityManager function and calls through that function pointer to get an interface of type ICLRAssemblyIdentityManager: // Declare a type for our pointer to GetCLRIdentityManager. typedef HRESULT (__stdcall *CLRIdentityManagerProc)(REFIID, IUnknown **); // Declare variables to hold both the function pointer and the // interface of type ICLRAssemblyIdentityManager. CLRIdentityManagerProc pIdentityManagerProc = NULL; ICLRAssemblyIdentityManager *pIdentityManager = NULL; // Use GetRealProcAddress to get a pointer to GetCLRIdentityManager. HRESULT hr = GetRealProcAddress("GetCLRIdentityManager", (void **)&pIdentityManagerProc); // Call GetCLRIdentityManager to get a pointer to ICLRAssemblyIdentityManager. hr = (pIdentityManagerProc)(IID_ICLRAssemblyIdentityManager, (IUnknown **)&pIdentityManager); Step 2: Calling GetBindingIdentityFromFile (or Stream)Now that you've got a pointer of type ICLRAssemblyIdentityManager, you can call either GetBindingIdentityFromFile or GetBindingIdentityFromStream to obtain a binding identity for an assembly. Mscoree.idl defines these two methods as follows: interface ICLRAssemblyIdentityManager : IUnknown { HRESULT GetBindingIdentityFromFile( [in] LPCWSTR pwzFilePath, [in] DWORD dwFlags, [out, size_is(*pcchBufferSize)] LPWSTR pwzBuffer, [in, out] DWORD *pcchBufferSize ); HRESULT GetBindingIdentityFromStream( [in] IStream *pStream, [in] DWORD dwFlags, [out, size_is(*pcchBufferSize)] LPWSTR pwzBuffer, [in, out] DWORD *pcchBufferSize ); // other methods omitted } Makecocoon.exe deals with files, so it uses GetBindingIdentityFromFile exclusively. As discussed, the _index stream requires a binding identity for every file in the cocoon. So, GetBindingIdentityFromFile is called by makecocoon as it iterates through the files in the directory in preparation to add them to a cocoon. GetBindingIdentityFromFile takes as input a buffer in which it will store the binding identity for the assembly you request. However, binding identities vary in size based on certain factors, including the assembly's friendly name, whether it has a strong name, and so on. Given this, there's no way to know how much buffer space to allocate beforehand. As a result, the GetBindingIdentityFromFile method is designed to be called twice in succession. On the first call to GetBindingIdentityFromFile, you pass NULL for the buffer in which the binding identity is to be stored and 0 for the pcchBufferSize parameter. The CLR determines how much buffer space is required for the binding identity you are asking for and returns the required size in pcchBufferSize. Next, you allocate a buffer of the requested size and call GetBindingIdentityFromFile again, passing it the allocated buffer. After this second call returns, pwzBuffer contains the binding identity. The following code shows how you call GetBindingIdentityFromFile twice to obtain a binding identity for a given assembly: // Call once to get the required buffer size. pszFileName // contains the path to the manifest of the assembly for which you'd like // a binding identity. DWORD cbBuffer = 0; HRESULT hr = m_pIdentityManager->GetTextualIdentityFromFile( pszFileName, 0, NULL, &cbBuffer); // Allocate a buffer is size cbBuffer. This example uses UNICODE strings, // hence the multiplication by sizeof(wchar_t). wchar_t *pBindingIdentity = (wchar_t *)malloc(cbBuffer*sizeof(wchar_t)); // Call again to actually get the binding identity. hr = m_pIdentityManager->GetTextualIdentityFromFile( pszFileName, 0, pBindingIdentity, &cbBuffer); // pBindingIdentity now contains the binding identity. // ... // Remember to free the string containing the binding identity. free(pBindingIdentity); The Makecocoon.exe ProgramNow that you've looked at all the pieces required to build makecocoon.exe, take a closer look at how the program works. Makecocoon.exe begins by creating a structured storage file based on the name of the executable file passed in. It then enumerates the contents of the directory looking for files with a .dll extension. For each .dll file, makecocoon.exe maps a view of the file's contents into memory using the Win32 memory-mapped file APIs. Given the view of the file in memory, makecocoon.exe creates a new stream in the structure storage file and writes the contents of the mapped memory to that stream. As each stream is created, I build up a data structure that contains the name of the stream and the binding identity of the assembly contained in that stream. This data structure is eventually written to the _index stream I described earlier. The source code for makecocoon.exe's primary source file is given in Listing 8-1. The program includes a few other files that contain helper classes for obtaining an ICLRAssemblyIdentityManager and for maintaining the index data structure. The complete source code can be found at this book's companion Web site. Listing 8-1. Makecocoon.cpp// // MakeCocoon.cpp // // Takes a directory of files and makes a "cocoon." MakeCocoon.exe takes // as input the main executable to wrap in the cocoon. It streams that // executable, plus all DLLs in the same directory into an OLE structured // storage file. #include "stdafx.h" #include "CStreamIndex.h" #include "CCLRIdentityManager.h" // Given an assembly file on disk, this function creates a stream under // pRootStorage and writes the bytes of the assembly to that stream. It also // creates an entry in the index that maps the name of the new stream to the // binding identity of the file it contains. HRESULT CreateStreamForAssembly(IStorage *pRootStorage, CStreamIndex *pStreamIndex, LPWSTR pAssemblyFileName) { // Make sure you can open the file. HANDLE hFile = CreateFile(pAssemblyFileName, GENERIC_READ, 0, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL); if (hFile == INVALID_HANDLE_VALUE) { printf("Error opening file: %s\n", pAssemblyFileName); return E_FAIL; } wprintf(L"Creating Stream for Assembly in file: %s\n", pAssemblyFileName); // Get the file size so you know how many bytes to write to the OLE // structured storage file. DWORD dwSize = GetFileSize(hFile, NULL); // Map the file into memory. HANDLE hFileMapping = CreateFileMapping(hFile, NULL, PAGE_READONLY, 0, dwSize, NULL); PVOID pFile = MapViewOfFile(hFileMapping, FILE_MAP_READ, 0, 0, 0); // Pull the file extension off the name so you're left with just the // simple assembly name. wchar_t wszSimpleAsmName[MAX_PATH]; ZeroMemory(wszSimpleAsmName, MAX_PATH*2); wcsncpy(wszSimpleAsmName, pAssemblyFileName, wcslen(pAssemblyFileName)-4); // Create a stream in which to store the assembly. IStream *pMainStream = NULL; HRESULT hr = pRootStorage->CreateStream(wszSimpleAsmName, STGM_DIRECT | STGM_CREATE | STGM_WRITE | STGM_SHARE_EXCLUSIVE, 0, 0, &pMainStream); assert(SUCCEEDED(hr)); // Write the assembly into the stream. ULONG ulSizeWritten = 0; hr = pMainStream->Write(pFile, dwSize, &ulSizeWritten); assert(SUCCEEDED(hr)); assert(ulSizeWritten == dwSize); // Clean up - release the Stream, Unmap the file, and close handles. pMainStream->Release(); UnmapViewOfFile(pFile); CloseHandle(hFileMapping); CloseHandle(hFile); // Add an entry to the index for this stream. CCLRIdentityManager *pIdentityManager = new CCLRIdentityManager(); wchar_t *pBindingIdentity = pIdentityManager ->GetBindingIdentityForFile(pAssemblyFileName); assert(pBindingIdentity); hr = pStreamIndex->AddIndexEntry(wszSimpleAsmName, pBindingIdentity); assert(SUCCEEDED(hr)); free(pBindingIdentity); delete pIdentityManager; return hr; } // Create a stream that holds a string. Use this to write entry point data // into the storage and to record the binding identity of the assembly // containing the application's executable. HRESULT CreateStreamForString(IStorage *pRootStorage, wchar_t *pszStreamName, wchar_t *pszSt ring) { wprintf(L"Creating String Stream containing: %s\n", pszString); // Create a stream in which to store the string. IStream *pStringStream = NULL; HRESULT hr = pRootStorage->CreateStream(pszStreamName, STGM_DIRECT | STGM_CREATE | STGM_WRITE | STGM_SHARE_EXCLUSIVE, 0, 0, &pStringStream); assert(SUCCEEDED(hr)); // Write the string to the stream. ULONG ulSizeWritten = 0; DWORD dwSize = wcslen(pszString)*sizeof(wchar_t); hr = pStringStream->Write(pszString, dwSize, &ulSizeWritten); assert(SUCCEEDED(hr)); assert(ulSizeWritten == dwSize); pStringStream->Release(); return S_OK; } int wmain(int argc, wchar_t* argv[]) { // Make sure the correct number of arguments was passed. if (argc != 3) { wprintf(L"Usage: MakeCocoon <exe file name> <name of type containing Main()>\n"); return 0; } // Construct the filename for the cocoon. I use the name of the exe // minus ".exe" + the ".cocoon" extension. wchar_t wszCocoonName[MAX_PATH]; ZeroMemory(wszCocoonName, MAX_PATH*2); wcsncpy(wszCocoonName, argv[1], wcslen(argv[1])-4); wcscat(wszCocoonName, L".cocoon"); // Create the structured storage file in which to store the assemblies. wprintf(L"Creating Cocoon: %s\n", wszCocoonName); IStorage *pRootStorage = NULL; HRESULT hr = StgCreateDocfile(wszCocoonName, STGM_DIRECT | STGM_READWRITE | STGM_CREATE | STGM_SHARE_EXCLUSIVE, 0, &pRootStorage); assert(SUCCEEDED(hr)); // Create the index you'll use to map stream names to binding identities. CStreamIndex *pStreamIndex = new CStreamIndex(pRootStorage); // Initialize and start the CLR. ICLRRuntimeHost *pCLR = NULL; hr = CorBindToRuntimeEx( L"v2.0.41013", L"wks", STARTUP_CONCURRENT_GC, CLSID_CLRRuntimeHost, IID_ICLRRuntimeHost, (PVOID*) &pCLR); assert(SUCCEEDED(hr)); pCLR->Start(); // Obtain an identity manager. This is a helper class that wraps the // methods provided by ICLRAssemblyIdentityManager. CCLRIdentityManager *pIdentityManager = new CCLRIdentityManager(); // Get the binding identity for the application's executable. wchar_t *pExeIdentity = pIdentityManager ->GetBindingIdentityForFile(argv[1]); assert(pExeIdentity); // Create a stream to hold the binding identity of the exe file. hr = CreateStreamForString(pRootStorage, L"_exeBindingIdentity", pExeIdentity); assert(SUCCEEDED(hr)); free(pExeIdentity); delete pIdentityManager; // Create a stream that contains the name of the type containing the // application's main() method. hr = CreateStreamForString(pRootStorage, L"_entryPoint", argv[2]); assert(SUCCEEDED(hr)); // Create a stream for the exe file. hr = CreateStreamForAssembly(pRootStorage, pStreamIndex, argv[1]); assert(SUCCEEDED(hr)); // Loop through the current directory creating streams for all // dependent assemblies. wchar_t bCurrentDir[MAX_PATH]; GetCurrentDirectory(MAX_PATH, bCurrentDir); wcsncat(bCurrentDir, L"\\*",2); WIN32_FIND_DATA fileData; HANDLE hFind = FindFirstFile(bCurrentDir, &fileData); while (FindNextFile(hFind, &fileData) != 0) { // Determine if the file is a DLL - ignore everything else. wchar_t *pDllExtension = wcsstr(fileData.cFileName, L".dll"); if (pDllExtension) { // Create a stream in the Compound File for the assembly. hr = CreateStreamForAssembly(pRootStorage, pStreamIndex, fileData.cFileName); assert(SUCCEEDED(hr)); } } // Write the index to the structured storage file. This creates the // _index stream. pStreamIndex->WriteStream(); // Clean up. delete pStreamIndex; FindClose(hFind); pRootStorage->Release(); return 0; } |
|