The CComBSTR Class


The CComBSTR class is an ATL utility class that is a useful encapsulation for the COM string data type, BSTR. The atlcomcli.h file contains the definition of the CComBSTR class. The only state maintained by the class is a single public member variable, m_str, of type BSTR.

//////////////////////////////////////////////////// // CComBSTR                                          class CComBSTR {                                     public:                                                BSTR m_str;                                        ...                                                  } ;                                                  


Constructors and Destructor

Eight constructors are available for CComBSTR objects. The default constructor simply initializes the m_str variable to NULL, which is equivalent to a BSTR that represents an empty string. The destructor destroys any BSTR contained in the m_str variable by calling SysFreeString. The SysFreeString function explicitly documents that the function simply returns when the input parameter is NULL so that the destructor can run on an empty object without a problem.

CComBSTR() { m_str = NULL; }            ~CComBSTR() { ::SysFreeString(m_str); } 


Later in this section, you will learn about numerous convenience methods that the CComBSTR class provides. However, one of the most compelling reasons for using the class is so that the destructor frees the internal BSTR at the appropriate time, so you don't have to free a BSTR explicitly. This is exceptionally convenient during times such as stack frame unwinding when locating an exception handler.

Probably the most frequently used constructor initializes a CComBSTR object from a pointer to a NUL-character-terminated array of OLECHAR charactersor, as it's more commonly known, an LPCOLESTR.

CComBSTR(LPCOLESTR pSrc) {                   if (pSrc == NULL) m_str = NULL;          else {                                       m_str = ::SysAllocString(pSrc);          if (m_str == NULL)                               AtlThrow(E_OUTOFMEMORY);     }                                    }                                        


You invoke the preceding constructor when you write code such as the following:

[View full width]

CComBSTR str1 (OLESTR ("This is a string of OLECHARs")) ;[5]


[5] The OLESTR macro is similar to the _T macros; it guarantees that the string literal is of the proper type for an OLE string, depending on compile options.

The previous constructor copies characters until it finds the end-of-string NULL character terminator. When you want some lesser number of characters copied, such as the prefix to a string, or when you want to copy from a string that contains embedded NULL characters, you must explicitly specify the number of characters to copy. In this case, use the following constructor:

CComBSTR(int nSize, LPCOLESTR sz); 


This constructor creates a BSTR with room for the number of characters specified by nSize; copies the specified number of characters, including any embedded NULL characters, from sz; and then appends a terminating NUL character. When sz is NULL, SysAllocStringLen skips the copy step, creating an uninitialized BSTR of the specified size. You invoke the preceding constructor when you write code such as the following:

// str2 contains "This is a string" CComBSTR str2 (16, OLESTR ("This is a string of OLECHARs")); // Allocates an uninitialized BSTR with room for 64 characters CComBSTR str3 (64, (LPCOLESTR) NULL); // Allocates an uninitialized BSTR with room for 64 characters CComBSTR str4 (64); 


The CComBSTR class provides a special constructor for the str3 example in the preceding code, which doesn't require you to provide the NULL argument. The preceding str4 example shows its use. Here's the constructor:

CComBSTR(int nSize) {                             ...                                           m_str = ::SysAllocStringLen(NULL, nSize);     ...                                       }                                             


One odd semantic feature of a BSTR is that a NULL pointer is a valid value for an empty BSTR string. For example, Visual Basic considers a NULL BSTR to be equivalent to a pointer to an empty stringthat is, a string of zero length in which the first character is the terminating NUL character. To put it symbolically, Visual Basic considers IF p = "", where p is a BSTR set to NULL, to be true. The SysStringLen API properly implements the checks; CComBSTR provides the Length method as a wrapper:

unsigned int Length() const { return ::SysStringLen(m_str); } 


You can also use the following copy constructor to create and initialize a CComBSTR object to be equivalent to an already initialized CComBSTR object:

CComBSTR(const CComBSTR& src) {     m_str = src.Copy();             ...                         }                               


In the following code, creating the str5 variable invokes the preceding copy constructor to initialize their respective objects:

CComBSTR str1 (OLESTR("This is a string of OLECHARs")) ; CComBSTR str5 = str1 ; 


Note that the preceding copy constructor calls the Copy method on the source CComBSTR object. The Copy method makes a copy of its string and returns the new BSTR. Because the Copy method allocates the new BSTR using the length of the existing BSTR and copies the string contents for the specified length, the Copy method properly copies a BSTR that contains embedded NUL characters.

BSTR Copy() const {                                  if (!*this) { return NULL; }                     return ::SysAllocStringByteLen((char*)m_str,         ::SysStringByteLen(m_str));              }                                                


Two constructors initialize a CComBSTR object from an LPCSTR string. The single argument constructor expects a NUL-terminated LPCSTR string. The two-argument constructor permits you to specify the length of the LPCSTR string. These two constructors are functionally equivalent to the two previously discussed constructors that accept an LPCOLESTR parameter. The following two constructors expect ANSI characters and create a BSTR that contains the equivalent string in OLECHAR characters:

CComBSTR(LPCSTR pSrc) {               ...                              m_str = A2WBSTR(pSrc);           ...                         }                                CComBSTR(int nSize, LPCSTR sz) {      ...                              m_str = A2WBSTR(sz, nSize);      ...                         }                                


The final constructor is an odd one. It takes an argument that is a GUID and produces a string containing the string representation of the GUID.

CComBSTR(REFGUID src); 


This constructor is quite useful when building strings used during component registration. In a number of situations, you need to write the string representation of a GUID to the Registry. Some code that uses this constructor follows:

// Define a GUID as a binary constant static const GUID GUID_Sample = { 0x8a44e110, 0xf134, 0x11d1,     { 0x96, 0xb1, 0xBA, 0xDB, 0xAD, 0xBA, 0xDB, 0xAD } }; // Convert the binary GUID to its string representation CComBSTR str6 (GUID_Sample) ; // str6 contains "{8A44E110-F134-11d1-96B1-BADBADBADBAD}" 


Assignment

The CComBSTR class defines three assignment operators. The first one initializes a CComBSTR object using a different CComBSTR object. The second one initializes a CComBSTR object using an LPCOLESTR pointer. The third one initializes the object using a LPCSTR pointer. The following operator=() method initializes one CComBSTR object from another CComBSTR object:

CComBSTR& operator=(const CComBSTR& src) {                    if (m_str != src.m_str) {                                     ::SysFreeString(m_str);                                   m_str = src.Copy();                                       if (!!src && !*this) { AtlThrow(E_OUTOFMEMORY); }     }                                                         return *this;                                         }                                                         


Note that this assignment operator uses the Copy method, discussed a little later in this section, to make an exact copy of the specified CComBSTR instance. You invoke this operator when you write code such as the following:

CComBSTR str1 (OLESTR("This is a string of OLECHARs")); CComBSTR str7 ; str7 = str1; // str7 contains "This is a string of OLECHARs" str7 = str7; // This is a NOP. Assignment operator              // detects this case 


The second operator=() method initializes one CComBSTR object from an LPCOLESTR pointer to a NUL-character-terminated string.

CComBSTR& operator=(LPCOLESTR pSrc) {                    if (pSrc != m_str) {                                     ::SysFreeString(m_str);                              if (pSrc != NULL) {                                      m_str = ::SysAllocString(pSrc);                      if (!*this) { AtlThrow(E_OUTOFMEMORY); }         } else {                                                 m_str = NULL;                                    }                                                }                                                    return *this;                                    }                                                    


Note that this assignment operator uses the SysAllocString function to allocate a BSTR copy of the specified LPCOLESTR argument. You invoke this operator when you write code such as the following:

CComBSTR str8 ; str8 = OLESTR ("This is a string of OLECHARs"); 


It's quite easy to misuse this assignment operator when you're dealing with strings that contain embedded NUL characters. For example, the following code demonstrates how to use and misuse this method:

CComBSTR str9 ; str9 = OLESTR ("This works as expected"); // BSTR bstrInput contains "This is part one\0and here's part two" CComBSTR str10 ; str10 = bstrInput; // str10 now contains "This is part one" 


To properly handle situations such as this one, you should turn to the AssignBSTR method. This method is implemented very much like operator=(LPCOLESTR), except that it uses SysAllocStringByteLen.

HRESULT AssignBSTR(const BSTR bstrSrc) {                        HRESULT hr = S_OK;                                          if (m_str != bstrSrc) {                                         ::SysFreeString(m_str);                                     if (bstrSrc != NULL) {                                          m_str = ::SysAllocStringByteLen((char*)bstrSrc,                 ::SysStringByteLen(bstrSrc));                           if (!*this) { hr = E_OUTOFMEMORY; }                     } else {                                                        m_str = NULL;                                           }                                                       }                                                           return hr;                                              }                                                           


You can modify the code as follows:

CComBSTR str9 ; str9 = OLESTR ("This works as expected"); // BSTR bstrInput contains // "This is part one\0and here's part two" CComBSTR str10 ; str10.AssignBSTR(bstrInput);     // works properly // str10 now contains "This is part one\0and here's part two" 


The third operator=() method initializes one CComBSTR object using an LPCSTR pointer to a NUL-character-terminated string. The operator converts the input string, which is in ANSI characters, to a Unicode string; then it creates a BSTR containing the Unicode string.

CComBSTR& operator=(LPCSTR pSrc) {                               ::SysFreeString(m_str);                                      m_str = A2WBSTR(pSrc);                                       if (!*this && pSrc != NULL) { AtlThrow(E_OUTOFMEMORY); }     return *this;                                            }                                                            


The final assignment methods are two overloaded methods called LoadString.

bool LoadString(HINSTANCE hInst, UINT nID) ; bool LoadString(UINT nID) ;                  


The first loads the specified string resource nID from the specified module hInst (using the instance handle). The second loads the specified string resource nID from the current module using the global variable _AtlBaseModule.

CComBSTR Operations

Four methods give you access, in varying ways, to the internal BSTR string that is encapsulated by the CComBSTR class. The operator BSTR() method enables you to use a CComBSTR object in situations where a raw BSTR pointer is required. You invoke this method any time you cast a CComBSTR object to a BSTR implicitly or explicitly.

operator BSTR() const { return m_str; } 


Frequently, you invoke this operator implicitly when you pass a CComBSTR object as a parameter to a function that expects a BSTR. The following code demonstrates this:

HRESULT put_Name (/* [in] */ BSTR pNewValue) ; CComBSTR bstrName = OLESTR ("Frodo Baggins"); put_Name (bstrName); // Implicit cast to BSTR 


The operator&() method returns the address of the internal m_str variable when you take the address of a CComBSTR object. Use care when taking the address of a CComBSTR object. Because the operator&() method returns the address of the internal BSTR variable, you can overwrite the internal variable without first freeing the string. This causes a memory leak. However, if you define the macro ATL_CCOMBSTR_ADDRESS_OF_ASSERT in your project settings, you get an assertion to help catch this error.

#ifndef ATL_CCOMBSTR_ADDRESS_OF_ASSERT     // Temp disable CComBSTR::operator& Assert #define ATL_NO_CCOMBSTR_ADDRESS_OF_ASSERT  #endif                                     BSTR* operator&() {                        #ifndef ATL_NO_CCOMBSTR_ADDRESS_OF_ASSERT      ATLASSERT(!*this);                     #endif                                         return &m_str;                         }                                          


This operator is quite useful when you are receiving a BSTR pointer as the output of some method call. You can store the returned BSTR directly into a CComBSTR object so that the object manages the lifetime of the string.

HRESULT get_Name (/* [out] */ BSTR* pName); CComBSTR bstrName ; get_Name (&bstrName); // bstrName empty so no memory leak 


The CopyTo method makes a duplicate of the string encapsulated by a CComBSTR object and copies the duplicate's BSTR pointer to the specified location. You must free the returned BSTR explicitly by calling SysFreeString.

HRESULT CopyTo(BSTR* pbstr); 


This method is handy when you need to return a copy of an existing BSTR property to a caller. For example:

STDMETHODIMP SomeClass::get_Name (/* [out] */ BSTR* pName) {   // Name is maintained in variable m_strName of type CComBSTR   return m_strName.CopyTo (pName); } 


The Detach method returns the BSTR contained by a CComBSTR object. It empties the object so that the destructor will not attempt to release the internal BSTR. You must free the returned BSTR explicitly by calling SysFreeString.

BSTR Detach() { BSTR s = m_str; m_str = NULL; return s; } 


You use this method when you have a string in a CComBSTR object that you want to return to a caller and you no longer need to keep the string. In this situation, using the CopyTo method would be less efficient because you would make a copy of a string, return the copy, and then discard the original string. Use Detach as follows to return the original string directly:

STDMETHODIMP SomeClass::get_Label (/* [out] */ BSTR* pName) {   CComBSTR strLabel;   // Generate the returned string in strLabel here   *pName = strLabel.Detach ();   return S_OK; } 


The Attach method performs the inverse operation. It attaches a BSTR to an empty CComBSTR object. Ownership of the BSTR now resides with the CComBSTR object, and the object's destructor will eventually free the string. Note that if the CComBSTR already contains a string, it releases the string before it takes control of the new BSTR.

void Attach(BSTR src) {             if (m_str != src) {                 ::SysFreeString(m_str);         m_str = src;                }                           }                               


Use care when using the Attach method. You must have ownership of the BSTR you are attaching to a CComBSTR object because eventually the object will attempt to destroy the BSTR. For example, the following code is incorrect:

STDMETHODIMP SomeClass::put_Name (/* [in] */ BSTR bstrName) {   // Name is maintained in variable m_strName of type CComBSTR   m_strName.Attach (bstrName); // Wrong! We don't own bstrName   return E_BONEHEAD; } 


More often, you use Attach when you're given ownership of a BSTR and you want a CComBSTR object to manage the lifetime of the string.

STDMETHODIMP SomeClass::get_Name (/* [out] */ BSTR* pName); ... BSTR bstrName; pObj->get_Name (&bstrName); // We own and must free the raw BSTR CComBSTR strName; strName.Attach(bstrName); // Attach raw BSTR to the object 


You can explicitly free the string encapsulated in a CComBSTR object by calling Empty. The Empty method releases any internal BSTR and sets the m_str member variable to NULL. The SysFreeString function explicitly documents that the function simply returns when the input parameter is NULL so that you can call Empty on an empty object without a problem.

void Empty() { ::SysFreeString(m_str); m_str = NULL; } 


CComBSTR supplies two additional interesting methods. These methods enable you to convert BSTR strings to and from SAFEARRAYs, which might be useful for converting to and from string data to adapt to a specific method signature. Chapter 3, "ATL Smart Types," presents a smart class for handling SAFEARRAYs.

HRESULT BSTRToArray(LPSAFEARRAY *ppArray) {               return VectorFromBstr(m_str, ppArray);            }                                                     HRESULT ArrayToBSTR(const SAFEARRAY *pSrc) {              ::SysFreeString(m_str);                               return BstrFromVector((LPSAFEARRAY)pSrc, &m_str); }                                                     


As you can see, these methods merely serve as thin wrappers for the Win32 functions VectorFromBstr and BstrFromVector. BSTRToArray assigns each character of the encapsulated string to an element of a one-dimensional SAFEARRAY provided by the caller. Note that the caller is responsible for freeing the SAFEARRAY. ArrayToBSTR does just the opposite: It accepts a pointer to a one-dimensional SAFEARRAY and builds a BSTR in which each element of the SAFEARRAY becomes a character in the internal BSTR. CComBSTR frees the encapsulated BSTR before overwriting it with the values from the SAFEARRAY. ArrayToBSTR accepts only SAFEARRAYs that contain char type elements; otherwise, the function returns a type mismatch error.

String Concatenation Using CComBSTR

Eight methods concatenate a specified string with a CComBSTR object: six overloaded Append methods, one AppendBSTR method, and the operator+=() method.

HRESULT Append(LPCOLESTR lpsz, int nLen);              HRESULT Append(LPCOLESTR lpsz);                        HRESULT Append(LPCSTR);                                HRESULT Append(char ch);                               HRESULT Append(wchar_t ch);                            HRESULT Append(const CComBSTR& bstrSrc);           CComBSTR& operator+=(const CComBSTR& bstrSrc); HRESULT AppendBSTR(BSTR p);                            


The Append(LPCOLESTR lpsz, int nLen) method computes the sum of the length of the current string plus the specified nLen value, and allocates an empty BSTR of the correct size. It copies the original string into the new BSTR and then concatenates nLen characters of the lpsz string onto the end of the new BSTR. Finally, it frees the original string and replaces it with the new BSTR.

CComBSTR strSentence = OLESTR("Now is "); strSentence.Append(OLESTR("the time of day is 03:00 PM"), 9); // strSentence contains "Now is the time " 


The remaining overloaded Append methods all use the first method to perform the real work. They differ only in the manner in which the method obtains the string and its length. The Append(LPCOLESTR lpsz) method appends the contents of a NUL-character-terminated string of OLECHAR characters. The Append(LPCSTR lpsz) method appends the contents of a NUL-character-terminated string of ANSI characters. Individual characters can be appended using either Append(char ch) or Append(wchar_t ch). The Append(const CComBSTR& bstrSrc) method appends the contents of another CComBSTR object. For notational and syntactic convenience, the operator+=() method also appends the specified CComBSTR to the current string.

CComBSTR str11 (OLESTR("for all good men "); // calls Append(const CComBSTR& bstrSrc); strSentence.Append(str11); // strSentence contains "Now is the time for all good men " // calls Append(LPCOLESTR lpsz); strSentence.Append((OLESTR("to come ")); // strSentence contains "Now is the time for all good men to come " // calls Append(LPCSTR lpsz); strSentence.Append("to the aid "); // strSentence contains // "Now is the time for all good men to come to the aid " CComBSTR str12 (OLESTR("of their country")); StrSentence += str12; // calls operator+=() // "Now is the time for all good men to come to // the aid of their country" 


When you call Append using a BSTR parameter, you are actually calling the Append(LPCOLESTR lpsz) method because, to the compiler, the BSTR argument is an OLECHAR* argument. Therefore, the method appends characters from the BSTR until it encounters the first NUL character. When you want to append the contents of a BSTR that possibly contains embedded NULL characters, you must explicitly call the AppendBSTR method.

One additional method exists for appending an array that contains binary data:

HRESULT AppendBytes(const char* lpsz, int nLen); 


AppendBytes does not perform a conversion from ANSI to Unicode. The method uses SysAllocStringByteLen to properly allocate a BSTR of nLen bytes (not characters) and append the result to the existing CComBSTR.

You can't go wrong following these guidelines:

  • When the parameter is a BSTR, use the AppendBSTR method to append the entire BSTR, regardless of whether it contains embedded NUL characters.

  • When the parameter is an LPCOLESTR or an LPCSTR, use the Append method to append the NUL-character-terminated string.

  • So much for function overloading. . .

Character Case Conversion

The two character case-conversion methods, ToLower and ToUpper, convert the internal string to lowercase or uppercase, respectively. In Unicode builds, the conversion is actually performed in-place using the Win32 CharLowerBuff API. In ANSI builds, the internal character string first is converted to MBCS and then CharLowerBuff is invoked. The resulting string is then converted back to Unicode and stored in a newly allocated BSTR. Any string data stored in m_str is freed using SysFreeString before it is overwritten. When everything works, the new string replaces the original string as the contents of the CComBSTR object.

HRESULT ToLower() {                                                 if (m_str != NULL) {                                #ifdef _UNICODE                                                            // Convert in place                                         CharLowerBuff(m_str, Length());              #else                                                                   UINT _acp = _AtlGetConversionACP();                         ...                                                         int nRet = WideCharToMultiByte(                                 _acp, 0, m_str, Length(),                                   pszA, _convert, NULL, NULL);                            ...                                                         CharLowerBuff(pszA, nRet);                                  nRet = MultiByteToWideChar(_acp, 0, pszA, nRet,                                     pszW, _convert);                        ...                                                     BSTR b = ::SysAllocStringByteLen(                               (LPCSTR) (LPWSTR) pszW,                                     nRet * sizeof(OLECHAR));                                if (b == NULL)                                                      return E_OUTOFMEMORY;                               SysFreeString(m_str);                                       m_str = b;                                      #endif                                                                  }                                                           return S_OK;                                    }                                                           


Note that these methods properly do case conversion, in case the original string contains embedded NUL characters. Also note, however, that the conversion is potentially lossy, in the sense that it cannot convert a character when the local code page doesn't contain a character equivalent to the original Unicode character.

CComBSTR Comparison Operators

The simplest comparison operator is operator!(). It returns true when the CComBSTR object is empty, and false otherwise.

bool operator!() const { return (m_str == NULL); } 


There are four overloaded versions of the operator<() methods, four of the operator>() methods, and five of the operator==() and operator!=() methods. The additional overload for operator==() simply handles special cases comparison to NULL. The code in all these methods is nearly the same, so I discuss only the operator<() methods; the comments apply equally to the operator>() and operator==() methods.

These operators internally use the VarBstrCmp function, so unlike previous versions of ATL that did not properly compare two CComBSTRs that contain embedded NUL characters, these new operators handle the comparison correctly most of the time. So, the following code works as expected. Later in this section, I discuss properly initializing CComBSTR objects with embedded NULs.

BSTR bstrIn1 =     SysAllocStringLen(         OLESTR("Here's part 1\0and here's part 2"), 35); BSTR bstrIn2 =     SysAllocStringLen(         OLESTR("Here's part 1\0and here is part 2"), 35); CComBSTR bstr1(::SysStringLen(bstrIn1), bstrIn1); CComBSTR bstr2(::SysStringLen(bstrIn2), bstrIn2); bool b = bstr1 == bstr2; // correctly returns false 


In the first overloaded version of the operator<() method, the operator compares against a provided CComBSTR argument.

bool operator<(const CComBSTR& bstrSrc) const {     return VarBstrCmp(m_str, bstrSrc.m_str,             LOCALE_USER_DEFAULT, 0) ==                  VARCMP_LT;                                  }                                               


In the second overloaded version of the operator<() method, the operator compares against a provided LPCSTR argument. An LPCSTR isn't the same character type as the internal BSTR string, which contains wide characters. Therefore, the method constructs a temporary CComBSTR and delegates the work to operator<(const CComBSTR& bstrSrc), just shown.

bool operator>(LPCSTR pszSrc) const {         CComBSTR bstr2(pszSrc);               return operator>(bstr2);      }                                     


The third overload for the operator<() method accepts an LPCOLESTR and operates very much like the previous overload:

bool operator<(LPCOLESTR pszSrc) const {     CComBSTR bstr2(pszSrc);                  return operator>(bstr2);             }                                        


The fourth overload for the operator<() accepts an LPOLESTR; the implementation does a quick cast and calls the LPCOLESTR version to do the work:

bool operator>(LPOLESTR pszSrc) const {      return operator>((LPCOLESTR)pszSrc); }                                        


CComBSTR Persistence Support

The last two methods of the CComBSTR class read and write a BSTR string to and from a stream. The WriteToStream method writes a ULONG count containing the numbers of bytes in the BSTR to a stream. It writes the BSTR characters to the stream immediately following the count. Note that the method does not tag the stream with an indication of the byte order used to write the data. Therefore, as is frequently the case for stream data, a CComBSTR object writes its string to the stream in a hardware-architecture-specific format.

HRESULT WriteToStream(IStream* pStream) {                  ATLASSERT(pStream != NULL);                            if(pStream == NULL)                                        return E_INVALIDARG;                               ULONG cb;                                              ULONG cbStrLen = ULONG(m_str ?                             SysStringByteLen(m_str)+sizeof(OLECHAR) : 0);      HRESULT hr = pStream->Write((void*) &cbStrLen,             sizeof(cbStrLen), &cb);                            if (FAILED(hr))                                            return hr;                                         return cbStrLen ?                                          pStream->Write((void*) m_str, cbStrLen, &cb) :         S_OK;                                          }                                                      


The ReadFromStream method reads a ULONG count of bytes from the specified stream, allocates a BSTR of the correct size, and then reads the characters directly into the BSTR string. The CComBSTR object must be empty when you call ReadFromStream; otherwise, you will receive an assertion from a debug build or will leak memory in a release build.

HRESULT ReadFromStream(IStream* pStream) {                           ATLASSERT(pStream != NULL);                                      ATLASSERT(!*this); // should be empty                            ULONG cbStrLen = 0;                                              HRESULT hr = pStream->Read((void*) &cbStrLen,                        sizeof(cbStrLen), NULL);                                     if ((hr == S_OK) && (cbStrLen != 0)) {                               //subtract size for terminating NULL which we wrote out          //since SysAllocStringByteLen overallocates for the NULL         m_str = SysAllocStringByteLen(NULL,                                  cbStrLen-sizeof(OLECHAR));                                   if (!*this) hr = E_OUTOFMEMORY;                                  else hr = pStream->Read((void*) m_str, cbStrLen, NULL);          ...                                                          }                                                                if (hr == S_FALSE) hr = E_FAIL;                                  return hr;                                                   }                                                                


Minor Rant on BSTRs, Embedded NUL Characters in Strings, and Life in General

The compiler considers the types BSTR and OLECHAR* to be synonymous. In fact, the BSTR symbol is simply a typedef for OLECHAR*. For example, from wtypes.h:

typedef /* [wire_marshal] */ OLECHAR __RPC_FAR *BSTR; 


This is more than somewhat brain damaged. An arbitrary BSTR is not an OLECHAR*, and an arbitrary OLECHAR* is not a BSTR. One is often misled on this regard because frequently a BSTR works just fine as an OLECHAR*.

STDMETHODIMP SomeClass::put_Name (LPCOLESTR pName) ; BSTR bstrInput = ... pObj->put_Name (bstrInput) ; // This works just fine... usually SysFreeString (bstrInput) ; 


In the previous example, because the bstrInput argument is defined to be a BSTR, it can contain embedded NUL characters within the string. The put_Name method, which expects a LPCOLESTR (a NUL-character-terminated string), will probably save only the characters preceding the first embedded NUL character. In other words, it will cut the string short.

You also cannot use a BSTR where an [out] OLECHAR* parameter is required. For example:

STDMETHODIMP SomeClass::get_Name(OLECHAR** ppName) {   BSTR bstrOutput =... // Produce BSTR string to return   *ppName = bstrOutput ; // This compiles just fine   return S_OK ;          // but leaks memory as caller                          // doesn't release BSTR } 


Conversely, you cannot use an OLECHAR* where a BSTR is required. When it does happen to work, it's a latent bug. For example, the following code is incorrect:

STDMETHODIMP SomeClass::put_Name (BSTR bstrName) ; // Wrong! Wrong! Wrong! pObj->put_Name (OLECHAR("This is not a BSTR!")) ; 


If the put_Name method calls SysStringLen to obtain the length of the BSTR, it will try to get the length from the integer preceding the stringbut there is no such integer. Things get worse if the put_Name method is remotedthat is, lives out-of-process. In this case, the marshaling code will call SysStringLen to obtain the number of characters to place in the request packet. This is usually a huge number (4 bytes from the preceding string in the literal pool, in this example) and often causes a crash while trying to copy the string.

Because the compiler cannot tell the difference between a BSTR and an OLECHAR*, it's quite easy to accidentally call a method in CComBSTR that doesn't work correctly when you are using a BSTR that contains embedded NUL characters. The following discussion shows exactly which methods you must use for these kinds of BSTRs.

To construct a CComBSTR, you must specify the length of the string:

BSTR bstrInput =   SysAllocStringLen (     OLESTR ("This is part one\0and here's part two"),     36) ; CComBSTR str8 (bstrInput) ; // Wrong! Unexpected behavior here                             // Note: str2 contains only                             // "This is part one" CComBSTR str9 (::SysStringLen (bstrInput),     bstrInput); // Correct! // str9 contains "This is part one\0and here's part two" 


Assigning a BSTR that contains embedded NUL characters to a CComBSTR object never works. For example:

// BSTR bstrInput contains // "This is part one\0and here's part two" CComBSTR str10; str10 = bstrInput; // Wrong! Unexpected behavior here                    // str10 now contains "This is part one" 


The easiest way to perform an assignment of a BSTR is to use the Empty and AppendBSTR methods:

str10.Empty();                // Insure object is initially empty str10.AppendBSTR (bstrInput); // This works! 


In practice, although a BSTR can potentially contain embedded NUL characters, most of the time it doesn't. Of course, this means that, most of the time, you don't see the latent bugs caused by incorrect BSTR use.




ATL Internals. Working with ATL 8
ATL Internals: Working with ATL 8 (2nd Edition)
ISBN: 0321159624
EAN: 2147483647
Year: 2004
Pages: 172

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net