A.3 The One-Definition Rule in Detail | C++ Templates: The Complete Guide

Ru-Brd

As we implied in the introduction to this appendix, there are many details to the actual rule. We organize the rule's constraints by their scope.

A.3.1 One-per-Program Constraints

There can be at most one definition of the following items per program:

Noninline functions and noninline member functions
Variables with external linkage ( essentially , variables declared in a namespace scope or in the global scope, and with the static specifier )
Static data members
Noninline function templates, noninline member function templates, and noninline members of class templates when they are declared with export
Static data members of class templates when they are declared with export :

For example, a C++ program consisting of the following two translation units is invalid ^[2] :

^[2] Interestingly, it is valid C because C has a concept of tentative definition , which is a variable definition without an initializer and can appear more than once in a program.

  //   Translation unit 1:  int counter;  //   Translation unit 2:  int counter;  // ERROR: defined twice! (ODR violation)

This rule does not apply to entities with internal linkage (essentially, entities declared in an unnamed namespace scope or in the global scope using the static specifier) because even when two such entities have the same name , they are considered distinct. In the same vein, entities declared in unnamed namespaces are considered distinct if they appear in distinct translation units. For example, the following two translation units can be combined into a valid C++ program:

  //   Translation unit 1:  static counter = 2;  // unrelated to other translation units  namespace {      void unique()  // unrelated to other translation units  {      }  }  //   Translation unit 2:  static counter = 0;  // unrelated to other translation units  namespace {      void unique()  // unrelated to other translation units  {         ++counter;      }  }  int main()  {      unique();  }

Furthermore, there must be exactly one of the previously mentioned items in the program if they are used . The term used in this context has a precise meaning. It indicates that there is some sort of reference to the entity somewhere in the program. This reference can be an access to the value of a variable, a call to a function, or the address of such an entity. This reference can be explicit in the source, or it can be implicit. For example, a new expression may create an implicit call to the associated delete operator to handle situations when a constructor throws an exception requiring the unused (but allocated) memory to be cleaned up. Another example consists of copy constructors, which must be defined even if they end up being optimized away. Virtual functions are also implicitly used (by the internal structures that enable virtual function calls), unless they are pure virtual functions. Several other kinds of implicit uses exist, but we omit them for the sake of conciseness.

There are two kinds of references that do not constitute a use in the previous sense: The first kind occurs when a reference to an entity appears as part of a sizeof operator. The second kind is similar but with a twist: If a reference appears as part of a typeid operator (see Section 5.6 on page 58), it is not a use in the previous sense, unless the argument of the typeid operator ends designating a polymorphic object (an object with (possibly inherited) virtual functions). For example, consider the following single-file program:

 #include <typeinfo>  class Decider {  #if defined(DYNAMIC)      virtual ~Decider() {      }  #endif  };  extern Decider d;  int main()  {      const char* name = typeid(d).name();      return (int)sizeof(d);  }

This is a valid program if and only if the preprocessor symbol DYNAMIC is not defined. Indeed, the variable d is not defined, but the reference to d in sizeof(d) does not constitute a use, and the reference in typeid(d) is a use only if d is an object of a polymorphic type (because in general it is not always possible to determine the result of a polymorphic typeid operation until run time).

According to the C++ standard, the constraints described in this section do not require a diagnostic from a C++ implementation. In practice, they are almost always reported by linkers as duplicate or missing definitions.

A.3.2 One-per-Translation Unit Constraints

No entity can be defined more than once in a translation unit. So the following example is invalid

C++:

 inline void f() {}  inline void f() {}  // ERROR: duplicate definition

This is one of the main reasons for surrounding the code in header files with so-called guards :

  //   File   guard_demo.hpp:  #ifndef GUARD_DEMO_HPP  #define GUARD_DEMO_HPP   #endif  // GUARD_DEMO_HPP

Such guards ensure that the second time a header file is #include d, its contents are discarded, thereby avoiding a duplicate definition of any class, inline function, or template it contains.

The ODR also specifies that certain entities must be defined in certain circumstances. This can be the case for class types, inline functions, and non- export templates. In the following few paragraphs we review the detailed rules.

A class type X (including struct s and union s) must be defined in a translation unit prior to any of the following kinds of uses in that translation unit:

The creation of an object of type X (for example, as a variable declaration or through a new expression). The creation could be indirect, for example, when an object that itself contains an object of type X is being created.
The declaration of a data member of type X .
Applying the sizeof or typeid operator to an object of type X .
Explicitly or implicitly accessing members of type X .
Converting an expression to or from type X using any kind of conversion, or converting an expression to or from a pointer or reference to X (except void* ) using an implicit cast, static_cast , or dynamic_cast .
Assigning a value to an object of type X .
Defining or calling a function with an argument or return type of type X . Just declaring such a function doesn't need the type to be defined however.

The rules for types also apply to types X generated from class templates, which means that the corresponding templates must be defined in those situations in which such a type X must be defined. These situations create so-called points of instantiation or POI s (see Section 10.3.2 on page 146).

Inline functions must be defined in every translation unit in which they are used (in which they are called or their address is taken). However, unlike class types, their definition can follow the point of use:

 inline int not_so_fast();  int main()  {      not_so_fast();  }  inline int not_so_fast()  {  }

Although this is valid C++, some compilers do not actually "inline" the call to a function with a body that has not been seen yet; hence the desired effect may not be achieved.

Just as with class templates, the use of a function generated from a parameterized function declaration (a function or member function template, or a member function of a class template) creates a point of instantiation. Unlike class templates, however, the corresponding definition can appear after the point of instantiation (or not at all if it is exported).

The facets of the ODR explained in this appendix are generally easily verified by C++ compilers; hence the C++ standard requires that compilers issue some sort of diagnostic when one of these rules is violated. An exception is the lack of definition of a nonexported parameterized function. Such situations are typically not diagnosed.

A.3.3 Cross-Translation Unit Equivalence Constraints

The ability to define certain kinds of entities in more than one translation unit brings with it the potential for a new kind of error: multiple definitions that don't match. Unfortunately, such errors are hard to detect by traditional compiler technology in which translation units are processed one at a time. Consequently, the C++ standard doesn't mandate that differences in multiple definitions be detected or diagnosed (it does allow it, of course). If this cross-translation unit constraint is violated, however, the C++ standard qualifies this as leading to undefined behavior , which means that anything reasonable or unreasonable may happen. Typically, such undiagnosed errors may lead to program crashes or wrong results, but in principle they can also lead to other, more direct, kinds of damage (for example, file corruption). ^[3]

^[3] Version 1 of the gcc compiler actually jokingly did this by starting the game of Rogue in situations like this.

The cross-translation unit constraints specify that when an entity is defined in two different places, the two places must consist of exactly the same sequence of tokens (the keywords, operators, identifiers, and so forth remaining after preprocessing). Furthermore, these tokens must mean the same thing in their respective context (for example, the identifiers may need to refer to the same variable).

Consider the following example:

  //   Translation unit 1:  static int counter = 0;  inline void increase_counter()  {      ++counter;  }  int main()  {  }  //   Translation unit 2:  static int counter = 0;  inline void increase_counter()  {      ++counter;  }

This example is in error because even though the token sequence for the inline function increase_counter() looks identical in both translation units, they contain a token counter that refers to two different entities. Indeed, because the two variables named counter have internal linkage ( static specifier), they are unrelated despite having the same name. Note that this is an error even though neither of the inline functions is actually used.

Placing the definitions of entities that can be defined in multiple translation units in header files that are #include d whenever the definitions are needed ensures that token sequences are identical in almost all situations. ^[4] With this approach, situations in which two identical tokens refer to different things become fairly rare, but when it does happen, the resulting errors are often mysterious and hard to track.

^[4] Occasionally, conditional compilation directives evaluate differently in different translation units. Use such directives with care. Other differences are possible too, but they are even less common.

The cross-translation unit constraints apply not only to entities that can be defined in multiple places, but also to default arguments in declarations. In other words, the following program has undefined behavior:

  //   Translation unit 1:  void unused(int = 3);  int main()  {  }  //   Translation unit 2:  void unused(int = 4);

We should note here that the equivalence of token streams can sometimes involve subtle implicit effects. The following example is lifted (in a slightly modified form) from the C++ standard:

  //   Translation unit 1:  class X {    public:      X(int);      X(int, int);  };  X::X(int = 0)  {  }  class D : public X {  };  D d2;  //  X(int)  called by  D()  //   Translation unit 2:  class X {    public:      X(int);      X(int, int);  };  X::X(int = 0, int = 0)  {  }  class D : public X {  //  X(int, int)  called by  D()  ;  };  //  D()  's implicit definition violates the ODR

In this example, the problem occurs because the implicitly generated default constructor of class D is different in the two translation units. One calls the X constructor taking one argument, and the other calls the X constructor taking two arguments. If anything, this example is an additional incentive to limit default arguments to one location in the program (if possible, this location should be in a header file). Fortunately, placing default arguments on out-of-class definitions is a rare practice.

There is also an exception to the rule that says that identical tokens must refer to identical entities. If identical tokens refer to unrelated constants that have the same value and the address of the resulting expressions is not used, then the tokens are considered equivalent. This exception allows for program structures like the following:

  //   File   header.hpp:  #ifndef HEADER_HPP  #define HEADER_HPP  int const length = 10;  class MiniBuffer {    char buf[length];  ...  };  #endif  // HEADER_HPP

In principle, when this header file is included in two different translation units, two distinct constant variables named length are created because const in this context implies static . However, such constant variables are often meant to define compile-time constant values, not a particular storage location at run time. Hence, if we don't force such a storage location to exist (by referring to the address of the variable), it is sufficient for the two constants to have the same value. This exception to the ODR equivalence rules applies only to integral and enumeration values (floating-point types and pointer types don't fall in this category).

Finally, a note about templates. The names in templates bind in two phases. So-called nondependent names bind at the point where the template is defined. For these, the equivalence rules are handled similarly to other nontemplate definitions. For names that bind at the point of instantiation, the equivalence rules must be applied at that point, and the bindings must be equivalent. This leads to a subtle observation: Although export ed templates are defined in only one location, they may have multiple instances which must obey the equivalence rules. Here is a particularly far- fetched violation of the ODR:

  //   File   header.hpp:  #ifndef HEADER_HPP  #define HEADER_HPP  enum Color { red, green, blue };  // the associated namespace of  Color  is the global namespace  export template<typename T> void highlight(T);  void init();  #endif  // HEADER_HPP   //   File   tmpl_def.cpp:  #include "header.hpp"  export template<typename T>  void highlight(T x)  {     paint(x);  // (1) a dependent call: argument-dependent lookup required  }  //   File   init.cpp:  #include "header.hpp"  namespace {  // unnamed namespace!  void paint(Color c)  // (2)  {   }  }  void init()  {      highlight(blue);  // argument-dependent lookup of (1) resolves to (2)  }  //   File   main.cpp:  #include "header.hpp"  namespace {  // unnamed namespace!  void paint(Color c)  // (3)  {   }  }  int main()  {      init();      highlight(red);  // argument-dependent lookup of (1) resolves to (3)  }

To understand this example, we must remember that functions defined in an unnamed namespace have external linkage, but they are distinct from any functions defined in an unnamed namespace of other translation units. Therefore, the two paint() functions are distinct. However, the call to paint() in the exported template has a template-dependent argument and is therefore not bound until the points of instantiation. In our example, there are two points of instantiation for highlight<Color> , but they result in different bindings of the name paint ; hence the program is invalid.

Ru-Brd