2.9 Thermodynamics, Information, and Entropy

Whereas questions about the interplay of limitations on information and entropy for most present computer system technologies remains more theoretical than actual, in systems using molecular recognition, this is a factor that must be considered from the outset. It has already been mentioned that one reason for the present emphasis on photochemistry is due to its possibilities for (a) parallel input, (b) processing, and (c) output. How to read information in and out of molecular systems should be considered as one of the defining problems of the field. Many of the systems under discussion will require some form of parallel output or defining value arising from collective behavior. Although single-atom observing devices such as STMs and AFMs have been developed, it will be hard to build a device that would allow interaction with all possible molecular units in the system in a reasonable amount of time. It would be much better to use the self-organizing behavior of the system to provide a signal at the macroscopic level. We have already seen this as a possibility with the example given above of size amplification of chirality.

By allowing the spontaneous but controlled buildup of complex systems from their components, self-organization also provides means of bypassing tedious nanofabrication and nanomanipulation procedures, a feature of great interest for nanotechnology.

In molecular systems, unlike present-day computers, there is no real split between hardware and software. It is theoretically possible to develop a molecular version of the 0s and 1s of present-day magnetic or optical memory using supermolecules with bipolar or multipolar states. It should be kept in mind, however, that it may be impossible to easily access the information without erasing it or perturbing it to the point where it loses its value. Although in many cases it may be possible to conceive of ways around this problem (e.g., regeneration of the data), it should be noted that putting such a scheme into effect may negate many of the putative advantages of molecular systems (quick reaction time, fuzzy logic, etc.). Add to this the inherent "floppiness" of molecular systems (because one is invariably working at nonzero temperatures) and the possibility of parallel/competing processes occurring. The result is that one starts to address problems that sound very much like that found in quantum mechanics—the inability of "knowing" what a particular state is without perturbing it, the spreading out of probability waves (here in chemical systems, it is information waves), and the overlapping and mixing of different states. The major stumbling block with using molecular systems may not be technical; the challenge is being able to conceive how to adequately make use of their properties!

Chemical systems may store information either in an analog fashion, in the structural features (size, shape, nature, and disposition of interactions sites) of a molecule or a supermolecule; or in a digital fashion, in the various states or connections of a chemical entity. The evaluation of the information content of a recognition process based on structural sensing in receptor-substrate pairs requires an assessment of the relevant molecular characteristics. Recognition is not an absolute but a relative notion. It results from the structural (and eventually dynamical) information stored in the partners and is defined by the fidelity of its reading, which rests on the difference in free energy of interaction between states, represented by different receptorsubstrate combinations. It is thus not a yes/no process but is relative to a threshold level separating states and making them distinct. It depends on free energy and, consequently, on temperature. The parameter kT could be a possible reference quantity against which to evaluate threshold values, differences between states, and reading accuracy. Both analog and digital processing of chemical information depend on such factors.

As an example of how both thermodynamics and structural factors come into play in self-recognition, let us examine the example given above, of the multicomponent self-assembly of the oligobipyridine strands around different ions. Here, we have three structural factors: (1) the structural features of the ligands (nature, number, and arrangement of the binding subunits; nature and position of the spacers), (2) the coordination geometries of the metal ions, and (3) the steric and conformational effects within the different assembled species resulting from the various possible combinations of ligands and metal ions in a given mixture. We also have two thermodynamic factors: (1) the energy-related principle of "maximal site occupancy", which implies that the system evolves toward the species or the mixture of species that presents the highest occupancy of the binding sites available on both the ligand and the ions (i.e., site saturation) and (2) the entropy factor, which favors the state of the system with the largest number of product species. Of course, other factors may contribute, such as the binding of other species present in the medium and environmental effects.

We have already mentioned before that these are all chemical systems following a natural process (i.e., entropy always increases). The decrease in entropy implied in information storage in a covalent structure is (over)compensated by the entropy increase occurring in the course of the stepwise synthesis of the "informed" molecule in question.

Another aspect of "information transfer" is what happens in self-replication. Here, a molecule catalyzes its own formation by acting as template for the constituents, which react to generate a copy of the template. Such systems display autocatalysis and may be termed informational or noninformational, depending on whether or not replication involves the conservation of a sequence of information (Orgel 1992). In self-replicating systems employing three starting constituents, competition between them can occur. Such processes are on the way to systems displaying information transfer, whereas the two-component ones are noninformational.

Entities resulting from self-assembly and self-organization of a number of components may undergo self-correction and adaptation. Such features might also explain why large multisite protein architectures are formed by the association of several smaller protein subunits rather than from a single long polypeptide.

In a broader perspective, these results point to the emergence of a new outlook involving a change in paradigm, from "pure compounds" to "instructed mixtures", from "unicity" (pure substance) to "multiplicity + information" (mixture of instructed components and program). Rather than pursuing mere chemical purity of a compound or a material, one would seek the design of instructed components that, as mixtures, would lead through self-processes to the spontaneous and selective formation of the desired (functional) superstructures.

Beyond programmed systems, the next step in complexity consists in the design of chemical "learning" systems, systems that are not just instructed but can be trained, that possess self-modification ability and adaptability in response to external stimuli. This opens perspectives toward systems that would undergo evolution (i.e, progressive change of internal structure under the pressure of environmental factors). It implies also the passage from closed systems to open systems that are connected spatially and temporally to their surroundings.

The progression from elementary particles to the nucleus, the atom, the molecule, the supermolecule, and the supramolecular assembly represents steps up the ladder of complexity. Particles interact to form atoms, atoms to form molecules, molecules to form supermolecules and supermolecular assemblies, and so on. At each level, novel features appear that did not exist at a lower one. Thus a major line of development of chemistry is toward complex systems and the emergence of complexity. Complexity implies and results from multiple components and interactions between them with integration (i.e., long-range correlation, coupling, and feedback). It is interaction between components that makes the whole more than the sum of the parts and leads to collective properties. The species and properties defining a given level of complexity result from, and may be explained on the basis of the species belonging to the level below and of their multibody interactions.

Because the higher depends on the lower, we work our way backward from organism to interacting assemblies down to supramolecular entities and to the recognition process. Molecular recognition is the level at which information processes and programming procedures are implemented in chemical systems, based on the storage of information in the molecular components and its supramolecular processing through specific interactional algorithms. It brings to light the third basic feature of chemical systems, in addition to matter and energy: information.