3.1. Defining Information

What is information? Consult a dictionary and enter a strange loop of circular definitions resembling the impossible structures of M.C. Escher, shown in Figure 3-1. Data is information is knowledge is information is data. Ask an expert and receive a philosophical treatise on the fine distinctions between data, information, knowledge, and wisdom. Ask a colleague and they'll question your sanity. But go ahead anyway. Ask someone to define information. Then poke holes in their definitions. Keep at it. Don't let them off the hook. I'll bet it's easy and fun, in a disturbing sort of way, like shooting fish in a barrel.

Our inability to precisely answer this question speaks volumes about the subject. Information surrounds us. We can cite examples ad infinitum: articles, books, cartoons, databases, encyclopedias, files, gestures, holograms, images, journals, knowledge bases, laws, maps, numbers, ontologies, paintings, quizzes, rules, signs, texts, users, variables, web sites, xeroxes, yaks, and zebras. We use information. We create information. But we can't draw a circle around the category and agree what's in and out. Take yaks and zebras for instance. Scholars argue that under the right circumstances, animals can enter the category we call documents. We'll revisit this bizarre claim later, but for now, let's agree to disagree about the definition of information.

Figure 3-1. Relativity (left) and Sky and Water I (right) by M.C. Escher (© 2005 The M.C. Escher Company-Holland. All rights reserved.

It's not that good people haven't tried to crack this nut. In fact, the field of information science was first defined in the early 1960s as:

The science that investigates the properties and behavior of information, the forces governing the flow of information, and the means of processing information for optimum accessibility and usability. The processes include the origination, dissemination, collection, organization, storage, retrieval, interpretation, and use of information.[*]

Since then, working definitions have emerged within the field:

  • Data. A string of identified but unevaluated symbols.

  • Information. Evaluated, validated, or useful data .

  • Knowledge. Information in the context of understanding.

But they only lead to more questions. Can evaluation turn data into information? Or is information defined by its value to the end user? Is it data that makes a difference? How about the knowledge we seek to manage? Can knowledge (and understanding) even exist outside the mind? We conflate distinctions of source, process, impact, and location. Yet we muddle through. We negotiate. We translate. We communicate. And therein lies the key to this Tower of Babel. Information is about communication. It involves the exchange of symbols, ideas, messages, and meaning between people. As such, it's characterized by ambiguity, redundancy, inefficiency, error, and indescribable beauty.

Communication is among humanity's greatest gifts. Without it, we would always be alone, trapped in our own thoughts and constrained by our own limitations. With it, we enjoy an amazing extension of the human nervous system. A child runs toward a street chasing a ball. The driver of an approaching car is blinded by the sun. A neighbor yells a single word. Stop! Communication is first and foremost about cooperation. It is evolutionary evidence that, in the words of Ben Franklin, "we must all hang together, or assuredly we shall all hang separately."[*] For countless millennia, small groups of humans have used gestural and verbal communication for collaborative hunting, gathering, fighting, parenting, learning, and decision-making.

Communication is the backbone of all human society from ancient tribes to modern nations. And, information is the principal ingredient that enables cooperation to scale from clans with a few dozen members to an interconnected global economy of billions. Information allows us to communicate across time and space. From marks on bark to etchings in silicon, we're able to share observations, experience, insight, and emotion. Documents are talking objects. They make possible the wonders of art, business, engineering, government, law, literature, and science. Documents enable us to stand on the shoulders of giants. Information is heady stuff indeed.

Yet when we try to define information, we become lost in a hall of mirrors occupied by human reflections, and we're back to the illusions and infinite loops of M.C. Escher. To escape this prison of relativity, we must abandon our discussions of disembodied, decontextualized, generic information. We must add substance and specificity. In short, we must classify. For once we categorize and contextualize, definitions come easy. Consider, for instance, a recipe book. It contains a collection of recipes that specify instructions and ingredients. It is typically a printed, bound document used by a cook in a kitchen for selecting and preparing meals. It provides multiple ways to find recipes, often by cooking method and type of cuisine. It includes a table of contents and an index.

This example illustrates the power and pervasiveness of genre. The term recipe book conjures up a specific image complete with format, structure, content, organization, context, and purpose. When we talk about novels, speeches, movies, magazines, letters, emails, billboards, blogs, and web sites, we rely on genre as shorthand to indicate both the message and the medium. Of course, new technologies complicate matters. Once upon a time, a story was an oral and aural experience, spoken directly from storyteller to listener, stored in human memory, and passed down through generations. This type of "story" enjoyed monopoly power for between 40,000 and 2,000,000 years. We're not exactly sure since we have no records of this "pre-history." In any case, around 5,000 years ago we invented written language and complicated things. We now had to qualify whether we "heard a story" or "read a story." Then, in the last 30 years, personal computers and the Internet happened. Suddenly, stories could flow from spoken word to printed text to digital document, and back again. A story could be encoded into a series of bits, transported around the world at the speed of light, and stored on a multitude of media. War and Peace on a laptop. The Odyssey on a flash drive. The Bible on a Treo.

Make our recipe book available online and it's no longer just a book. It becomes an interactive product and database with a user interface for search and navigation. It's available on desktop computers and mobile devices around the world. And thanks to Google's keyword search, each recipe has become a discrete findable object. Users may access individual recipes without experiencing the broader collection we call a book. In fact, our recipe book has mutated into something else, and we're not sure what to call it. Perhaps it's an online cookbook, a recipe collection, or a recipe archive. Or maybe it's just a web site with a branded name like A Google search elicits even stranger real world examples including Recipe Goldmine, Recipe Cottage, and RecipeLand. Clearly, the Web has confused our notions of genre. The recipes survive but the book is blown to bits. Technology and genre are intertwined. Books. Television. The Web. The way we experience the message is shaped by the medium. And the ways we define information are shaped by the properties of that medium and the context of use.

Our confusion today is but one sign of the turbulence in our media landscape being wrought by the relentless drive of Moore's Law towards faster, smaller, cheaper computer processors. Today Google and the World Wide Web dominate our information universe. Tomorrow will be different. But before we look ahead, it's worth looking back at the universe of Calvin Mooers and the science of information retrieval .