John Mauldin: Knowledge and Power

Published on

A key insight came from an analogy with the game of Twenty Questions: paring down a complex problem to a chain of binary, yes-no choices, which Shannon may have been first to dub “bits”. Then this telephonic tinkerer went to work for Bell Labs at its creative height, when it was a place where a young genius could comfortably unicycle down the hallways juggling several balls over his head.

During the war, he worked on cryptography there and talked about thinking machines over tea with the great tragic figure Alan Turing. Conceiving a generic abstract computer architecture, Turing is arguably the progenitor of information theory broadly conceived. At Bletchley Park in Britain, his contributions to breaking German codes were critical to the Allied victory. Following the war, he committed suicide by eating a poisoned apple after having undergone court-mandated estrogen therapy to rein in his public homosexuality. (The Apple logo, with its missing bite, is seen by some as an homage to Turing, but Steve Jobs said he only wished he had been that smart).

The two computing-obsessed cryptographers, Shannon and Turing, also discussed during these wartime teas what Shannon described as his burgeoning “notions on Information Theory” (for which Turing provided “a fair amount of negative feedback”).

In 1948, Shannon published those notions on Information Theory in The Bell System Technical Journal as a 78-page monograph, “The Mathematical Theory of Communication.”  (The next year—with an introduction by Warren Weaver, one of America’s leading wartime scientists—it reappeared as a book.) It became the central document of the dominant technology of the age and still resonates today as the theoretical underpinning for the Internet.

Shannon’s first wife described the arresting magnetism of his countenance as “Christ-like.” Like Leonardo and fellow computing pioneer Charles Babbage, he was said by one purported witness to have built floating shoes for walking on water. With his second wife, herself a “computer” when he met her at AT&T, he created a home full of pianos, unicycles, chess-playing machines, and his own surprising congeries of seriously playful gadgets. These included a mechanical white mouse named Theseus—built soon after the information theory monograph—which could learn its way through a maze; a calculator that worked in Roman numerals; a rocket-powered Frisbee; a chair lift to take his children down to the nearby lake; a diorama in which three tiny clowns juggled 11 rings, 10 balls, and 7 clubs; and an analog computer and radio apparatus, built with the help of blackjack card-counter and fellow MIT professor Edward Thorp, to beat the r oulette wheels at Las Vegas (it worked in Shannon’s basement but suffered technical failure in the casino). Later an uncannily successful investor in technology stocks, Shannon insisted on the crucial differences between a casino and a stock exchange that eluded some of his followers.

When I wrote my book, Microcosm, on the rise of the microchip, I was entranced with physics and was sure that the invention of the transistor at Bell Labs in 1948 was the paramount event of the post-war decade. Today, I find that physicists are entranced with the theory of information. I believe, with his biographer James Gleick, that Shannon’s Information Theory was a breakthrough comparable to the transistor. While the transistor is today ubiquitous in information technology, Shannon’s theories are immanent in all the ascendant systems of the age. As universal principles, they grow ever more fruitful and fertile as time passes. Every few weeks, I encounter another company crucially rooted in Shannon’s theories, full of earnest young engineers conspiring to beat the Shannon limit. The technology of our age seems to be at once both Shannon limited and Shannon enabled. So is the modern world.

Let us imagine the lineaments of an economics of disorder, disequilibrium, and surprise that could explain and measure the contributions of entrepreneurs. Such an economics would begin with the Smithian mold of order and equilibrium. Smith himself spoke of property rights, free trade, sound currency, and modest taxation as crucial elements of an environment for prosperity. Smith was right: An arena of disorder, disequilibrium, chaos, and noise would drown the feats of creation that engender growth. The ultimate physical entropy envisaged as the heat death of the universe, in its total disorder, affords no room for invention or surprise. But entrepreneurial disorder is not chaos or mere noise. Entrepreneurial disorder is some combination of order and upheaval that might be termed “informative disorder.”

Shannon defined information in terms of digital bits and measured it by the concept ofinformation entropy: unexpected or surprising bits. Reported to have been the source of the name was John von Neumann, inventor of computer architectures, game theory, quantum math, nuclear devices, military strategies, cellular automata, among other ingenious phenomena. Encountering von Neumann in a corridor at MIT, Shannon allegedly told him about his new idea. Von Neumann suggested that he name it “entropy” after the thermodynamic concept (according to Shannon, von Neumann said it would be a great word to use because no one knows what it means).

Shannon’s entropy is governed by a logarithmic equation nearly identical to the thermodynamic equation of Rudolf Clausius that describes physical entropy. But the parallels between the two entropies conceal several pitfalls that have ensnared many. Physical entropy is maximized when all the molecules in a physical system are at an equal temperature (and thus cannot yield any more energy). Shannon entropy is maximized when all the bits in a message are equally improbable (and thus cannot be further compressed without loss of information). These two identical equations point to a deeper affinity that MIT physicist Seth Lloyd identifies as the foundation of all material reality—at the beginning was the entropic bit.

For the purposes of economics, the key insight of information theory is that information is measured by the degree to which it is unexpected. Information is “news,” gauged by its surprisal, which is the entropy. A stream of predictable bits conveys no information at all. A stream of uncoded chaotic noise conveys no information, either.

In the Shannon scheme, a source selects a message from a portfolio of possible messages, encodes it through resort to a dictionary or lookup table using a specified alphabet, then transcribes the encoded message into a form that can be transmitted down a channel. Afflicting that channel is always some level of noise or interference. At the destination, the receiver decodes the message, translating it back into its original form. This is what is happening when a radio station modulates electromagnetic waves, and your car radio demodulates those waves, translating them back into the original sounds or voices at the radio station.

Part of the genius of information theory is its understanding that this ordinary concept of communication through space extends also through time. A compact disk, iPod memory, or Tivo personal video recorder also conducts a transmission from a source (the original song or other content) through a channel (the CD, DVD, microchip memory, or “hard drive”) to a receiver chiefly separated by time. In all these cases, the success of the transmission depends on the existence of a channel that does not change significantly during the course of the communication, either in space or in time.

Change in the channel is called noise and an ideal channel is perfectly linear.  What comes out is identical to what goes in. A good channel, whether for telephony, television, or data storage, does not change in significant ways during the period between the transmission and receipt of the message. Because the channel is changeless, the message in the channel can communicate changes. The message of change can be distinguished from the unchanging parameters of the channel.

In that radio transmission, a voice or other acoustic signal is imposed on a band of electromagnetic waves through a modulation scheme. This set of rules allows a relatively high-frequency non-mechanical wave (measured in kilohertz to gigahertz and traveling at the speed of light) to carry a translated version of the desired sound, which the human ear can receive only in the form of a lower frequency mechanical wave (measured in acoustic hertz to low kilohertz and traveling close to a million times slower).  The receiver can recover the modulation changes of amplitude or frequency or phase (timing) that encode the voice merely by subtracting the changeless radio waves. This process of recovery can occur years later if the modulated waves are sampled and stored on a disk or long term memory.

The accomplishment of Information Theory was to create a rigorous mathematical discipline for the definition and measurement of the information in the message sent down the channel. Shannon entropy or surprisal defines and quantifies the information in a message. In close similarity with physical entropy, information entropy is always a positive number measured by minus the base two logarithm of its probability.

Information in Shannon’s scheme is quantified in terms of a probability because Shannon interpreted the message as a selection or choice from a limited alphabet. Entropy is thus a measure of freedom of choice. In the simplest case of maximum entropy of equally probable elements, the uncertainty is merely the inverse of the number of elements or symbols. A coin toss offers two possibilities, heads or tails; the probability of either is one out of two; the logarithm of one half is minus one. With the minus

Leave a Comment