illustrative example has now clarified some basic principles about the nature of information. Further details follow.
4.1 The Lowest Level of Information: Statistics
When considering a book B, a computer program C, or the human genome (the totality of genes), we first discuss the following questions:
– How many letters, numbers, and words make up the entire text?
– How many single letters does the employed alphabet contain (e. g. a, b, c …z, or G, C, A, T)?
– How frequently do certain letters and words occur?
To answer these questions, it is immaterial whether we are dealing with actual meaningful text, with pure nonsense, or with random sequences of symbols or words. Such investigations are not concerned with the contents, but only with statistical aspects. These topics all belong to the first and lowest level of information, namely the level of statistics.
As explained fully in appendix A1, Shannon’s theory of information is suitable for describing the statistical aspects of information, e.g., those quantitative properties of languages which depend on frequencies. Nothing can be said about the meaningfulness or not of any given sequence of symbols. The question of grammatical correctness is also completely excluded at this level. Conclusions:
Definition 1: According to Shannon’s theory, any random sequence of symbols is regarded as information, without regard to its origin or whether it is meaningful or not.
Definition 2: The statistical information content of a sequence of symbols is a quantitative concept, measured in bits (binary digits).
According to Shannon’s definition, the information content of a single message (which could be one symbol, one sign, one syllable, or a single word) is a measure of the probability of its being received correctly. Probabilities range from 0 to 1, so that this measure is always positive. The information content of a number of messages (signs for example) is found by adding the individual probabilities as required by the condition of summability. An important property of information according to Shannon is:
Theorem 4: A message which has been subject to interference or "noise," in general comprises more information than an error-free message.
This theorem follows from the larger number of possible alternatives in a distorted message, and Shannon states that the information content of a message increases with the number of symbols (see equation 6 in appendix A1). It is obvious that the actual information content cannot at all be described in such terms, as should be clear from the following example: When somebody uses many words to say practically nothing, this message is accorded a large information content because of the large number of letters used. If somebody else, who is really knowledgeable, concisely expresses the essentials, his message has a much lower information content.
Some quotations concerning this aspect of information are: French President Charles De Gaulle (1890–1970), "The ten commandments are so concise and plainly intelligible because they were compiled without first having a commission of inquiry." Another philosopher said, "There are about 35 million laws on earth to validate the ten commandments." A certain representative in the American Congress concluded, "The Lord’s Prayer consists of 56 words, and the Ten Commandments contain 297 words. The Declaration of Independence contains 300 words, but the recently published ordinance about the price of coal comprises no fewer than 26,911 words."
Theorem 5: Shannon’s definition of information exclusively concerns the statistical properties of sequences of symbols; meaning is completely ignored.
It follows that this concept of information is unsuitable for evaluating the information content of meaningful sequences of symbols. We now realize that an appreciable extension of Shannon’s information theory is required to significantly evaluate information and information processing in both