living and inanimate systems. The concept of information and the five levels required for a complete description are illustrated in Figure 12. This diagram can be regarded as a nonverbal description of information. In the following greatly extended description and definition, where real information is concerned, Shannon’s theory is only useful for describing the statistical level (see chapter 5).
Figure 12: The five aspects of information. A complete characterization of the information concept requires all five aspects — statistics, syntax, semantics, pragmatics, and apobetics, which are essential for both the sender and the recipient. Information originates as a language; it is first formulated, and then transmitted or stored. An agreed-upon alphabet comprising individual symbols (code), is used to compose words. Then the (meaningful) words are arranged in sentences according to the rules of the relevant grammar (syntax), to convey the intended meaning (semantics). It is obvious that the information concept also includes the expected/implemented action (pragmatics), and the intended/achieved purpose (apobetics).
4.2 The Second Level of Information: Syntax
When considering the book B mentioned earlier, it is obvious that the letters do not appear in random sequences. Combinations like "the," "car," "father," etc. occur frequently, but we do not find other possible combinations like "xcy," "bkaln," or "dwust." In other words:
• Only certain combinations of letters are allowed (agreed-upon) English words. Other conceivable combinations do not belong to the language. It is also not a random process when words are arranged in sentences; the rules of grammar must be adhered to.
Both the construction of words and the arrangement of words in sentences to form information-bearing sequences of symbols, are subject to quite specific rules based on deliberate conventions [9] for each and every language.
Definition 3: Syntax is meant to include all structural properties of the process of setting up information. At this second level, we are only concerned with the actual sets of symbols (codes) and the rules governing the way they are assembled into sequences (grammar and vocabulary) independent of any meaning they may or may not have.
Note: It has become clear that this level consists of two parts, namely:
A) Code: Selection of the set of symbols used.
B) The syntax proper: inter-relationships among the symbols.
A) The Code: The System of Symbols Used for Setting Up Information
A set of symbols is required for the representation of information at the syntax level. Most written languages use letters, but a very wide range of conventions exists: Morse code, hieroglyphics, international flag codes, musical notes, various data processing codes, genetic codes, figures made by gyrating bees, pheromones (scents) released by insects, and hand signs used by deaf-mute persons.
Several questions are relevant: What code should be used? How many symbols are available? What criteria are used for constructing the code? What mode of transmission is suitable? How could we determine whether an unknown system is a code or not?
The number of symbols: The number of different symbols q, employed by a coding system, can vary greatly, and depends strongly on the purpose and the application. In computer technology, only two switch positions are recognized, so that binary codes were created which are comprised of only two different symbols. Quaternary codes, comprised of four different symbols, are involved in all living organisms. The reason why four symbols represent an optimum in this case is discussed in chapter 6. The various alphabet systems used by different languages consist of from 20 to 35 letters, and this number of letters is sufficient for representing all the sounds of the language concerned. Chinese writing is not based on elementary sounds, but pictures are employed, every one of which represents a single word, so that the