Shannon's fundamental theorems

Díaz Nafría, José María
 Incorporated contributions
Díaz (12/2009)
 Usage domain
Théorème fondamental de Shannon
 German Shannon-Grundsatz


  1. Fundamental theorem for a noiseless channel
  2. Fundamental theorem for a discrete channel with noise  
  3. Complementarity of both theorems

1. Fundamental theorem for a noiseless channel

Let a source have entropy H (bits per symbol) and a channel have a capacity C (bits per second). Then it is possible to encode the output of the source in such a way as to transmit at the average rate C/H–ε symbols per second over a channel where ε is arbitrary small. It is not possible to transmit at an average rate greater than C/H. (Shannon 1948: 16)

Shannon probes here the existence of a limit to the efficiency of what has been called source coding (®encoder). If the entropy of a source –characterised by the emission of a finite set of symbols– can be determined, then we know H (in bit/symbol) would correspond to the minimum binary digits to be used for its coding. Any move to this limit translates into a growing complexity (in operational and/or circuital costs). As in other fundamental results of the MTC, it deals with a non constructive conclusion “leaving open the problem of designing codes” (®C.E.Shannon).

In technical practice, source coding is not only attained to the statistical level addressed by Shannon. The most sophisticated techniques of source coding are actually a combination of:

1)   predictive coding, in which the sender only conveys what cannot be predicted from previous sendings, achieving optimal results if source peculiarities and pragmatic context are analysed in depth (e.g. for the reproduction of a piano playing, keyboard touching is just registered).

2)   Transformational coding (specially applicable for signals addressed to sensory organs), in which a linear transformation is applied to signals to be conveyed (reversible) enabling to distinguish ranges of different sensibility. This makes possible to leave out data being imperceptible or under certain quality thresholds (operation entailing an irreversible loss of data –not necessarily information, as it is commonly said, if this data is not able in the least to ‘inform’ recipients). In that coding, efficiency is achieved through an analysis in depth of the sensory perception.

3)   Statistical coding, in the sense pointed out by the MTC where source emissions are regarded as ergodic and stationary processes.

2. Fundamental theorem for a discrete channel with noise  

Let a discrete channel have the capacity C and a discrete source the entropy per second H. If H≤C there exist a coding system such that the output of the source can be transmitted over the channel with an arbitrarily small frequency of errors (or an arbitrary small equivocation). If H>C it is possible to encode the source so that the equivocation is less than H–C+ε where ε is arbitrarily small. There is no method of encoding which gives an equivocation less than H–C. (Shannon 1948: 22)

Since here the source is characterised by its information transmission rate (according to Shannon’s definition of entropy), this theorem warns us that the transmission of this information flow requires at least a channel of capacity bigger than H. We might vainly try to transmit it through a channel of lesser capacity, any excess of source entropy with respect to channel capacity will imply a corresponding increase in the rate of error reception. On the other hand, approaching to the threshold (C≈H) leads to an increase in (operational/circuital) complexity.

How can the distance between source entropy H and channel capacity C be employed? Redundancy might be employed in order to facilitate recipients identification and correction of transmission errors. This kind of coding in named channel coding (®encoder). There are several techniques to add redundancy, which can be classified in block codes and convolutional codes. In the former, consecutive data blocks are used to determine the added redundancy; in the convolutional ones, state machines are used, which output depends on the coder state and entry data. Error correction looks in the former for the most similar valid block, in the later for the most similar sequence of valid code.

3. Complementarity of both theorems

Thus, there is a certain practical complementarity between these two theorems: the former indicates how far can we compress the code for conveying source messages (maximally removing redundancy); the second shows us the redundancy the system could use in order to facilitate error correction.

At a glance, source coding tries to equate binary digits to bits, maximizing entropy and eliminating whatever is non entropic and useless for decoding purposes, whereas channel coding adds non entropic digits that can be recognized by recipients to eliminate transmission errors.

  • SHANNON, C. E. (1948). “A Mathematical Theory of Communication”. The Bell System Technical Journal, Vol. 27, pp. 379–423, 623–656, July, October.
  • SHANNON, C. y WEAVER, W. (1949). The mathematical theory of communication. Urbana: The University of Illinois Press.
  • SKLAR, Bernard (2001). Digital Communications. Fundamentals and Applications. New Jersey: Prentice Hall. 
New entry. Before doing a new entry, please, copy this line and the following ones and paste them at the column bottom. Next fill out the fields: 'name', 'date' and 'text', and delete this upper blue paragraph.
Name (date)
[Entry text]

Incorporated entries
J.M. Díaz (12/2009)
[It corresponds with the first version of the article, which is now showed in the left column]