The audio-visual content has a double relation with information. As physical objects they can be observed as carriers of information about their own nature and given the transmitted content, can also be considered as information carriers, in the terms of the Unified Theory of information. UNESCO’s Memory of the World Program recognizes that documents, including audiovisual documents, have two components: the information content and the carrier on which it resides.


The value of information often depends on how easily it can be found, retrieved, accessed, filtered and managed. An incommensurable amount of audiovisual information is becoming available in digital form, in digital archives, on the World Wide Web, in broadcast DataStream and in personal and professional databases, and this amount is only growing. In spite of the fact that users have increasing access to these resources, identifying and managing them efficiently is becoming more difficult, because of the growing volume. The question of identifying content is not just restricted to database retrieval applications such as digital libraries, but extends to areas like broadcast channel selection, multimedia editing, and multimedia directory services.


Furthermore, images are rich in contents, while in many applications text may not be rich enough to describe images in an effective way. To overcome these difficulties, in the early 1990s, content-based image retrieval emerged as a promising means for describing and retrieving images. Content-based image retrieval systems describe images by their own visual content, such as color, texture, and objects’ shape information rather than text. In 1996 MPEG* recognize the need to identify multimedia content, and started a work item formally called ‘Multimedia Content Description Interface', better known as MPEG-7.



This standard includes the description of physical characteristics of the image but MPEG-7 also includes Descriptors that define the syntax and the semantics of the image. The specific structure, semantics and relationships among the components of the content are collect in Description Schemes. There are two different schema types:  Descriptors and Description Schemes.


According to this philosophy, the MPEG-7 descriptors of the audio-visual content may include all the items that the standard considers as informative:

  • Information describing the creation and production processes of the content (director, title, short feature movie).
  • Information related to the usage of the content (copyright pointers, usage history, broadcast schedule).
  • Information of the storage features of the content (storage format, encoding).
  • Structural information on spatial, temporal or spatio-temporal components of the content (scene cuts, segmentation in regions, region motion tracking).
  • Information about low level features in the content (colors, textures, sound timbres, melody description).
  • Conceptual information of the reality captured by the content (objects and events, interactions among objects).
  • Information about how to browse the content in an efficient way (summaries, variations, spatial and frequency sub bands,).
  • Information about collections of objects.
  • Information about the interaction of the user with the content (user preferences, usage history).


*The Moving Picture Experts Group (MPEG) is a working group of ISO/IEC (formally ISO/IEC JTC1/SC29/WG11) in charge of “development of international standards for compression, decompression, processing, and coded representation of moving pictures, audio, and their combination, in order to satisfy a wide variety of applications”.

