Layout analysis encompasses all the techniques that are used to infer the organization of the page layout of document images. From a physical point of view the layout can be described as composed by blocks, in most cases rectangular, that are arranged in the page and contain homogeneous content, such as text, vectorial graphics, or illustrations.

From a logical point of view text blocks can have a different meaning on the basis of their content and their position in the page. For instance, in the case of technical papers blocks can correspond to the title, author, or abstract of the paper. The learning algorithms adopted in this domain are often related to supervised classifiers that are used at various processing levels to label the objects in the document image according to physical or logical categories.

The classification can be performed for individual pixels, for regions, or even for whole pages. The different approaches adopted for using supervised classifiers in layout analysis are analyzed in this chapter.

Therefore, and unlike many other approaches for layout analysis, ours can easily adapt itself to a variety of document analysis problems. One need only specify the page grammar and provide a .

A document image is composed of a variety of physical entities or regions such as text blocks, lines, words, figures, tables, and background. We could also assign functional or logical labels such.

