• Aucun résultat trouvé

Cadastral Map

Dans le document L’Université de La Rochelle (Page 28-33)

This section provides some fundamental information related to the definition of a special image called Cadastral Map, its history and its classification. This special image is the main focus of our study.

The extraordinary potential of the automatic analysis of color documents brings new interests and represents a real challenge since color has always been considered as a strong tool for information extraction [Dorin Comaniciu 1997]. As mentioned, earlier, in the context of the project called “ALPAGE”, we are considering the digital-ization of ancient maps. In this ALPAGE project, we consider cadastral maps from the 19th Century (called “Atlas VASSEROT"), on which objects are drawn by using color to distinguish parcels for instance. This project deals with the classical graphic recognition problems, to which are added difficulties due to the presence of colors and strong time due degradations of relevant information : color degradation, yellowing of the paper, pigment fading. . . In the context of this multi-disciplinary project,

1.4. Cadastral Map 7

Figure 1.5: Architecture of a graphic document analysis system

the idea is to provide strategic information for historians, or students, what means that the purpose is to propose a set of processing allowing to segment/recognize all the objects of the documents. In such a topic, the number of handled objects can be counted by million. This volume of data leads to the rise of new services as intelligent indexation, document browsing and content searching. If the analysis of a given doc-ument was reduced in the digitalization of the paper docdoc-ument to a “bitmap” image, the problem would be commonplace. Actually the subjacent scientific problems are very complex because the objective is much more ambitious, the conversion of the paper document into its semantic interpretation [BELAID A. 1992]. The concept of retro-conversion is a semantic digitalization, from elementary data and contextual information the analysis is carried out through a color graphic recognition process where the aim is to build structured information dedicated to a GIS. A classical ascending approach from pixel to object calls various low level tools such as color segmentation or line tracking while at the top, high level methods allow the inte-gration of a priori knowledge bringing a contribution to the interpretation process with an aim of archiving information (figure1.5) [Lladós J. 2003].

An example of cadastral map is shown in figure1.6. A straightforward comment points out the color specificity of these documents, hence, we need to consider the color meaning to extract cadastral information (ie: a parcel) and a closer look is given to color representation. Consequently, in this thesis, we propose a general ar-chitecture to take into account color information embedded into graphic documents in the objective to build a relevant Content-Based Map Retrieval (CBMR) system.

An overview of our framework is displayed in figure 1.7 and our method relies on three major steps as follows:

1. Firstly, a preprocessing stage aims at preparing the data, so, this includes:

(a) Finding the best color model in terms of distinction between different colors. We assume that the choice of an efficient color model will be

Figure 1.6: A sample of cadastral map.

decisive since the performance of any color-dependent system is highly influenced by the color model it uses.

(b) Removing undesirable data to preserve the true diamond (data mining), the parcels within the map.

2. Secondly, a vectorization step and its performance evaluation provide, respec-tively, vectorial objects to be inserted into the GIS and tools for comparing maps. The three sub-steps, composing this second part, are as follows :

(a) A color segmentation approach dedicated to documents is presented; it is inspired by graphic construction rules of cadastral maps.

(b) Digital curves approximation aims at transforming pixels to vectors.

(c) Performance evaluation of the vectorization and a vectorial dissimilarity measure between maps.

3. Finally, a Content-Based Image Retrieval system adapted to cadastral maps is presented. This so called Content-Based Map Retrieval (CBMR) applica-tion lies on a vectorial distance between maps. The vectorizaapplica-tion stage feeds the CBMR process, thus, it provides a morphological analysis and it makes possible, from a query image, to find similar cadastral maps.

(a) Images of maps are like no others and a CBMR approach should take profit of the intrasectoral spatiality of such a document. In this way, a graph-based representation is more likely to perform better, consequently, the CBMR system should involve a graph distance when searching by similarity a map.

1.5. Conclusion 9

Figure 1.7: Overall methodology of our system.

1.5 Conclusion

This thesis deals with a problem of graphic detection and retrieval, specially focused on an ancient and colored cadastral maps. This thesis also belongs to a part of a historical document project named ALPAGE, which has as main objective to pre-serve and derive benefits from ancient documents. The main objectives of this thesis are to propose a vectorization and pattern recognition framework to a database of maps and to develop a CBIR system in order to provide a reliable accessibility and functions to that database for interested users.

The organization of this thesis paper is in five chapters. The interaction between chapters is illustrated in figure 2 and a short description of each chapter is put forward as follows:

Chapter 1 gives the introduction to the project and provides overall concept of this thesis. We introduce a general aspect of document image analysis, the necessities and importance of historical documents and the related project named ALPAGE.

Next, we focus on colored cadastral maps and define the scope and objectives of this study.

Chapter 2 concerns the state of the art. This chapter reviews the literatures related to historical document analysis. The literature review gives the fundamental knowledge, global aspect and terminology, and presents recent ideas and techniques that are useful to our work.

Chapter 3 is dedicated to the color processing aspects. This project deals with the classical graphic recognition problems, to which are added difficulties due to the presence of colors and strong time due degradations of relevant information:

color degradation, yellowing of the paper, pigment fading... Especially, this chapter introduces some principles on color restoration and provides a guide tour on color representation and its subsequent selection.

Chapter 4 deals with the problem of the extraction of information from cadastral maps. This part aims at presenting methods and low-level algorithms involved into the quarters and parcels retrieval. Thereafter, parcels information is structured into a graph-based representation. These data will be used to feed a CBIR stage.

Chapter 5 aims at assessing the vectorization process. The question of perfor-mance evaluation is raised and a set of metrics is defined. Theses indices reveal errors that occur in a raster to vector conversion.

Chapter 6 presents graph matching and graph classification methods. Cadastral maps are modeled by attributed relational graphs taking into account the relation-ships between parcels, hence, the question of finding similarities between maps turns into a graph matching problem. This chapter gives theoretical and experimental basements of graph mining methods considered for that purpose.

Chapter 7 addresses the Content-Based Image Retrieval (CBIR) topic in a gen-eral point of view (in a gengen-eral way). A discussion about the suitability of structural approaches for a CBIR system is given. Finally and more specifically, a CBIR ap-plication based on polygon features is described. From a query map, the cadastral map collection browsing is aided by computer, in the objective to retrieve the most similar cadastral maps from a morphological point of view. In this way, the CBIR paradigm is derived to gives birth to what we call Content-Based Map Retrieval (CBMR).

Chapter 8 discusses and draws a conclusion of this thesis. Finally, we give per-spectives and introduce possible future works.

Chapter 2

Color Map Understanding: State

Dans le document L’Université de La Rochelle (Page 28-33)