Cartographic Raster to Vector Conversion - Color Map Understanding: State of the art

Color Map Understanding: State of the art

2.2 Cartographic Raster to Vector Conversion

Two main methods are currently used for map vectorization [Levachkine S. 2000]:

(V1) Paper map digitization by electronic-mechanical digitizers, and (V2) Raster map (a map obtained after scanning of the paper original) digitization. The dig-itizing of paper maps cannot be automated; hence the only practical approach to design R2V conversions is the development of methods and software to automate vectorization of raster maps. Raster map digitization approaches can be divided into four intersecting groups [Levachkine S. 2000]: (D1) Manual; (D2) Interactive;

(D3) Semi-automated, and (D4) Automatic. In practice, (D1) = (V1). In the case of "punctual" objects, the operator visually locates graphic symbols and fixes their coordinates. In the case of "linear" and polygonal objects, the operator uses recti-linear segments to approximate curvirecti-linear contours. The manual digitization rate is one to two objects per minute. Interactive digitization uses special programs, which, once the operator indicates the starting point on a line segment, automati-cally follow the contours of the line (tracing). These programs are capable of tracing relatively simple lines. If the program cannot solve a graphical ambiguity on the raster map, it returns a message to the operator (alarm). Recently, vector editors capable of carrying out this digitization process have appeared, reducing the time consumption by a factor of 1.5 to 2. These can be called semi-automated systems [Levachkine 2004] [Levachkine 2003] [Levachkine S. 2000] [E-Cognition ] [R2V ].

In theory, automatic vector editors automatically digitize all objects of a given class, leaving to the operator the error correction in the resulting vector layers. Some vector editors use this approach [E-Cognition] [R2V ] [Benz U.C. 2003]. However, in practice, the high error level resulting from any small complication in the raster map means that alternative methods should be sought to reduce the huge volume of manual corrections.

Computer aided correcting system lies on two points: (1) An automatic detection of ambiguities (low confidence zones) and (2) a pertinent graphical user interface in order to locally correct errors without perturbation of the global data coherency.

A system approach can enhance the outcome not only of processing (map recog-nition) but also pre-processing (preparation of paper maps and their corresponding

2.2. Cartographic Raster to Vector Conversion 13

Figure 2.1: Main processing steps required in raster to vector conversion.

rasters) and post-processing (final processing of the results of automatic digitiza-tion). Thus, in the following three sections one shall consider these processing stages in the context of system approach. Figure 2.1depicts an overview of the complete process.

2.2.1 Pre-processing

The main goal of pre-processing is to prepare raster cartographic images in such a way as to simplify them and increase the reliability of their recognition in the automatic system. The proposed sequence of operations for the preparation of raster maps for automatic recognition is:

Pre-processing=

1. Preparation of the cartographic materials for scanning (a) Restoration

(b) Copying

(a) Test scanning

(b) Definition of the optimal scanning parameters (c) Final scanning

(d) Joining raster facets

(e) Correction of the raster image geometry by the reference points 3. Preparation of the raster maps for recognition of the cartographic objects

(a) Edition of raster map

(b) Elimination of the map notations and map legend

(d) Restoration of the topology of cartographic images in pixel level (e) Separation of basic colors of the graphical codification on a raster map (f) Restoration of the color palette of a raster map

(g) Stratification of raster map

(h) Stratification by reduced color palette

(i) Logical stratification of the cartographic objects

The comments on all pre-processing steps can be found in [Levachkine 2004]–[Levachkine S. 2000]. Concerning this stage, let us discuss only the most important points that expose clearly our problem.

Increasing the contrast of image objects. To simplify the process of vectorization, objects of the same class can be highlighted on a copy of the map using a contrasting color. Typically such marking involves enclosing linear objects such as streets, or inside rooms. In practice, outlines of polygonal objects, which do not have explicit borders (such as water well, stairs, etc.), and are delineated only by dashed or patterned lines, should be drawn in. In particular, various polygonal objects may overlap, one represented by color, another outlined by a dashed line, and a third by patterned lines; in such cases, the objects should all be outlined explicitly.

Stratification of the raster map. A raster map, considered as a unified hetero-geneous graphical image, is suitable for parallel human vision. In contrast, raster images, containing homogeneous graphical information, are suited to consecutive machine vision. Two approaches can be used for the stratification of the original raster map: 1) stratification by reduced color palette or 2) logical stratification of the cartographic objects. In the first case, maps are derived from the original raster map which preserve only those pixels that have a strongly defined set of colors corresponding to the images of one class (for example, red and yellow, say, might correspond to the icons of two distinct parcel owners). In the second case, the map only preserves fragments of general raster image context corresponding to the locations of cartographic objects of one class.

2.2.2 Processing

The main goal of this principal stage of automatic vectorization of raster maps is the recognition of cartographic images, i.e. generation of vector layers and attributive information in electronic maps. The fundamental idea of R2V automation is the development of methods, algorithms and programs that focus only on locating and identifying specific cartographic objects that constitute the map semantic structure.

Each cartographic image has its own graphical representation parameters, which can be used for automatic object recognition on a raster map. The particular attributes depend on the topological class of the object. Traditionally in GIS, vector map objects are divided into three types: points, segments and polygons, representing

2.2. Cartographic Raster to Vector Conversion 15

respectively "punctual", "linear" and area objects. Graphical images have color, geometric (location, shape), topological and attributive (quantitative and qualitative parameters, e.g. the name of object) information, which can be merged into the concept of cartographic image.

An important element of the automation of raster map vectorization is the devel-opment of an optimal sequence of steps for cartographic image recognition, succes-sively eliminating elements already decoded from the raster map field and restoring images, which were hidden by the eliminated elements. The basic principle of this optimized ordering is from simple to complex. Nevertheless, the possibility of using information from objects already digitized (whether manually or by an automatic system) should be provided for in the development of a recognition strategy. For example, the punctual layer of street information can be successfully used for recog-nition of polygonal elements of the quarters. Taking this into account, it becomes clear that street names and numbers should be retrieved before the quarters are digitized. Eliminating them from the raster map, one can use their locations and attributive data to aid in recognition of elements of the quarters. This strategy can also be considered as use of different sources of evidence to resolve ambiguities in a R2V conversion.

In other words, maximal use of already existing information (directly or indi-rectly related to the vectorized objects) employed as general principle of automatic cartographic image recognition can increase efficiency and reliability. Summarizing the processing of raster maps, we notice that the methods and algorithms used for this process should provide complete, even redundant cartographic image recogni-tion (no matter a number of erroneously recognized objects), since visual control and correction of the vector layers can be carried out more quickly than manual digitation of missed objects. To conclude the discussion in this section, the process of automatic cartographic image recognition (processing) often follows this scheme:

Processing=

1. Development of the strategy of automatic digitization of raster maps 2. Recognition of cartographic images

(a) Digitization of objects which have vector analogues (b) Digitization of objects which have not vector analogues

(a) Restoration of image covered by recognized objects (b) Correction of restored image

2.2.3 Post-Processing

The main goal of the post-processing of raster maps (after map image recognition) is an automatic correction of vectorization errors. For automatic correction of

digitiza-tion, two approaches can be distinguished: 1) using the topological characteristics of objects in vector layers and 2) using the sources of evidence (textual, spatial distribution information).

The first approach is based on the fact that many cartographic objects in the map have well-defined topological characteristics, which can be used for the correction of vectorization errors. Let us give just one obvious example: Parcels. The topological characteristics of parcel systems (e.g. contour lines) are:

(a) parcels are continuous, (b) they cannot overlap each other, and (c) each parcel has at least one edge on the street side. However, in a raster map these character-istics, as a rule, may be lost due to several reasons: (i) the lines are broken where a street number for a given parcel is written, (ii) some parcels are not well drawn in high density regions, (iii) the folding of the page may corrupt the raster image, and (iv) degradations due to the storage condition can cause many defects of printing the paper maps. The contrast between the parcel color and the color of the street is not strong enough to make a clear distinction, so, the boundaries are broken and they need special consideration. These elements of the map’s graphical design, if not considered as the parts of the parcel system, hinder the correct assembly of the polygons and either should be eliminated or (better) detached in a separate vector layer. They can be restored on the vector map and used for the automatic attribution of polygons assembled from the contour lines.

The second approach has proven its efficiency in toponym recognition of car-tographic maps and can be also used in general R2V conversion post-processing [Gelbukh A. 2003]. For instance, it is obvious to say that close to a street number, there is a parcel. This information of relation between a text component and a parcel object can be useful. The example presented shows that the characteristics of internal structure and relationships between the vector objects can be used in automatic correction of errors of the automatic vectorization. In practice, it means the development of more specific software for automatic cadastral image recognition.

Summarizing the discussion of this section, the process of automatic correction of results of automatic cadastral image recognition (postprocessing) follows the scheme:

Post-processing =

1. Correction of vector layers based on peculiarities of their internal topology 2. Correction using the sources of evidence (textual, street names or numbers) 3. Final correction of vector layers in whole electronic map system

Color cadastral map interpretation is at the edge between cartographic raster to vector conversion and technical drawing understanding. From its color aspect,

Dans le document L’Université de La Rochelle (Page 34-39)