Ontology and Data Mining Against Semantics Lack in Image RetrievalLack in Image Retrieval

Part III Multimedia Data Indexing

9. New Image Retrieval Principle: Image Mining and Visual OntologyMining and Visual Ontology

9.3 Ontology and Data Mining Against Semantics Lack in Image RetrievalLack in Image Retrieval

Users can automatically assign semantics to an image, using the visual content and their own knowledge. Image retrieval systems suffer from a lack of expressive power

because they do not integrate enough semantics. Colour, texture, shape, etc., properties as well as key words are not sufficient to capture all the concepts a user wants to ex-press. Images that are similar from a visual content point of view may be semantically very different. This is the reason why many research works are performed on image semantics. Many approaches intend to propagate annotations starting from partially annotated image databases. Different point of view can be considered: key words annotation propagation, semiautomatic annotation, use of ontology, knowledge dis-covery.

As far as we are concerned, we are focusing on the two last points. We think that without a real synergy between ontology and data mining, the gap reducing between low-level textual visual descriptors and high-level concepts can not be reached. It is well-known that ontologies help the process of information retrieval. But, the combi-nation of data mining techniques with ontologies to mine, interpret, and (re)organize knowledge is less common. For us, ontology may be seen as a priori knowledge and used to improve the data mining process, and conversely data mining process may be useful for creating ontology. Moreover, these techniques should also improve both navigation and image retrieval. More precisely, our goal is to build a visual ontology dedicated to a specific application and to use it for large image database exploration.

Indeed, these techniques should not only allow an efficient access to large image databases by providing a relevant synthetic view, but also play a filter role by re-ducing the search space that is essential in media retrieval. The contribution of each technique (ontology and data mining) to more ingenious image retrieval is developed in the following sections.

9.3.1 Knowledge Discovery in Large Image Databases

Data mining techniques are part of knowledge discovery methods whose aim is to discover knowledge in large databases without predetermined information about the application field that is well-known as KDD [29]. Data mining methods try to discover knowledge from an exploratory or a decisional point of view.

Today, data mining techniques have been extensively used for traditional data.

However, in the context of multimedia data, the databases contain both numeric features as in lots of current databases, and voluminous quantities of non standard data. Image mining (more generally multimedia mining) does not consist to apply alphanumerical mining techniques to image. Classical data mining methods may not be directly applied to image because of their nature. Indeed, image data may be considered as complex data owing their high dimensionality [25]. Some approaches for extracting knowledge from multimedia data have been proposed. However, we think that image (more generally multimedia) mining seems to be a promising issue to overcome the semantics problem in visual retrieval system. This research field is still in its infancy, but image or multimedia generates new challenges by knowledge learning and discovering from large quantities of these nonstructured data. In the image context, among data mining techniques (classification obtained by decision trees or neural networks, clustering, associated rules), the more used methods are clustering, association rules [30], and neural networks.

In recent years, some approaches for extracting knowledge from multimedia data have been proposed. The SKICAT system [31] deals with knowledge discovery in as-tronomical images and integrates techniques for image processing and classification.

Decision trees are used to classify objects obtained by image segmentation.

In [32], authors have proposed methods for mining content-based associations with recurrent items and with spatial relationships from large visual data repositories.

A progressive resolution refinement approach has been proposed in which frequent item sets at rough resolutions levels are mined, and progressively, finer resolutions are mined only on candidate frequent item sets derived from mining through rough resolution levels. The proposed algorithm is an extension of the famous A Priori algorithm that takes account of the number of object occurrences in the images.

In [33], an algorithm about discovering association rules in images databases based on image content has been proposed. This algorithm relies on four majors steps:

feature extraction, object identification, auxiliary image creation, and object mining.

The main advantage of this approach is that it does not use any domain knowledge and does not produce meaningless rules or false rules. However, it suffers from several drawbacks, the most important is the relative slowness of feature extraction step and it does not work well with complex images.

In [34], the author proposes an architecture that integrates knowledge extraction from image databases. Association rules are extracted to characterize images, and they are used to classify new images during insertion.

In [35], a recent experiment has been done to show the dependencies between textual and visual indexation. This experiment is performed on different corpus con-taining photographs manually indexed by key words. Then, the authors compare text-only classification, visual-only classification, and the fusion of textual and vi-sual classification. They show that the fusion is significantly improving text-only classification.

The following section presents how the ontology concept may be seen as a priori knowledge and used to improve the data mining process.

9.3.2 Ontologies and Metadata

Semantics can be expressed using weak semantics like taxonomies or rich semantics like ontologies [17, 36]. Semantic information may also appear as semantic annota-tions or metadata. Several formats have been designed to meet this goal, among which the Resource Description Framework [37] from the W3C. RDF aims at describing resources and establishes relationships among them. RDF can be enriched with an RDFS Schema, which expresses class hierarchies and typing constraints, for example, to specify that a given relation type can connect only specific classes. A taxonomy is a hierarchically organised controlled vocabulary. The world has a lot of taxonomies, because human beings naturally classify things. Taxonomies are semantically weak. A thesaurus is a “controlled vocabulary arranged in a known order and structured so that equivalence, homographic, hierarchical, and associative relationships among terms are displayed clearly and identified by standardized relationship indicators” [38]. The purpose of a thesaurus is to facilitate documents retrieval. Wordnet [39] is a thesaurus

that organizes English nouns, verbs, adverbs, and adjectives into a set of synonyms and defines relationships between synonyms. According to [40], “an ontology is an explicit specification of a conceptualization.” Ontologies consist of a hierarchical description of important concepts of a domain and a description of each concept’s properties. They can be defined more or less formally from natural language to descrip-tion logics [41]. OWL (Web Ontology Language) [42] belongs to this last category.

OWL is built upon RDF and RDFS and extends them to express class properties.

Many tools and methodologies exist for the construction of ontologies [43–46].

Their differences are the expressiveness of the knowledge model, the existence of an inference and query engine, the type of storage, the formalism generated and its compatibility with other formalisms, the degree of automation, consistency checking and so on. But building an ontology from scratch is a tedious and time-consuming task. In order to reduce the effort to build ontologies, several approaches for the partial automation of the knowledge acquisition process have been proposed. They use natural language analysis and machine learning techniques [47–49].

Data mining techniques contribution to ontology construction can be seen from two different points of view according to the existence of prior knowledge. In the first case, the ontology construction from text will be done according to semantic matching with an existing ontology or thesaurus. In the other case, the ontology is built from scratch and the quality of the resulting ontology is very difficult to evaluate. Several similarity measures can be used in these two cases. Clustering is then performed using a well-suited distance.

Medianet [50] is an example of a multimedia knowledge database built with the help of Wordnet and including semantic and perceptual relationships. Several repre-sentations can be associated with a concept (text, image, video, audio); concepts are linked using semantic relationships (e.g., specialization) and perceptual relationships (e.g., similar shape). The construction is semiautomatic; the user has to specify which interpretation of an annotation given by Wordnet is correct. The knowledge base is built starting from a set of images partially annotated, visual feature extraction tools, and Wordnet.

Dans le document Multimedia Data Mining and Knowledge Discovery (Page 193-196)