• Aucun résultat trouvé

Our proposition within graph-based framework

Dans le document The DART-Europe E-theses Portal (Page 55-61)

Firstly, we aim to provide an alternative method for image modeling which can consider different type of image representations and different visual features.

The need of a model that could take several image points of views is one of our objectives. We are also motivated by the fact that there is still a gap between the low-level features model and that of the high-level semantic ones. We create an intermediate-level image representation layer between image semantics and the middle-level of concepts included various visual features along with the spatial relations among them. Such image representation layer can easily describe the image contents, for example, “building is in the left of the tree”, “cloud is in the top of the building”, etc.

Secondly, generative models have been around for decades and been applied successfully to textual retrieval. These methods are both practical in terms of implementation and effective in term of computational cost. Moreover, the ex-tension of the generative matching process does exist for the complex knowledge representation, such as for conceptual graph [Maisonnasse et al. 2008]. To the best of our knowledge, no one has tried to use generative methods for graph matching process. In this regard, our second objective is to study the effect and benefit of using a probabilistic framework for matching of the graph-based image representation.

Therefore, our proposition graph-based framework will include the following original contributions to the current state-of-the art:

A unified graph-based representation for image modeling. Our goal is to automatically deduce for each image a visual graph representing the image contents. For this, image regions are automatically associated with the visual concepts, and spatial relations are used for creating links between these regions and keypoints. The frequency of visual concepts and their relations are also captured as the weights in our visual graphs.

The advantage of this model is that it offers an intuitive representation of image content. Moreover, by allowing the user to select the image

3.7. Conclusion 45

representations (such as visual concepts) and the spatial relations to be considered it can be more easily matched to a particular image category.

A generative matching method using language modeling. To reduce the computational cost, we propose to use the language modeling for generative graph matching process. Unfortunately, the current conceptual language modeling framework is limited to only a set of concept and a set of relation [Maisonnasse et al. 2008]. Therefore, we will extend the theory of this framework in order to take in to account of multiple concept sets and multiple relation sets. To do that, we have to make several independence assumptions based on the concept sets and relation set. We also propose a simple smoothing method for the probability estimation of concept and relation in this framework.

3.7 Conclusion

To summarize, in this chapter we surveyed the current learning models, such as generative approaches and discriminative approaches. The important theoretical aspect of the language modeling inspired from information retrieval is also provided in section 3.3. Furthermore, we have investigated different structured image representations on image modeling, for instance conceptual graph and attributed relational graph. We have also studied some graph matching methods based on discriminative approach (such as embedding of paths and walks in kernel based classification) or generative approaches (such as Markov’s model and language modeling). Motivated by the limitation of the current state-of-the-art methods, we have proposed a new approach based on the graph-based image representation and a generative process for graph matching.

The next part contributes on designing the proposal method. As said, chapter 4explains how the framework works with three principal steps: image processing, graph modeling and graph retrieval. Chapter5details the graph formulation and the graph matching based on the language modeling. We will give some examples to illustrate the constructed graph and how we compute the likelihood probability for a pair of graphs.

Part II

Our Approach

47

Chapter 4

Proposed Approach

Design is not just what it looks like and feels like. Design is how it works.

Steve Jobs

4.1 Framework overview

Inspired by the bag-of-word model, images are modeled as a set of visual words (concepts) described and supported by different visual features and rep-resentations.. As we explained previously, our goal is to automatically deduce, from a given image, a graph that represents the image content. Such a graph will contain concepts directly associated with the elements present in the image, as well as spatial relations which express how concepts are related in the image.

The reason that we have choosen graph as the image representation is due to its capacity of embedding complex symbolic relations and attributes of concepts (such as numerical value or probability estimation). Alternatively, with this presentation we can apply an extension of language modeling, which is a generative probabilistic model, for the graph retrieval process.

To do so, we present in this section the system architecture that consists of three main stages (see Figure4.1).

1. Image processing aims at extracting image regions (i.e., segmentation, grid partition or saliency point detection) from the image. It also consists of computing the numerical feature vectors (e.g., color, edge histogram, and local feature information) associated with regions or saliency points.

2. Graph modeling consists of two main steps. First, extracted image regions that are visually similar will be grouped into clusters using an unsupervised learning algorithm (e.g., k-means clustering). Each cluster is then associated with a visual concept. The second process consists of

49

Figure 4.1: System architecture of the graph-based model for image retrieval.

generating the spatial relations between the visual concepts. After these two steps, each image is represented by a visual graph generated from a set of visual concepts and a set of spatial relations among the visual concepts.

3. Graph retrieval is to retrieve images relevant to a new image query. Query graphs are generated following the graph modeling step described above.

Inspired by the language model for text retrieval, we extend this framework for matching the query graph with the trained graph from the database.

Images are then ranked based on their probabilities of the corresponding graphs.

Indeed, these three phrases are clearly distinct from each other. They can be associated with the three layers of a classical paradigm in machine vision of Marr as introduced in chapter2: the processing layer (1), the mapping layer (2), the high-level interpretation layer (3). Our contributions are mainly related to the graph modeling and graph retrieval problem. In the graph modeling step, we propose a unified graph-based framework for image representation. After that, we propose a graph matching algorithm based on the extension of the language model that was initially proposed in the information retrieval community. We will describe these steps in the following sections.

Dans le document The DART-Europe E-theses Portal (Page 55-61)