Recherche d'images par le contenu, analyse multirésolution et modèles de régression logistique

(1)

RECHERCHE D'IMAGES PAR LE CONTENU, ANALYSE

MULTIRESOLUTION ET MODELES DE REGRESSION

LOGISTIQUE

par

Riadh Ksantini

These presentee ail Departement d'informatique

en vue de l'obtention du grade de philosophiae doctor (Ph.D.)

FACULTE DES SCIENCES

UNIVERSITE DE SHERBROOKE

(2)

**1*1**

Library and

Archives Canada

Published Heritage

Branch

395 Wellington Street Ottawa ON K1A0N4 Canada

Bibliotheque et

Archives Canada

Direction du

Patrimoine de I'edition

395, rue Wellington Ottawa ON K1A0N4 Canada

Your file Votre reference ISBN: 978-0-494-42627-2 Our file Notre reference ISBN: 978-0-494-42627-2

NOTICE:

The author has granted a

non-exclusive license allowing Library

and Archives Canada to reproduce,

publish, archive, preserve, conserve,

communicate to the public by

telecommunication or on the Internet,

loan, distribute and sell theses

worldwide, for commercial or

non-commercial purposes, in microform,

paper, electronic and/or any other

formats.

AVIS:

L'auteur a accorde une licence non exclusive

permettant a la Bibliotheque et Archives

Canada de reproduire, publier, archiver,

sauvegarder, conserver, transmettre au public

par telecommunication ou par Plntemet, prefer,

distribuer et vendre des theses partout dans

le monde, a des fins commerciales ou autres,

sur support microforme, papier, electronique

et/ou autres formats.

The author retains copyright

ownership and moral rights in

this thesis. Neither the thesis

nor substantial extracts from it

may be printed or otherwise

reproduced without the author's

permission.

L'auteur conserve la propriete du droit d'auteur

et des droits moraux qui protege cette these.

Ni la these ni des extraits substantiels de

celle-ci ne doivent etre imprimes ou autrement

reproduits sans son autorisation.

In compliance with the Canadian

Privacy Act some supporting

forms may have been removed

from this thesis.

Conformement a la loi canadienne

sur la protection de la vie privee,

quelques formulaires secondaires

ont ete enleves de cette these.

While these forms may be included

in the document page count,

their removal does not represent

any loss of content from the

thesis.

Canada

Bien que ces formulaires

aient inclus dans la pagination,

il n'y aura aucun contenu manquant.

(3)

Le 5 octobre 2007

lejury a accepte la these de M. Riadh Ksantini dans sa version finale.

Membres dujury

M. Fran9ois Dubeau

Directeur

Departement de mathematiques

M. Djemel Ziou

Codirecteur

Departement d'informatique

M. Richard Egli

Membre

Departement d'informatique

M. HamidKrim

Membre externe

Electrical and Computer Engineering Department - NC State University

M. Bernard Colin

President-rapporteur

Departement de mathematiques

(4)

(5)

SOMMAIRE

Cette these, presente l'ensemble de nos contributions relatives a la recherche d'images

par le contenu a l'aide de Panalyse multiresolution ainsi qu'a la classification lineaire

et nonlineaire. Dans la premiere partie, nous proposons une methode simple et

ra-pide de recherche d'images par le contenu. Pour representer les images couleurs, nous

introduisons de nouveaux descripteurs de caracteristiques qui sont des histogrammes

ponderes par le gradient multispectral. Afin de mesurer le degre de similarite entre

deux images d'une fagon rapide et efficace, nous utilisons une pseudo-metrique

pon-deree qui utilise la decomposition en ondelettes et la compression des histogrammes

extraits des images. Les poids de la pseudo-metrique sont ajustes a l'aide du modele

classique de regression logistique afin d'ameliorer sa capacite a discriminer et la

pre-cision de la recherche. Dans la deuxieme partie, nous proposons un nouveau modele

bayesien de regression logistique fonde sur une methode variationnelle. Une

compa-raison de ce nouveau modele au modele classique de regression logistique est effectuee

dans le cadre de la recherche d'images. Nous illustrons par la suite que le modele

baye-sien permet par rapport au modele classique une amelioration notoire de la capacite

a discriminer de la pseudo-metrique et de la precision de recherche. Dans la troisieme

partie, nous detaillons la derivation du nouveau modele bayesien de regression

logis-tique fonde sur une methode variationnelle et nous comparons ce modele au modele

classique de regression logistique ainsi qu'a d'autres classificateurs lineaires presents

(6)

dans la litterature. Nous comparons par la suite, notre methode de recherche, utilisant

le modele bayesien de regression logistique, a d'autres methodes de recherches deja

publiees. Dans la quatrieme partie, nous introduisons la selection des caracteristiques

pour ameliorer notre methode de recherche utilisant le modele introduit ci-dessus.

En effet, la selection des caracteristiques permet de donner automatiquement plus

d'importance aux caracteristiques qui discriminent le plus et moins d'importance aux

caracteristiques qui discriminent le moins. Finalement, dans la cinquieme partie, nous

proposons un nouveau modele bayesien d'analyse discriminante logistique construit a

l'aide de noyaux permettant ainsi une classification nonlineaire flexible.

(7)

REMERCIEMENTS

Je tiens en premier lieu a exprimer toute ma reconnaissance envers mon directeur de

recherche, le Professeur Frangois Dubeau de l'Universite de Sherbrooke pour avoir

ac-cepts de diriger mes travaux. Son encadrement, ses conseils avises et sa disponibilite

furent pour moi des elements importants quant a la bonne conduite de mes projets de

recherche.

Je voudrais egalement remercier mon codirecteur, le Professeur Djemel Ziou de

l'Uni-versite de Sherbrooke pour les discussions profitables que j'ai eues avec lui, ainsi que

pour ses judicieuses critiques et ses suggestions pertinentes.

Je voudrais egalement remercier le Professeur Bernard Colin de l'Universite de

Sher-brooke pour les discussions profitables que j'ai eues avec lui, ainsi que pour ses

judi-cieuses critiques et ses suggestions pertinentes.

Mes remerciements vont aux Laboratoires Universitaires Bell, a Patrimoine Canada,

a 1'ISM, au Conseil de Recherches en Sciences Naturelles et en Genie du Canada

(CRSNG) ainsi qu'a la Faculte des sciences et a mes directeur et codirecteur de

re-cherche pour leur soutien financier.

Une pensee speciale va a mes collegues du groupe MOIVRE de l'Universite de

Sher-brooke pour leur soutien scientifique et l'excellente ambiance de travail. Je ne peux

passer sous silence l'apport du professionnel de recherche Karim Hamel qui m'a aide

dans la mise en ceuvre de Fun de mes travaux.

(8)

Je tiens finalement a remercier tous les membres de ma famille ainsi que tous ceux qui

me sont proches pour leur encouragement et leur soutien. Un remerciement special va

a ma mere pour son aide et son soutien eternel sans lequel je n'aurais pas pu continuer

mes etudes.

Riadh Ksantini

Sherbrooke, juillet 2007

(9)

TABLE DES MATIERES

S O M M A I R E iii

R E M E R C I E M E N T S v

T A B L E D E S M A T I E R E S vii

I N T R O D U C T I O N 1

C H A P I T R E 1 — Recherche d'images fondee sur la separation des

re-gions et l'analyse multiresolution 6

C H A P I T R E 2 — P s e u d o - m e t r i q u e p o n d e r e e p o u r u n e m e t h o d e rapide

de recherche d'images p a r le contenu 53

C H A P I T R E 3 — Amelioration de la capacite a discriminer d ' u n e

pseudo-m e t r i q u e en utilisant u n pseudo-modele bayesien de regression logistique

fonde sur une m e t h o d e variationnelle 66

C H A P I T R E 4 — Modeles de regression logistique p o u r une m e t h o d e

r a p i d e de recherche d'images p a r le contenu fondee sur la selection

(10)

des caracteristiques

100 C H A P I T R E 5 — Modele bayesien d'analyse discriminante logistique

fonde sur les noyaux : une amelioration a l'analyse discriminante de

Fisher fondee sur les noyaux 117

(11)

INTRODUCTION

Avec le developpement realise recemment dans les techniques de production, de

trans-mission, et de traitement des donnees, il y a eu une explosion dans la quantite et la

complexity des donnees generees chaque jour. Ces donnees prennent differents formats

incluant le texte, les images, le son, la video et le multimedia. Elles peuvent etre

trou-vees sous forme de bases de donnees, de collections non organisees, ou encore sur le

World Wide Web. A titre d'exemple, le nombre de pages Web referencees par Google

s'eleve a plus de 4 milliards en ce moment. Par contre, cette abondance d'information

n'a pas que des impacts positifs. Le grand paradoxe auquel sont confrontes les gens

actuellement est qu'il y a de plus en plus de donnees disponibles a propos d'un sujet

specifique, et qu'il est de plus en plus difficile de localiser l'information pertinente

dans des delais raisonnables. Ceci a donne naissance a un nouveau besoin qui est

ce-lui d'inventer des outils qui aident les gens a localiser l'information voulue, ces outils

sont les moteurs de recherche d'information. Nous pouvons done definir un moteur

de recherche comme etant un outil auquel l'utilisateur soumet une requete, textuelle

ou autre, et qui se charge de chercher dans une collection de donnees tous les articles

qui lui correspondent. Les premiers moteurs de recherche a avoir vu le jour et a avoir

suscite beaucoup d'interet a la fois parmi les chercheurs et parmi les utilisateurs sont

les moteurs de recherche de texte. Le fait que les pages Web telles que celles de Google

ou de Yahoo! figurent parmi les pages les plus visitees sur Internet illustre bien

(12)

l'im-portance et l'utilite de tels outils. Cependant, en depit de la quantite d'information

visuelle dans les bases de donnees et le Web, peu de gens se sont interesses au

pro-bleme de la recherche d'images, et la plupart des moteurs existants sont primitifs et

leur performance reste assez limitee.

En recherche d'images, la premiere approche a avoir vu le jour trouve ses origines

dans les algorithmes de recherche de texte. Pour pouvoir rechercher les images, on

commence par les annoter avec du texte et ensuite les techniques de recherche de

texte sont appliquees pour retrouver des images. Cette approche, connue sous le nom

de "recherche d'images basee sur le texte", date des annees 70 et est due a la

com-munaute de gestion des bases de donnees. Meme si quelques systemes commerciaux

tels que Google image search et AltaVista photo finder l'ont adoptee, cette technique

souffre de plusieurs limitations. Premierement, plusieurs collections d'images ne sont

pas annotees avec du texte et leur annotation manuelle peut s'averer fastidieuse et tres

couteuse. Deuxiemement, meme quand une collection est annotee, son annotation est

generalement faite par des humains et peut par consequent etre subjective : deux

per-sonnes differentes peuvent utiliser des termes differents pour annoter la meme image.

En plus de cela, se baser exclusivement sur le texte est souvent insuffisant surtout

quand les usagers sont interesses par les composantes visuelles de l'image qui peuvent

difficilement etre decrites par des mots. En effet, une image peut contenir plusieurs

objets et chaque objet peut posseder une longue liste d'attributs, ce qui defie la

des-cription avec les mots.

Ces limitations ont pousse les gens a reflechir a une autre solution consistant a

"lais-ser les images se decrire par elles-memes". Ceci a donne naissance a une seconde

approche basee sur les caracteristiques visuelles des images telles que la couleur et

la texture. Cette approche, connue sous le nom de "recherche d'images basee sur le

contenu" (CBIR), a ete proposee au debut des annees 90 et vient de la communaute

de vision par ordinateur. Plusieurs moteurs de recherche recents l'ont adoptee tels que

(13)

QBIC, SIMPLIcity et Cires. Les premiers moteurs de recherche bases sur le contenu

exigeaient de l'usager de selectionner les caracteristiques visuelles qui l'interessent et

de fournir des valeurs numeriques a chacune de ces caracteristiques. Cependant pour

differentes raisons, il est generalement difficile pour l'usager de specifier explicitement

les valeurs des caracteristiques visuelles. Tout d'abord, un usager peut ignorer les

de-tails de l'imagerie et son jargon. Pour s'en convaincre, imaginons un usager auquel

on demande de choisir entre le "nitre de Gabor" et les "Ondelettes" par exemple!

Ensuite, il est difficile, meme pour un specialiste en imagerie, de traduire les images

qu'il a en tete en une combinaison de caracteristiques et de valeurs numeriques. Des

lors, les gens se sont mis a reflechir a une autre alternative, et la solution qu'ils ont

adoptee consiste a permettre a l'usager de specifier implicitement les caracteristiques

qui l'interessent a travers un paradigme connu sous le nom de "requete par l'exemple"

(QBE). En utilisant l'interface que le moteur lui offre, l'usager choisit une image

re-quete et ensuite le moteur parcourt la collection de donnees en extrayant toutes les

images qui ressemblent a cette requete. Precisement, la requete choisie par l'usager

et les images de la base de donnees sont initialement representees par des vecteurs

de caracteristiques. Ensuite, la similarite entre l'image requete et une image cible est

mesuree par une metrique qui est calculee entre leurs vecteurs de caracteristiques.

Enfin, les images cibles les plus proches de l'image requete, au sens de la metrique,

sont retournees a l'usager. La metrique doit 6tre assez flexible, pour tenir compte des

distorsions de la requete par rapport a la cible, et aussi assez rapide d'execution, pour

pouvoir etre utilisee sur de grandes bases de donnees.

Dans cette these, notre objectif est de developper une methode de recherche d'images

basee sur le contenu qui soit a la fois efficace et rapide. Pour ce faire, nous avons

utilise plusieurs outils comme les ondelettes, de nouveaux descripteurs ou vecteurs

des caracteristiques, une structure de donnees specifique et la classification lineaire et

nonlineaire. Par consequent, nous avons fait des contributions relatives a la recherche

(14)

d'image par le contenu, la description des caracteristique et aussi la classification

li-neaire et nonlili-neaire. Dans le Chapitre 1 de la these, nous proposons une methode

simple et rapide de recherche d'images par le contenu. Pour representer les images

couleurs, nous introduisons de nouveaux descripteurs de caracteristiques qui sont des

histogrammes ponderes par le gradient multispectral. Afin de mesurer le degre de

similarite entre deux images d'une fagon rapide et efficace, nous utilisons une

pseudo-metrique ponderee qui utilise la decomposition en ondelettes et la compression des

histogrammes extraits des images. Les poids de la pseudo-metrique sont ajustes a

l'aide du modele classique de regression logistique afin d'ameliorer sa capacite a

dis-criminer et la precision de la recherche. Dans le Chapitre 2, nous proposons un nouveau

modele bayesien de regression logistique fonde sur une methode variationnelle. Une

comparaison de ce nouveau modele au modele classique de regression logistique est

effectuee dans le cadre de la recherche d'images. Nous illustrons par la suite que le

modele bayesien permet par rapport au modele classique une amelioration notoire de

la capacite a discriminer de la pseudo-metrique et de la precision de recherche. Dans

le Chapitre 3, nous detaillons la derivation du nouveau modele bayesien de

regres-sion logistique fonde sur une methode variationnelle et nous comparons ce modele au

modele classique de regression logistique ainsi qu'a d'autres classificateurs lineaires

presents dans la litterature. Nous comparons par la suite, notre methode de recherche,

utilisant le modele bayesien de regression logistique, a d'autres methodes de recherches

deja publiees. Dans le Chapitre 4, nous introduisons la selection des caracteristiques

pour ameliorer notre methode de recherche utilisant le modele introduit ci-dessus.

En effet, la selection des caracteristiques permet de donner automatiquement plus

d'importance aux caracteristiques qui discriminent le plus et moins d'importance aux

caracteristiques qui discriminent le moins. Finalement, dans le Chapitre 5, nous

propo-sons un nouveau modele bayesien d'analyse discriminante logistique construit a l'aide

de noyaux permettant ainsi une classification nonlineaire flexible.

(15)

Dans l'ensemble des articles qui suivent, la mise en oeuvre des contributions

omni-presentes a ete faite par l'auteur principal Riadh Ksantini avec l'aide precieuse des

professeurs Frangois Dubeau, Djemel Ziou et Bernard Colin.

(16)

CHAPITRE 1

Recherche d'images fondee sur la

separation des regions et l'analyse

multiresolution

Dans ce chapitre, nous proposons une methode simple et rapide de recherche d'images

par le contenu. En utilisant le gradient multispectral, une image couleur est coupee en

deux parties disjointes : les regions homogenes de couleur et les regions de contours.

Les regions homogenes sont representees par les histogrammes traditionnels de couleur

et les regions de contours sont representees par les histogrammes des moyennes des

modules du gradient multispectral calculees sur chaque pixel de l'image couleur. Afin

de mesurer le degre de similarity entre deux images couleurs rapidement et

efficace-ment, nous utilisons une pseudo-metrique ponderee qui se sert de la decomposition en

ondelettes Daubechies-8 et de la compression des histogrammes extraits. Les poids de

la pseudo-metrique sont ajustes par le modele classique de regression logistique pour

ameliorer sa capacite a discriminer et la precision de la recherche. Notre methode de

recherche est invariante aux translations des objets et aux intensites de couleur dans

(17)

les images. Les experimentations ont ete effectuees sur une collection de 10000 images

couleurs.

Nous presentons dans les pages qui suivent, un article intitule Image Retrieval

Ba-sed on Region Separation and Multiresolution Analysis qui a ete publie dans

le numero de mars 2006 du International Journal of Wavelets, Multiresolution

and Information Processing (IJWMIP). Une version preliminaire de l'article a ete

presentee dans la Conference Internationale en Recherche Operationnelle (CIRO'05),

Marrakech, Maroc, 2005.

(18)

Image Retrieval Based on Region Separation and Multiresolution

Analysis

R. Ksantini

1

, D. Ziou

1

, and F. Dubeau

2

(1) Departement d'informatique, Faculte des sciences Universite de Sherbrooke

Sherbrooke, Qc, Canada J1K 2R1. Email: riadh.ksantini@usherbrooke.ca

djemel.ziou@usherbrooke.ca

(2) Departement de mathematiques, Faculte des sciences Universite de Sherbrooke

Sherbrooke, Qc, Canada J1K 2R1. Email: francois.dubeau@usherbrooke.ca

K e y w o r d s : Multispectral Gradient, Region Separation, Multiresolution Analysis, Pseudo-metric, Logistic Re-gression, Color Image Retrieval.

Abstract

In this paper, a simple and fast querying method for content-based image retrieval is presented. Using the multispectral gradient, a color image is split into two disjoint parts which are the homogeneous color regions and the edge regions. The homogeneous regions are represented by the traditional color histograms, and the edge regions are represented by multispectral gradient module mean histograms. In order to measure the similarity degree between two color images both quickly and effectively, we use a one-dimensional pseudo-metric, which makes use of the one-dimensional Daubechies decomposition and compression of the extracted histograms. Our querying method is invariant to the query color image object translations and color intensities. The experimental results are reported on a collection of 10000 LAB color images.

(19)

1 Introduction

The rapid expansion of the Internet and the wide use of digital data in many real world applications in the field of medecine, weather prediction, communications, commerce and academia, increased the need for both effi-cient image database creation and retrieval procedures. For this reason, content-based image retrieval (CBIR) approach was proposed [19], [2]. In this approach, the first step is to compute for each database image a feature vector capturing certain visual features of the image such as color, texture and shape. This feature vector is stored in a featurebase, and then given a query image chosen by a user, its feature vector is computed, compared to the featurebase feature vectors by a distance metric or a similarity measure, and finally the most similar database images to the query image are returned to the user. In order to have effective characterization of local image properties, and to increase the data storage efficiency and the querying execution speed in the CBIR field, a wavelet based indexing approach was introduced. The wavelet transforms are proven to have the advantage of allowing better resolution in time and frequency. Consequently, they have received much attention as a tool for developing CBIR systems. In the following we review some of the CBIR systems based on the wavelet domain feature extractions.

Related work:

Jacob et al. [5] proposed a fast image querying algorithm in databases ranging in size from 1093 to 20,558 color images. RGB, HSV, and YIQ color spaces are chosen separately to represent the database color images before the querying. Each database color image component is Haar wavelets decomposed. Dominant coefficients of this decomposition are retained to represent the spatial information and color visual features of the color image. The similarity degree between a query and potential targets is measured by a weighted metric which compares how many significant wavelet coefficients they have in common.

Wang et al. [14] have proposed the WBIIS querying system in a database of 10,000 RGB color images. All database color images are four stage Daubechies-8 wavelets decomposed. The lower frequency bands in each database color image wavelet transform, represent the object configurations in the image and the higher fre-quency bands represent the texture and local color variations. The similarity degree between a query and potential targets is measured by a comparison between the variances of their lowpass band coefficients. Then, these latters are compared using an euclidean distance. Finally, a weighted euclidean distance is used to perform a comparison between the query and the remaining color images lowest resolution subimages representing the lowpass bands, horizontal bands, vertical bands, and diagonal bands. For database color images, this procedure is repeated on all three color channels.

(20)

Kuo et al. [15] have proposed the WaveGuide querying system in a database of 2127 YUV color images. Each database color image is wavelet packet decomposed and wavelet pyramid decomposed to extract texture, shape and color features, respectively. Both wavelet transforms are followed by a successive approximation quantiza-tion (SAQ) which uses a sequence of thresholds in order to indentify relevant and irrelevant wavelet coefficients and their locations in each wavelet transformed color image subbands. Texture descriptor is extracted from the significant coefficients in wavelet packet transformed image Y-component subbands. Color descriptor is extracted from the three color image components with respect to the (SAQ) twelve thresholds of the wavelet pyramid transformed image. Shape descriptor is extracted from the significant coefficients of the first three scale vertical, horizontal and diagonal subbands of the wavelet pyramid transformed color image Y-component. The texture, color and shape similarities between a query color image and the database color images are defined using the L\ distance.

N. Khelil and A. Benazza-Benyahia [17] have suggested a method for image retrieval in a database of 2815 mul-tispectral S P 0 T 3 images and in a database of 2815 mulmul-tispectral SPOT4 images. Each database color image components are decomposed separately according to the lifting scheme (second generation of wavelet transform) [1] through the 5/3 transform. The wavelet coefficients of a 5/3 transformed database color image, represent the salient features of the image. The wavelet coefficients related to all components at a given resolution level of each database 5/3 transformed color image, are merged into a common subband whatever the transform orientations, and then this subband is modelized by a zero-mean Generalized Gaussian Distribution (GGD). The similarity degree between the query image and the database color images is measured by a weighted metric which is a combination of the symmetrical Kullback-Leibler distance which is used to evaluate how different are two GGDs, and by a second order distance computed between their scaling coefficient variances.

Our approach is based on the use of the multispectral gradient in order to separate between the homoge-neous regions and the edge regions of each database color image, and to represent each region by feature vectors which are weighted histograms. The weighted histograms representing the homogeneous regions are color his-tograms constructed after edge regions elimination, and the weighted hishis-tograms representing the edge regions are multispectral gradient module mean histograms. In the querying, just very few dominant coefficients of the wavelet decomposed versions of these weighted histograms are considered to have an effective querying, despite a lower querying computational complexity. Among all kinds of wavelets, Daubechies-8 wavelets are proven to have good frequency properties and to be good for 1-D signal synthesis [14]. Therefore, Daubechies-8 wavelets

(21)

are chosen in our approach. In order to measure the similarity degree between two color images we use the one-dimensional version of the weighted metric proposed by Jacob et al. [5], which makes use of the compressed and quantized versions of the Daubechies-8 wavelet decomposed histograms. In order to discriminate most effectively, the metric weights are adjusted using the standard logistic regression model. Our querying method does not suffer from the query image object translation variance, and the querying is invariant to the color intensities of the query image, thanks to a modification of the multispectral gradient module mean histograms. We apply our retrieval method by representing our database color images in the LAB color space, because it's a perceptually uniform color space that describes color by just two coordinates.

The difference between the related work approaches and the approach developed in this paper, is that in order to reduce the computational complexity and to increase the data storage efficiency our approach is based on the wavelet decomposition and compression of the feature vectors themselves, instead of the database color images. A variety of heuristic histogram similarity measures has been proposed in literature in the context of image retrieval [18]. However, these similarity measures do not handle wavelet decomposed and compressed histograms. In fact, they are computed over the totality of the histogram pixels. Consequently, they are more expensive to compute, especially, when we have a large database. For these reasons, in our application we use the one-dimensional version of the weighted metric proposed by Jacob et al. [5].

In the next section, we explain how we construct homogeneous and edge region histograms. In the third section, we briefly explain the Daubechies-8 wavelet decomposition and the compression of a one-dimensional image. In section 4, we define the dimensional version of the metric proposed by [5], we explain the one-dimensional image or feature vector querying algorithm and we describe the logistic regression model and the training performed to adjust the metric weights. In section 5, we present the color image querying method. Finally, in section 6 we perform some experiments to evaluate our querying method and to show the querying improvement.

2 Color images and histogram decomposition

In this section, we present the LAB color space advantages and we explain how to represent our database color images in this space. Also, we explain the separation between the homogeneous color regions and edge regions, using the multispectral gradient. Then, we define the two feature vectors which are the traditional

(22)

color histograms constructed without considering the edge regions, and the multispectral gradient module mean

histogram.

2.1 Color images a n d color space

In order to extract color features from a color image, we need to choose a color space in which to represent it. We decided to use the LAB color space. In fact, LAB color space is approximately perceptually uniform [11], [7], [3], that maps equally distinct color differences into approximately equal euclidean distances in space. Also, it allows a good separation between the luminance and the colors. In this space, L defines the luminance with values from 0 for black to 100 for a perfectly white body, A denotes the red/green chrominance, with values from -200 for green to 200 for red, and B denotes the yellow/blue chrominance, with values from -200 for blue to 200 for yellow. In the LAB color space, each color image pixel is represented by a vector (L, a, b), where L is the luminance, a is the red/green and b is the yellow/blue. In our application, each LAB color image is numerically represented by three matrices II, Ia and I&, containing the pixel intensities of the luminance, red/green and yellow/blue, respectively. To simplify, we use a linear interpolation to represent each LAB color image component intensities between 0 and 255.

2.2 Weighted histograms

The luminance histogram and the color histogram represent how pixels of some images are distributed in the LAB channels. Given any LAB color image, its luminance histogram hi, contains the number of pixels of the luminance L, and its color histograms ha and hb contain the number of pixels of the chrominances a red/green and b yellow/blue, respectively. Therefore, the three histograms of an M x N pixel LAB color image, can be written as follows M-1N-1

**M<0 = £ £ * ( / L ( M ) - C )**

j = 0 j=0 M-1N-1 h*{c) = £ £<*(/«(*, j ) - c ) for each c e { 0 , . . . , 255} (1) i = 0 j=Q M-1N-1 j=0 j=0

(23)

where S is the Kroeneicker symbol at 0, defined by

6(x-c) = \l

l

\l

= C

(2)

v ' y (J otherwise.

LAB histograms have been used widely in many content-based image retrieval systems with some success [16]. They provide only the distribution of the luminance and the color. That's why histogram-based color retrieval techniques suffer from a lack of important spatial knowledge.

In order to overcome these drawbacks, spatial information should be integrated. Several recently proposed ap-proaches augment the color histogram with some spatial information. Examples include the laplacian weighted histogram proposed by [6], the color coherent vector (CCV) proposed by [9] and enhanced by [10], and the color correlogram proposed by [13]. In our case we will use the multispectral gradient module mean histogram. The multispectral gradient changes from a color image to another having different edge shapes, which increases the discriminative power of the querying. The multispectral gradient module mean histogram is inspired from the laplacian weighted histogram which is proposed by [6]. This latter is a good tool to distinguish the pixels located at the neighbourhoods of a color image edges. However, it's noised because it's based on the color image component second derivatives, also it can represent a false feature when the color image contains several staircase edges [12].

In a color image, the number of the pixels belonging to the homogeneous regions is widely greater than the number of the edge pixels. Therefore, these latters have negligible influence on the color histogram shape, and then their statistical importance becomes insignificant, thus rendering their effect on the querying very negligible. A solution to this problem is to separate between the homogeneous regions and the edge regions. In our application, this separation will be only performed on the chrominance images. That's why we will preserve the luminance histogram and we will introduce weighted color histograms which combine the color distribution with its spatial properties.

(24)

color histograms for each region

M - 1 J V - 1

hlk(c)=^2 ^26{Ik(i,j)-c)Wlk(i,j) f o r e a c h c e {0,...,255} and k = a,b, (3)

i=0 j=0

_ J h for homogeneous regions, ,.-.

I J O T / - 4 V S~tl-\ I-V/-A V / M T I A n t ? ^ '

where Wj, is the weight and

, - J

e for edge regions.

In [6], the weight Wlk(i,j) = AIk(i,j)2 for each h = a,b, where AI(i,j) is the laplacian of the image Ik at the

pixel (i,j).

2.3 Separation of homogeneous a n d edge regions

According to [4], a color edge detection is based on finding local maxima in the first directional derivative of the vector-valued color image. The magnitude of the strongest change of the vector-valued image which represents the multispectral gradient module, coincides with the largest eigenvalue of the matrix JTJ, denoted by Xmax, where J is the Jacobian matrix of the vector-valued image. In our case, the Jacobian matrix of the chrominance images at each pixel (x, y), is given by

/ dla{x,y) dla(x,y) \

J = ( dib(%,y) dib(x,y) I > (5)

\ dx dy J

where a J a 9 (*'^, dIa^'y), 9hQXx'v), and aib^'v) are the first partial derivatives of a and b images, respectively. Consequently, the matrix JTJ at each pixel {x,y), is defined by

J T J = ( an(x,y) a12(x,y) \ ( 6 ) V 02i{x,y) a22{x,y) J where

fdl

a

(x,y)\

2

fdl

b

(x,yy

2 a

^

v )

= {-dx—)

+

{-dx—

fdl

a

{x,y)\

2

(dl

b

(x,

y

2

a

2

2(x,y) = (

v

— 9 ^ J +

V dy

dla(x,y)dla{x,y) dlb{x,y) dlb(x,y)

(25)

y)-Therefore, the largest eigenvalue Xmax of JTJ, which represents the strongest change of the vector-valued image

at each pixel (x,y), is given by

^max(x,y) = - f (an(x,y) + a22(x,y)) + y(an(x,y) - a22{x,y))2 + Aa\2{x,y) J. (7)

A pixel is considered in an edge region if the \max computed over it is greater than a given threshold rj, and is considered in homogeneous region elsewhere. In our application, we simply use a threshold defined by the mean of the largest eigenvalues computed over all pixels. Explicitly

M - 1 J V - 1

i=0 j = 0

where M and N are respectively the length and the width of the color image. Let us remark that other strategies for thresholding are also possible.

2.4 Homogeneous regions : separation of modes

For these regions the weight Wfc(i,j) in the formula (3) is given by W[}(i,j) = X[O,TJ] I Xmax(i,j) )> f°r e a c n k = a,b. Therefore, the weighted histograms are

M - 1 J V - 1 , s

hk(C) = Y^ YS(Ik(i:J)-c)X{0,V}(Xmax(.i,j)), (9)

j=0 j=0 ^ '

for each c e {0,..., 255} and k = a,b, and where XE is the characteristic function of the set E, defined by

, , f 1 if x e E, Mrt.

*E{X) = { 0 otherwise. ( 1 0 )

In some images, edge pixels can cause overlappings or noises between the color histogram populations. A con-sequence of not considering edge pixels in the formula (10) is the avoidance of these overlappings or noises. The following figure gives an example of the separation between two modes of a histogram.

(26)

van \m •m tan 6CC0 too mo

1 J

|

l

0 2SS (b)

Figure 1: Separation between two histogram modes: a) Color image, b) The a component histogram before edge region elimination and c) The a component weighted histogram.

2.5 Edge regions

For these regions the weight Wfce(i, j) in the formula (3) is given by W%(i,j) = X]v,+°o[( ><max{i, j) Um a x( i , j),

for each k = a,b. Therefore, the weighted color histograms are the multispectral gradient module weighted histograms which are given by

M-1N-1

hUc) = ^2 ] L 5(Ik(i'J) - c) X]r,,+oo[( >>max(hj) )*>max(i,j),

i = 0 j=0

(11)

for each c € {0, ...,255} and k = a,b. Thanks to the term X]?7,+oo[( ^max{i,j) ), the weighted histogram h%(c) takes into account the high values of the multispectral gradient modules. Thus, it provides information about the overall contrast in chrominances.

According to the formula (11), two color images having same colors and object shapes, but different object sizes, can have different multispectral gradient module weighted histograms. To overcome this drawback, we can consider the means of the multispectral gradient modules. In fact, the means represent the gradient module global values of a chrominance. For each chrominance, the multispectral gradient module mean histogram is given by

he/A

for each c € {0,...,255} and k = a,b, (12)

m =

#

(c)

NP,k{c)

where NPtk(c) is the number of the edge region pixels and is defined as

M-1N-1 , v

NP,k(c) = J ! 5 Z ^ (/ f c( * ' j ) -c) X ] » 7 , + o o [ f ^max(i,j)J,

i—n ->—n \ /

(13)

(27)

for k = a,b.

3 Wavelet decomposition, compression and quantization of an

his-togram

3.1 Histogram and multiresolution analysis

An histogram is a (ID) image or signal supported by 2J pixels ( J G N) and represented by a sequence of coef-ficients {if} -_0 • In order to analyse the histogram we use the Daubechies-8 wavelets. In fact, Daubechies-8

wavelets are continuous functions which can analyse continuous (ID) signals more efficiently. Also, Daubechies-8 wavelets are a good compromise between computational time and performances, since they have eight coeffi-cient overlapping filters. Furthermore, they have four vanishing moments which produce as marked a contrast in wavelet coefficient sizes between smooth and non-smooth sections of the (ID) signal.

The Daubechies-8 scaling function is compactly supported by the interval [0, 7] and its values can be calcu-lated thanks to a given initial value and to a recurrence relation which is given by

t = 7

0 ( x ) = ^ M ( 2 x - i ) , (14)

i=0

where (h0 = -0.014, hi = 0.046, h2 = 0.043, h3 = -0.264, h4 = -0.039, h5 = 0.890, h6 = 1.009, h7 = 0.325) is

the lowpass filter.

The Daubechies-8 scaling function <f> serves as the basic building block for its associated Daubechies-8 wavelet function, denoted by ip, and defined by the following recursion

i=l

1>(x) = E (-l)%-itf(2z - i). (15)

t = - 6

In order to ensure that </> and i/> are compactly supported by the same interval [0,7] and are equal zero out-side it, we shift the Daubechies-8 wavelet function ip from x to x — 3. The scaled, translated and normalized versions of <\> and the shifted version of ip are denoted by (j>\(x) = V2~(J~^(f>(2~^J^^x — i) and i>\{x — 3) =

(28)

For the finest resolution level, we introduce a vector space VJ which is the set of all possible linear combi-nations of the Daubechies-8 scaling function shifted versions. With implicit periodicity considerations, we can write

VJ=IAn{4>i : i = 0 , . . . , 2J- l } . (16)

By supposing that the (ID) signal I £VJ,we can approximate it as follows

2J- 1

I(x) ~ £ If fa - i). (17)

i=0

We define the vector subspace W3^1 = Lin{ij)f~x : i — 0,...,2J~1 - 1} to be the orthogonal complement of

yj-l i n yJ_ Explicitly

VJ = VJ-1@WJ-1. (18)

The Daubechies-8 wavelet transform decomposes the 2J pixel (ID) signal into its components for J different

scales. It consists of passing successively from the space VJ to the space V°, while generating through this decomposition the spaces {W^}J~Q. So we obtain

J - I

VJ = V°®($)Wk. (19)

fc=o

Consequently, we can rewrite the 2J pixel (ID) signal in the Daubechies-8 basis as follows

j-\V-\

I{x) ~ i^lix) + E E

!

ittfr -

3

)> (

2 0

)

j=0 fc=0

where IQ is the overall average of the (ID) signal, called the scaling factor. The Daubechies-8 wavelets transform decomposes an histogram having 256 pixels into its components for 8 different scales. Consequently, the smooth components and the detailed components of the histogram are readily seperated. The resulted scaling factor represents the average of the histogram Y-coordinate magnitudes. Therefore, the scaling factor of a Daubechies-8 wavelet decomposed histogram of a LAB color image component, represents the average of the overall intensity of this latter.

(29)

3.2 Signal compression a n d quantization

The compression is carried out on the number of the retained coefficients representing the decomposed (ID) image. The wavelet coefficients represent the local intensity variations in the image. Their magnitudes represent the importance of the variations, but their signs express the type of these variations. If we keep only the coefficients with largest magnitudes, we can obtain a good approximation of the decomposed image. Only the absolute values of the wavelet coefficients are taken into account during the compression. In the compression method we rewrite the decomposed (ID) image (20) as follows

2J- 1

I(x) * I[0]4>°0(x) + ^ / [ ^ ( x ) , (21)

i = l

where I[i] = Pk and iii(x) = ip3k{x — 3) for (i = 2j + k, k = 0, ...,2j — \,j = 0,..., J - 1). By summing these coefficients in order of decreasing magnitude and by using a permutation a, we obtain

2J- 1

I{x) ~ 7[0]$(x) + Y, lHi)]ua{i)(x), (22)

where

! I\cr(ii)} \>\ I[(T(i2)} | for all 0 < h < i2. (23)

Consequently, when we keep the m largest coefficients, we obtain an approximation Ic(x) representing the

compressed version of the decomposed (ID) image I{x), defined by

2J- 1

I

c

(x) ~ i[0]$(a:) + J2 ~

IC

[^i{x), (24)

j = i

where

fcr.-i _ / I[*\ i f l <o- 1(i)<m, / „

-This approximation introduces an error called the L2-error given by

i-

r

2 - 1

E (

J

>«])

2 (26)

The retained coefficients after the compression, have the largest magnitudes and the most relevant data in the decomposed (ID) image. However, the storage of these coefficients requires a large space. For this reason, we

(30)

use a quantization of our (ID) images which reduces the storage space. Every significant non-zero coefficient is quantified to just two levels: + 1 represents the largest positive coefficients, and —1 represents the largest negative coefficients. Therefore, the quantified version Iq of the compressed (ID) image Ic is given by

2J- 1

/«(*) ~ J[0]$(s) + £ 3[t]fi,(a:), (27)

where C + 1 if P[i] > 0, Icq\i] = < 0 if Jc[i] = 0 , i = 1,..., 2J - 1. (28) [ - 1 if Ic{{\ < 0.

4 The metric and t h e querying algorithm

4.1 The metric or pseudo-metric

In the last section, we showed that the compression gives us a good approximation of the Daubechies-8 decom-posed (ID) image and the quantization reduces its storage space. According to [5], if we consider only the signs of the retained and quantified coefficients after the compression in the querying, we can reduce the comparison algorithm execution time.

Let us consider Q and T as the query and the target (ID) images, respectively. The (ID) version of the metric, proposed by [5] is obtained from the following expression

2J- 1

| | Q , T | | = a ; o | Q [ 0 ] - r [ 0 ] | + £ w«|QS[»]-T,c[i]|, (29) »=i

where o>$ are the metric weights, Q[0] and T[0] are the scaling function coefficients of the ID images Q and T, and Qg[i] and Tj:[i] represent the i-th decomposed, compressed and quantified coefficients of these latters. Since the possible values of a coefficient are + 1 or — 1, the term

| $ [ i ] - T ,c[ t ] | e j o , l , 2 } . (30)

In the following case

(31)

Q and T preserved their coefficients at the i-th position despite the decomposition and compression, but the

two coefficient signs don't match. They have opposite variations. With respect to the distance between two different images, two opposite variations having the same positions don't represent more proximity than when one of them is equal to zero after the compression. Also, since we assume the vast majority of database images not to match to the query (ID) well at all, the number of mismatches is larger than the number of matches. That's why we write our metric as follows

|| Q,T ||= W0|Q[0] -f[0}\ + £ W i f e ] 7 ^ [ A (32)

where

In order to make the metric faster, we only consider terms in which the query has a non-zero wavelet coefficient. A disadvantage of this approach is that we technically disqualify our metric from being a metric because of its asymmetry. Because of this modification our metric becomes

|| Q,T ||= o;0|Q[0]-f'[0]|+ ] T wA Qg[i\ * fcq\i] J. (34)

To compute the metric over a database of (ID) images, it's generally quicker to count the number of matching coefficients of Qq and Tq than mismatching coefficients. For this reason, we rewrite

]T

Wi

fe]^T£[i]) = J2 " i - E " i f e ] = ^ ] ) > (35)

where

So our metric becomes

Q

)

T||=

Wo

|0[o]-r[o]|+ £ m- E «i(Q

c

S =

fcq

^)- (

37

)

(32)

Since the sum of the terms Wi is independant of every target T9C, we can discard it. Consequently, we obtain

|| Q,T ||= u>

0

\Q[o]-f[o}\- Yl ^ f e ] =

f 9 c

w ) - (

38

)

To simplify, we suppose that the weights associated to the coefficients belonging to the same scale, are identical. Thus, we group the weights according to the resolution levels, by using a simple bucketing function bin{) such as bin(i) represents the floor of log2(i) for i € {1,..., 2J — 1}

bin(i) = [log2{i)\ with i = 1,..., 2J - 1. (39)

Consequently, we use a set of weights wo and {WJ},~Q and define the UJQ and uVs by

\ u>i = Wj for J G 2 ^ + {0,..., 2j - 1} and j = 0,..., J - l ( j — bin(i)),

where J is the maximum number of resolution levels. Finally, it suffices to compute the expression

|| Q,T ||= «)o|g[0]-:f[0]|- Yl Wbin(i)(Q

cg

[i\^T

gc

[i\). (40)

4.2 One-dimensional image querying algorithm

In order to optimize the metric computation process, we introduce two arrays called search arrays. Let 6+

for the coefficients quantified to + 1 and 0 _ for those which are quantified to —1. Each array contains 2J — 1 elements and each element contains a list. For example, the element 9+[i] points on the list of all the database (ID) images having a large positive wavelet coefficient at the i-th position, after compression. With the same way, the element 0_[i] points on the list of all the database (ID) images having a large negative wavelet coefficient at the i-th position. Thanks to these arrays and to the compression, during the querying process we have just to go through the m lists associated to the query m retained coefficients instead of 2J — 1 coefficients. After the creation of the search arrays and weights WQ and {wi}^ computation, the retrieval procedure of a query Q in the database of (ID) images Tfc (k = 1,..., |.DJB|), where |.D.B| denotes the database size, is defined as follows

(33)

Q <— FastDaubechiesWaveletsDecomposition(Q)

Initialize Score[k] = 0, for each k e {1,..., |Z)Z?|} For each k e {1,..., \DB\) do

Score[position of Tk in the (DB)] = w0 * |Q[0] - Tfc[0]|

end for

Qc <— Compress(Q,m)

Qcq <- Quantify(Qc)

For each Q£[i] ^ 0 do If Q£[i] > 0 t h e n

List *-@+[i] Else

List <- 6_[z] End if

for each 2 of List do

Score [position of I in the (DB)] = S'corelposition of I in the (DB)] - Wun(i) End for

End for Return Score

End procedure

This procedure returns an array Score such that Score[k] =|| Q,Tfe || for each k e {1,..., |JDB|}. The array

Score elements which are the similarity degrees between the query Q and the database (ID) images Tk {k e

{1,..., |D-B|}), can be negative or positive. The most negative similarity degree corresponds to the closest target to the query Q.

4.3 Weights a d j u s t m e n t a n d logistic regression

The logistic regression is one of the most popular data mining tools. In content-based image retrieval field logistic regression was used to model the relevance feedback [8]. In our application, the logistic regression can be used to tune the weights of our metric. A weight w^ represents the corresponding relative importance of a query fc-th resolution level coefficients, retained after the compression, to a target coefficients belonging to the same resolution level and having the same positions and signs. Note that if we set the weights WQ and

(34)

to different resolution levels have the same corresponding relative importance to the query different resolution level coefficients. We devide our target set into two classes: a matching class and a mismatching class. Each class contains a set of observations extracted from the database. The logistic regression method introduces a binary target variable and a set of explanatory variables to represent the class of a given observation. Specifically

Y : Target variable

yr — 1 case of mismatch with the target T = 0 case of match with the target T tr_ = {io,T,to,r, —,tj-i,T) '• Explanatory variables,

where io,T is the absolute value of the difference between the scaling factors of the query and the target T, {£fc,r}fc=o a r e the numbers of mismatches between the fc-th resolution level coefficients of the query and the

target T, J is the maximum number of the resolution levels and JJT is the binary target variable, it's either 0 or 1, depending on whether or not the query and the target T are intended to match. We assume a posterior probability of the mismatching with the target T, given the explanatory variables is p?, i.e., pr = P(yT\tr)- In our logistic regression model we assume that posterior probability is given by

PT = P\VT = l\tr_) = F(w0i0,T + 5^Wfc*fc,r)> (4 1) fc=0 and J - I PT = ! ~PT = P[yr =0\tr) = F{-w0i0)T - ^ k i v ) , (42) ^ ' fc=o

where WQ and {wk}JkZl are the weights to compute and

"() = TT*-

(43)

In order to tune the weights, we perform a maximum likelihood estimation. The likelihood function measures the likelihood that different tr_ have given rise to the observed yr- Provided observed target variables have independent Bernoulli distribution with the probabilities PT for each target T, the form of the likelihood is

(35)

given by n L(w0,w0,...,Wj^) = \[pyTT{l-PT)l-yT (44) T=l na n i = Y[(1-PT)Y[PT, (45) T = l T = l

where n is the number of all observations and UQ and n\ are the numbers of cases with target variable value 0 and 1, respectively. We want to choose VOQ and {wk}kZ,Q so as to maximize the natural logarithm of the above likelihood. This can be done by using the software SAS 6.12 or the Fisher-scoring algorithm. The training is an important step in the logistic regression method to extract the observations from the database.

4.4 Training (for t h e logistic regression m e t h o d )

Let us consider a color image database which consists of several color image sets such that each set contains color images which are perceptually close to each other in terms of object shapes and colors. In order to compute the metric weights wl0 and {wlk}rk=0 (I e {1, ...,N}) by the logistic regression, we have to create the matching classes (ylT = 0) (I € {1,...,N}) and the mismatching classes (ylT — 1) (I e {l,...,iV}). To create a single matching class (ylT = 0), we draw all possible pairs of histograms or feature vectors representing color images belonging to the same database color image sets, and for each pair we compute the values of il0 T and {tlk T}kZ0 • Similarly, to create a single mismatching class (ylT = 1), we draw all possible pairs of histograms or feature vectors representing color images belonging to different database color image sets, and then for each pair we compute the values of i^ T and {tlk T

}fc=o-5 Color image retrieval method

The querying method is in two phases. The first phase is a pretreatment phase done once for the entire database containing \DB\ color images. The second phase is the querying phase.

5.1 Preprocessing (of the LAB color image database)

We detail the preprocessing phase done once for all the database color images before the querying in a general

case by the following steps.

(36)

2. Compute the N feature histograms Tu (I G {l,...,iV}) for each i-th LAB color image of the database,

where i e {1,..., \DB\}.

3. The feature histograms representing the database color images are Daubechies-8 wavelets decomposed, compressed to m coefficients each and quantified.

4. Organize the decomposed, compressed and quantified feature vectors into search arrays 0 + and 6;_

(Z = l,...,iV).

5. Adjustment of the metric weights wl0 and {wlk}7k=0 for each set of feature histograms Tu (i — 1,..., \DB\) representing the database color images, where I € {1,..., N}.

5.2 The querying algorithm

We detail the querying algorithm in a general case by the following steps.

1. Given query LAB color image. We denote the feature vectors representing the query image by Qi (I = l,...,iV).

2. The feature vectors representing the query image are Daubechies-8 wavelets decomposed, compressed to

m coefficients each and quantified.

3. The similarity degrees between Qi (Z = 1,.,.,N) and the database color image feature histograms Tu (Z = 1, ...,JV) (i = 1,..., \DB\) are represented by the arrays Scorei (Z = l,...,iV) such that Scorei[i] =||

Qi,Tu || for each i 6 {1,..., |J9B|}. These arrays are returned by the procedures Retrieval(Q;, m, 6; +,

@i.) (I = 1,..., N), respectively.

4. The similarity degrees between the query color image and the database color images are represented by a resulted array TotalScore, such as, TotalScore[i] = J2i=iliScarei[i] for each % e {1,..., |-D-B|}, where {7(}i^i a r e weightfactors used to fine-tune the influence of each individual feature.

5. Organize the database color images in order of increasing resulted similarity degrees of the array TotalScore. The most negative resulted similarity degrees correspond to the closest target images to the query image. Finally, return to the user the closest target color images to the query color image and whose number is denoted by RI and chosen by the user.

(37)

5.3 The querying method dataflow diagram

For simplicity, we describe t h e querying m e t h o d steps by t h e following diagram.

Preprocessing

Querying

[Extraction of N feature histograms from each database LAB color image.

Extracted feature histograms are Daubechies-8 wavelets, decomposed

compressed and quantified.

I

Construction of'the search arrays for each set of feature histograms.

T

Adjustment of the metric weights, for each set of feature histograms.

Extraction of N feature histograms from the LfiB query color image.

I

The query color image N feature histograms are Daubechies-8 wavelet*

decomposed, compressed and

quantified.

T

The strnilarity degrees .between the

query color1 image N feature histograms

and the database color image feature

histograms are computed using the procedure Retrieve! and then are

represented by the arrays Scorsg (1 = 1., ,N).

T

The similarity degrees between the query eotor image and the database color images are represented by the resulted array TotafScore, such us,

A'

TotalScortfi] = y y, Scorej[/]

1=1

for each ; ' e { l , \D3\\

T

The most negative resulted similarity degrees correspond to the closest target images to the query image.

(38)

6 Experimental results and evaluation

In this section we perform some experiments to validate and evaluate our querying method in a database of

\DB\ = 10000 animal, landscape, art, bridge and building LAB color images. It consists of several color image

sets such that each set contains color images which are perceptually close to each other in terms of object shapes and colors.

The first evaluation will be based on a comparison between the querying after using classical color histograms

h,L, ha and h(, given by (1), and the querying after using hi, h\, h^, hea and h^ given by (1), (9) and (12), respectively, to represent the database color images. The second evaluation will be based on a comparison between the querying results when the metric weights are equal to 1 and the querying results when the metric weights are computed by the logistic regression. In this evaluation each database color image is represented by its hi,, h%, 11%, hi and hi. The third evaluation will be based on a comparison between the querying after using

hi, fto, h% given by (1) and (9), respectively, and the laplacian weighted histograms proposed by [6], and the

querying after using hi, h%, h%, hi and hi, to represent the database color images. In the fourth evaluation we will show the importance of considering the metric term ifto|<2[0] — T[0]\ or the scaling factors in the querying. Finally, the fifth evaluation is carried out to show the improvement of the querying results thanks to the query color image h\ and hi transformation used to make the querying invariant to the color intensities of the query color image.

Generally, to carry out an evaluation in the image retrieval field, two principal issues are required: the ac-quisition of ground truth and the definition of performance criteria. For ground truth, we introduce human observations. In fact, two external persons participate in the all below evaluations. Concerning performance criteria, we represent each evaluation results by the precision-scope curve Pr = f(RI), where the scope RI is the number of images returned to the user. In each querying performed in an evaluation experiment, each human subject is asked to give a goodness score to each retrieved image. The goodness scores are 2 if the retrieved image is almost similar to the query, 1 if the retrieved image is fairly similar to the query and 0 if there is no similarity between the retrieved image and the query. Consequently, we can compute the precision as follows:

Pr = the sum of goodness scores for retrieved images/i?I. Therefore, the curve Pr = f(RI) gives the precision

for different values of RI which lie between 1 and 10 in our evaluation experiments. When the human subjects perform different queryings in the same evaluation experiment, we compute an average precision for each value

(39)

of RI, and then we construct the precision-scope curve.

The following experiments will be carried out to extract the curve Pr = f(RI) for each evaluation men-tioned above.

F i r s t e x p e r i m e n t

This experiment is carried out to show the advantage of using our weighted histograms instead of classical color histograms to represent the query color image before the querying. Each human subject is asked to formulate a query from the database and to execute a querying using N = 3 feature histograms which are fiL, ha and hb given by (1), to represent the query color image, while computing the metric weights by the logistic regression and keeping the weightfactors {7/}?=i equal to 1, and to give a goodness score to each retrieved image, then to reformulate a query from the database and to execute a querying using N — 5 feature histograms which are

h>L, h-a, h1^, hea and h\, to represent the query color image, while computing the metric weights by the logistic regression and keeping the weightfactors {7;}f=1 equal to \ and 74 and 75 equal to 1 to give more importance

to the edge region features, and to give a goodness score to each retrieved image. Each querying is repeated twenty times by choosing a new query from the database each time. We repeat this experiment for different orders of compression m e {30,20,10}. The resulted precision-scope curves are

2,5 -I 2 • c 1,5-.£ ™ £ 1 • 0 , 5 -1 2 • " • ' • ' ' " • ' • " ^ v : : : -3 4 5 6 Scope 7 8 9 SE E E E E 10

Figure 3: Evaluation: precision-scope curves for retrieval when the database LAB color images are represented by /i£, ha and hb each, and when the database LAB color images are represented by tiL, h%, hft, hea and h\ each, for the compression orders m e {30,20,10}.

(40)

separation between the modes of the color histograms representing the database LAB color images and to the use of the multispectral gradient module mean histogram to consider the database color image edge regions in the querying. We choose the same query for the three examples. For each example the query is located at the top-left of the dialog box.

;S;M;KW.«W*VAM i • . - . , . * Bmwn—H—WKHKKK0smmmmsmMn

(a) (b)

E 3 ! 'SsSs Q 3 : SS3 H ' 3

(a') (b') Figure 4: Comparison (m = 30): a) first 7 color images retrieved after being represented by their hi, ha and

hb, b) second 7 color images retrieved after being represented by their hi, ha and hi,, a') first 7 color images retrieved after being represented by their hL, ft£, /i£, hea and hi and b') second 7 color images retrieved after being represented by their hi, h%, h%, h% and h\.

(41)

QgggU?

m

- j

m

-

m

**- [™!*^-^i**

rmmm w

.SB

(a) (b)

**mid &$ M£ mti* &:^**

m

«*!

JSP LsJ

(a') (b')

Figure 5: Comparison (m = 20): a) first 7 color images retrieved after being represented by their hi, ha and

hb, b) second 7 color images retrieved after being represented by their hi, ha and hb, a') first 7 color images retrieved after being represented by their hi, /i£, h\, hea and h\ and b') second 7 color images retrieved after being represented by their hi, h\, h\, hea and h\.

(42)

. _ . £ S B J ~ — WfVSWim If j f>"> "-T--TT - " • ! (a) TWltt—'

at^ ee

1 ^

i ^

(a') ruaair K 9 JSB.J — — « M i f f » » » sgg D O (b) £5BL 4 •1 P F " » . (b-) SB

Figure 6: Comparison (m = 10): a) first 7 color images retrieved after being represented by their hi, ha and

fit,, b) second 7 color images retrieved after being represented by their hL, ha and ht,, a') first 7 color images retrieved after being represented by their hi, ftj, ft£, ft® and ft| and b') second 7 color images retrieved after being represented by their ft/,, ftjj, ftj, ft£ and ftjj.

(43)

Second e x p e r i m e n t

This experiment is carried out to show how the weights computed by the logistic regression can optimize our querying results. Each database color image is represented by UL, h*l, h%, hea and h^ Each human subject is asked to formulate a query from the database and to execute a querying, using weights computed by the logistic regression, and to give a goodness score to each retrieved image, then to reformulate a query from the database and to execute the querying, using weights equal to 1, and to give a goodness score to each retrieved image. Each querying is repeated twenty times by choosing a new query from the database each time. We repeat this experiment for different orders of compression m e {30,20,10} and we keep the weightfactors {7/}f=1 equal to \ and 74 and 75 equal to 1 to give more importance to the edge region features. The following table contains

the weights computed by the logistic regression, for order m = 30 of the compression.

hL(l = 1) K(l = 2) KQ = 3) >£(* = 4) K(i = 5) w>o 3.31 3.59 5.14 5.78 5.63 wl0 7.23 4.65 8.29 5.67 8.08 w[ 5.34 6.28 3.51 8.52 6.18 wl2 6.85 8.26 4.40 7.04 9.37 wl3 8.51 4.38 8.59 6.85 9.47 U>4 8.37 6.37 5.90 8.01 5.05 wl5 6.24 7.21 8.72 7.09 7.57 wl6 7.25 10.41 7.83 6.82 8.43 wl7 10.14 3.91 6.93 6.62 7.94

Table 1: Weights computed by the logistic regression for TO = 30.

The resulted precision-scope curves for each compression order are

2,5 -,

1,5

o. 1

0,5 H

Weights computed by the logistic regression. Weights equal to 1.

5 6 Scope

10

Figure 7: Evaluation: precision-scope curves for retrieval using weights equal to 1 and weights computed by the logistic regression, for the compression orders m 6 {30,20,10}.

(44)

The following three Figures give three examples of the improvements of our querying results when we use weights computed by the logistic regression instead of weights equal to 1, to tune the metric. We choose the same query for the three examples. For each example the query is located at the top-left of the dialog box.

-—""•' i«a«n;i La* i h'£ H Lfc^fc! ' Sf4 1 oaa ua _{a s raa} (a) (b) ESSep i * f ^ i.. rs~i ,™„„u. \**\\ f£.S W " " (a')

*1 I iL '

(b') E»a i±i

Figure 8: Comparison (m = 30): a) first 7 color images retrieved after using weights equal to 1, b) second 7 color images retrieved after using weights equal to 1, a') first 7 color images retrieved after using weights computed by the logistic regression and b') second 7 color images retrieved after using weights computed by the logistic regression.

(45)

l i i U I I lt&\^ lv 'U>1 I , Si * s i _ j asa iaa (a)

ugJ

^

(a') £53 S 3 •;*^##*l ' m' (b) [~Ti«ir

Tit

J S S J "«-<

r

,*H*f*. Bag.

Pi

(b')

Figure 9: Comparison (m = 20): a) first 7 color images retrieved after using weights equal to 1, b) second 7 color images retrieved after using weights equal to 1, a') first 7 color images retrieved after using weights computed by the logistic regression and b') second 7 color images retrieved after using weights computed by the logistic regression.

(46)

L _ cssj. > X

o yfefi

r (a) •ii«!,r|; fc •arc: I'1

m

£33 LSJ (b) , S l .

^ y

' B i (a') {S3 UU (b')

Figure 10: Comparison (m = 10): a) first 7 color images retrieved after using weights equal to 1, b) second 7 color images retrieved after using weights equal to 1, a') first 7 color images retrieved after using weights computed by the logistic regression and b') second 7 color images retrieved after using weights computed by the logistic regression.