1. Introduction 2. Hachage ouvert 3. Hachage fermé

(1)

1. Introduction 2. Hachage ouvert 3. Hachage fermé

4. Implémentation des fonctions

Méthodes de hachage

(2)

Recherche dichotomique

3 4 8 9 10 20 40 50 70 75 80 83 85 90

1 2 3 4 5 6 7 8 9 10 11 12 13 14

d i f

4 ?

d i f d i f

di f df

table

fonction ELEMENT (x, table, d, f ) ; début

si (d = f) alors

si (x = table [d ]) alors retour (vrai) sinon retour (faux)

sinon {

i ← (d+f ) / 2 ;

si (x > table [i]) alors

retour (ELEMENT (x, table, i+1, f )) sinon retour (ELEMENT (x, table, d, i)) }

fin

(3)

3 4 8 9 10 20 40 50 70 75 80 83 85 90

Recherche par interpolation

table

1 2 3 4 5 6 7 8 9 10 11 12 13 14

di f

4 ?

df fonction ELEMENT (x, table, d, f ) ; début

si ( d = f ) alors

si (x = table [d ] alors retour (vrai) sinon retour (faux)

sinon { i ← d +

si (x > table [i] alors

retour (ELEMENT (x, table, i+1, f )) sinon retour (ELEMENT (x, table, d, i)) }

fin

[ ] ^f ^table [ ] [ ] ^d ⁽ ^f ^d ^);

table

d table

x  ∗ −



 





−

(4)

Idée : Établir une relation entre un élément et l’adresse à laquelle il est rangé en mémoire

Théorème : Si les éléments de table [1 … n] et x sont choisis uniformément dans un intervalle [a,b], le temps moyen

d’exécution de la recherche par interpolation est O (log log n)

(5)

Type abstrait « dictionnaire »

Ensembles avec les opérations principales

contains (x, A)

add (x, A) temps constant en moyenne

remove ( x, A)

Implémentation non adapté au classement.

Table de hachage

table dont les indices sont dans [0 .. B-1]

Fonction de hachage

h : éléments → [0 .. B-1] non injective en général Résolution des collisions

Hachage ouvert : avec listes, triées ou non.

Hachage fermé : linéaire, quadratique, aléatoire, uniforme, double, ...

}

Hachage

(6)

Hachage ouvert

table 0

1

B-1

Liens explicites Taille variable

int h (char x [ ], int B) { int i ;

sum = 0 ;

for (i = 0 ; x [i] != ’\0’; i ++) { sum = sum + x [i] ;

}

return (sum % B) ;

(7)

int hash (char * s) { char * p ;

unsigned h = 0, g ;

for (p = s ; * p ; p++) { h = (h<< 4) + (* p) ;

if (g = h & 0xf0000000) { h = h ^ (g >> 24) ;

h = h ^ g ; }

}

return (h % PRIME) ; }

Voir Aho, Sethi, Ullman, Compilers, Addison-Wesley, 1986

(8)

#define B 127

typedef struct cell { ElementType e;

struct cell * next;

} Cell

typedef Cell * List ;

void clear (List d [ ], int size) { int i ;

for (i = 0; i < size; i ++) { d [ i ] = null ;

}

int contains (ElementType x, List d [ ] ) { List p ;

p ← d [ h(x) ] ; while (p != null) {

if (compare(p->e,x) == 0) return 1 ; else p ← p->next ;

}

(9)

void add ( ElementType x, List d [ ]) { addListHead ( x, d, h(x) ) ;

}

x h(x)

void addListHead (ElementType x , List d [ ], int i) { List p ;

if ( contains (x, d) == 0) {

p = (List ) malloc (sizeof (Cell)) p->e = x ;

p->next = d [ i ] ; d [ i ] = p;

} }

(10)

Hachage ouvert

- initialisation (clear) : O (B)

- ajout : temps constant (après test d ’appartenance) - appartenance

- suppression

si les événements "h (x) = i" sont équiprobables Création d'une table de n éléments : O ( n (1 +

_Bⁿ

))

} ^O ⁽¹ ⁺

^Bⁿ

⁾

Temps des opérations

(11)

Hachage fermé

table 0

1

B-1

• liens implicites : gain de place

• taille limitée

• ré-allocation en cas de débordement

Re-hachage

h₀ (x) = h (x), h₁ (x), … h_i (x), …

où h_i (x) dépend généralement de x et de i Suppression d ’un élément

distinction entre « vide » et « disponible »

facteur de charge : n / B où n = nombre d ’éléments

(12)

Hachage linéaire

Re-hachage : h_i (x) = (h(x) + i) % B

FORWARD

disponible THEN

vide vide

FOR TO 0

1 2

3 4 5 B-1= 6

EXEMPLE B = 7

h (x) = (‘c’ – ‘a’) % B

où c première lettre de x.

add (’’BEGIN’’) add (’’FOR’’)

add (’’FUNCTION’’) add (’’FORWARD’’) add (’’THEN’’)

remove (’’FUNCTION’’) remove (’’BEGIN’’)

add (’’TO’’)

A H O V B I P W C J Q X C K R Y E L S Z F M T

G N U

(13)

#define B 7

#define VIDE 0

#define DISPONIBLE 1

void clear (ElementType d [ ]) { int i ;

for (i = 0, i < B, i ++) { d [ i] ← VIDE ; }

int position (ElementType x, ElementType d [ ]) {

/* calcule la seule position possible où ajouter x dans A */

int i ; i = 0 ;

while ( ((i < B) && (compare(d [h_i (x)], x) != 0) ) &&

((d [h_i (x)] != VIDE) && (d [h_i (x)] != DISPONIBLE)) ) { i = i + 1 ;

}

return (h_i (x)) ; }

(14)

int positionForContains (ElementType x, ElementType d [ ]) { int hi, dernier ;

hi ← h (x) ; dernier ← (hi + B-1) % B ;

while (( hi != dernier && (d [hi] ∉ {x, VIDE})) { hi ← (hi + 1) % B ;

}

return (hi) }

int contains (ElementType x , ElementType d [ ]) {

if (compare(d [positionForContains (x, d)] , x) == 0) return 1 ; else return 0 ;

}

void remove (ElementType x élément, ElementType d [ ]) { int i ;

i = positionForContains (x, d) ;

if (compare(d [ i ] , x) == 0) d [ i ] = DISPONIBLE ; }

(15)

int positionForAdd (ElementType x , ElementType d [ ]) { int hi, dernier;

hi = h (x) ; dernier := (hi + B-1) % B ;

while ((hi != dernier) && (d [hi] ∉ {x, VIDE, DISPONIBLE})) { hi ← (hi + 1) % B ;

}

return hi ; }

void add (ElementType x , ElementType d [ ]) { int i ;

i = positionForAdd (x, d) ;

if (d [ i ] ∈ {VIDE, DISPONIBLE} ) { d [ i ] = x ;

} else if (compare(d [ i ] , x ) != 0) { error ( " table pleine " ) ;

} }

(16)

Hachage quadratique

Re-hachage : h_i (x) = (h (x) + i ²) % B

int position (ElementType x , ElementType d [ ]) { int hi, inc;

hi = h (x) ; inc = 1 ;

while ((inc < B) && (A [hi] ∉ {x, VIDE, ? }) { hi = (hi + inc) % B ;

inc = inc + 2 ; }

return hi ; }

•

seule la moitié de la table est examinée par re-hachage

→ utiliser la suite :

h (x), h (x) + 1, h (x) - 1, h (x) + 4, h (x) - 4, … avec B premier.

(17)

Hachage double

Re-hachage : h_i (x) = (h(x) + i g (x)) % B - B premier et 1 ≤ g (x) ≤ B - 1

- ou B premier avec chacun des g (x)

pour examen de toute la table par re-hachage.

int position (ElementType x, ElementType d [ ]) { int hi, inc, dernier;

hi = h (x) ; inc = g (x) ;

dernier = (hi + (B-1)* inc) % B ;

while ((hi != dernier) && (A [hi] ∉ {x, VIDE, ? })) { hi = (hi + inc) % B ;

}

return hi ; }

(18)

Hachage aléatoire

Re-hachage : h_i (x) = (h (x) + d_i) % B,

d₁, d₂, …, d_B-1 permutation aléatoire de (1,..., B-1) Génération des d_i par « décalage ».

- choix de k ∈ (1,…, B-1)

2

.

^d_i ^{si 2}

.

^d_i

≤

^B-1

(

²

.

^d_i ^- B) ⊕ k sinon

Exemple B = 8 k = 3 = 11₂

d1 = 5₁₀ = 1012

d2 = 1₁₀ = 0012

d3 = 2₁₀ = 0102

d4 = 4₁₀ = 1002

d5 = 3₁₀ = 0112

d6 = 6₁₀ = 1102

d7 = 7 = 1112

-

^d_i + 1 =

{

(19)

Hachage fermé (aléatoire) table contenant n éléments - initialisation (clear) : O (B)

- probabilité d'avoir i comparaisons :

- coût d'un ajout réussi

i

B n B-i+ n-i+

. ...

B- n-

b n . 



 



≅  1 1 1 1

n B = -

B + n

i

i=

1 1 1

1





 



∑ 

≤

^∞

- B n B . -

- ~ = n

C

_e

nB n-

k=

n

kB

1 1

log 1 1

1

¹ ₁

∑

0

n/B 50 % 80 % 90 % C

_n

1,39 2,01 2,56

Temps des opérations

- création d'une table à n éléments (n ≤ B) C_n= coût moyen d'un ajout

(20)

F tableau associatif : nombre fini d'indices de type quelconque

⇔ F fonction de domaine fini

⇔ F représentable par l'ensemble E = { (x,

F [x]) / x ∈ domaine de F } Opérations

- INIT (F) rendre F [x] non défini pour tout x - DEFINIR (F, x, y) poser F [x] = y

- CALCULER (F, x) = y si F [x] = y

nul sinon

Implémentation

- représenter E par hachage sur le domaine de F

{

Tableau associatif

(21)

Traducteur mot-à-mot dico

if si

then alors else sinon

entrée text ---

if --- then

sortie texte --- Trad si ^--- alors TRADUCTEUR

1. Introduction 2. Hachage ouvert 3. Hachage fermé