Robust digital image watermarking

(1)

Thesis

Reference

Robust digital image watermarking

PEREIRA, Shelby

Abstract

Invisible Digital watermarks have been proposed as a method for discouraging illicit copying and distribution of copyright material. While a myriad of algorithms have appeared in recent years, a number of problems remain. In this thesis we develop methods for robustly embedding and extracting 60 to 100 bits of information from an image. These bits can be used to link a buyer and a seller to a given image and consequently be used for copyright protection. When embedding the watermark, a fundamental tradeoff must be made between robustness, visibility and capacity. Robustness refers to the fact that the watermark must survive against attacks from potential pirates. Visibility refers to the requirement that the watermark be imperceptible to the eye. Finally, capacity refers to the amount of information that the watermark must carry.

PEREIRA, Shelby. Robust digital image watermarking . Thèse de doctorat : Univ. Genève, 2000, no. Sc. 3191

DOI : 10.13097/archive-ouverte/unige:46164

Available at:

http://archive-ouverte.unige.ch/unige:46164

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

Robust Digital Image Watermarking

THÈSE

présentée àla Facultédes sciences de Université de Genève

pour obtenirle grade de Docteurès sciences, mention informatique

par

Shelby PEREIRA

de

Montréal (Canada)

Thèse N o

3191

GENÈVE

(3)

technique Fédérale de Lausanne), C. PELLEGRINI, professeur ordinaire (Départe-

mentd'informatique),A.HERRIGEL,docteur(DCT,DigitalCopyrightTechnologies-

Zürich)etS.VOLOSHYNOVSKIY, docteur (Département d'informatique), autorise

l'impression de laprésente thèse, sans exprimer d'opinion sur les propositions qui y

sont énoncées.

Genève,le 22août, 2000

Thèse -3191-

Le Doyen, Jacques WEBER

(4)

Table of Contents

Table of Contents : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : i

List of Tables : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : iv

List of Figures : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : v

Acknowledgements : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : vii

Abstract : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : viii

Résumé : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : x

0.1 Introduction . . . x

0.1.1 Robustesse. . . x

0.1.2 Visibilité . . . xi

0.1.3 Capacité . . . xi

0.1.4 Contributions principales . . . xi

0.2 Stratégies d'insertion de Filigrane . . . xii

0.2.1 Filigranagelinéaire additif . . . xii

0.2.2 Filigranagenon-linéaire. . . xiv

0.3 Filigranagedans le domaineDFT . . . xiv

0.3.1 Insertion du Filigrane. . . xv

0.3.2 Insertion du Template . . . xv

0.3.3 Décodage . . . xv

0.4 Filigranagedans le Domaine DCT . . . xvi

0.5 Évaluationde qualité . . . xvii

0.6 Résultats. . . xix

1 Introduction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 2 Watermarking for Copyright Protection : : : : : : : : : : : : : : : : 4 2.1 Watermarking Framework . . . 4

2.2 Watermarking Requirements . . . 6

(5)

2.2.2 Robustness . . . 7

2.2.3 Visibility. . . 9

3 Watermarking as Communications: : : : : : : : : : : : : : : : : : : : 16 3.1 Watermarking as a Communications Problem . . . 16

3.2 Coding . . . 17

3.2.1 SpreadSpectrum Encoding . . . 17

3.2.2 Capacity . . . 22

3.2.3 M-aryModulation . . . 23

3.2.4 Error Correction Coding . . . 24

4 Algorithm Review : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 27 4.1 Linear Watermarking . . . 27

4.1.1 Message embedding . . . 27

4.1.2 Attacking channel. . . 28

4.1.3 Message extraction . . . 28

4.2 CategoryI watermarks . . . 30

4.3 CategoryII Watermarks . . . 32

4.3.1 SpatialDomain Category II Watermarking . . . 32

4.3.2 Transform Domain Category II Watermarking . . . 33

4.4 CategoryIII Watermarking . . . 36

4.5 ResistingGeometric Transformations . . . 37

4.5.1 Invariant Watermarks . . . 37

4.5.2 TemplateBased Schemes . . . 37

4.5.3 Autocorrelation Techniques . . . 38

4.6 Evaluationand Benchmarking . . . 39

4.7 Analysisand Research Directions . . . 41

5 Embedding in the Fourier Domain : : : : : : : : : : : : : : : : : : : : 43 5.1 The DFT and its Properties . . . 44

5.1.1 Denition . . . 44

5.1.2 GeneralProperties of the Fourier Transform . . . 44

5.1.3 DFT: Translation . . . 45

5.2 Embedding . . . 45

5.2.1 Embedding the Watermark . . . 45

5.2.2 Embedding the Template. . . 47

5.3 Decoding. . . 47

5.3.1 TemplateDetection . . . 47

5.3.2 Decoding the Watermark . . . 51

(6)

6 Optimized Transform Domain Embedding : : : : : : : : : : : : : : : 53

6.1 Overview. . . 53

6.2 SpatialDomain Masking . . . 54

6.3 ProblemFormulation . . . 55

6.4 Eective Channel Coding. . . 57

6.5 Wavelet Domain Embedding . . . 58

6.6 Results . . . 59

6.7 Summaryand Open Issues . . . 60

7 Attacks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 63 7.1 Problemformulation . . . 64

7.2 Watermark attacks based on the weak points of linearmethods. . . . 65

7.3 Watermark removalbased on denoising . . . 66

7.3.1 ML solutionof image denoising problem . . . 67

7.3.2 MAP solutionof image denoising problem . . . 67

7.4 Lossywavelet Compressionattack . . . 69

7.5 Denoising/Compressionwatermarkremovalandpredictionfollowedby perceptual remodulation . . . 70

7.6 Watermark copy attack . . . 71

7.7 Template/ACF removalattack . . . 72

7.8 Denoisingwith RandomBending . . . 73

8 Towards a Second Generation Benchmark : : : : : : : : : : : : : : : 74 8.1 Perceptual Quality Estimation . . . 74

8.1.1 The Watsonmodel . . . 74

8.1.2 Comparison ofthe Watson metric and the PSNR . . . 75

8.2 Second Generation Benchmarking . . . 77

9 Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 80 9.1 Perceptual Quality Evaluation . . . 80

9.2 Benchmarking results . . . 83

10 Conclusion and further research directions : : : : : : : : : : : : : : 86

Bibliography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 88

(7)

List of Tables

1 Codage par Magnitude . . . xvii

6.1 Magnitude Coding . . . 57

6.2 JPEGthresholds at quality factor 10 . . . 58

8.1 PSNRand Watson measures for the images benz and mandrill . . . . 77

9.1 Watson measures forimages bear, boat, girl,lena,watch and mandrill. 84

9.2 Results relative toPetitcolas'benchmark . . . 84

9.3 Benchmark Results . . . 85

(8)

List of Figures

1 Modèle d'un système de ligranageadaptatif . . . xii

2 Modèle d'un système de ligranagenon-linéaire . . . xiv

3 a)OriginalBarbara b)PSNR=24.60 c)PSNR=24.59d)PSNR=24.61 . xviii 1.1 Robustness, Visibilityand Capacity . . . 2

2.1 Modelof Watermarking Process . . . 5

2.2 Sensitivity of the HVS to changes inluminance . . . 10

2.3 Originalimages of Barbara (a) and Fish (b) along with their NVF as determinedby ageneralized gaussian model(c) and (d). . . 14

3.1 Watermarking as a Communications Problem . . . 16

3.2 LinearFeedback ShiftRegister . . . 19

3.3 MaximalLength Sequence Generation . . . 20

3.4 Error Probability for Msequences with a 1bit message . . . 21

3.5 Probability of correctdecoding fora 60 bitmessage . . . 21

3.6 GaussianChannel Capacity . . . 22

3.7 M-aryencoding . . . 23

3.8 M-aryDecoding . . . 23

3.9 M-arySystem Performance. . . 24

3.10 TurboEncoding . . . 25

3.11 Decoding of Turbo codes . . . 26

4.1 CategoryI Watermark . . . 31

4.2 CategoryII Watermark . . . 32

4.3 CategoryIII Watermark . . . 36

4.4 Peaks inDFT beforeand after rotation . . . 39

5.1 ORIGINAL LENA IMAGE ANDLOGOF MAGNITUDE OFFFT . 46 6.1 OriginalimageLena(a) alongwith watermarked image(b) and water- markin(c)=(a)-(b). . . 62

7.1 Communication formulation of awatermarking system . . . 64

(9)

7.3 Scaling/shrinkage functions ofthe imagedenoising algorithms . . . . 69

7.4 Approximation of the soft-shrinkage function by quantization with zero-zone. . . 70

7.5 DFT peaks associated with a template based scheme(a) and auto- correlationbased scheme(b) . . . 73

8.1 . . . 76

8.2 a)TheoriginalBarbara imageb)PSNR=24.60,TPE=7.73, NB1=119, NB2=0c)PSN=24.59,TPE=7.87,NB1=128,NB2=0d)PSNR=24.61, TPE=9.27, NB1=146, NB2=3 . . . 78

8.3 a)The original Lena image b)initials of aname added . . . 79

9.1 Originaltest images. . . 81

9.2 Examples ofimages marked by the 4 algorithms.. . . 82

9.3 a)The bear marked image b)total perceptualerrors for blocks 1616 83

(10)

Acknowledgements

I wouldrstliketothank myprofessorThierryPun foracceptingmeasastudent

in the Vision Group and closely supervising this thesis over the past two years and

oering many helpful suggestions duringthe courseof this time.

I alsothank the members of the watermarking group. Firstly I thank Sviatoslav

Voloshynovskiyformanyinstructivetechnicaldiscussionsandalsoforoeringhelpful

suggestions while editing this document. I also thank him for agreeing to be a jury

member. Thanks alsoto Frederic Deguillaume,GabrielaCsurka and Joe Oruanaidh

with whom I was able to share numerous enlightening discussions. I also thank

Alexander Herrigel from Digital Copyright Technologies for funding this project in

part and also for agreeing to be a member of the jury. Thanks also to Murat Kunt

and Christian Pelligrini for agreeing to be members of the jury and for taking time

toread my thesis.

I thank Lori Petrucci, Patrick Roth, Alex Dupuis, Stephane Marchand-Maillat,

Wolfgang Muller, Henning Muller, David Squire, Alexandre Masselot, Paul Albu-

querque, Sergui Startchi, Christian Rauberwho all helped make my stay in Geneva

more enjoyable. I thank them for the many lunches, coee breaks, dinner partiess,

barbeques, and outings we shared together which all helped me to integrate into

Geneva. I thank alsoGermaine Gusthiot for her help with the many administrative

tasks and Dorothy Hauser for her promptness in obtaining many references which

greatly facilitatedthe research task.

Finally I thankthe many friends inGeneva whoallmade my stay morepleasant.

(11)

Abstract

Invisible Digital watermarks have been proposed as a method for discouraging

illicit copying and distribution of copyright material. While a myriad of algorithms

have appeared in recent years, a number of problems remain. In this thesis we de-

velop methods for robustly embedding and extracting 60 to 100 bits of information

from animage. These bits can be used to link abuyerand a seller toa given image

andconsequently beusedfor copyright protection. When embeddingthewatermark,

a fundamental tradeo must be made between robustness, visibility and capacity.

Robustness refers to the fact that the watermark must survive against attacks from

potential pirates. Visibility refers to the requirement that the watermark be imper-

ceptible to the eye. Finally, capacity refers to the amount of information that the

watermark must carry.

We present 2fundamentallydierentapproaches. The rst approach is basedon

theDiscreteFourierTransform(DFT),themagnitudeofwhichisusedforembedding

bits. Anetransformationsonanimageleadtocorrespondinganetransformations

in the DFT which suggests that the DFT can be used to recover watermarks which

have undergone such transformations. We propose the use of a template consisting

of peaks in the DFT domain. If the image is transformed, the peaks are detected

and the transformationis detected by solving apoint matching problem. In orderto

ensure that the watermark is invisible, anoise visibilityfunction is calculated in the

spatial domainand afterthe embeddingis performed inthe DFT domain,the pixels

are modulatedin the spatial domain to ensure invisibility. Our results indicate that

the proposed methodsuccessfully recovers watermarksfromtransformed images,but

is relativelyweak against compressionand cropping.

In recent years ithas been recognizedthat embeddinginformationinatransform

domain leads to more robust watermarks. A major diculty in watermarking in a

transform domainlies in the fact that constraints onthe allowable distortionat any

pixel are usually specied inthe spatial domain. Consequently the second approach

consists of a generalframework for optimizingthe watermark strength inthe trans-

formdomain when the visibility constraints are speciedin the spatialdomain. The

main idea is to structure the watermark embedding as a linear programming prob-

lem in which we wish to maximize the strength of the watermark subject to a set

of linear constraints on the pixel distortions as determined by a masking function.

(12)

using the Haar wavelet and Daubechies 4-tap lter in conjunction with a masking

function based on a non-stationary Gaussian model, but the algorithm is applicable

toanycombinationoftransformandmaskingfunctions. Unfortunatelythealgorithm

is not applicable to the DFT since the computational complexity of global transfor-

mations isoverwhelming. Our results indicatethat the proposed approach performs

well against lossy compression such as JPEG and other types of ltering which do

not change the geometry of the image, however at this time the watermark cannot

be recovered from images which have undergone an anetransformation.

As asecond aspectof the thesis wealsodevelop robustevaluationmethods. This

consists of two parts. Firstly we develop a new approach for evaluating watermark

visibilitybasedontheWatsonmetric. Theresultsindicatethattheperceptualquality

as measured by the Watson metric is consistently more accurate than that provided

by the typically used PSNR criterion. We also dene a new benchmark consisting

of attacks which take into account prior information about the watermark and the

image. Theseattacksaremuchmorepowerfulthantheones availableinotherbench-

marking toolswhich donot use prior information,but rather perform generalimage

processing operations. Our results relativeto the proposed benchmark indicate that

theoptimizednon-linearembeddingapproachwedevelopedperformsmarkedlybetter

than existingcommercialsoftware which suggests that futureresearch shouldconsist

of pursuing these lines rather than the linear additive watermarking paradigmdealt

with inthe bulkof the literature.

(13)

Résumé

0.1 Introduction

Le travaildécritdans cettethèse sesitue dansledomainedu ligranage d'images

digitales. Il s'insère dans un projet global dont le but est de développer un système

pourlaprotectiondesdroitsd'auteursurleWeb. Nousregardonsicilecasparticulier

des droitsd'auteurs appliquésauximagesetnousnouslimitonsàl'aspect traitement

d'imagedu problème. Lesaspects cryptographiques sont traitéspar Herrigel [33].

L'idéeprincipaleduligranageconsisteàencoderuneinformationdansuneimage

en eectuant des légères modications sur les pixels. Pour qu'un ligranesoit utile,

troiscritèresprincipauxdoiventêtresatisfaits: leligranedoitêtrerobuste, invisible,

et doit contenir une certaine quantité d'information quel'on désigne par le termede

capacité. Nousconsidérons ces trois pointsimportantsen détail.

0.1.1 Robustesse

Un ligrane doit être capable à résister plusieurs type d'attaques. Ces attaques

peuvent être divisées en quatre catégories:

Lapremierecatégorieconcernelesattaquessimplesquinechangentpaslagéométrie

de l'image, et qui n'utilise pas d'information a priori sur l'image. Les attaques de

cette catégorie sont : la compression JPEG et ondeletes, le ltrage, l'addition de

bruit, la correction gamma, et l'impression suivie par une rénumérisation à l'aide

d'un scanner.

Ladeuxièmecatégoriecomprendlesattaquesquimodientlagéométriedel'image.

On y trouve la rotation, le changement d'échelle, et les transformation anes. Des

attaques plus sophistiqués incluses dans cette même catégorie sont l'enlèvement de

lignes oude colonnes, ainsi queles distortions locales non-linéairesqui sont misesen

oeuvre dans leprogramme Stirmark[66].

La troisième catégorie d'attaques comprend les attaques dites d'ambiguité. Ici

l'idéeestdecréerunesituationoùl'onnepeutpasdéciderdelavéritableorigined'un

ligrane. Craver [13] démontre quedans certaines conditions, onpeut créer un faux

original en soustraiant un ligrane d'une image déjà marquée. Récemment Kutter

(14)

marquepeut êtreestiméestatistiquementsur l'imageprotégée, puisajoutée àl'image

de destination.

La dernière catégorie comprend les attaques consistant à enlever le ligrane de

l'image. Ici on utilise des informations a priori sur le ligrane et sur l'image pour

eectuer un débruitage optimalde l'image, oùleligraneest traitécomme un bruit.

Voloshynovsky[95]démontrequ'ilestmêmepossibled'augmenterlaqualitédel'image

en enlevant leligrane avec des algorithmes de débruitage bien adaptés.

0.1.2 Visibilité

Leligranedoitaussiêtreinvisible. Enpratiquececiimpliquequel'imageoriginale

et l'image marquée doivent être indiérenciables à l'oeil. L'évaluation objective de

la visibilité est un sujet dicile. Dans un premie temps, Petitcolas a proposé un

PSNR de 38dB pour indiquer si deux images sont indiérenciable. Nous trouvons

en pratique que cette mesure est insusanteet dans cette thèse nous proposons une

mesure basée sur lamétrique de Watson.

0.1.3 Capacité

Leligranedoit aussiconteniraumoins60bitsd'information. Ceci estimportant

selon les dernières standardisations puisque l'on aimerait insérer un identicateur

qui associe un vendeur, un acheteur et une image. Malheureusement la plupart des

publications actuelles sur le sujet décrivent des ligranes ne portant qu'un seul bit

d'information.

0.1.4 Contributions principales

Les contributions principales de cette thèse sont:

1. Le développement de techniques basées sur la transformée discrète de Fourrier

(DFT)ande rendrelesligranesrésistantsauxtransformationsgéométriques.

2. Ledéveloppement d'un nouvel algorithmenon-linéaire, basésur latransformée

DCT, qui dépend de l'image, et qui résiste à l'attaque consistant à copier un

ligraned'une image etl'insérer dans une autre.

3. Le développementd'une nouvelle méthode pour évaluer laqualité d'une image

signée.

Nousdétaillonsces contributions dans lessections qui suivent.

(15)

0.2 Stratégies d'insertion de Filigrane

Il existeplusieursstratégies pourinsérerun ligranesatisfaisantauxtroiscritères

présentés. Cox [12] utiliseleschéma généralde lagure0.2. Tout d'abordoncalcule

Module D’attenuation

Image

Modele

Image Filigrane

Perceptuel

avec Filigrane

Figure1: Modèle d'un système de ligranageadaptatif

des contraintes imposées à partirde l'imageà marqueran de préserver l'invisibilité

du ligrane. Oncalculeensuiteleligrane,onappliquenoscontraintes, etlerésultat

est ajoutéà l'imageoriginale,nous donnantainsil'imagesignée. Ceciest un système

de ligranesadditifetlinéaire. Laplus grandepartiedelarecherche dans ledomaine

de ligranage s'est eectué dans ce cadre, etnous en détaillonslesprincipes.

0.2.1 Filigranage linéaire additif

Nous avons l'information binaire b = (b

1

;:::;b

L

) à insérer dans l'image x =

(x

1

;:::;x

N )

T

de tailleM

1

M

2

, où N = M

1 M

2

. Typiquement b est converti dans

un formeplus robuste en utilisant lescodes de correctionsd'erreurs tel quelescodes

BCH, LDPC ou turbo [62, 93]. En général cette conversion peut être décrite par

la fonction c = Enc(b;Key) où Key est la clé secrète de l'utilisateur. Le ligrane

s'exprime alors commeune fonction de cpar:

w(j)= K

X

k=1 c

k p

k

(j)M(j) (1)

où M est un masque local qui atténue le ligrane pour le rendre invisible, et p est

un ensemble de fonction orthogonales en deux dimensions. L'image avec le ligrane

additifs'écrit alors :

(16)

Pour décoder le ligrane on considère que l'on reçoit l'image marquée y 0

après

avoir subit diverses distortions. An d'extraire le ligrane nous calculons d'abord

une estimation du ligrane:

b

w=Extr(y 0

;Key) (3)

Pour calculer cette estimation, on modélise le bruit introduit par l'image ainsi que

le bruit apporté par des attaques par une distribution de probabilité p

X

(:), et l'on

obtient alors le maximum de vraisemblancepar:

b

w=argmax

e w2<

N p

X (y

0

jw)e (4)

Dans le cas d'un bruit Gaussien nous obtenons la moyenne. Dans le cas d'un bruit

Laplacien,nousobtenonslamédiane. Ilestaussipossibled'utiliserl'estimateurMAP

(Maximum a Posteriori).

b

w=argmax

e w 2<

Nf p

X (y

0

jw)e p

W

(w)ge (5)

où p

W

(:) est la distribution du ligrane. Si l'on fait l'hypothèse que l'image est

locallement Gaussienne avec x N(x;R

x

) et w N(0;R

w

) et avec matrices de

covariance R

x etR

w

, nous avons:

b w=

R

w

R

w +R

x (y

0

y 0

) (6)

oùy 0

x,et b

R

x

=max

0;

b

R

y R

w

est l'estimationdelavariancelocaledel'image

b

R

x

= 2

x I

.

Pour décoder leligrane nous eectuons d'abord les corrélations:

r =hw;b pi: (7)

Le décodeur optimal s'exprimecomme:

b

b =argmax

e

b p

r j e

b;x

: (8)

Enpratiquececi revientàdécoderdes codes de correctiond'erreurs. Il est important

de noterque souventoneectueune quanticationdes valeursavantledécodage des

codes de correction d'erreurs,ce qui conduitàune perte de 3-6dB.Lescodes lesplus

puissants (turbo et LDPC) utilisent toute l'informationan d'obtenir une meilleure

performance.

(17)

Image

Modele

Image Filigrane

avec Filigrane Optimisation

Perceptuel

Figure2: Modèle d'un système de ligranagenon-linéaire

0.2.2 Filigranage non-linéaire

Lemodèle deligranage linéaireest pratiqueparcequ'ilest simple. Malheureuse-

ment, l'utilisation de ce modèle entraîne des pertes à l'insertion de la signature qui

résultentdu faitqu'ontraitel'imagecommeun bruit. Cependant,l'imageestconnue,

et cette information devrait être exploitée pour l'insertion. Cox propose le modèle

généralengure0.2.2pourun systèmedeligranagepluspuissant. Danscesystème,

au lieu d'une simple atténuation, nous optimisons le ligrane par rapport à l'image

lors de l'insertion de la signature. Typiquement nous pouvons utiliser les tables de

quantisation JPEG ou des ondelettes pour optimiser le ligrane par rapport à la

compression.

0.3 Filigranage dans le domaine DFT

LaDFTaplusieurspropriétésintéressantesquifontquecettetransforméeseprête

bien auligranage. Le DFT est déni en 2D par:

F(k

1

;k

2 )=

N

1 1

X

x

1

=0 N

2 1

X

x

2

=0 f(x

1

;x

2 )e

j2x

1 k

1

=N

1 j2x

2 k

2

=N

2

(9)

et son inverse par:

f(x

1

;x

2 )=

1

N

1 N

2 N1 1

X

k

1

=0 N2 1

X

k

2

=0 F(k

1

;k

2 )e

j2k1x1=N1+j2k2x2=N2

(10)

où N

1 et N

2

sont les dimensions de l'image. En pratique nous travaillons avec la

(18)

trouvons que lamagnitude de laDFT subitla transformation suivante :

k

1

k

2

! T

1

T

k

1

k

2

(11)

Cettepropriétéestimportantecarsinouspouvonsdéterminerlatransformationsubit

parl'image,alors nouspouvons lacompenserdans ledomaineDFT avantde décoder

le ligrane. Nous allons voir comment la transformation peut être détectée à l'aide

d'une mire de recalage ( template). On note aussi que la magnitude de la DFT est

invarianteà la translation.

0.3.1 Insertion du Filigrane

Pour insérer un ligrane dans le domaineDFT nous commençons par compléter

l'image avec des zéros an de l'amener à une taille xe de 10241024. Ensuite la

séquence de bits est traduite par la correspondance 0! 1 et 1 ! 1. Pour insérer

lemessageonchoisit une bandede fréquences entre f

w1 etf

w2

,quise situeaumilieu

du spectre. Danscette bande, nous choisissons des paires de points aléatoirementet

nous changeons les valeurs de sorte que k

w

~ m

ci

= (x

i

;y

i ) (y

i

; x

i

), où (x

i

;y

i ) sont

les points choisis, m~

c

i

sont les bits du message, et k

w

est la force du ligrane. La

forceest choisie de façonàlimiterlesdistorsionsdans ledomainespatiallorsque l'on

maximise laforce dans ledomainespectral.

0.3.2 Insertion du Template

Pourinsérerletemplateonchoisit8pointsaléatoirementdansleDFTetoninsère

des pics àces points. Letemplate necontientaucune informationmaisil sertcomme

un outil pour détecter les transformationssubies par l'image.

0.3.3 Décodage

Le décodage se divise en deux parties: détection du template et le décodage du

ligrane. Pour détecter le template nous utilisons l'algorithmesuivant:

1. Calculde lamagnitude du DFT de l'image.

2. Extractiondes maxima locaux.

3. Estimationde lamatricede correspondance entre lespositionsconnues dutem-

plate et les positions des maxima détectés. Pour limiter l'exhaustivité de la

recherche, oncontraintletypedematriceàestimerauxtransformationsraison-

(19)

Si le template est détecté nous pouvons dire avec une probabilité P

fal se

= ( 9

2000 )

8

3000 5:010 16

qu'un ligrane était inséré. Une fois que le template est détecté,

lamatricecalculéenouspermetde compenser pour latransformationgéométriqueet

de décoder le ligranecorrectement.

0.4 Filigranage dans le Domaine DCT

La contribution la plus importante de cette thèse est le développement d'une

méthodenon-additivetenantcomptedetoutel'informationdel'imagelorsdel'insertion

du ligrane. Le problème avec l'insertiond'un ligranedans le domainespectral est

de limiter les distortions dans le domaine spatial tout en maximisant la force dans

ledomainespectral. Voloshynovskiy propose l'utilisationd'une fonctionNVF (Noise

VisibilityFonction)pour mesurer lesdistortionsvisuelles, quiindiqueleniveaumax-

imal de modication acceptable à chaque pixel sans qu'elle devienne visible. Cette

fonction sedénit comme suit :

NVF(i;j)=

w(i;j)

w(i;j)+ 2

x

; (12)

où w(i;j)=[()]

1

kr(i;j)k 2

etr(i;j)=

x(i;j) x(i;j)

x

. Le NVF nous dit surtout com-

ment marquer les régions texturées. Pour marquer les régions uniformes, nous pou-

vons exploiter laluminance,en observant quel'oeilest moinssensible auxvariations

dans les régionsclaires que dans lesrégions foncées. Ceci conduità:

pi;j

=(1+kCST(x

i;j

)((1 NVF(i;j))S+NVF(i;j)S

1

) (13)

oùCST estlafonctiondesensibilitéàlaluminance,etk,S,etS

1

sontdesconstantes

qui pondèrent lepoid des composantes.

Pour encoder dans ledomaineDCT nouspouvons alors formuler un problèmede

minimisation. Pour encoder un 1 on maximiseraune valeur DCT et pour encoder

un0 onminimiseraune valeurDCT. Pouroptimiserlaforcedu ligraneonformule

le problème commesuit :

min

x f

0

x ; Axb (14)

où x = [x

11 :::x

81 x

12 :::x

82 :::x

18 :::x

88 ]

t

est le vecteur de valeurs DCT, f est un

vecteurde zérossaufauxpositionsoùl'onaimeraitinsérer1 ou0. A cespositions

oninsère 1 et 1 respectivement dans f. Axb contient lescontraintes:

A= 2

4

IDCT

IDCT 3

5

; b= 2

4

p

3

5

(15)

(20)

où IDCT est la matrice d'inversion de la DCT. On résout la minimisation par la

méthode Simplex.

Pour mettre ceci en pratique, nous divisons l'image en blocs de taille 88 et

dans chaque bloconchoisit2endroitspourinsérer l'information. Audécodage onlit

la séquence des endroits choisis et l'on décode le code de correction d'erreurs. Dans

cettecontexteilsetrouvequelesturbocodesnousdonnentlameilleureperformance.

Une amélioration importante de la méthode consiste à utiliser le codage décrit

dans letableau1. Audécodageon prendlamagnitude de lavaleur etonla compare

Table 1: Codage par Magnitude

signe(c

i

) bit Codage

+ 0 baisse c

i

0 augmentec

i

+ 1 baisse c

i

1 augmentec

i

à un seuil T qu'on xé à 15. L'avantage principal de ce type de codage est que le

ligranevarieselonl'image. Enparticulierl'insertionduligranedépend delavaleur

déjà présente dans le DCT, et par conséquent, ce type de ligrane ne peut pas être

copié d'une imageà une autre.

0.5 Évaluation de qualité

Pour eectuer des comparaisons objectives de la qualité des images marquées, il

faut dénir un critère objectif. Le PSNR, qui est en général utilisé, est insusant.

La gure3 montre une comparaisonentre 4images: l'originale,le ligraneadaptatif

utilisant un modèle stationnaire Gaussien généralisé, le ligrane adaptatif utilisant

un modèlenon-stationnaire Gaussien,etle ligranenon-adaptatif. Touteslesimages

sont marquées avec le même PSNR mais on remarque que les ligranes adaptatifs

restentinvisibles etalorsqueleligranenon-adaptatifestvisible. Cecirésultedufait

quel'oeilest plus sensibleauxmodicationsdans lesrégionsplates. LePSNR est un

critère globalqui ne pondèrepas lesdistorsions en fonction de larégion.

Pour remédier à ce problème, nous pouvons utiliser la métrique de Watson, qui

évalue l'impact des distorsions en fonction de la région de l'image. L'idée principale

est de calculerla DCTblocpar blocsur toute l'image. Lesblocssontde taille88.

Pour chaque coecientDCT, nous xons un seuilde visibilitét

ij

quiaété déterminé

expérimentalement. Ce seuil est ensuite ajusté en fonction de la luminance et de la

texture par les équations:

t

ijk

=t

ij (

c

00k

c )

a

t

(16)

(21)

(a)original (b)adaptatifsGG

(c)adaptatif nG (d)non-adaptatif

Figure3: a)OriginalBarbara b)PSNR=24.60 c)PSNR=24.59 d)PSNR=24.61

(22)

et

m

ijk

=Max[t

ijk

;jc

ikj j

wij

t

ijk 1 wij

] (17)

oùc

00k

est lecoecientDCdu block,c

00 ,eta

t

sontdes constantes, m

ijk

est un seuil

de visibilité,etw

ij

est ledegrédemasquagedans lesrégionstexturées. Pour mesurer

ladégradation subie par une image, nous utilisons alors:

d

ijk

= e

ijk

m

ijk

(18)

oùe

ijk

estl'erreurpour unecomposanteDCT.Eneetnouspondéronsl'erreurparsa

visibilité. PourobteniruncritèreglobalnouseectuonsunesommationdeMinkowski

des erreursaux carréessur touslesblocs. Enutilisantcette métrique,nous obtenons

pour le ligrane non-adaptatif une erreur de 9.27, alors que les ligranes adaptatifs

ont des erreurs comprises entre 7.73 et 7.87. Nous observons que la métrique arrive

à bien diérencier la qualité des images bien que le PSNR ne nous donne aucun

renseignement utile.

0.6 Résultats

Les résultats indiquent que l'algorithme DFT à une bonne performance globale.

En particulier, l'algorithme est robuste contre les distortions géométriques ainsi que

contre les ltres passe-haut et passe-bas. Malheureusement l'algorithme ne résiste

pas àune compressionJPEGavec unfacteur de qualité inférieurà60%, etne résiste

pas aux distortions locales mises en oeuvres dans le program Stirmark. De plus,

l'algorithme ne résiste pas à l'attaque de copie du ligrane. L'algorithme DCT, par

contre, n'a aucune résistance aux distortions géométriques. En revanche, il résiste

à l'attaque de copie du ligrane ainsi qu'à une compression JPEG à un facteur de

qualité de10%. Malheureusement,leligranene résistepas àdes distorsions locales.

Par conséquent, le travail qui reste à faire consiste à ajouter un template à

l'algorithme DCT pour le rendre résistant aux transformations géométriques. Il est

aussi important de travaillersur de nouvelles méthodes qui résistent aux distortions

locales.

(23)

Chapter 1

Introduction

The World WideWeb, digital networks and multimediaaord virtually unprece-

dentedopportunitiestopiratecopyrightedmaterial. Digitalstorageandtransmission

makeittrivialtoquicklyand inexpensivelyconstructexact copies. Consequently the

needformethodstoidentify,authenticate,andprotectmultimediahas spurredactive

research inwatermarking.

Watermarking is a special case of the general information hiding problem. The

central idea is to robustly embed information in a medium known as the cover

object in order to produce the stego object. The embedding is done in such a

way that the cover and stego objects are indistinguishable. Cover objects include

images,3D graphics,video, music, textdocuments andhtmldocuments. Asurvey of

methods used tohide informationinvarious mediais given in[65]. In this thesis we

willbeinterestedin theproblemof imagewatermarking forthe purpose ofcopyright

protection. Withrespecttothis application,wewould liketoembedinformationinto

animagewhichassociatesabuyerwithaseller. Inordertobeuseful thisinformation

must beeasily and reliably recovered at alater time if the need arises.

The rapidly expanding world wide web provides a plethora of applications for

watermarking technology. Recently, art galleries and museums as well as private

photographers and artists have showed interest in selling their work on the web.

The availablility of high resolution color scanners and monitors as well as image

processing software makes the process of transforming traditional color images into

highresolutiondigitalimagesarelativelyeasytask. Onceindigitalform,thegrowing

infrastructures for e-commerce can be exploited to reach a potentially enourmous

clientele over the web. However, the drawback of this approach arises fromthe fact

that with current technology, digital images can be reproduced easily, quickly and

without loss. Consequently some mechanism for the protection of owner's rights

must bedeveloped todiscourage theftwithinthis context. Thisneedhas spurred the

recent research eorts in imagewatermarking.

With respect to the general information hiding problem, a tradeo is involved

(24)

refers to the ability of the inserted information to withstand image modications

(intentional or unintentional). In information hiding we require that the modied

object be indistinguishablefromthe original. Inthe image watermarking context we

use the term visibility to designate this concept. In some publications the word

transparency is also used. Finally capacity refers to the amount of information

we are able to insert into the image. Designing and optimizing informaiton hiding

algorithmsinvolvesthedelicateprocessofjuidiciouslytradingobetweenthesethree

conicting requirements.

CAPACITY ROBUTSNESS

VISIBILITY

Figure 1.1: Robustness, Visibilityand Capacity

In themultimediacontextworkininformationhidinghas beenundertaken inthe

following 3 subgroups: steganography, tamper-proong, and watermarking. In the

case of steganography, we are interested in sending large quantities of information

(large capacity), however we are less concerned with robustness. Our visibility re-

quirementisthatthecoverobjectdoesnotappeartocontainanyhiddeninformation.

Many applications can be found in militarysettings. For example secrets might be

exchangedovertheweb byinvisiblyembeddinginformationinimageswhichare then

transmitted. A wide array of possibilitiesincludingembeddinginformationinmusic,

videos and even in text.

Tamper-proong[39] involvesembedding informationintothe coverobject which

is then used at detection to determine if and how the object has been modied. In

this case the robustness criteriais somewhat dierent since we would like the water-

mark to change as a function of the distortions introduced into the medium. Our

capacity requirement is relatively low and our visibility requirements may be rela-

tively weak since we may be able to tolerate some degradation in the image. One

major applicationisin theauthentication ofdigital evidenceinacourt case. Insuch

a scenarioa dierent approachemployinga fragilewatermark could beused to indi-

cate that indeed a digital image has not undergone processing intended to sabotage

evidence. Fragile watermarking refers to the process of inserting a watermark which

willbedestroyed ifthe mediumischanged. This problemis considerablyeasier than

(25)

the general tamper-proong problem is quite interesting since algorithms must be

designed sothatit ispossibletodetermine thelocationof modicationsinanimage.

Watermarking involves embedding informationwhich can beused to identify the

buyer and seller of a given cover object. This information can later be detected to

identify the true buyer and seller of a given cover object. In this context robustness

is of prime importance since a malicious third party may intentionally attempt to

remove the watermark and later illegally resell the object. Furthermore we also

require that the watermark be invisible since the cover object is of value. This is in

sharp contrast to steganography where the cover object has no value in itself. The

capacityrequirementisalsodierentsinceunlikesteganographyweonlyhaveasmall

amount of information to communicate which includes buyer and seller identiers.

This usually amounts toroughly 100 bits.

This thesis covered several aspects of the watermarking problem. The most im-

portantresultconsistsofaformalizationof theembeddingproblem. In particularwe

demonstrate how to maximize the strength of the watermark subject to constraints

on the visibility. Another important result is the development of methods for re-

covering watermarks which have undergone a generalane transformation. Finally,

we also propose a new method for evaluating image quality and a new benchmark

for evaluating the robustness of watermaking schemes. Both of these problems have

been addressed in the literature, but the initialevaluation criteria for visibility and

robustness provetobeinadequateinpractice. Our resultsindicatethatthe visibility

criteria developed yieldsa substantial improvement over the existing methods. Fur-

thermore our new benchmarks contains attacks whichcorrectly pinpoint weaknesses

of existing methods.

The thesis is structured as follows. In chapter 2 we present the watermarking

problemand the requirements that asuccessful watermarking algorithmmust fulll.

In chapter 3 we formulate watermarking as a communications problem. In chapter

4 we review algorithms which have appeared in the recent literature. In chapter 5

we begin the presentation of the central contributions of this thesis by describing

DFT domainalgorithms. In chapter 6 we present the key contributionof this thesis

which is a mathematical formalization of the embedding process. In chapter 7 we

turn our attention to attacking watermarkings. We then dene a new measure of

perceptual quality of animageinchapter 8and incroporatethe quality measure and

the attacks in a new benchmark which we use to evaluate our algorithmsas well as

two commercial software packages. Finally,we present our conclusions inchapter9.

(26)

Chapter 2

Watermarking for Copyright

Protection

While digital watermarking for copyright protection is arelatively new idea, the

idea of data hiding dates back to the ancient Greeks and has progressively evolved

overtheages. Anexcellentsurveyof theevolution ofdatahidingtechnologiescanbe

found in [38]. The inspiration of current watermarking technology can be traced to

paperwatermarkswhichwere used some700 yearsago for thepurpose of datingand

authenticating paper [28]. The legal power of suchwatermarks was demonstrated in

a court case known as Des Decorations [21]. Watermarks in the context of digital

images rst appeared in 1990[87] and has since received considerableattention with

aparticularlyrapid proliferationofresearchintowatermarking algorithmsinthe last

six years.

In this literature survey we highlight some of the key developments in the dig-

ital watermarking community. We begin in section 2.1 by presenting the general

watermarking framework. We continue in section 2.2 by stating the requirements

a watermarking system must fulll. In sections 3 we formulate the watermarking

problemas acommunicationsproblem. Then insections 4.2 to4.4, we present three

Categories of watermarking and present watermarking algorithms which fall under

each category. Here we consider only robustness againstattacks that donot modify

the geometry of the image. Section 4.5 describes strategies for making watermarks

robusttogeometrictransformationsoftheimage. Insection4.6weturnourattention

to the problem of evaluating and comparing algorithms. Finally in 4.7 we point out

the limitationsof currentstrategies and indicatethe directions taken inthis thesisin

order toimprovethe performance of current watermarking algorithms.

2.1 Watermarking Framework

A simple yet complete model of the watermarking framework appears in gure

(27)

message to be embedded and produces the stego data. The decoding process is

completely analagous where we take the stego data, a secret key, and attempt to

detect if the image contains a watermark, and if it does, we attempt to decode the

message. With respect to the watermark decoding process, the notion of oblivious

Embedding Attacks Decoding

Cover Data

Stego Data

Secret Key Secret Key

Copyright Message Copyright Message

Figure2.1: Model of Watermarking Process

recovery is important. Byoblivious we mean that the watermark can berecovered

fromanimagewithoutthe original. Clearly inthiscase the problemismore dicult

sinceiftheoriginalisavailable,asimplesubtractioncanbeperformedtoseparate the

imagefromthewatermark. Inpracticeobliviousrecoveryisanimportantrequirement

asotherwisealargesearchinadatabaseofimagesmayberequiredtondtheoriginal.

The notionof key basedembeddinganddetection isimportantinordertoensure

cryptographic security. Two types of systems exist: private and public. In a private

keysystem onlythe copyrightholderis ableto decode the watermarkwith hissecret

key. In a public system, the copyright holder embeds the watermark with a private

key, but the watermark can be read with a public key. This is useful in situations

where we have apublic detector. In particular, in certaincontexts a public detector

may check all images leaving or entering a local network. While appealing at rst

glance, Linnartz showed that in situations where a public watermark was available,

the watermark could be removed without any degradations in image quality [47].

Consequently, we willonly consider the case ofprivate keysin which decoding of the

watermark isonly possible with a private key.

In a typical scenario, a number of transactions must take place in order for an

image to be watermarked and securely transferred to the buyer. Figure 2.1 depicts

the exchanges that take place between the owner Alice and the buyer Bob during

the purchase of an image. Also depicted are the possible points of attack by the

maliciousMallet. A formal set of transactions involvingcryptographic protocols has

been proposed in [78]. A large number of threats are linked with cryptographic

security issues. In this thesis we willonly be concerned with attacks directly related

(28)

attempttodevelopalgorithmswhichare robusttoimageprocessingattacks. Wewill

elaborate onthis subset of image processingattacks insection 2.2.2.2.

2.2 Watermarking Requirements

Wereturntogure1.1whichsummarizestherequirementsandtradeosinvolved

in image watermarking. The requirements fall intothe three categories: robustness,

visibilityand capacityand inthe case ofimagewatermarkingwe mustmakecompro-

mises between these conicting requirements.

2.2.1 Capacity

We rst consider the capacity requirement since it is the simplest. There are

two approaches considered in the current watermarking literature. The bulk of the

literature contains 1 bit watermarks where at decoding, hypothesis testing [97] is

used todetermineif awatermark ispresent ornotin theimage. Thisis sucientfor

some copyright applications however many more possibilitiesare available when the

watermarkcontains60-100bitsofinformation. Thissecondclassofapproachesallows

the watermark to contain information such as the buyer and the seller of the image

Furthermoreadditionalagbitsmaybeusedtodescribetheimagecontent. This can

be useful in tracking illegalpornographical images over the Web. It is importantto

distinguishthesetwoclassesbecausealgorithmsusing1-bitwatermarksarenot easily

extended to higher capacities. The brute force approach of embedding 1 watermark

froma set of N possiblewatermarks and then using hypothesis testingto determine

which watermark was embedded is a possible approach which conveys log

2

(N) bits

of information. However this scheme breaks down when N becomes large since the

numberoftests whichmustbeperformedgrowsexponentially. Wewillreturntothis

(29)

2.2.2 Robustness

A secondimportantrequirement of watermarking schemes isrobustness. Clearly

a watermark is only useful if it is resistant to typical image processing operations

as well as to malicious attacks. However, it is important to note that the level of

robustness required varies with respect to the application athand. Werst consider

the required level of robustness of various applications and then categorize various

typesattacks against which watermarking algorithmsmust berobust.

2.2.2.1 Watermarking Applications

In[28] Hartungdistinguishesthe levelofrobustnessrequiredforfourcategoriesof

applications: authentication, data monitoring, ngerprinting, and copyright protec-

tion. In authentication applications, only certain types of attacks and in particular

mild compression must be considered. Since ultimately we wish to determine if an

image is authentic, we would like the watermark to be destroyed when the data is

manipulated. Data monitoring and tracking need higher levels of robustness. These

applications required the detection of transmitted or stored media typically for the

purposesofbilling. Here thewatermarkshouldberesistanttocompression aswellas

format conversions. Inthese rsttwocategories wenote thatonlya 1-bitwatermark

is needed since we need only make a decision about the presence or absence of a

watermark. Consequently thecapacityrequirementislow. The bulkofthe literature

containsalgorithmswhichaddressthesetwocategoriesofalgorithmsandmanyviable

solutionsbased on hypothesis testing [97] exist.

Fingerprintingapplicationsassociatereceiversofwatermarked mediatothemedia

itself by inserting a unique user ID and a unique document IDintothe media. This

provides a mechanism for tracing pirated copies to the person who pirated them.

Since in a typical example one image may be sold to many people, we immediately

see the need for highercapacity inthis application. We needenough bits so that we

canhaveauniqueidentierforeachimagesold. Therobustnessrequirementsarealso

high since the watermark must survive data processing as well as malicious attacks

by pirates who attemptto remove the watermark.

Similarly, copyright protection also requires high capacity (60-100 bits in typical

commercial systems) [75, 78, 34, 9] and also require a high level of robustness. In

addition todata processing and malicious attacks, watermarks for copyright protec-

tionmustbeabletowithstandmultiplewatermarkswhichmaybeinsertedbypirates

in order to create a deadlock situation where the true owner of an image becomes

ambiguous[14, 43].

In this thesis we will be concerned with the problem of copyright protection adn

oblivious recovery which is the most challenging of the aforementionedproblems.

(30)

2.2.2.2 Characterization of Attacks

As we have seen, depending onthe typeapplication, we require acertain level of

robustness against intentional or unintentional attempts to remove the watermark.

Theseattacks canbebrokendown into4categoriesasproposedbyHartungin[29].

The rst class of attacks are simple attacks that do not change the geometry

of the image and do not make any use of prior information about the watermark.

For example these methods do not treat the watermark as noise, but assume the

watermarkand the hostdata are inseparable. Attacks inthis category includelter-

ing, JPEG and wavelet domain compression, additionof noise, quantization, digital

to analog conversion, enhancement, histogram equalization, gamma correction, and

printingfollowed byre-scanning. Theseattacksattempttoweaken detector response

by increasing the noise relativetothe watermark.

The second class of attacks are those that disable the synchronization of the

watermark detector. This class of attacks includes geometric transformation such as

cropping,rotations,scalings,andshearingorgenerallineartransformations. Moreso-

phisticatedattacksinthiscategoryincludetheremovalofpixels, orlinesandcolumns

as done inthe programUnZign [91]. Even more subtle attacks are performed in the

programStirmark 3.1[67] wheretheimageisunnoticeablydistorted locallyby bend-

ing and resampling. The main goal of these attacks are to render the watermark

unreadable even thoughit isstill present inthe modiedimage.

The third class of attacks are ambiguity attacks. Here the aim is to create a

deadlock where it is unclear which image is original. One example is the insertion

of a second watermark by a pirate [35]. Craver [14] introduces the concept of non-

invertiblewatermarking schemes and demonstratesthat undercertaincircumstances

a fake original can be created. This creates a deadlock situation in which it is

impossible to determine the true owner of an image. Another attack which can be

placed in this category is the copy attack [43]. Here the attacker estimates the

watermark from one image and adds it to another image to produce a watermarked

image.

The nal class of attacks are the Removal Attacks. In many ways these are

the most sophisticated attacks since they take into account prior knowledge of the

watermarking process. These attacks attempt to estimate the watermark and then

remove the watermark withoutvisibledegradation to the host media. Examples are

collusion attacks which attempt to get a good estimate the watermark from several

watermarked images [84]. Another possibility is denoising where the watermark is

modeledasnoise[44]. Recently,Voloshynovsky [95]showedthatitispossibleinsome

cases to improve the quality of the image while removing the watermark. This isan

importantresultsince it demonstratesthe powerof denoising schemes inperforming

(31)

2.2.3 Visibility

Wenow turnour attention tothe question ofvisibility. Since watermarking algo-

rithmsembed informationby modifyingthe originaldata, inordertocreate invisible

watermarks, we must have a precise understanding as to the amount pixels can be

modied withoutvisiblechangesin the host image.

Most of the ideas currently used in watermarking with respect to evaluating the

visibility of watermarks have been inspired by perceptual coders designed for com-

pression. Thekeyidea isthat bothwatermarking andcompressionintroducenoisein

the imageand that inbothcases we would like tominimizethe perceived distortion.

Perceptual coders minimize the error perceived by the HVS. These were intro-

duced since it was found that working with the peak signal to noise ratio (PSNR)

criterion and the mean square error (MSE) criteria was inadequate in reducing per-

ceived distortions introduced by compression. Unlike PSNR and MSE which are

purely mathematicaland globalcriteriaperceptualcodersuse criteriawhichattempt

to take into account the behavior of the HVS. These coders have been shown to

greatly improve the performance of compression algorithms while minimizing visi-

ble distortion to the image. For example the JPEG compression scheme quantizes

DCT coecients based on tables which reect the perceptual impact of each DCT

coecient.

Themostrecentresearchonperceptualmodelsincorporatelowlevelvisionfactors

aswell ashigher level factors. Untilthe early1990's,most of the research concerned

low level factors. We begin by examining the low level factors in detail and then

briey consider higher level factors which have not yet been seen in watermarking

algorithms.

2.2.3.1 Low Level Vision Factors

Jayant [37] gives anoverview of perceptual coders and identies three important

low level factors which inuence whether noise will be visible in a given part of an

image: luminance,frequency, and texture.

Luminance: Jayant states that the HVS is more sensitive in detecting noise

for mid-level gray values than for bright and dark regions respectively. How-

ever conicting results have appeared inthe literature. Atypicalexperimental

setupisgiveningure2.2awherethesubjectisaskedatwhichpointthecentral

square becomes visible against a constant gray level background. All authors

agree that athigh luminance levels the sensitivity of the HVS follows Weber's

law which states that Æl

l

= k

Weber

where Æl is the change in luminance of the

central square and l is the luminance of the background. However at lower

luminance levels, while Jayant suggests that the HVS is less sensitiveto noise,

most recent publications suggest that the HVS is more sensitive to noise. Re-

(32)

(typically<l

th

=10cd=m 2

)whichstatesthat Æl

l

= q

l

th k

Weber

. Thecomplete

curve is shown in 2.2b. The x-axis contains the luminance between 0 and 255

while the y-axis indicates the contrast sensitivity threshold which is the ratio

Æl

l

at which the centralsquare becomesvisible.

(a)Experimental setup

0 50 100 150 200 250

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08

luminance

contrast sensitivity threshold

(b)Sensitivityfunction

Figure2.2: Sensitivity of the HVS to changes inluminance

Frequency: the sensitivity ofthe eyetovariousfrequenciescan bedetermined

by using the same experimental setup where a mid-level grey value is used

and the subject is asked to determine the visibility threshold for the central

square which contains given spatial frequencies. It has been determined that

theeyeismostsensitiveatthelowerfrequenciesandmuchlesssensitiveforhigh

frequencies. Rather than using pure frequencies (sine waves) it isalso possible

to use the various subbands of a transformation, for example DCT or wavelet

domainsubbands. TheDCT88blocktransformhasbeenextensivelystudied

since it is used in the JPEG compression standard [64, 1, 98]. Consequently

the masking properties are well known. Masking refers to the fact that the

presenceof apatternisundetectable tothe eye ifasimilarpatternexists atthe

same spatiallocation. In the case of the DCTthe patterns weconsider are the

DCT basis functions. Fora given DCT component (i;j) we have the visibility

threshold given by [98]:

log

10 t

ij

= log

10 T

min

r

ij

+k(log

10 f

ij log

10 f

min )

2

(2.1)

with:r = r+(1 r)cos 2

(2.2)

(33)

where

T

min

=

L

S

0

ifL>L

T

L

S0 (

L

T

L )

1 a

t

ifLL

T

(2.3)

k=

k

0

ifL>L

k

0 (

L

k )

a

k

ifLL

k

(2.4)

f

min

=

f

0

ifL>L

f

0 (

L

f )

a

f

ifLL

f

(2.5)

whereL

T

=13:45cd=m 2

;S

0

=94:7;a

t

=0:649;L

k

=300cd=m 2

;k

0

=3:125;a

k

=

0:0706;L

f

=300cd=m 2

;f

0

=6:78cycles=deg;r=0:7 and a

f

=0:182:Also,

f

ij

= 1

16 q

(i=W

x )

2

+(j=W

y )

2

(2.6)

where W

x

and W

y

are the horizontal and vertical size of a pixel in degrees of

visualangle. The angularparameter is given by:

ij

=arcsin 2f

i0 f

0j

f

ij 2

(2.7)

The parameters were determined by extensive subjective tests and are now

widely adopted.

It has now been established that there is an important interaction between

luminance and frequency which Watson incorporatesin the modelby setting

t

ijk

=t

ij (

c

00k

c

00 )

a

t

(2.8)

wherec

00k

idtheDC coecient ofblock k,c

00

, anda

t

determines the degreeof

masking (set to0:65 typically).

Texture : Texture masking refers tothe fact that the visibilityof a patternis

reducedbythepresenceofanotherintheimage. Themaskingisstrongestwhen

both components are of the same spatial frequency, orientation and location.

Watsonextendstheresultsofluminanceandfrequencymaskingpresentedabove

toinclude texture masking. This isdone by setting:

m

ijk

=Max[t

ijk

;jc

ikj j

w

ij

t

ijk 1 w

ij

] (2.9)

where m

ijk

is the masked threshold and w

ij

determines the degree of texture

masking. Typicallyw =0and w =0:7for allother coecients.

(34)

From these three factors a perceptual error metric can be derived which yields an

indicator of image quality after modications to the image. This can be computed

by rst setting

d

ijk

= e

ijk

m

ijk

(2.10)

where e

ijk

is the error in a given DCT coecient in a given block k. We then use

Minkowskisummation to obtainthe qualitymetric:

d(I;

~

I)= 1

N 2

[ X

ij (

X

k d

ijk

s

)

f

s

] 1

s

(2.11)

with

s

=

f

=4.

The modelpresented abovehas been developed byextensivetestingand hasbeen

adaptedforuseinconjunctionwiththeDCT88transform. RecentlyPodilchuk[71]

used such a model ina DCT domain watermarking scheme which demonstrated the

applicabilityof such modelsto theproblemof watermarking. Bycontrast,recentlya

spatial domain mask has been derived based on a stochastic modellingof the image

[94]. Since this mask will be used with the algorithms developed in this thesis, we

consider the modelindetail.

One of the most popular stochastic image models, which has found wide appli-

cation in image processing, is the Markov Random Field (MRF) model [24]. The

distribution of MRF's is writtenusing aGibbs distribution:

p(x)= 1

Z e

P

c2A Vc(x)

; (2.12)

whereZ isanormalizationconstantcalledthepartitionfunction,V

c

()isafunctionof

alocalneighboringgroupcofpointsandAdenotes theset ofallpossiblesuchgroups

orcliques. Animportantspecial caseofthis modelistheGeneralizedGaussian(GG)

model. Assume that the coverimage isa randomprocess with non-stationarymean.

Then, using autoregressive (AR) modelnotation, one can write the cover imageas:

x=Ax+"=x+"; (2.13)

where xis the non-stationary localmean and " denotes the residualterm due to the

error of estimation. The particularities of the above modeldepend on the assumed

stochastic properties of the residualterm:

"=x x=x Ax=(I A)x=Cx; (2.14)

where C = I A and I is the unitary matrix. If A is a low-pass lter, then C

represents a high-pass lter. We note that the same distribution models the error

(35)

This type of model has found widespread applications in image restoration and

denoising [52, 8] as well as in some recent wavelet compression algorithms [48, 50].

Here we use the stationary Generalized Gaussian (GG) modelfor the residual term

". The advantage of this model is that it takes local features of the image into

account. This is accomplished by using an energy function, which preserves the

image discontinuities under stationaryvariance.

The auto-covariancefunction for the stationary modelisequal to:

R

x

= 0

B

@

2

x

0 0

0

2

x 0

.

. 0 .

.

0

0 0 0

2

x 1

C

A

; (2.15)

where 2

x

isthe global image variance. The model can be writtenas:

p

x

(x)=( ()

2 ( 1

)

) N

2

1

jdetR

x j

1

2

exp f ()(jCxj

2

) T

R

2

x jCxj

2

g; (2.16)

where () = r

( 3

)

( 1

)

and (t) = R

1

0 e

u

u t 1

du is the gamma function, R

x

is de-

termined according to (2.15), and the parameter is called the shape parameter.

Equation (2.16) includes the Gaussian ( = 2)and the Laplacian ( =1) models as

special cases. For real imagesthe shape parameter isin the range 0:3 1.

It has been shown [94] that from the generalized Gaussian model, we can derive a

noise visibility (NVF) at each pixelpositionas:

NVF(i;j)=

w(i;j)

w(i;j)+ 2

x

; (2.17)

where w(i;j)=[()]

1

kr(i;j)k 2

and r(i;j)=

x(i;j) x(i;j)

x .

The particularities of this model are determined by the choice of two parameters

of the model, e.g. the shape parameter and the global image variance 2

x . To

estimate the shape parameter, we use the moment matching method in [50]. The

shape parameter for most of real images is in the range 0:3 1. Once we

have computed the noise visibility function we can obtain the allowable distortions

by computing:

pi;j

=(1 NVF(i;j))S+NVF(i;j)S

1

(2.18)

where S and S

1

are the maximum allowable pixel distortions in textured and at

regions respectively Typically S may be as high as 30 while S

1

is usually about 3.

Wenote thatinat regions theNVF tendsto1 sothatthe rst term tendsto 0and

consequently the allowable pixel distortion is at most S which is small. Intuitively

(36)

this makes sense since we expect that the watermark distortions will be visible in

at regions and less visible in textured regions. Examples of NVFs for two images

are given in gure 2.3. We note that the modelcorrectly identies texturedand at

regions. In particular the NVFis close to0 intextured regions and close to 1in at

regions.

(a) (b)

50 100 150 200 250

50

100

150

200

250 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(c)

50 100 150 200 250

50

100

150

200 250 0.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

(d)

Figure 2.3: Original images of Barbara (a) and Fish (b) along with their NVF as

determined by a generalizedgaussian model(c) and (d).

2.2.3.2 High Level Vision Factors

We now briey consider higher level factors which will not be exploited further

in this thesis, but may well bethe subject of future research. Unlike low level vision

factors, high level vision factors require feedback from the brain. In particular, re-

cently much attention has been given to the problem of determiningthe importance

subjects assign to regions in an image. To this end, 8 factors have been identied

(37)

Contrast: Regionswhichhaveahighcontrastwiththeirsurroundsattractour

attention and are likelyto be ofgreater visualimportance[100].

Size: Larger regions attract our attention more than smaller ones however a

saturation point exists after which the importancedue to size levelso [22].

Shape: Regionswhicharelongand thin(suchasedges)attractmoreattention

than round at regions[80].

Colour: Some particular colours (red) attract more attention. Further more

the eect ismore pronounced when the colour of aregion is distinctfrom that

of the background [54].

Location: Humanstypicallyattachmoreimportancetothecenter ofanimage

[20]. Thismayormay not be importantinwatermarking applicationssince we

must account for the situation where the image is cropped. When this occurs,

the center changes.

Foreground/Background: Viewersaremoreinterestedinobjectsinthefore-

groundthan those in the background [6].

People: Many studies have shown that we are drawn to focus on people in a

scene and inparticular their faces, eyes, mouthand hands [100].

Context: Viewers eye movements can be dramaticallychanged depending on

the instructions they are given prior to or during the observation of an image

[100].

(38)

Chapter 3

Watermarking as Communications

Having looked at the requirements of a watermarking algorithm, we are now in

positiontoadoptastandardformulationwhichwillserveasaparadigmforanalyzing

and developing watermarking schemes.

3.1 Watermarking as a Communications Problem

Weformulatewatermarkingasacommunicationsproblemasdepictedingure3.1

[95]. Heretheoriginalbinarymessagebisencoded typicallyby errorcorrectioncodes

00 00 11 11

00 11 0

1 0 1

Cover Image

Message

Perceptual Model

Watermark Embedder

Encoder

Attacks Watermark Extractor

Watermark Decoder

b x

M

c Key

w ^ b ^

y’

y

Detector Watermark

Νο Yes

k

Figure3.1: Watermarking asa Communications Problem

[30]orM-sequences [89]toproduce thecoded messagec . This codedmessageisthen

embedded intothe cover image xby some embeddingfunction y =E(c;x;k)where

k is a cryptographic key. Typically the embedder rst takes the coded message c

andthe perceptual maskM tocreateawatermark wwhichistheninserted (usually

added or multiplied) into the image either in the spatial domain or in a transform

domain. The resultingimagey maythen undergoattackswhichproducethe y 0

from

which we would like to recover the watermark. The watermark recovery process in

its most general form consists of three steps: watermark extraction (or estimation),

0