• Aucun résultat trouvé

The DART-Europe E-theses Portal

N/A
N/A
Protected

Academic year: 2022

Partager "The DART-Europe E-theses Portal"

Copied!
120
0
0

Texte intégral

(1)

HAL Id: tel-01087106

https://tel.archives-ouvertes.fr/tel-01087106

Submitted on 25 Nov 2014

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

semi-paramétriques et non paramétriques

Jean-Bernard Salomond

To cite this version:

Jean-Bernard Salomond. Propriétés fréquentistes des méthodes Bayésiennes semi-paramétriques et non paramétriques. Mathématiques générales [math.GM]. Université Paris Dauphine - Paris IX, 2014.

Français. �NNT : 2014PA090034�. �tel-01087106�

(2)

Thèse

présentée à

L'université Paris-Dauphine

Éole dotorale de Dauphine

Centre de Reherhe en Mathématiques de la Déision

Pour l'obtention du titre de

Doteur en mathémathiques appliquées

Présentée etsoutenue par

Jean-Bernard Salomond

Le30 Septembre 2014

Propriétés fréquentistes des méthodes

Bayésiennes semi-paramétriques et

non-paramétriques

Jury :

IsmaëlCastillo CNRS Examinateur

FabienneComte Université Paris Desartes Examinateur

Céile Durot Université Paris Ouest Nanterre LaDéfense Examinateur

Elisabeth Gassiat Université Paris-Sud Rapporteur

Dominique Piard Université Paris Diderot -Paris7 Examinateur

VinentRivoirard Université Paris Dauphine Examinateur

Judith Rousseau Université Paris-Dauphine Diretrie de thèse

(3)
(4)

La reherhe sur les méthodes bayésiennes non-paramétriques onnaît un essor

onsidérable depuis les vingt dernières années notamment depuis le développe-

ment d'algorithmes de simulation permettant leur mise en pratique. Il est don

néessaire de omprendre, d'un point de vue théorique, le omportement de es

méthodes. Cettethèse présentediérentes ontributionsàl'analysedes propriétés

fréquentistes des méthodes bayésiennes non-paramétriques. Si se plaer dans un

adreasymptotiquepeutparaîtrerestritifdeprimeabord,elapermetnéanmoins

d'appréhender lefontionnementdes proédures bayésiennes dansdes modèlesex-

trêmementomplexes. Celapermetnotammentdedéteterlesaspetsdel'apriori

partiulièrementinuentssur l'inferene. Denombreux résultatsgénérauxontété

obtenus dans e adre, ependant au fur et à mesure que les modèles deviennent

de plus en plus omplexes, de plus en plus réalistes, es derniers s'éartent des

hypothèses lassiques et ne sont plus ouverts par la théorie existante. Outre

l'intérêt intrinsèque de l'étude d'un modèle spéique ne satisfaisant pas les hy-

pothèses lassiques, ela permet aussi de mieux omprendre les méanismes qui

gouvernent le fontionnement des méthodes bayésiennes non-paramétriques.

Chapitre 1 L'introdutionprésente le paradigme bayésien et l'approhe bayési-

enne des problèmes non-paramétriques. Nous introduisons les propriétés

fréquentistes des méthodes bayésiennes et présentons leur importane dans

la ompréhension du omportement de la loi a posteriori. Nous présentons

ensuite les prinipaux modèles étudiés dans ette thèse, et les diultés

posées par eux-i pour l'étude de leurs propriétés asymptotiques.

Chapitre 2 Dans e hapitre, nous étudions la onsistane et la vitesse de on-

entration de la loia posterioridans lemodèle de densitédéroissante pour

diérentesmétriques. Cemodèleestpartiulièrementintéressantarlesden-

sitésdéroissantesontunereprésentationsousformedemélanged'uniformes

et sont don un as partiulier de mélange pour lequel le supportdu noyau

dépend du paramètre. Dans e adre, les hypothèses lassiques néessaires

pour laonsistanede laloiaposteriorine sontpas vériées. Notammentla

loiapriorine met passusammentde massesurlesvoisinagesde Kullbak-

(5)

Leiblerduvrai paramètre,etuneadaptationdes méthodes usuellesest don

néessaire. Pour deux familles d'a priori lassiques, nous prouvons que l'a

posterioriseonentre àlavitesse minimaxepour lespertes

L 1

etHellinger.

Nousétudions ensuitelaonsistane de la loiaposterioride ladensitépour

lespertespontuelle etnormesup. Cesdeux métriquessonten généraldi-

ilesàétudierarellesnepeuventêtrereliéesàladivergenenaturellequ'est

la divergene de Kullbak-Leibler. Pour es deux pertes, nous prouvons la

onsistanede l'aposteriorietdonnons une bornesupérieurepourlavitesse

de onentration.

Chapitre 3 Nous proposons un test bayésien non paramétrique de déroissane

d'une fontion dans le modèle de régression gaussien. Dans e adre, outre

le fait que les deux hypothèses sont non-paramétriques, l'hypothèse nulle

est inlue dans l'alternative. Il s'agit don d'un as de test partiulière-

ment diile. En outre dans e as, l'approhe usuelle par le fateur de

Bayes n'est pas onsistante. Nous proposons don une approhe alternative

reprenant les idées d'approximation d'une hypothèse pontuelle par un in-

tervalle. Nous prouvons que pour une large famille de lois a priori, le test

proposé est onsistant et sépare les hypothèses à la vitesse minimaxe. De

plus notre proédure est faile à implémenter et à mettre en ÷u vre. Nous

étudions ensuite son omportement sur des données simulées et omparons

lesrésultatsave lesméthodeslassiquesexistantes danslalittérature. Pour

haun des as onsidérés, nous obtenons des résultats au moins aussi bons

quelesméthodesexistantes, etlessurpassonspourunertainnombredeas.

Chapitre 4 (o-éritave BartekKnapik)Nousproposonsuneméthodegénérale

pourl'étudedesproblèmesinverseslinéairesmal-posésdansunadrebayésien.

S'il existe de nombreux résultats sur les méthodes de régularisation et la

vitesse de onvergene d'estimateurs lassiques, pour l'estimation de fon-

tions dans un problème inverse mal-posé, les vitesses de onentration d'a

posterioridansleadrebayésienn'aétéquetrèspeuétudiédanseadre. De

pluses quelquesrares résultatsexistantne onsidèrent quedes famillestrès

limitéesde loisa priori,en généralreposant sur la déompositionen valeurs

singulières de l'opérateur onsidéré. Dans e hapitre nous proposons des

onditions générales sur la loi a priori sous lesquelles l'a posteriori se on-

entre à une ertaine vitesse. Notre approhe nous permet de trouver les

vitesses de onentration de l'a posteriori pour de nombreux modèles et de

larges lasses de loi a priori. Cette approhe est de plus partiulièrement

intéressantearellepermetde mieuxomprendrelefontionnementde laloi

aposteriori etnotammentl'impat de l'opérateursur l'inférene.

(6)
(7)
(8)

Researh on Bayesian nonparametri methods has reeived a growing interest for

the past twenty years, espeiallysine the development of powerfulsimulation al-

gorithmswhih makestheimplementationof omplexBayesian methodspossible.

From that point it is neessary to understand from a theoretial point of view

the behaviour of Bayesian nonparametri methods. This thesis presents various

ontributionstothestudyoffrequentistpropertiesofBayesiannonparametripro-

edures. Although studying these methods from an asymptoti angle may seems

restritive,itallowstograsptheoperationofthe Bayesianmahineryinextremely

omplex models. Furthermore, this approah is partiularly useful to detet the

harateristis of the prior that are strongly inuential in the inferene. Many

general results have been proposed in the literature in this setting, however the

moreomplexand realistithemodelsthefurthertheygetfromtheusualassump-

tions. Thusmany models that are of greatinterestin pratie are not overed by

the general theory. If the study of a model that does not fall under the general

theory has aninterest onitsowns, italsoallowsfor abetterunderstanding of the

behaviour of Bayesian nonparametri methods ina general setting.

Chapter 1 The introdution presents the Bayesian paradigm and the Bayesian

approah to nonparametri problems. We introdue frequentist properties

of Bayesian proedures and present their importane in the understanding

of the behaviourof the posterior distribution. Wethen present the dierent

models studied in this manusript and the hallenge faed in studying of

their asymptotiproperties.

Chapter 2 In this hapter, we study onsisteny and onentration rates of the

posterior distributionunder several metris inthe monotonedensity model.

This model is partiularly interesting as monotone densities an be written

as a mixture of uniform kernels whih is a speial ase of kernels for whih

the supportdependsonthe parameter. In thisase theusual hypotheses re-

quired toderive posterioronentrationrate are not satised. Inpartiular,

the prior distribution we onsider do not put positive mass on Kullbak-

Leiblerneighbourhoodsofthetrue parameterandwethushavetoadaptthe

(9)

standardmethodstogetanupperboundontheposterioronentrationrate.

Fortwostandardpriordistributions,weprovethattheposterioronentrate

attheminimax ratefor the

L 1

and theHellinger losses. Wethen study on-

sistenyof the posteriorunderthe pointwiseand supremum loss. These two

metris are in general diult to study in the Bayesian framework as they

arenot relatedtothe Kullbak-Leiblerdivergene whihisthe naturalsemi-

metriin this setting. We however provethat the posterioris onsistent for

both losses and get an upperbound for the posterioronentration rate.

Chapter 3 We propose a Bayesian nonparametri proedure to test for mono-

toniity in the regression setting. In this ase, not only the null and the

alternativehypothesesare nonparametri, but one isembedded inthe other

whih makes the testing problem partiularly diult. In partiular the

Bayes-Fator, whih is a usual Bayesian answer to testing problems, is not

onsistent under the null hypothesis. We propose an alternative approah

thatreliesontheideasofapproximatingapointnullhypothesisbyshrinking

intervals. Theproposedproedureisonsistentforawidefamilyofpriordis-

tributions and separate the hypotheses at the minimax rate. Furthermore,

ourapproahiseasytoimplementand doesnotrequire heavy omputations

ontrariwisetothe existingproedures. Wethenstudyitsbehaviouronsim-

ulated data and for all the onsidered ases, our proedure doesat least as

goodas the lassial ones, and outperform themin some ases.

Chapter 4 (Joint work with Bartek Knapik) We propose a general approah to

study nonparametri ill-posedlinear inverse problems ina Bayesian setting.

Although there is a wide literature on regularisation methods and onver-

geneofestimators inthis setting,theposterioronentrationinaBayesian

setting has not reeived muh attention yet. Furthermore, the few existing

results only onsider very restrited families of prior distributions, mostly

related to the singular value deomposition of the operator at hand. In

this hapter we give general onditions on the prior suh that the posterior

onentratesataertainrate. Thisapproahallowsustoderiveasymptoti

resultsforvariousill-posedinverseproblemsandwidefamiliesofpriors. Fur-

thermore, this approah is partiularly interesting inthe sense that it gives

somevaluableinsightsonthebehaviourofthe posteriordistributioninthese

models and the impat onthe operator onthe inferene.

(10)

Résumé iii

Summary vii

1 Introdution 1

1.1 Bayesian nonparametri approahes . . . 2

1.1.1 Bayesian modeling . . . 2

1.1.2 Bayesian nonparametris . . . 3

1.2 Asymptoti properties of the posterior distribution . . . 5

1.2.1 Posterior onsisteny . . . 5

1.2.2 Posterior onentration rate . . . 7

1.2.3 Minimaxonentration rates and adaptation . . . 9

1.3 NonparametriBayesian testing . . . 10

1.4 Challengingasymptoti properties . . . 11

1.4.1 Inferene under monotoniity onstraints . . . 12

1.4.2 Ill posed linear inverse problems . . . 15

2 Monotone densities 27 2.1 Introdution . . . 28

2.2 Mainresults . . . 31

2.2.1

L 1

and Hellinger metri . . . . . . . . . . . . . . . . . . . . 32

2.2.2 Pointwise and supremum loss . . . 33

2.3 Proofs . . . 35

2.3.1 Proof of Theorems 2.1and 2.2 . . . 35

2.3.2 Proof of Theorems 2.3and 2.5 . . . 39

2.3.3 Proof of Theorem 2.4 . . . 42

2.3.4 Proof of Theorem 2.6 . . . 43

2.4 Tehnial Lemmas . . . 44

2.4.1 Proof of Lemma 2.1 . . . 44

2.4.2 Proof of Lemma 2.2 . . . 50

2.5 Additionnal proof . . . 51

(11)

2.6 Disussion . . . 53

3 Bayesian testing for monotoniity 57 3.1 Introdution . . . 58

3.1.1 Modelling with monotone onstraints . . . 58

3.1.2 The Bayes fator approah . . . 59

3.1.3 Analternativeapproah . . . 59

3.2 Construtionof the test . . . 61

3.2.1 The testingproedure . . . 61

3.2.2 Theoretialresults . . . 62

3.2.3 A hoie for the prior inthe non informativease . . . 64

3.3 Simulated Examples . . . 65

3.4 AppliationtoGlobal Warming data . . . 68

3.5 Proofof Theorem 3.1 . . . 69

3.6 Proofsof tehnial lemmas . . . 71

3.6.1 Proof of Lemma3.1 . . . 71

3.6.2 Proof of lemma3.2 . . . 74

3.6.3 Proof of Lemma3.3 . . . 77

3.7 Disussion . . . 77

4 Ill-posed inverse problems 81 4.1 Introdution . . . 82

4.2 GeneralTheorem . . . 83

4.3 Modulus of ontinuity . . . 85

4.4 Somemodels . . . 87

4.4.1 White noise . . . 87

4.4.2 Regression . . . 96

4.5 Disussion . . . 106

(12)

Introdution

PerhapsI shouldnot have been a sherman,he thought.

Butthatwasthe thingthatI wasborn for.

Ernest Hemingway, The old manand the sea.

Résumé

L'introdution présente leparadigme bayésien et l'approhe bayésienne des prob-

lèmesnon-paramétriques. Nousintroduisonslespropriétésfréquentistesdesméth-

odes bayésiennes et présentons leur importane dans la ompréhension du om-

portement de la loi a posteriori. Nous présentons ensuite les prinipaux modèles

étudiés dans ette thèse, etles diultés posées par eux-i pour l'étudede leurs

propriétés asymptotiques.

(13)

This introdution presents the main onepts ommon to the following hap-

ters,thestatistialmodelinganditsBayesianapproahthatweadoptinthisthesis.

Weproeedwithaquikintrodutiontononparametristatistisandtheonstru-

tion of prior distributions in aninnite dimensional spae, and we emphasize the

importaneof frequentists properties of Bayesian nonparametri proedures. We

then present the dierent statistialmodels studiedinthis manusript.

1.1 Bayesian nonparametri approahes

Themaingoalofstatistisistoinferonarandomphenomenongiven observations.

The ore onept of statistis is probabilisti modelling, that is a mathematial

approximation of the random phenomenon at hand. In a statistial model, an

observation

X

on an observation spae

X

is assumed to be generated from a

probability distribution

P

that belongs to a model

P

. Usually this distribution is haraterized by a parameter

θ

in a parameter set

Θ

whih gives the sampling

model

{X , P θ , θ ∈ Θ } .

The aim of statistis is then to infer, and make deisions on the model, based

on the observed data. To model omplex data generating phenomenon, the pa-

rameterspae

Θ

may be very large and possibly innitedimensional. As oftenin mathematialsienes, it is interesting to delineate regions of statistial method-

ology, and modern mathematial statistis tends to dierentiate Bayesian versus

frequentistmethods, parametri versus nonparametri models. In this setion,we

deneBayesian nonparametri models and underline their importane.

1.1.1 Bayesian modeling

Statisti models usually fall into either the frequentist paradigmor the Bayesian

one. The frequentistparadigmonsiders that the data are generated from axed

distribution

P θ 0

assoiated with the true parameter

θ 0

. Let

g

be a funtion from

Θ

to

Ξ

, suh that one is interested in making inferene on

g(θ 0 )

. Frequentist

statistiians look forstatistis, that isfuntions

S : X 7→ Ξ

that minimizesarisk

R(g(θ 0 ), S(X)).

The risk is most of the time assoiated with a metri or semi-metri

d

, or more

generalyany lossfuntion, and an be rewritten

R(g(θ), S(X)) =

E

θ 0 [d(g(θ 0 ), S(X))] ,

whereE

θ

is the expetation with respet to

P θ

.

(14)

IntheBayesianparadigm,onemodelstheignoraneontheparameter

θ

through

a probability distribution

Π

based on prior beliefs (the prior distribution). An extensive introdution to Bayesian statistis an be found in Robert (2007). A

Bayesian model isthus a samplingmodel

X ∼ P θ , θ ∈ Θ

together with a prior model

θ ∼ Π

whih an be ombinedthrough the Bayes' rules toget aprobability distribution

of the parameter given the data alled the posterior distribution dened as for all

measurable

A ⊂ Θ

Π(θ ∈ A | X) = R

A P θ (X)Π(dθ) R

Θ P θ (X)Π(dθ) .

(1.1)

It isthe singleobjet onwhihallinferene(e.g. estimation,testing, onstrution

of rediblesets, et.) isbased.

The Bayesian approah to statistis has beome inreasingly popular, espe-

ially sine the 1990's beause of the development of new sampling methodssuh

as Markov-Chains Monte-Carlo (MCMC) algorithms that makes sampling under

the posterior distribution feasible if not easy. Bayesian methods are now used in

a wide variety of domains, from biology to nane and data analysts are more

and more attrated by its axiomati viewof unertainty and its apaity to han-

dleomplexmodels,see Gelman etal.(2004)for instane. However, the fatthat

somemethodsarealledBayesian emphasizesthe fatthatthereisstilltwophilo-

sophial approahes to statistial modeling. When the parameter spae is nite

dimensional,Bayesian and frequentistmethodsusually agree whenthe amountof

information grows. In partiular, under weak assumptions on the prior distribu-

tion,thesoalledBernstein-von-MiseTheorem,aspresentedinLeCam and Yang

(2000), shows that Bayesian redible sets and frequentist ondene intervals are

asymptotiallyequivalent. Thisresultispartiularlyimportantasitindiatesthat

Bayesian models with dierent priors 1

will eventually agree when the amount of

information 2

grows, and willgive a similaransweras frequentists ones.

1.1.2 Bayesian nonparametris

Nonparametri models are often dened as probabilisti models with massively

many parameters (see Müller and Mitra, 2013) or with an innite dimensional

1

With aslightabuseofnotations,wemaysaypriorforpriordistributions whenthere isno

onfusion

2

(15)

parameter spae as inGhosh and Ramamoorthi(2003). These models oer more

exibilitythanparametrimethodsbuttheirmathematialomplexityisingeneral

more involved than for parametri methods.

A rst probleminBayesian nonparametris is todenea probabilitydistribu-

tion onaninnitedimensionalspae. Choosinga priordistribution isakey point

inoftheBayesianinferene,andgoingfrompriorknowledgetoapriordistribution

an behallenging. In partiular forinnitedimensional parameterspaes, assur-

ingthatthepriordistributionhasasuientlylargesupportisadiulttask,not

mentioning the diulty to ompute the posterior distribution for suh models.

A popular tool in the Bayesian nonparametri literature is the Dirihlet proess

introduedbyFerguson(1974). TheDirihletproessisaprobabilitymeasureson

the set of probability measure and an be dened as follows:

Denition 1.1 (Dirihlet proess, Ferguson, 1974). Let

α

be a non null nite

measure on

X

. Wesay that

P

followsaDirihletproess

DP (α)

,ifforall

k ∈ N

, allpartitionof measurable sets

(B 1 , . . . , B k )

of

X

,

(P (B 1 ), . . . , P (B k )) ∼ D (α(B 1 ), . . . , α(B k ))

where

D

isthe Dirihlet distribution.

The Dirihlet proesses have been proved to have a large weak support (see

Ferguson,1973),whihisalldistributionswhosesupportisinludedinthesupport

of the basemeasure

α

. Its hyperparametersare easilyinterpretableand itlead to tratable posteriors. Moreover Sethuraman (1994) showed that the Dirihlet pro-

essanbeobtainedinaonstrutiveway alledthe stikbreaking representation.

In addition, it opened the way to more exible prior distributions on the set of

density funtions. Sinethenmany priordistributionsoninnitedimensionalsets

have been proposed. Forinstane Antoniak (1974)introduedmixturesof Dirih-

letproessinthe ontext ofprobabilitydensitiesestimation. Given aolletionof

kernels

K µ ( · )

depending ona parameter

µ

we denethe mixture

θ( · ) =

Z

X

K µ ( · )dP (µ),

where

P

is a probability measure. Thus, hoosing a prior on

P

(e.g. a Dirihlet

proess prior) induesa prior on

θ

.

Many otherpriorshavebeenproposedinthe literature, generallasses ofmix-

turesforthe density model(Lo,1984),hierarhial Dirihletproesses(Teh etal.,

2006), Gaussian proesses (Lenk , 1991) among others. It is typially diult in

nonparametri settings to quantify the impat of a prior distribution onthe pos-

(16)

that when the amount of information inreases, inferene based on dierent pri-

ors will merge, it does not hold easily when the parameter spae is very large.

Diaonis and Freedman (1986)showed that in some ases, Bayesian nonparamet-

ri proedures an lead to inonsistent results (when the data are assume to be

sampled from a distribution

P θ 0

, the posterior distribution does not aumulate its mass around the true parameter). Some other examples showthat even if the

posterioronentratesitsmassaroundthetrueparameter,thepriorstillinuenes

the rate atwhih this onentration ours.

1.2 Asymptoti properties of the posterior distri-

bution

Looking at the asymptoti behaviour of the posterior distribution helps under-

standingtheimpatof theprioronthe posteriordistribution. It isalsoimportant

to detet whih parts of the prior inuene the most the posterior. In partiular,

some aspets of the prior may remain when the amount of informationgrows to

innityand may thusbe highlyinuentialforsmallsamplesizes for instane. We

nowdenetwomain asymptotipropertiesofthe posteriordistributionstudiedin

this manusript, namelyposterioronsisteny and posterioronentration rate.

1.2.1 Posterior onsisteny

Consisteny of the posteriordistribution an beonsidered asa leastrequirement

for Bayesian nonparametri proedures. Diaonis and Freedman (1986) proved

that in the ase of exhangeable data, onsisteny of the posteriordistribution is

equivalenttoweakmergingofposteriorsassoiatedwithdierentproperpriordis-

tributions. This ispartiularlyinteresting as,asargued before, itis oftendiult

togofrompriorknowledgeontheparametertoapriordistribution,andtwostatis-

tiians ouldomewith two dierent priors. Wegive amoredetaileddenition of

onsisteny oftheposteriordistribution,aspresented inGhosh and Ramamoorthi

(2003).

Let the observations

X n ∈ X n

be some random variables sampled from a

distribution

P θ n

for

θ ∈ Θ

. Here

n

is onsidered to be a quantiation of the amount ofinformation. Consider

Π

aprior probabilitydistribution on

Θ

. We an

thus ompute the posteriordistribution of

θ

denoted

Π( ·| X n )

(see (1.1)). Assume

that there exists an unknown parameter

θ 0 ∈ Θ

suh that the data are generated

fromthetrue distribution

P θ n 0

,anddenean

ǫ

-neighbourhoodof

θ 0

assoiatedwith

the loss funtion

d

B ǫ (θ 0 ) = { θ, d(θ, θ 0 ) ≤ ǫ } .

(17)

Denition 1.2. The posterior distribution is said to be onsistent at

θ 0

for the

loss

d

if for any

ǫ > 0

,the posteriorprobability of

B ǫ (θ 0 ) Π(B ǫ (θ 0 ) | X n ) → 1

eitherin

P θ n 0

probability or

P θ 0

-almost surely.

A rst result of Doob (1949) shows that when

d

is a metri and

(Θ, d)

is a

omplete separable spae, any posterior distribution is onsistent at

θ 0

,

Π

-almost

surely,undersomeergodiityonditions. Thisresultisinterestingbutveryweakas

itdoesnotprovidesanyinformationonthesetofparametersatwhihonsisteny

holds.

Ausualrequirementfortheposteriortobeonsistentisthatthepriorputspos-

itive mass onneighbourhoodsof

θ 0

. More preisely, if

P θ

isabsolutely ontinuous

with respet to

P θ 0

,dene the Kullbak-Leibler divergene as

KL(P θ , P θ 0 ) = Z

log dP θ

dP θ 0

dP θ,

one willrequire that

Π(KL(P θ , P θ 0 ) < ǫ) > 0

for all

ǫ

.

Aseondonditionisthatthemodelmakesitpossibletodierentiatesbetween

θ 0

and parameters outside

B ǫ (θ 0 )

. This an be formalized by the existene of a

sequene of tests of

H 0 : θ = θ 0 ,

versus

H 1 : θ ∈ B ǫ c (θ 0 ).

(1.2)

We then dene an exponentially onsistent sequene of tests

{ φ n ( X n ) }

as fol-

lows

Denition 1.3. The sequene of tests

{ φ n ( X n ) }

is exponentially onsistent for testing (1.2)if thereexists

c > 0

suhthat for all

n

E

θ 0 (φ n ( X n )) . e cn , sup

θ ∈ B ǫ c (θ 0 )

E

θ (1 − φ n ( X n )) . e cn .

Forindependentidentiallydistributed observations

X n = (X 1 , . . . , X n )

where

the parameter of interest is the ommon density

f

with respet to a measure

λ

,

we thushave

f = dP θ

dλ , θ n ( X n ) = Y n

i=1

θ(X i ),

hene, in this ase

θ = f

, Shwartz (1965) gives general onditions on the model

toahieveonsisteny. InthisasetheKullbak-Leiblerdivergenebetween

f

and

f 0

is

KL(f, f 0 ) = Z

X

f (x) log

f(x) f 0 (x)

dx.

(1.3)

(18)

Shwartz (1965) requires that the prior has positive mass on all

ǫ

-neighborhoods for the Kullbak-Leibler divergene for all

ǫ > 0

Π (f : KL(f, f 0 ) ≤ ǫ) > 0.

The truth

f 0

is then said to belong to the

KL

-support of the prior

Π

. This

onditionensures that thesupport ofthe priorislargeinthesenseofthe Kullba-

Leibler divergene.

Inthe density setting, Shwartz's Theorem then states:

Theorem 1.1 (Shwartz (1965)). Let

Π

be a prior on

Θ

, and

θ 0 ∈ Θ

suh that

• θ 0

isin the

KL

-support of

Π

there exists an exponentiallyonsistent sequene of tests for (1.2)

then

Π(B ǫ0 ) | X n ) → 1 P θ 0

almost surely.

SinethisresultofShwartz, othertypesofresultshavebeenobtained,inmany

dierentsettings,see forinstane Walkerand Hjort(2001),Walker(2003),Walker

(2004), Lijoiet al.(2007).

1.2.2 Posterior onentration rate

A more rened asymptoti property is the posterior onentration rate. Loosely

speaking, itis the rate at whih the

ǫ

-neighborhoods

B ǫ0 )

an shrink suh that

the posteriorprobabilityof

B ǫ (θ 0 )

remindsloseto

1

. Toget abetterunderstand- ing of the impat of the prior on the posterior, we need to study sharper results

than mere onsisteny. Some aspets of the prior may inuene signiantly the

posterioronentrationrate. They arethuslikelytobehighlyinuentialfornite

datasets and should thus be handled with are. We now give a preise denition

of the posterior onentration rate and present some general results proposed in

the literature.

Denition 1.4. Lettheobservations

X n

besampledfromadistribution

P θ n 0

with

θ 0 ∈ Θ

andlet

Π

beaprioron

Θ

. Aposterioronentrationrateat

θ 0

withrespet

to a semimetri

d

on

Θ

is a sequene

ǫ n

suh that for all positive sequenes

M n

going to innity

Π(θ, d(θ, θ 0 ) ≤ M n ǫ n | X n ) → 1,

in

P θ n 0

probabilityas

n

goes toinnity.

In their seminal papers Ghosal et al. (2000a) (see also Shen and Wasserman ,

(19)

rates in the density model (i.e. independent and identially distributed observa-

tions

X n

)ina similarwayShwartz (1965)did foronsisteny. This ideahas then been extended tomany other models, and other approahes have been proposed,

see for instane Ghosal and van der Vaart (2007). Their approah requires also

thatthe prior puts enoughmassonshrinking Kullbak-Leiblerneighbourhoodsof

the truth. However the neighbourhoods here are more restritive than the ones

onsidered for onsisteny. Dene the

k

-th entred Kullbak-Leibler moment, if

dP θ n

is absolutelyontinuous with respet to

dP θ n 0

,

V k (P θ n , P θ n 0 ) =

Z

X

log

dP θ n dP θ n 0

− KL(P θ n , P θ n 0 )

k

dP θ .

We then dene the followingKullbak-Leibler neighborhood

S n0 , ǫ, k) =

KL(P θ n , P θ n 0 ) ≤ nǫ 2 , V k (P θ n , P θ n 0 ) ≤ nǫ .

AsinShwartz'sTheorem,Ghosal and vander Vaart(2007)alsorequirestheexis-

teneofanexponentiallyonsistentsequeneoftests,butinsteadoftestingagainst

the omplement of the shrinking ball

B ǫ n0 )

, itis suient totest against sets

B n j (θ 0 ) = { θ ∈ Θ n , jǫ n ≤ d(θ, θ 0 ) ≤ 2jǫ n } ,

for any integer

j ≥ J 0

for some positive

J 0

,where

Θ n

is aninreasing sequene of

sets that takes most of the prior mass of

Π

. Their Theorem is thusas follows:

Theorem 1.2 (Theorem3of Ghosaland vander Vaart (2007)). Let

d

be asemi-

metri on

Θ

and onsider a sequene

ǫ n

suh that

ǫ n → 0

,

2 n → ∞

as

n → ∞

.

For

k > 1

,

K > 0

and

Θ n ⊂ Θ

, if there exists a sequene of tests

φ n

suh that for

J 0 > 0

, for every

j ≥ J 0

E

θ 0 φ n → 0, sup

B j n (θ 0 )

(1 − φ n ) ≤ e Knj 2 ǫ 2 n ,

(1.4)

and if

Π(B n j0 ))

Π(S n (θ 0 , ǫ n , k)) ≤ e Knj 2 ǫ 2 n /2 ,

(1.5)

then for every sequene

M n → ∞

we have

Π(θ ∈ Θ n , d(θ, θ 0 ) ≤ M n ǫ n | X n ) → 1

in

P θ n 0

-probability as

n

goes to innity.

A usualway ofinsuringthe existeneoftests inondition(1.4), forwellsuited

semimetri

d

is to ontrol the overing number of the sets

B n j (θ 0 )

. For instane

when the semimetri

d

is the Hellinger metri,the wellknown results by Le Cam

(1986) or LeCam and Yang (2000) insure the existene of suh sequene of tests

(20)

1.2.3 Minimax onentration rates and adaptation

Theonentrationrate'stheoryanberelatedtothe lassialoptimalonvergene

rate ofestimators. Ghosalet al.(2000a)showintheontextofdensityestimation,

that the posterior yields a point estimate that onverges atthe same rate as the

posterior onentration rate when the onsidered loss is bounded and onvex. It

thus makes sense to ompare frequentists and Bayesian approahes based on this

asymptotiproperty.

Tostudy the asymptoti behaviourof the posterior distribution,we only on-

sidersomesubspae

Θ 0

oftheparameterspaeonwhihthefuntionsarebehaving

well. One of the most ommon riterion for studying optimality of an estimator

is the minimaxrisk dened by the minimum overallestimatorof maximal riskof

this estimator. More preisely, if

d

is a semimetri on

Θ

, the minimax risk over

Θ 0 ⊂ Θ

is dened as (see Tsybakov,2009)

R n = inf

T n

sup

θ ∈ Θ 0

E

θ [d(T n , θ)] ,

wheretheinmumistaken overallestimators

T n

. Theminimaxratein

Θ 0

isthus

the sequene

ǫ n

suh that there exists a xed positive onstant

C

with

lim sup

n →∞ ǫ n 1 R n = C.

We say that a Bayesian proedure onentrates at the minimax rate if the

onentration rate of the posterior in the lass

Θ 0

is the minimax onvergene

rate. Many models (prior and sampling models) studied in the literature have

been proven to onentrate at the minimax rate in

Θ 0

. In partiular, in the

density model, nonparametri mixture models are known to onentrate at the

minimaxrate(up toa

log

fator)overlassesofHölderfuntionsforvarioustypes

ofkernels,seeGhosaland vander Vaart(2001),Ghosaland vander Vaart(2007),

Shen etal. (2013) for Gaussian kernels, Kruijer et al. (2010) for loation sale

mixturesorRousseau(2010)forbetakernels. Manyothertypesofpriorshavebeen

proven to leadto the minimax onentrationrate, van der Vaart and VanZanten

(2008) proved minimax onentration rates of the posterior for Gaussian proess

priors, Ghosal and vander Vaart (2007) and Knapik et al. (2011) show minimax

onentration rates for series expansions priors for regression and the white noise

model respetively, Arbel etal. (2013), ?, Belitser and Ghosal (2003) obtained

generi resultsfor varioussampling models.

Thesubspaes

Θ 0

arerestritedthroughregularityassumptionssuhasSobolev orHöldersmoothness,shaperestrition,orsparity. Theselassesoffuntionsare

in general indexed by a parameter, say

β

, that aounts for the level of regu-

(21)

on this parameter. However, it is often diult to x

β

a priori, it is thus nat-

ural to seek proedures that perform well over a wide variety of

β

values, say

β ∈ I

. Suh proedures are alled adaptive as they automatially adapt the on- entration rate over the whole olletion of spaes

Θ 0,β ∈ I

. Frequentist adaptive

estimators havebeen well studiedin the literature for the past three deades, see

forinstane Efroimovih(1986), Polyak and Tsybakov (1990),orTsybakov (2009)

for a review. From a Bayesian perspetive, adaptive proedures have beome

more and more popular, see Belitserand Ghosal (2003) for innite dimensional

Gaussian distributions, Sriiolo (2006) obtained adaptive rates in the density

model, van der Vaart and van Zanten (2009) onsidered Gaussian random elds

priors,De Jonge et al.(2010)onsidered loationsale mixtures. Other examples

ofadaptiveBayesianproeduresan befoundinRivoirardet al.(2012),Rousseau

(2010)or Arbelet al.(2013) for instane.

1.3 Nonparametri Bayesian testing

Anotheraspet ofBayesiannonparametri inferenethat hasbeeninvestigated in

this workis the soalled testingproblemor modelhoie. In this ase, one isnot

interested in reovering an unknown parameter

θ

but rather in taking a deision

on the parameter given the observations. This problem of testing in a Bayesian

framework is well known and an be dated bak to Laplae (1814). It an be

formalized as follows: let

Θ 0

and

Θ 1

be two distint subspaes of the parameter

spae

Θ

, assoiated with prior probability

π 0

and

π 1

, one wants to infer whether

θ ∈ Θ 0

versus

θ ∈ Θ 1

, whih an be seen asthe estimation of

I Θ

1 (θ)

asargued in

Robert (2007). Consider

Π

a priordistribution on

Θ = Θ 0 ∪ Θ 1

. In this settingit

is naturalto onsider the

0

-

1

loss with weights

γ 0 , γ 1

similar to the one proposed

by Neyman and Pearson (1938)whih isdened fora deision

ϕ L(θ, ϕ) =

( γ 0

if

ϕ = 0

and

I Θ

0 (θ) = 0 γ 1

otherwise

.

TheBayesiansolutiontothis problem(i.e. theminimizeroftheBayesianrisk)

isthen

ϕ( X n ) =

( 1

if

Π(Θ 1 | X n )/π 1 ≥ γ 0 γ 0 1 Π(Θ 0 | X n )/π 0

0

otherwise

.

(1.6)

Toavoid the impatof

Π(Θ 0 )

and

Π(θ 1 )

or

γ 0

and

γ 1

, one an equivalently dene

the Bayes-Fator

B 0,1 = Π(Θ 0 | X n ) Π(Θ 1 | X n ) × π 1

π 0

.

(22)

The testing proedure orresponds to rejeting

Θ 0

if

B 0,1

is small but the Bayes-

Fator provides more information than just a

0

-

1

answer. Standard thresholds

are given by Jereys' sale. A test proedure based on the Bayes-Fator

B 0,1

is

said tobe onsistent if

B 0,1

goesto innity in

P θ n 0

probability for all

θ 0 ∈ Θ 0

and

onverges to

0

in

P θ n 0

probabilityfor all

θ 0 ∈ Θ 1

. Bayes-Fatorsfor nonparametri goodness of t test have been studied in term of their asymptoti properties in

the literature, see Dass and Lee (2004); MVinish etal. (2009); Rousseau (2007);

Rousseau and Choi(2012)forinstane. Whenbothhypothesesarenonparametri

and one is embedded inthe other, the determinationof Bayesian proedures that

have good asymptotiproperties is diultin general.

Similarly to the estimation problem, asymptoti properties of a Bayesian an-

swer to a testing problem are of great interest from both a theoretial and a

methodologialpointofviewsineinferenebasedoninonsistentposteriorsould

be highly misleading. A similar requirement should also hold for testing proe-

dures. In this ontext, we will say that a proedure is onsistent if it gives the

right answer with probability that goes to

1

as the amount of information grows toinnity. More preisely,atestingproedure(1.6)issaidtobeonsistentforthe

metri orsemi-metri

d

, if for all

ρ > 0 sup

θ ∈ Θ 0

E θ (ϕ( X n )) = o(1), sup

d(θ,Θ 0 )>ρ

E θ (1 − ϕ( X n )) = o(1).

(1.7)

Similarlyto the frequentist literature,we onsider hereuniform onsisteny, how-

ever this denition of onsisteny slightly diers from the one usually onsidered

in the frequentist setting, as here we do not x a level for the type I error of the

test. It is also interesting to study the ounterpart of the onentration rate in

the testingproblemnamely the separation rate ofthe test. The separation rateis

dened as the smallest sequene

ρ = ρ n

suh that (1.7) is still valid. It indiates

howfastthe testan dierentiatebothhypotheses. Similarlytotheonentration

rate, it also indiates whih part of the prior inuenes the Bayesian proedure

even asymptotially. This is of great interest as it is well known that in testing

problems, the sensitivity to the prior isa major issue.

1.4 Challenging asymptoti properties of Bayesian

nonparametri proedures

Wehaveseenthat studyingthe asymptotibehaviourofthe posteriordistribution

is a major tool to understand the inuene of the prior in the nonparametri

setting. Wehavealsoseenthatthereexistssuientonditionsonthemodelunder

whih the proedure is known to be onsistent and to have optimal asymptoti

(23)

do not fall under this general theory. These models present a new hallenge for

the Bayesian nonparametri ommunity. In this setion we present two of these

problems namely the inferene under monotoniity onstraints and estimation in

linear ill-posedinverse problems.

1.4.1 Inferene under monotoniity onstraints

In many statistial problems, it is useful to impose some restritions on the pa-

rameter spae to be able to arry out the inferene. When modelling real world

situations, shape onstraints on the parameter of interest may appear naturally,

this is the ase for instane for drug response models or insurvivalanalysis. Fur-

thermore, theses hypotheses are often easy to interpret, understand and explain

ompared to smoothness restritions for instane. Among dierent shape on-

straints, monotoniityrestritions have been fairly popular inthe literature. In a

regression settingfor instane, amonotoniity ofa response isoften granted from

physial of theoretial onsiderations. Shape onstraints inferene, and mono-

toniity in partiular an be dated bak to Brunk (1955) and most of the early

works on the subjet an be found in Barlowet al. (1972). Sine then mono-

toniity onstraintshave been used in many appliedproblems: in pharmaeutial

ontext in Bornkamp and Ikstadt (2009), for survival analysis in Laslett (1982),

Neelon and Dunson (2004) studied monotone regression for trend analysis and

Dunson (2005) onsidered monotoniity onstraints on ount data. Many other

appliationsan befound inRobertson etal. (1988).

In this setion, we present the two shapeonstrained problems studied inthis

thesis, namely the estimation of a density under monotoniity onstraints and

testingfor monotoniity ina regression setting.

1.4.1.1 Monotone densities

Monotonedensitiesare ommoninpratie, espeiallyinsurvivalanalysis. Arst

study of monotone density an be imputed to Grenander (1956) who onsidered

themaximumlikelihoodestimatorofamonotone density. Sinethen many others

havebeeninterestedinestimatingaunknowndistributionundershaperestritions.

Laslett(1982)onsiderstheproblemofestimatingthedistributionofrakslength

onaminewall,Sun and Woodroofe(1996)presentsomeappliationinastronomy

andrenewalanalysisamongothers. Usingshapeonstraintsproedureswillensure

that the estimate follows this onstraints, whih ould be a requirement of the

analysis.

SineWilliamson(1956),itisknown thatadensityismonotonenon inreasing

if and only if it is a mixture of uniform kernels. More preisely, let

F

be the set

of monotone non inreasing densities on

[0, ∞ )

, then for all

f ∈ F

there exists a

(24)

probability distribution

P

suh that

f ( · ) =

Z

0

I [0,θ] ( · )

θ dP (θ).

(1.8)

This mixture representation is partiularly interesting as it allows for inferene

based on the likelihood. Grenander (1956) showed that the maximum likelihood

estimator oinides with the rst derivative of the least onave majorant of the

umulative distribution funtion. Its asymptoti properties were later studied in

Groeneboom(1985)underthe

L 1

lossandPrakasa Rao(1970)studiedtheasymp-

toti behaviour of the maximum likelihood estimator evaluated at a xed point

in the interior of the support. In Groeneboom (1989), it is shown that the mini-

max rate of onvergene for this problem is of the order of

n 1/3

. This shows in

a way how monotoniity onstraints at as regularity onstraints as in this ase,

one obtains the same onvergene rate as for Lipshitz densities. Another sur-

prising aspet of monotone non inreasing densities is that the evaluation of the

maximum likelihood estimator at the boundaries of the support leads to inon-

sistent estimators. This problem has been studied in Sun and Woodroofe (1996)

and very preise results on the behaviour of the maximum likelihood estimator

at

0

an be found in Balabdaoui etal. (2011). More reently, Durotet al.(2012)

obtain some asymptoti results for the maximum likelihood estimator under the

supremum loss. IntheBayesianframework,monotonedensitieshavebeenstudied

in Brunner and Lo (1989) and Lo (1984). From a Bayesian point of view, the

mixture representation (1.8) leads naturally to a mixture type prior. Choosing a

prior modelon

P

in representation (1.8) naturally indues a prior on

F

. This is

the approahonsidered inBrunner and Lo(1989). InChapter 2weonsider two

types of priors on

P

namely Dirihlet proess and nite mixtures with a random

number of omponents. An interesting feature of these models is that the prior

doesnotputpositivemassontheKullbak-Leiblerneighborhoodofthetruth,and

thus ondition (1.5) will not hold and the standard approah based on the work

of Ghosal and vander Vaart (2007) annot be applied diretly. We prove that a

similar result holds when one only onsiders Kullbak-Leibler neighbourhoods of

trunated versions of the densities

f n ( · ) = f( · ) I [0,x n ]

F (x n ) , f θ 0 ,n ( · ) = f θ 0 ( · ) I [0,x n ] F θ 0 (x n ) ,

where

x n

is aninreasing sequene and

F

is the umulative distribution funtion of

f

. From this result,we provethat forboth prior models, the posterior onen-

tratesattheminimaxrate

n 1/3

uptoa

log(n)

term. Wealsostudytheasymptoti

propertiesof the posteriordistribution ofthe densityat axed point

x

of itssup-

(25)

general well suited for losses that are related to the Kullbak-Leibler divergene

(see Arbelet al., 2013; Homann et al., 2013). In partiular, the usual approah

of LeCam (1986)for onstruting exponentially onsistentsequene of tests does

not hold in this ase. However, we prove in Chapter 2 that for the onsidered

priordistributionthe posteriordistributionof

f (x)

isonsistentforevery

x

inthe

support of

f

, inluding the boundaries. The fat that the posterior distribution is onsistent at the boundaries of the support when the maximum likelihood es-

timatoris not an beimputed to the penalization indued by the prior. Another

interestingfeature ofourBayesianapproahisthatthe posteriorisalsoonsistent

for the supremum loss over the whole support. Here again, the supremum loss

is not related to the Kullbak-Leibler divergene, whih makes the onstrution

ofexponentiallyonsistent sequene of tests diult.

1.4.1.2 Nonparametri test for monotoniity

Althoughthereisawideliteratureontheproblemofestimatinganunknownfun-

tion under shape onstraints, an important question is whether it is appropriate

to impose a spei shape onstraint. If it is, then the estimation proedures

ould in general be greatly improved by using a shape onstrained estimation

proedure. Conversely, imposing shape onstraints in an appropriate ase ould

lead to dramatially erroneous results. The problem of testing for monotoniity

has been widely studied inthe frequentist literature. Bowman et al.(1998)intro-

dueda test for monotoniityin the regression settingbase onthe idea of ritial

bandwidth introduedinSilverman(1981). Hall and Hekman (2000)showed that

this proedure is highlysensitive toat parts of the regression funtion,and pro-

posed another test proedure based on running gradient. Baraud et al. (2003),

Ghosalet al. (2000b) and Baraud et al. (2005)propose testing proedures in the

xeddesign regressionsettingand theGaussianwhite noisesetting. Durot(2003)

and Akakpo etal. (2014) onsider a test that exploits the onavity of a primi-

tive of a monotone funtion. A Bayesian approah to testing monotoniity in a

regression frameworkhas been proposed inSott etal. (2013).

In Chapter 3, we onsider the nonparametri regression model

Y i = f(x i ) + σǫ i , ǫ i

iid ∼ N (0, 1),

(1.9)

and we want totest

H 0 : f ∈ F ,

versus

H 1 : f 6∈ F ,

(1.10)

for

F

be the set of monotone non inreasing funtionson

[0, 1]

.

A rstdiultyintestingfor monotoniityina regressionsettingis thatboth

thenullandthealternativehypothesesarenonparametri. Asageneralrulewhen

(26)

aountthesensitivitytothepriordistribution. Thisistrueforparametrimodels

butisritialfornonparametrionesasinthatase,asstatedbefore,theprioran

stillinuenetheposteriorasymptotially. Aseondandprobablymoreimportant

diulty is the fat that when testing for monotoniity in a regression setting,

the null hypothesis is embedded in the alternative. This problem is ommon

in goodness of t tests where one is interested in testing

f = f 0

versus

f 6 = f 0

. This has been investigated in Dass and Lee (2004), Ghosal et al. (2008) or MVinish et al.(2009)amongothersinthe density setting,orRousseau and Choi

(2012)intheregression problem. Inthisase amaindiultyisthataparameter

in the null model an also be approximated by a parameter in the alternative

model. In fat it has been proved in Walkeretal. (2004) that the Bayes-Fator

willasymptotiallysupportthemodelwithpriorthatsatisestheKullbak-Leibler

property, some additionalonditions may be required when both priors do.

In the ase of testing for monotoniity, it seems that for a natural hoie of

prior,namelypieewiseonstantfuntionswithrandomnumberofbins,theBayes-

Fatorisnotonsistent. Wethusproposeanalternativetestthatisasymptotially

equivalenttotestingformonotoniityusingasimilarideaasapproximatingapoint

nullhypothesis by ashrinkinginterval(see Rousseau ,2007). Denoteby

F

the set

of monotone non inreasing funtions with support

[0, 1]

and let

d ˜

be a metri or

a semi-metri. Consider the test

H 0 a : ˜ d(f, F ) ≤ τ

versus

H 1 a : ˜ d(f, F ) ≥ τ

(1.11)

where

d(f, ˜ F ) = inf g ∈F d(f, g) ˜

and

τ

is agiven threshold. If

τ

dereases toward

0

,

both tests (1.10) and (1.11) are asymptotially equivalent. We propose a alibra-

tion of thethreshold

τ

,the Bayesian answer tothe test (1.11)assoiatedwith the

0

-

1

lossisonsistentfortheinitialproblemof testing(1.10)andgivesgoodresults

in pratie ompared to the frequentist proedures. Furthermore, for a spei

hoie of prior, the proposed Bayesian test is easy toimplement whih is a great

advantage ompared tothe existing methods.

We also study the separation rate of the test whih gives insights on the

eieny of the proedure. The adaptive minimax separation rates for testing

monotoniityhas been derived inBaraud etal.(2005)and Dümbgen et al.(2001)

overHölder alternatives. Under similarassumptions,we provethat our proedure

ahieves the minimax separation rate up to a

log(n)

fator.

1.4.2 Ill posed linear inverse problems

Another general lass of models that beame popular for statistial modelling

sinethe 1960'sisthesoalledinverseproblems. Theyappearnaturallywhenone

(27)

ase in many elds of appliations: medial imaging(omputerized tomography),

eonometry(instrumentalvariables),radioastronomy(interferometry),astronomy

(blurred images of Hubble telesope) or seismology among many others. In the

statistialsettingthis is modelledby onsideringthat the data arise froma prob-

ability distribution whose parameter has been transformed by a known operator

K

that atsontheparameterspae. Inmost ases,wean assumethatthe trans-

formation

K

does not indue additional noise in the observations. The sampling modelis thusmodied to

X n ∼ P n , θ ∈ Θ.

(1.12)

If the operator

K

an be inverted and if its inverse is ontinuous, then the gen-

eral theory applies and inferene on

θ

does not dier from the usual framework.

However, inmany ases,the inverseof theoperatorisnot ontinuous. In thisase

the problem is alled ill-posed with respet to Hadamard's denition, as in this

ase a small noise in the data will be greatly amplied in the inferene on

θ

. An

interesting lass of operators whih overs many appliations isthe lass of linear

operators onHilbert Spaes. It isusuallyassumed that the operator

K

isompat

and injetive and the Hilbertspaes are separable.

Statistialapproahtoinverse problems has grown popularsinethe standard

framework has been proposed in Tikhonov (1963). A usual toy example tostudy

suh methodsis the white noise model

X n = Kθ + σ W

√ n ,

(1.13)

where

W

is white noise and

σ > 0

a variane parameter. In this example we an

easily grasp the diulties at hand. In Chapter 4 we treat more general inverse

problems models of the form (1.12). In the following, we willreall some features

of statistialinferenein inverse problems and illustrateitwith model(1.13).

1.4.2.1 Singular value deomposition

Consider

K

to be aompat injetivelinear operatorbetween two Hilbert spaes

{ Θ, h· , ·i θ }

and

{ Ξ, h· , ·i ξ }

. Forreadingonveniene, weshalldropthe subsriptfor

theinner produtwhenthere isnoonfusion. A usualapproahtoinferon

θ

isto

onsider its deomposition in a basis of

Θ

. In the linear inverse problem setting,

a simple hoie for suh basis would be the one that diagonalize the operator

K

.

Morepreisely,denoteby

K

theadjointoperatorof

K

andsupposethattheauto-

adjointoperator

K K

isompat,thenthespetralTheoremstatesthat

K K

has

aomplete orthogonalsystem of eigenvetors

{ e i }

withorresponding eigenvalues

{ b i }

. We thus have for all

θ ∈ Θ K Kθ =

X ∞ i=1

b i h θ, e i i e i = X ∞

i=1

κ 2 i θ i e i ,

(1.14)

(28)

where

κ i = √

b i

and

θ i = h θ, e i i

. Inthisase wesaythat

K

admitsasingularvalue

deomposition (SVD) with singular values

{ κ i }

and singularbasis

{ e i }

. Inferring

on

θ

isthusequivalenttoinferontheinnitesequene

{ θ i }

. Fromtheobservations

X n

from model (1.12), one an get an estimator

η ˆ

of

η = Kθ

. Denoting

{ η ˆ i }

its

projetion ontothe SVD basis, a simple estimator

θ ˆ

of

θ

is given by

θ ˆ i = η ˆ i

κ i

.

When theproblemisill-posed,sine

{ κ i }

,goesto

0

,we see thatthe oeients

θ i

willbe over-estimated for large

i

.

To see this problem more learly, onsider the white noise example. By pro-

jeting (1.13)onto the basis

{ e i }

and sine

W

isawhite noise,we an rewritethe

modelas

x i = κ i θ i + σ

√ n ǫ i , ǫ i ∼ N (0, 1), i = 1, 2, . . .

with

x i = h x, e i i

. This sequene model has been a ornerstone in the study of

linear inverse problems, see for instane Donoho (1995); Cavalier and Tsybakov

(2002); Cavalier (2008). The ase where

K

is the identity operator (i.e.

κ i = 1

for all

i

) has been widely studied in the literature. From a Bayesian perspe- tive, this representation is highly interesting as in this ase, it is natural to on-

sider a prior onthe sequene

{ θ i }

. These types of priors have been onsidered in

Ghosal and van der Vaart (2007) when

K

is the identity or Knapik etal. (2011)

orAgapiou etal.(2013)inthe inverse problemsetting. Toinferon

θ

,we onsider

the transformed model

x i κ i 1 = θ i + κ i 1 σ

√ n ǫ i , i = 1, 2, . . . ,

whihreduestheproblemtoestimatingthemeanofaninniteGaussiansequene.

Sine the problem is ill-posed, the sequene

κ i 1 → ∞

, hene the variane of the

noise blows up.

It appears from these onsiderations that the diulty of an inverse problem

an bequantied by the rate atwhih the sequene

{ κ i 1 }

goesto innity.

Denition1.5 (Ill-posedness). Wedenethe degreeofill-posednessofaninverse

problemas follows:

We say that a problem is mildly ill-posed of degree

p

if the sequene of

singular values

{ κ i }

is suh that there exist onstants

0 < C d ≤ C u < ∞

suh that

C d i p ≤ κ i ≤ C u i p .

Références

Documents relatifs

Face à ce vide, dévoilement momentané ou durable, le sentiment d'inquiétante étrangeté peut envahir le sujet. La question de l'inquiétante étrangeté est fondamentale dans

Or nous avons identifié cinq façons récurrentes d’éluder le problème : on peut faire comme si l’efficience relevait de l’évidence, activer un cercle vicieux

Ces derniers, à l’occasion de tensions localisées ou dans des démarches au long cours, interviennent dans l’aménagement de leur cadre de vie comme dans les modes de

L’iconique se présente aussi comme un commentaire postérieur à l’œuvre, comme sa paraphrase ou son contresens parfois, sous forme d’illustrations, couvertures illustrées

On peut lancer assez de rayons afin d’obtenir une discr´etisation de la surface ´eclair´ee du mˆeme ordre que dans le cadre d’un calcul en m´ethode int´egrale.. Lors de calculs

Figure 5-5 : Comparaison des EISF déduits de l’analyse phénoménologique des spectres à 100µeV moyenné sur les trois températures (croix) à ceux attendus en

A titre d’illustration, nous allons exposer la r´ ` eponse de l’atome unique pour l’harmonique 35 g´ en´ er´ ee dans le n´ eon (calcul´ ee dans le cadre de l’approximation

Dans le cas o` u G est un groupe de Baire ab´ elien et A une alg` ebre de Banach, nous obtenons ` a l’aide du th´ eor` eme du graphe ferm´ e et du th´ eor` eme de Gelfand un r´