V. C ONCLUSION
This paper showed that MCMCmethods, which are widely used for Bayesian estimation, are also a suitable tool for su- pervised classification using hierarchical Bayesian learning. In the proposed HBL implementation, the training samples were used to build Markov chains whose target distributions were the parameter posterior distributions. The class-conditional proba- bility densities were then obtained by Monte Carlo ergodic av- eraging of the MC elements. An academic example showed that the performance of the proposed HBL implementation is very close to the theoretical performance obtained with closed-form expressions of the class conditional distributions. Finally, the problem of classifying chirp signals by using HBL combined with MCMCmethods was studied. The proposed classifier was shown to outperform conventional time–frequency classifiers, provided the number of training vectors is sufficient to learn the class-conditional distributions.
This paper studies new strategies to classify linear and non-linear modulations. The first strategy is based on a practical suboptimal Bayes classifier using a ‘‘plug-in’’ rule initially proposed in [8] . It can be applied to recognize M- PSK classical schemes, as well as M-QAM (M states QAM) or M-APSK (M states APSK) modulations. The main idea is to estimate the unknown model parameters by Bayesian estimation combined with Markov chain Monte Carlo (MCMC) methods. The estimated parameters are then plugged into the posterior probabilities of the received modulated signal (conditionally to each class). The classical maximum a posteriori (MAP) classification rule is finally implemented with these estimated probabilities. Unfortunately, the complexity of this MCMC classifier may be prohibitive for some practical applications. To overcome this difficulty, we consider a new digital modulation classifier based on hidden Markov models (HMMs) to classify linear modulations transmitted through an unknown finite memory channel and cor- rupted by additive white Gaussian noise (AWGN). This classifier is based on a state trellis representation, allowing one to use a modified version of the Baum– Welch (BW) algorithm (proposed in [9] for speech recognition) to estimate the posterior probabilities of the possible modulations. These posterior probabilities are then plugged into the optimal Bayes decision rule. This BW classifier, initially introduced in [10] , is interest- ing since it can be used to recognize OQPSK modulation from other linear phase modulations. Indeed, since some transitions are not allowed in case of OQPSK, a distinct state trellis representation from QPSK can be defined. The BW classifier then exploits this state trellis representation for modulation classification.
Fig. 1. Acquisition system.
Fig. 2. Synthetic trace.
[15] enable us to circumvent this difficulty by solving the inte- gration and optimization problems by simulating random vari- ables. This approach leads to stochastic versions of the EM al- gorithm [stochastic expectation maximization (SEM), SAEM] [12]. Moreover, we investigate the problem in a totally Bayesian framework in which prior information is introduced upon the pa- rameters [10]. Then, the estimation of the Bernoulli–Gaussian model and of the wavelet is solved by the simulation of random variables via MCMC algorithms such as the Gibbs sampler [10]. Once the model has been estimated, the next step is the de- convolution itself. This problem is said to be “ill-posed” be- cause, in the presence of noise, different reflectivity sequences can lead to similar seismic data. Therefore, it is necessary to use as much prior information as possible upon the reflectivity to limit the set of acceptable solutions. Thus, it is natural to ac- count for the Bernoulli–Gaussian assumption introduced above. Then, we need an adequate procedure to achieve the detection of reflectors and the estimation of their amplitude. Actually, the problem can be solved using either the maximum a posteriori (MAP) criterion, which can be optimized using the simulated annealing technique, or by the suboptimum maximum posterior mode (MPM) method, which involves optimization by means of a MCMC technique.
It can be seen on this example that the MCMC method gives a better deconvolution than the ML method. At high SNR, the localization of impulses by MCMC is more accurate with less misses and false detection than by ML, where more false detections occur. If SNR decreases, the impulse miss number increases, because low-amplitude impulses are not detected. High-amplitude impulses continue to be well localized. It can be remarked that both methods show good robustness with regard to the SNR decrease.
August 31, 2014
Abstract
The resolution of many large-scale inverse problems using MCMCmethods requires a step of drawing samples from a high dimensional Gaussian distribution. While direct Gaussian sampling techniques, such as those based on Cholesky factorization, induce an excessive numerical complexity and memory requirement, sequential coordinate sampling methods present a low rate of convergence. Based on the reversible jump Markov chain framework, this paper proposes an efficient Gaussian sampling algorithm having a reduced computation cost and memory usage. The main feature of the algorithm is to perform an approximate resolution of a linear system with a truncation level adjusted using a self-tuning adaptive scheme allowing to achieve the minimal computation cost. The connection between this algorithm and some existing strategies is discussed and its efficiency is illustrated on a linear inverse problem of image resolution enhancement.
Markov Chain Monte Carlo (MCMC) techniques are one of the most popular approaches that are used in marginal likelihood esti- mation [1–3]. However, despite their well known advantages, these methods have lost their charm in various machine learning appli- cations especially during the last decade, as they are perceived to be computationally very demanding. Indeed, the conventional ap- proaches require passing over the whole data set at each iteration, which makes the methods impractical even for mediocre N . Re- cently, alternative approaches, under the name of stochastic gradi- ent MCMC (SG-MCMC), have been proposed, aiming to develop computationally efficient MCMCmethods that can scale up to large- scale regime [4–11]. Unlike conventional MCMCmethods, these methods require to ‘see’ only a small subset of the data per iteration, which enables the methods to handle large datasets.
3.3. Estimation Algorithm
The estimation algorithm robustness is a crucial point. Indeed, we need to process thousands of galaxies and thus the algorithm must tolerate large approximation on the initial parameters so that it can be used in an unsupervised mode. Several authors have pointed out the diculty of providing a fully automatic algorithm for the estimation of the parameters of the much more simpler two components (bulge and disc) model in mono band images [16, 29, 30]. Obviously, the estimation of a more complex model generates more diculties. To overcome this problem we propose to use MCMCmethods. MCMC algorithms allow to sample the parameter space according to the target distribution and theoretical results prove the convergence of the distribution of the samples to the target distribution in innite time.
The outer Monte Carlo stage samples distributions restricted to {Y ∈ A}. A naive acceptance-rejection on Y fails to be efficient because most of simulations of Y are wasted. Therefore, specific rare-event techniques have to be used. Importance sampling is one of these methods (see e.g. [ RK08 , BL12 ]), which can be efficient in small dimension (10 to 100) but fails to deal with larger dimensions. In addition, this approach relies heavily on particular types of models for Y and on suitable information about the problem at hand. Another option consists in using Markov Chain Monte Carlo (MCMC) methods. Such methods amount to construct a Markov chain (X (m) ) m≥0 , such that the chain possesses
Telecom ParisTech & CNRS 46, rue Barrault, 75013 Paris (France)
gfort@telecom-paristech.fr
Markov Chain Monte Carlo (MCMC) sampling has demonstrated to be a powerful and versatile method for Bayesian inference in a variety of models, e.g. large treewidth graphical models in Ma- chine Learning or intricate models in experimental Physics. MCMCmethods simulate a Markov chain ( θ (t) ) that approximates independent draws from a previously defined target distribution
CREST - Insee and Ceremade - Universit ´e Paris-Dauphine, Paris , France
Summary. In this paper we develop an original and general framework for automatically op- timizing the statistical properties of Markov chain Monte Carlo (MCMC) samples, which are typically used to evaluate complex integrals. The Metropolis-Hastings algorithm is the basic building block of classical MCMCmethods and requires the choice of a proposal distribution, which usually belongs to a parametric family. The correlation properties together with the ex- ploratory ability of the Markov chain heavily depend on the choice of the proposal distribution. By monitoring the simulated path, our approach allows us to learn “on the fly” the optimal pa- rameters of the proposal distribution for several statistical criteria.
In the human genome, susceptibility to common diseases is likely to be determined by interactions between multiple genetic variants. We propose an innovative Bayesian method to tackle the challenging problem of multi-locus pattern selection in the case of quantitative phenotypes. For the first time, in this domain, a whole Bayesian theoretical frame- work has been defined to incorporate additional transcriptomic knowledge. Thus we fully integrate the relationships between phenotypes, transcripts (messenger RNAs) and genotypes. Within this framework, the relationship between the genetic variants and the quantitative phenotype is modeled through a multivariate linear model. The posterior distribution on the parameter space can not be estimated through direct calculus. Therefore we design an algorithm based on Markov Chain Monte Carlo (MCMC) methods. In our case, the number of putative transcripts involved in the disease is unknown. Moreover, this dimension parameter is not fixed. To cope with trans-dimensional moves, our sampler is designed as a reversible jump MCMC (RJMCMC). In this document, we establish the whole theoretical background necessary to design this specific RJMCMC.
Abstract
In big data context, traditional MCMCmethods, such as Metropolis- Hastings algorithms and hybrid Monte Carlo, scale poorly because of their need to evaluate the likelihood over the whole data set at each iteration. In order to rescue MCMCmethods, numerous approaches belonging to two categories: divide-and-conquer and subsampling, are proposed. In this article, we study parallel MCMC techniques and propose a new com- bination method in the divide-and-conquer framework. Compared with some parallel MCMCmethods, such as consensus Monte Carlo, Weier- strass Sampler, instead of sampling from subposteriors, our method runs MCMC on rescaled subposteriors, but shares the same computation cost in the parallel stage. We also give a mathematical justification of our method and show its performance in several models. Besides, even though our new method is proposed in parametric framework, it can been applied to non-parametric cases without difficulty.
VII. C ONCLUSION
This paper presented a hybrid Gibbs sampler for estimating the Potts parameter β jointly with the unknown parameters of a Bayesian segmentation model. In most image processing applications this important parameter is set heuristically by cross-validation. Standard MCMCmethods cannot be applied to this problem because performing inference on β requires computing the intractable normalizing constant of the Potts model. In this work the estimation of β has been included within an MCMC method using an ABC likelihood-free Metropolis-Hastings algorithm, in which intractable terms have been replaced by simulation-rejection schemes. The ABC distance function has been defined using the Potts potential, which is the natural sufficient statistic for the Potts model. The proposed method can be applied to large images both in 2D and 3D scenarios. Experimental results obtained for synthetic data showed that estimating β jointly with the other unknown parameters leads to estimation results that are as good as those obtained with the actual value of β. On the other hand, choosing an incorrect value of β can degrade the estimation performance significantly. Finally, the proposed algorithm was successfully applied to real bidimensional SAR and tridimensional ultrasound images.
A BSTRACT
Adding inequality constraints (e.g. boundedness, monotonicity, convexity) into Gaussian processes (GPs) can lead to more realistic stochastic emulators. Due to the truncated Gaussianity of the posterior, its distribution has to be approximated. In this work, we consider Monte Carlo (MC) and Markov Chain Monte Carlo (MCMC) methods. However, strictly interpolating the observations may entail expensive computations due to highly restrictive sample spaces. Furthermore, having (constrained) GP emulators when data are actually noisy is also of interest for real-world implementations. Hence, we introduce a noise term for the relaxation of the interpolation conditions, and we develop the corresponding approximation of GP emulators under linear inequality constraints. We show with various toy examples that the performance of MC and MCMC samplers improves when considering noisy observations. Finally, on 2D and 5D coastal flooding applications, we show that more flexible and realistic GP implementations can be obtained by considering noise effects and by enforcing the (linear) inequality constraints.
or simulated from, respectively ( Andrieu and Roberts , 2009 ; Marin et al. , 2011 ). 2
Although current approximation methods can provide significant empirical performance improve- ments, they tend either to over- or under-utilize the surrogate, sacrificing exact sampling or potential speedup, respectively. In the first case, many methods produce some fixed approximation, inducing an approximate posterior. In principle, one might require only that the bias of a posterior expec- tation computed using samples from this approximate posterior be small relative to the variance introduced by the finite length of the MCMC chain, but current methods lack a rigorous approach to controlling this bias ( Bliznyuk et al. , 2008 ; Fielding et al. , 2011 ); Cotter et al. ( 2010 ) show that bounding the bias is in principle possible, by proving that the rate of convergence of the forward model approximation can be transferred to the approximate posterior, but their bounds include unknown constants and hence do not suggest practical strategies for error control. Conversely, other methods limit potential performance improvement by failing to “trust” the surrogate even when it is accurate. Delayed-acceptance schemes, for example, eliminate the need for error analysis of the surrogate but require at least one full model evaluation for each accepted sample ( Rasmussen , 2003 ;
We conclude that the Gibbs sampler is well adapted to pile-up affected data, and as it attains the Cramér-Rao bound it might lead to a significant reduction of the acquisition time. Therefore, we compare the MCMC method to the following estimation practice. Data from the pile-up model are obtained at a low laser intensity (λ = 0.05) such that the probability for 2 or more photons per laser pulse is negligible. Then the observed arrival times are considered as independent observations from the exponential mixture distribution given by (31) and the classical EM algorithm is applied. Repeated simulations provide estimates of the bias and the variance of the estimators for a two-component model and various numbers of observations. For the same two-component model we simulated data using the laser intensity λ opt =
account of the non-linear properties of hull materials and/or the non-linear characteristics of specific loading conditions (as in case of shock, blast or other military threats).
Dumez et al. (2008) have developed an ultra fast 3D ship modelling and grid generation tool based on four cornerstones: parametric modeller, generativity, granularity, and propagation. These four elements enable the creation of 3D CAD models of complete ships in a few days. The model obtained is topologically connected, allowing automatic updates of the definition by changing some parameters, and to readily extract a structural mesh of the whole ship or its associated compartment plans. For the same purpose Forrest (2008) introduced a novel hullform generation technique for the Paramarine ship and submarine design system. He discussed the requirements that shaped the development of the technique in terms of the user interface, the underlying mathematical methods, the need to function in a parametric environment, and the importance of compatibility with the design system’s extant solid modeller. Such requirements were assembled over many years using literature searches, application prototypes and user consultations. General features of the design solution are described. The user interface is a key component of the system and enables a patchwise hull to be developed rapidly and intuitively. Surface objects are built up from curves and define a hullform in terms of a series of patches. The curves are associative and use high-level parametric definitions in order to achieve the user’s requirements. In global FE ship analysis there are two laborious steps: Building the global finite element model and assessing the structure based on the finite element results. In general the assessment cannot be performed only using the global finite element model and results - additional information about structural details or loads are also needed when derived physical quantities like buckling usage factors should be computed. Germanischer Lloyd (Wilken et al. 2008) proposes a technical solution and observes different modelling requirements between finite element computation (where idealized structural information is necessary) and derived results assessment (where detailed structural models have to be used) and a way to use 3D CAD data to derive this information.
• numerical methods ≡ all parameters of order 1, all sizes and dimensions of the same order (ex. beam theory, lubrication theory)
• perturbation methods ≡ small parameter • what is small ? (dimensional analysis) • very small means very large
Building Detection by Markov Object processes and a MCMC algorithm Laurent Garcin — Xavier Descombes — Josiane Zerubia — Hervé Le Men.. apport de recherche..[r]
Yang (2007) tackles some open questions mentioned in Roberts and Rosenthal (2007), by providing sufficient conditions - close to the conditions we give in Theorems 2.1 and 2.5 - to ensure convergence of the marginals and a weak law of large numbers for bounded functions. The conditions in (Yang, 2007, Theorems 3.1 and 3.2) are stronger than our conditions. But we have noted some skips and mistakes in the proofs of these theorems. 2.4.5. Comments on the methods of proof. The proof of Theorem 2.1 is based on an argu- ment extended from Roberts and Rosenthal (2007) which can be sketched heuristically as follows. For N large enough, we can expect P N