Unite´ de recherche INRIA Lorraine, Technopoˆle de Nancy-Brabois, Campus scientifique, 615 rue du Jardin Botanique, BP 101, 54600 VILLERS LE` S NANCY Unite´ de recherche INRIA Rennes, Ir[r]

Graph cuts: The labeling produced by the graph cuts (GC) algorithm is determined by finding the minimum cut between the foreground **and** background seeds via a maximum flow computation. The original work on GC for interactive image segmentation was produced by Boykov **and** Jolly [ 32 ], **and** this work has been subsequently extended by several groups to employ different **features** [ 25 ] or user interfaces [ 151 , 188 ]. Although GC is relatively new, the use of minimal surfaces **in** segmentation has been a common theme **in** **computer** **vision** for a long time [ 26 , 99 , 162 ] **and** other boundary-based user interfaces have been previously employed [ 61 , 92 , 112 , 161 ]. Two concerns **in** the literature about the original GC algorithm are metrication error (“blockiness”) **and** the shrinking bias. Metrication error was addressed **in** subsequent work on GC by including additional edges [ 34 ], by using continuous max flows [ 10 ] or total variation [ 210 ]. These methods for addressing metrication error successfully overcome the problem, but may incur greater memory **and** computation time costs than the application of maximum flow on a 4-connected lattice. The shrinking bias can cause overly small object segments because GC minimizes boundary length. Although some techniques have been proposed for addressing the shrinking bias [ 10 , 34 , 214 ], these techniques all require additional parameters or computation.

En savoir plus
170 En savoir plus

convolution filters to decompose curvilinear **features** from background textures. The main idea behind such filter design is to create line shape templates to extract locally oriented gradient information. Frangi et al. analyzes the eigenvalues of the Hessian ma- trix for a given image to obtain principle directions of the local structure [ Frangi 98 ]. Optimally Oriented Flux (OOF) measures the amount of outgoing gradient flux to find the curvilinear structures [ Law 08 ]. Morphological operator collects pixels according to the structural similarity on the elongated path [ Talbot 07 ]. However, due to the lack of shape interpretation, such methods based on local image **features** are insufficient to re- construct underlying curvilinear structure. On the other hand, graphical models such as [ González 10 , Türetken 13b ] define **geometric** constraints **in** a local configuration **and** globally minimize their cost function. More precisely, the graph-based algorithms initialize some points highly corresponding to the latent curvilinear structure, **and** then define a path which connects these points with **geometric** priors to provide plausible shapes. **Geometric** properties of line network are involved as constraint terms when an energy optimization problem is formulated. Stochastic models [ Lacoste 05 , Jeong 15a ] reconstruct curvilinear structures by sampling multiple line segments to maximize a posterior probability of given image data. Similarly to the graph-based representation, **geometric** priors are considered to define the connectivity **and** curvature of line seg- ments. Recently, machine learning algorithms have been proposed to detect curvilinear structure. Becker et al. [ Becker 13 ] applied a boosting algorithm to obtain an optimal set of convolution filter banks. Sironi et al. [ Sironi 14 ] developed a regression model to estimate the scale (width) of the curvilinear structures **and** to localize the centerlines. Although the contour grouping algorithms [ Tu 06 , Arbeáez 11 ] examine image fea- tures corresponding to curves **and** lines, the goal is quite different from the curvilinear structure reconstruction techniques. The contour grouping algorithms seek closed con- tour lines to divide an image into meaningful regions. Therefore, the cost function ex- ploits global texture cues **in** that the contours are associated with salient edges around object boundaries. On the other hand, we look for multiple curvilinear structures, which are not necessarily closed, are latent **in** the homogeneous texture. Compared with the contour, the curvilinear structures are estimated by subtle local image fea- tures. Internal similarity of the structure **and** an accurate design of shape prior are essential to solve our problem.

En savoir plus
155 En savoir plus

Optical systems
Optical systems are inherently non-contact **and** extract three-dimensional information from the geometry **and** the texture of the visible surfaces **in** a scene. Structured light (laser-based) can compute three- dimensional coordinates on most surfaces. **In** the case of systems that operate with ambient light (stereo or photogrammetry-based systems), the surfaces that are measured must contain unambiguous **features**. Obviously, external lighting can be projected on surfaces **in** order to ease the processing tasks. Finally, these systems can acquire a large number of three-dimensional points **in** a single image at high data rates. With recent technological advances made **in** electronics, photonics, **computer** **vision** **and** **computer** graphics, it is now possible to construct reliable, high-resolution **and** accurate three-dimensional optical measurement systems. The cost of high-resolution imaging sensors **and** high-speed processing workstations has decreased by an order of magnitude **in** the last 5 years. Furthermore, the convergence of photogrammetry **and** **computer** **vision**/graphics is helping system integrators provide users with cost effective solutions for three-dimensional motion capture.

En savoir plus
Keywords: Stereovision; Stereo-correlation; 3-D digital image correlation (3D-DIC); Shape measurement; Displacement/strain measurement; Experimental mechanics
1. Introduction
Full-field optical techniques for displacement or strain measurements are now widely used **in** experimental mechanics. The main techniques are photoelasticity, **geometric** moire´, moire´ interferometry, holographic inter- ferometry, speckle interferometry (ESPI), the grid method **and** digital image correlation (DIC) [1–9] . It should be noted that some of these techniques can only measure **in**- plane displacements/strains on planar specimens **and** some of them can give both **in**-plane **and** out-of-plane displace- ment/strain fields on any kind of specimen (planar or not). Due to its (apparent) simplicity **and** versatility, the DIC method is probably one of the most commonly used methods, **and** many applications can be found **in** the literature 1 [10–45] . When it is used with a single camera (classical DIC), the DIC method can only give **in**-plane displacement/strain fields on planar objects. By using two cameras (stereovision), the 3-D displacement field **and** the

En savoir plus
Boosting 3D-**Geometric** **Features** for Efficient Face Recognition **and** Gender Classification
Lahoucine Ballihi, Boulbaba Ben Amor, Mohamed Daoudi, Anuj Srivastava, **and** Driss Aboutajdine
Abstract—We utilize ideas from two growing but disparate ideas **in** **computer** **vision** – shape analysis using tools from dif- ferential geometry **and** feature selection using machine learning – to select **and** highlight salient geometrical facial **features** that contribute most **in** 3D face recognition **and** gender classification. Firstly, a large set of geometries curve **features** are extracted using level sets (circular curves) **and** streamlines (radial curves) of the Euclidean distance functions of the facial surface; together they approximate facial surfaces with arbitrarily high accuracy. Then, we use the well-known Adaboost algorithm for feature selection from this large set **and** derive a composite classifier that achieves high performance with a minimal set of **features**. This greatly reduced set, consisting of some level curves on the nose **and** some radial curves **in** the forehead **and** cheeks regions, provides a very compact signature of a 3D face **and** a fast classification algorithm for face recognition **and** gender selection. It is also efficient **in** terms of data storage **and** transmission costs. Experimental results, carried out using the FRGCv2 dataset, yield a rank-1 face recognition rate of 98% **and** a gender classification rate of 86%.

En savoir plus
cognition or concepts representation needs to reconsider this deep hierarchies. **In** particular, the dynamics of neural processing is much more complex than the hierarchical feedforward abstrac- tion **and** very important connectivity patterns such as lateral **and** recurrent interactions must be taken into account to overcome several pitfalls **in** understanding **and** modelling biological **vision**. **In** this section, we highlight some of these key novel **features** that should greatly influence com- putational models of visual processing. We also believe that identifying some of these problems could help **in** reunifying natural **and** artificial **vision** **and** addressing more challenging questions as needed for building adaptive **and** versatile artificial systems which are deeply bio-inspired. **Vision** processing starts at the retina **and** the lateral geniculate nucleus (LGN) levels. Although this may sound obvious, the role played by these two structures seems largely underestimated. Indeed, most current models take images as inputs rather than their retina- LGN transforms. Thus, by ignoring what is being processed at these levels, one could easily miss some key properties to understand what makes the efficiency of biological visual systems. At the retina level, the incoming light is transformed into electrical signals. This transformation was originally described by using the linear systems approach to model the spatio-temporal filtering of retinal images [86]. More recent research has changed this view **and** several cortex- like computations have been identified **in** the retina of different vertebrates (see [110, 156] for reviews, **and** more details **in** Sec. 4.1). The fact that retinal **and** cortical levels share similar computational principles, albeit working at different spatial **and** temporal scales is an important point to consider when designing models of biological **vision**. Such a change **in** perspective would have important consequences. For example, rather than considering how cortical circuits

En savoir plus
its five descendants. The average gradient was 0.57 ± 0.11, **in**- dicating that much of the drop from full to no recognition occurs for a small change at the MIRC level (the MIRC itself or one level above, where the gradient also was found to be high). The examples **in** Fig. 4 illustrate how small changes at the MIRC level can have a dramatic effect on recognition rates. These changes disrupt visual **features** to which the recognition system is sensi- tive (6–9); these **features** are present **in** the MIRCs but not **in** the sub-MIRCs. Crucially, the role of these **features** is revealed uniquely at the MIRC level, because information is more re- dundant **in** the full-object image, **and** a similar loss of **features** will have a small effect. By comparing recognition rates of models at the MIRC **and** sub-MIRC levels, we were able to test computationally whether current models of human **and** **computer** **vision** extract **and** use similar visual **features** **and** to test the ability of recognition models to recognize minimal images at a human level. The models **in** our testing included HMAX (10), a high-performing biological model of the primate ventral stream, along with four state-of-the-art **computer** **vision** models: (i) the Deformable Part Model (DPM) (11); (ii) support vector machines (SVM) applied to histograms of gradients (HOG) representations (12); (iii) extended Bag-of-Words (BOW) (13, 14); **and** (iv) deep convolutional networks (Methods) (15). All are among the top-performing schemes **in** standard evaluations (16).

En savoir plus
1.4. Related Work
1.4.1 Online Programming Platforms
Our system follows **in** the footsteps of Scratch [6]. We take the basic concept of Scratch, an online programming **and** sharing platform designed to bring programming to under-served youth populations, **and** rethink the concept to fit the needs of the budding **computer** **vision** hobbyist. Scratch requires users to download **and** install a local appli- cation. However, our system allows content authoring **in** the browser. Scratch uses Java as its publishing language, mak- ing it difficult to deploy camera-based applications since standard client installations do not include camera access **features**. **In** contrast, our system uses Flash, which includes camera access capabilities **in** the standard client installation. **In** addition to adding camera access, our system also allows programs to interact with services other than our own, al- lowing third party integration.

En savoir plus
14.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
14.1 Motivation
14.1.1 Efficiency **and** sparseness **in** biological representations of natural images
The central nervous system is a dynamical, adaptive organ which constantly evolves to provide optimal decisions 1 for interacting with the environment. The early visual pathways provides with a powerful system for probing **and** modeling these mechanisms. For instance, the primary visual cortex of primates (V1) is absolutely central for most visual tasks. There, it is observed that some neurons from the input layer of V1 present a selectivity for localized, edge-like **features** —as represented by their “receptive fields” (Hubel **and** Wiesel, 1968). Crucially, there is experimental evidence for sparse firing **in** the neocortex (Barth **and** Poulet, 2012; Willmore et al., 2011) **and** **in** particular **in** V1. A representation is sparse when each input signal is associated with a relatively small sub-set of simultaneously activated neurons within a whole population. For instance, orientation selectivity of simple cells is sharper than the selectivity that would be predicted by linear filtering. Such a procedure produces a rough “sketch” of the image on the surface of V1 that is believed to serve as a “blackboard” for higher-level cortical areas (Marr, 1983). However, it is still largely unknown how neural computations act **in** V1 to represent the image. More specifically, what is the role of sparseness —as a generic neural signature— **in** the global function of neural computations?

En savoir plus
∗ Corresponding author.
E-mail address: julien.maitre1@uqac.ca (J. Maitre).
**and** time consuming. Two approaches are typically used to identify **and** characterize minerals grains **in** sediments or milled rocks: visual sorting with optical microscopy **and** automated Scanning Electron Mi- croscopy (SEM) ( Gottlieb et al. , 2000 ; Sutherland **and** Gottlieb , 1991 ). Techniques such as chemical analysis **and** X-ray diffraction of sands or milled rocks will not provide a real mineral count. **In** the case of optical microscopy a highly qualified mineralogist will identify each individual mineral grain **in** a Petri dish at a typical rate of 60 grains per minute. It is a tedious work that needs lot of attention where any minute distraction can ruin a day’s work. Also, it provides grain percentage instead of area percentage ( Nie **and** Peng , 2014 ). The main drawbacks of the optical approach are the fatigue of highly qualified personnel leading to misidentification of minerals due to their lack of distinctive **features** **and** their small size. Alternatively, the SEM produces images of a mineral grains sample by scanning the surface with a focused beam of high-energy electrons to generate a variety of signals. Those signals are produced by electron-sample interaction **and** provide information such as the grain surface characteristics by secondary electrons (SE), its atomic density by backscattered electrons (BSE) **and**/or the chemical composition (from characteristic peaks **in** the X-ray spectrum). Mineral

En savoir plus
In the first phase, a sliding window technique based on three features is used to track the rodent and determine its coarse position in the frame.. The second phase uses the edge map and[r]

127 En savoir plus

Localization using Stochastic Modeling Batool & Chellappa [1, 2] were the first
to propose a generative stochastic model for wrinkles using Marked Point Processes (MPP). **In** their proposed model wrinkles were considered as stochastic spatial ar- rangements of sequences of line segments, **and** detected **in** an image by proper place- ment of line segments. Under Bayesian framework, a prior probability model dic- tated more probable **geometric** properties **and** spatial interactions of line segments. A data likelihood term, based on intensity gradients caused by wrinkles **and** high- lighted by Laplacian of Gaussian (LoG) filter responses, indicated more probable locations for the line segments. Wrinkles were localized by sampling MPP posterior probability using the Reversible Jump Markov Chain Monte Carlo (RJMCMC) al- gorithm. They proposed two MPP models **in** their work, [1] **and** [2], where the latter MPP model produced better localization results by introducing different movements **in** RJMCMC algorithm **and** data likelihood term. They also presented an evaluation setup to quantitatively measure the performance of the proposed model **in** terms of detection **and** false alarm rates **in** [2]. They demonstrated localization results on a variety of images obtained from the Internet. Figures 17 **and** 18 show examples of wrinkle localization from the two MPP models **in** [1] **and** [2] respectively.

En savoir plus
ε = − for double precision numbers. **In** both cases, with these numerical values, the collapsing effect disappears **and** the invariant measure of any component is the Lebesgue measure [11] as we show below. **In** the case of computation using floating points, starting form most initial condition, it is possible to find a Mega-Periodic orbit (i.e. with period equal to 1,320,752). When computations are done with double precision number it is not possible to find any periodic orbit, up to n = × 5 10 11 iterations. **In** [11] the computations have been performed on a Dell **computer** with a Pentium IV microprocessor using a Borland C compiler computing with ordinary (IEEE-754) double precision numbers.

En savoir plus
Formal approaches
VO is a particular instance of statistical estimation, where a quantity of interest, the state of the system, is involved **in** a criterion depending on some data (e.g., image **features**) **and** whose functional form derives from a statistical modeling of the various components (sensor noise, prior distribution on variables) **and** their relationships. Optimization of this criterion leads to the optimal estimate of the state given the data, with the (implicit) relationship between data **and** estimated state being referred to as the estimator. Modeling efforts allow the properties of the estimator to be theoretically characterized. Some properties concern the discrepancy between the estimated state **and** the true one, such as bias (e.g., systematic error) **and** variance (statistical dispersion). Bias **and** variance are usually associated with the performance of the estimation. They are, themselves, characterized by another level of properties, called structural properties. Efficiency refers to the optimality of bias **and** vari- ance for the problem at hand; i.e., that no other estimator can achieve lower values. Consistency expresses the fact that they correctly charac- terize the performance; that is to say, that the true state indeed lies within the interval of values defined by bias **and** variance. It clearly pertains to the safety of **vision**-based navigation: with a consistent estimator it is, for instance, possible to guarantee that the plane remains within some known bounds around the requested trajectory. Unfortunately, consis- tency is very difficult to assess for **vision**-based odometry or SLAM estimators. This is due to the non-linearity of the relationship between image data **and** state parameters. Also, as already mentioned, **vision** is prone to outliers, which are not accounted for **in** the problem modeling **and** lead to inconsistency. Hence, consistency is not a definitive answer to VO/SLAM safety issues, yet the vast literature on the subject includes relevant works; for instance, regarding consistency check techniques, **in** indoor environments. **In** addition, contrasts should be stable

En savoir plus
S3C is a useful feature extractor that performs comparably to the best approaches when large amounts of labeled data are available.
5.7.2 CIFAR-100
Having verified that S3C **features** help to regularize a classifier, we proceed to use them to improve performance on the CIFAR-100 dataset, which has ten times as many classes **and** ten times fewer labeled examples per class. We compare S3C to two other feature extraction methods: OMP-1 with thresholding, which Coates **and** Ng (2011) found to be the best feature extractor on CIFAR-10, **and** sparse coding, which is known to perform well when less labeled data is available. We evaluated only a single set of hyperparameters for S3C. For sparse coding **and** OMP-1 we searched over the same set of hyperparameters as Coates **and** Ng (2011) did: {0.5, 0.75, 1.0, 1.25, 1.25} for the sparse coding penalty **and** {0.1, 0.25, 0.5, 1.0} for the thresholding value. **In** order to use a comparable amount of computational resources **in** all cases, we used at most 1600 hidden units **and** a 3 ⇥ 3 pooling grid for all three methods. For S3C, this was the only feature encoding we evaluated. For SC (sparse coding) **and** OMP-1, which double their number of **features** via sign splitting, we also evaluated 2 ⇥2 pooling with 1600 latent variables **and** 3⇥3 pooling with 800 latent variables to be sure the models do not su↵er from overfitting caused by the larger feature set. These results are summarized **in** Fig. 5.9.

En savoir plus
165 En savoir plus

Abstract
We introduce an approach for analyzing the variation of **features** generated by convolutional neural networks (CNNs) with respect to scene factors that occur **in** natural images. Such factors may include object style, 3D viewpoint, color, **and** scene lighting configuration. Our approach analyzes CNN feature responses corresponding to different scene fac- tors by controlling for them via rendering using a large database of 3D CAD models. The rendered images are pre- sented to a trained CNN **and** responses for different layers are studied with respect to the input scene factors. We per- form a decomposition of the responses based on knowledge of the input scene factors **and** analyze the resulting compo- nents. **In** particular, we quantify their relative importance **in** the CNN responses **and** visualize them using principal component analysis. We show qualitative **and** quantitative results of our study on three CNNs trained on large image datasets: AlexNet [ 18 ], Places [ 40 ], **and** Oxford VGG [ 8 ]. We observe important differences across the networks **and** CNN layers for different scene factors **and** object categories. Finally, we demonstrate that our analysis based on **computer**- generated imagery translates to the network representation of natural images.

En savoir plus
Keywords: Fog, Meteorological Optical Range, Contrast, Visibility, Imaging, **Computer** **Vision**
1 Introduction
Fog is a quite common meteorological phenomenon. It happens **in** certain wind, temperature, **and** humidity conditions when vapour condenses into microscopic water droplets around airborne particles, causing the optical density of the atmosphere to rise dramatically. The net result is that visibility drops to levels where traffic becomes hazardous, with disrupting effects on ground (**and** other modes of) transport. Presently, the main solution to prevent such disruptions is to give warning to the drivers ahead of a foggy area, so that they can adapt their behaviour (Al-Ghamdi, 2007).

En savoir plus
enough (arithmetic roughness equal to 1.2nm over 1µm 1µm) **and** do not influence the current collection.
V. C ONCLUSION
**In** this work, the influence of tip-plane configuration, involved **in** AFM configuration measurements, on the electric field **in** thin dielectric layer is studied. Experimental **and** FEM results demonstrate that concerning the charge injection mechanism the radial electric field influences the charge lateral spreading whereas the axial electric field governs the amount of injected charges. Moreover, the nanostructured nature of the dielectric layer influences mainly the injection process. Concerning the C-AFM measurements, the macroscopic laws failed to interpret experimental results **and** new model needs to be developed to reproduce the real configuration (electric field heterogeneity distribution **and** influence of injected charge). Indeed, taking into account the electric field at the contact point is not enough to reproduce the real conditions **and** its distribution **in** the

En savoir plus
However, when a risk of distortions or cracks during quenching is detected for a given industrial component, numerical simulations are not systematically performed to quantify it. To the authors’ knowledge, the main reason is that the models involved require a very large amount of data, such as the phase transformation curves, mechanical properties of each constituent, heat exchange coefficients, etc. Moreover, the consequences of the use of any simplified set of data on the numerical results are not clearly understood. For exam ple, it is known that, during the cooling part of the quenching process, distortions occur at high temperature (at the begin ning of cooling), while residual stresses essentially develop at low temperature (at the end of cooling). Nevertheless, the influence of high **and** low temperature parameters on residual distortions **and** stresses, respectively, has not yet been quan tified. It turns out that, before any numerical simulation of a quenching process, a long time period is often necessary to

En savoir plus