Haut PDF Learning representations from functional MRI data

Learning representations from functional MRI data

Learning representations from functional MRI data

A B S T R A C T Thanks to the advent of functional brain-imaging technologies, cog- nitive neuroscience is accumulating maps of neural activity responses to specific tasks or stimuli, or of spontaneous activity. In this work, we consider data from functional Magnetic Resonance Imaging ( fMRI ), that we study in a machine learning setting: we learn a model of brain activity that should generalize on unseen data. After reviewing the standard fMRI data analysis techniques, we propose new meth- ods and models to benefit from the recently released large fMRI data repositories. Our goal is to learn richer representations of brain activ- ity. We first focus on unsupervised analysis of terabyte-scale fMRI data acquired on subjects at rest (resting-state fMRI ). We perform this anal- ysis using matrix factorization. We present new methods for running sparse matrix factorization/dictionary learning on hundreds of fMRI records in reasonable time. Our leading approach relies on introduc- ing randomness in stochastic optimization loops and provides speed- up of an order of magnitude on a variety of settings and datasets. We provide an extended empirical validation of our stochastic sub- sampling approach, for datasets from fMRI , hyperspectral imaging and collaborative filtering. We derive convergence properties for our algorithm, in a theoretical analysis that reaches beyond the matrix fac- torization problem. We then turn to work with fMRI data acquired on subject undergoing behavioral protocols (task fMRI ). We investigate how to aggregate data from many source studies, acquired with many different protocols, in order to learn more accurate and interpretable decoding models, that predicts stimuli or tasks from brain maps. Our multi-study shared-layer model learns to reduce the dimensionality of input brain images, simultaneously to learning to decode these im- ages from their reduced representation. This fosters transfer learning in between studies, as we learn the undocumented cognitive com- mon aspects that the many fMRI studies share. As a consequence, our multi-study model performs better than single-study decoding. Our approach identifies universally relevant representation of brain activity, supported by a few task-optimized networks learned during model fitting.
En savoir plus

183 En savoir plus

Operator-valued Kernels for Learning from Functional Response Data

Operator-valued Kernels for Learning from Functional Response Data

i=1 [(θ i + λ) −1 P n j=1 hz ij , y j iz i ] To put our algorithm into context, we remind that a crucial question about the applica- bility of functional data is how one can find an appropriate space and a basis in which the functions can be decomposed in a computationally feasible way while taking into account the functional nature of the data. This is exactly what Algorithm 1 does. In contrast to parametric FDA methods, the basis function here is not fixed in advance but implicitly defined by choosing a reproducing operator-valued kernel acting on both input and output data. The spectral decomposition of the block operator kernel matrix naturally allows the assignment of an appropriate basis function to the learning process for representing input and output functions. Moreover, the formulation is flexible enough to be used with different operators and then to be adapted for various applications involving functional data. Also, in the context of nonparametric FDA where the notion of semi-metric plays an important role in modeling functional data, we note that Algorithm 1 is based on computing and choosing a finite number of eigenfunctions. This is strongly related to the semi-metric building scheme in Ferraty and Vieu (2006) which is based on, for example, functional principal components or successive derivatives. Operator-valued kernels constructed from the covariance opera- tor (Kadri et al., 2013b) or the derivative operator will allow to design semi-metrics similar to those just mentioned. In this sense, the eigendecomposition of the block operator kernel matrix offers a new way of producing semi-metrics.
En savoir plus

55 En savoir plus

Learning from genomic data : efficient representations and algorithms.

Learning from genomic data : efficient representations and algorithms.

CHAPTER 1. INTRODUCTION a review). For example, in object recognition tasks, deep convolutional neural networks have largely superseded the manually engineered SIFT features. The representations learned by these deep convolutional architectures were shown to capture concepts such as edges or textures in the first layers, and objects or parts of objects such as eyes or cats in deeper layers. In NLP, repre- sentation learning techniques have also imposed their supremacy over more traditional models such as N-grams or latent semantic analysis (LSA). Today, Word2Vec [ Mikolov et al. , 2013 ] yields among the best vector representations for words. It refers to one of two models, the con- tinuous bag-of-words (CBOW) model or the skip-gram model. Both of these models are linear models whose architecture only contains one single hidden layer (no non-linearity applied). The CBOW model predicts which word is most likely to appear given a certain number of words which precedes and follows it in a sentence. By contrast, the skip-gram model predicts the words that are likely to surround a given word in a sentence. Both of these models take as input the one-hot encoding of words and implement a softmax regression to predict the output. The new vector representation of words learned by these models is the representation of words in the hidden layer. This representation was shown to be interesting since it not only encodes syntactic similarities between words but also semantic similarities. For example, simple arithmetic oper- ations such as Vec(‘King’) - Vec(‘Man’) + Vec(‘Woman’) yields a new vector which is closest to the vector that represents the word ‘Queen’ in the database. The representation learned with Word2Vec has been successfully used in various NLP tasks.
En savoir plus

145 En savoir plus

Non-invasive inference of information flow using diffusion MRI, functional MRI, and MEG

Non-invasive inference of information flow using diffusion MRI, functional MRI, and MEG

Our results on HCP data showed that the solutions found by using func- tional and diffusion MRI identify fewer cortical regions while still explaining the M/EEG data. It is important to recall that because there are fewer MEG measurements than cortical sources to estimate, recovering brain activity from MEG measurements is an ill-posed problem. As a direct consequence, infinitely many source configurations will explain the observed measurements. To resolve this ambiguity, our approach makes use of prior information from other modali- ties to select a single source configuration which is closest to the priors. In doing so, our approach also estimates white matter information flow, understood to be the posterior likelihood of a connection to be active, given the MEG mea- surements. With the addition of functional MRI, the information flow between cortical regions known to be involved in visuomotor tasks were identified with no manual selection of these regions of interest.
En savoir plus

20 En savoir plus

Kernel-based learning on hierarchical image representations : applications to remote sensing data classification

Kernel-based learning on hierarchical image representations : applications to remote sensing data classification

Figure 3.18: Classification accuracy w.r.t. different maximum considered subpath lengths P. SBoSK is computed on the Strasbourg Spot-4 image with D = 4096. known techniques for spatial/spectral remote sensing image classification. Spatial-spectral kernel [61] has been introduced to take into account pixel spectral value and spatial infor- mation through accessing the nesting region. We thus implement spatial-spectral kernel based on the multiscale segmentation commonly used in this paper, and select the best level (determined by a cross-validation strategy) to extract spatial information. Attribute profile [48] is considered as one of the most powerful techniques to describe image content through context feature. We use full multi-spectral bands with automatic level selection for the area attribute and standard deviation attribute as detailed in [75]. Stacked vector was adopted in [96, 25, 113], and relies on features extracted from a hierarchical representation. We use a Gaussian kernel with stacked vector that concatenates all nodes from ascending paths gen- erated from our multiscale segmentation. The comparison is done by randomly choosing n = [ 50, 100, 200, 400 ] samples for training and the rest for testing. The classification accu- racies with different methods are shown in Tab. 3.3. Three common accuracy assessment measures in the remote sensing community [15, 40] are reported here: overall accuracy, av-
En savoir plus

146 En savoir plus

Learning representations for Information Retrieval

Learning representations for Information Retrieval

Summary Information retrieval is generally concerned with answering questions such as: is this document relevant to this query? How similar are two queries or two doc- uments? How query and document similarity can be used to enhance relevance estimation? In order to answer these questions, it is necessary to access computa- tional representations of documents and queries. For example, similarities between documents and queries may correspond to a distance or a divergence defined on the representation space. It is generally assumed that the quality of the representation has a direct impact on the bias with respect to the true similarity, estimated by means of human intervention. Building useful representations for documents and queries has always been central to information retrieval research. The goal of this thesis is to provide new ways of estimating such representations and the relevance relationship between them. We present four articles that have been published in international conferences and one published in an information retrieval evaluation forum. The first two articles can be categorized as feature engineering approaches, which transduce a priori knowledge about the domain into the features of the rep- resentation. We present a novel retrieval model that compares favorably to existing models in terms of both theoretical originality and experimental effectiveness. The remaining two articles mark a significant change in our vision and originate from the widespread interest in deep learning research that took place during the time they were written. Therefore, they naturally belong to the category of representa- tion learning approaches, also known as feature learning. Differently from previous approaches, the learning model discovers alone the most important features for the task at hand, given a considerable amount of labeled data. We propose to model the semantic relationships between documents and queries and between queries them- selves. The models presented have also shown improved effectiveness on standard test collections. These last articles are amongst the first applications of representa- tion learning with neural networks for information retrieval. This series of research leads to the following observation: future improvements of information retrieval effectiveness has to rely on representation learning techniques instead of manually defining the representation space.
En savoir plus

159 En savoir plus

Learning from ranking data : theory and methods

Learning from ranking data : theory and methods

Whatever the type of task considered (supervised, unsupervised), machine-learning algorithms generally rest upon the computation of statistical quantities such as averages or linear combina- tions of the observed features, representing efficiently the data. However, summarizing ranking variability is far from straightforward and extending simple concepts such as that of an average or median in the context of preference data, i.e. ranking aggregation, raises a certain number of deep mathematical and computational problems, on which we focused on Part I . Regarding dimensionality reduction, it is far from straightforward to adapt traditional techniques such as Principal Component Analysis and its numerous variants to the ranking setup, the main barrier being the absence of a vector space structure on the set of permutations. In this chapter, we develop a novel framework for representing the distribution of ranking data in a simple manner, that is shown to extend, in some sense, consensus ranking. The rationale behind the approach we promote is that, in many situations encountered in practice, the set of instances may be partitioned into subsets/buckets, such that, with high probability, objects belonging to a certain bucket are either all ranked higher or else all ranked lower than objects lying in another bucket. In such a case, the ranking distribution can be described in a sparse fashion by: 1) a partial ranking structure (related to the buckets) and 2) the marginal ranking distributions associated to each bucket. Precisely, optimal representations are defined here as those associated to a bucket order minimizing a certain distortion measure we introduce, the latter being based on a mass transportation metric on the set of ranking distributions. In this chapter, we also establish rate bounds describing the generalization capacity of bucket order representations obtained by min- imizing an empirical version of the distortion and address model selection issues related to the choice of the bucket order size/shape. Numerical results are also displayed, providing in partic- ular strong empirical evidence of the relevance of the notion of sparsity considered, which the dimensionality reduction technique introduced is based on.
En savoir plus

210 En savoir plus

Learning Myelin Content in Multiple Sclerosis from Multimodal MRI through Adversarial Training

Learning Myelin Content in Multiple Sclerosis from Multimodal MRI through Adversarial Training

3.2 Qualitative Evaluation Figure. 3 shows the qualitative comparison of our prediction results, a 2-layer DNN as in [ 8 ] and a single cGAN (corresponding to the sketcher in our ap- proach) with corresponding input multimodal MRI and the true [ 11 C]PIB PET DVR parametric map. We can find that the 2-layer DNN failed to find the non- linear mapping between the multimodal MRI and the myelin content in PET. Especially, some anatomical or structural traces (that are not present in the ground truth) can still be found in the 2-layer-DNN predicted PET. This high- lights that the relationship between myelin content and multimodal MRI data is complex, and only two layers are not powerful enough to encode-decode it.
En savoir plus

9 En savoir plus

Radiological classification of dementia from anatomical MRI assisted by machine learning-derived maps

Radiological classification of dementia from anatomical MRI assisted by machine learning-derived maps

Materials and Methods. We studied 34 patients with early-onset Alzheimer’s disease (EOAD), 49 with late-onset AD (LOAD), 39 with frontotemporal dementia (FTD) and 24 with depression from the pre-existing cohort CLIN-AD. Support vector machine (SVM) automatic classifiers using 3D T1 MRI were trained to distinguish: LOAD vs Depression, FTD vs LOAD, EOAD vs Depression, EOAD vs FTD. We extracted SVM weight maps, which are tridimensional representations of discriminant atrophy patterns used by the classifier to take its decisions and we printed posters of these maps. Four radiologists (2 senior neuroradiologists and 2 unspecialized junior radiologists) performed a visual classification of the 4 diagnostic pairs using 3D T1 MRI. Classifications were performed twice: first with standard radiological reading and then using SVM weight maps as a guide.
En savoir plus

22 En savoir plus

Learning Disentangled Representations via Mutual Information Estimation

Learning Disentangled Representations via Mutual Information Estimation

1. Introduction Deep learning success involves supervised learning where massive amounts of labeled data are used to learn useful representations from raw data. As labeled data is not always accessible, unsupervised learning algorithms have been proposed to learn useful data representations easily transferable for downstream tasks. A desirable property of these algorithms is to perform dimensionality reduction while keeping the most important attributes of data. For in- stance, methods based on deep neural networks have been proposed using autoencoder approaches [ 13 , 16 , 17 ] or gen- erative models [ 1 , 6 , 10 , 18 , 20 , 25 ]. Nevertheless, learning high-dimensional data can be challenging. Autoencoders present some difficulties to deal with multimodal data dis- tributions and generative models rely on computationally demanding models [ 9 , 15 , 24 ] which are particularly com- plicated to train.
En savoir plus

10 En savoir plus

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures

Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures tors and thus an appropriate statistical model of realistic data. Targeting classification applications by assuming a mixture of concentrated random vectors model, this article stud- ies the spectral behavior of Gram matrices G in the large n, p regime. Precisely, we show that these matrices have asymptotically (as n, p → ∞ with p/n → c < ∞) the same first-order behavior as for a Gaussian Mixture Model (GMM). As a result, by generating images using the Big- GAN model ( Brock et al. , 2018 ) and considering different commonly used deep representation models, we show that the spectral behavior of the Gram matrix computed on these representations is the same as on a GMM model with the same p-dimensional means and covariances. A surprising consequence is that, for GAN data, the aforementioned sufficient statistics to characterize the quality of a given representation network are only the first and second order statistics of the representations. This behavior is shown by simulations to extend beyond random GAN-data to real images from the Imagenet dataset ( Deng et al. , 2009 ). The rest of the paper is organized as follows. In Section 2 , we introduce the notion of concentrated vectors and their main properties. Our main theoretical results are then pro- vided in Section 3 . In Section 4 we present experimental results. Section 5 concludes the article.
En savoir plus

10 En savoir plus

A shape base framework to segmentation of tongue contours from MRI data

A shape base framework to segmentation of tongue contours from MRI data

We briefly review previous work we believe to be the most relevant to the presented method. [1] proposed a popu- lar model that has been frequently used in speech processing. A PCA-guided articulatory model is built to control tongue shapes in 2D. [2] developed active contours that use a shape model defined by a PCA. The curve evolves locally based on image gradients and curvature, and globally towards the MAP estimate of position and shape of the object. [3] adopted an implicit representation of the segmenting curve and cal- culated pose parameters to minimize a region-based energy functional. In this paper, we introduce a robust variational framework for segmentation. Following the work in [4] and [5], we construct a total energy including both global and local image statistics. Shape priors are incorporated into segmentation via a PCA model. We describe this framework in section 2. The implementation details are discussed in section 3. In section 4, we present results obtained using the proposed framework, and make comparisons with other approaches. We conclude in section 5.
En savoir plus

6 En savoir plus

Reproducible evaluation of Alzheimer's Disease classification from MRI and PET data

Reproducible evaluation of Alzheimer's Disease classification from MRI and PET data

Rathore S., Habes M., Iftikhar M.A., Shacklett A., Davatzikos C. (2017), ‘A review on neuroimaging-based classification studies and associated feature extraction methods for Alzheimer's disease and its prodromal stages’, Neuroimage, vol. 155, pp. 530-548 Samper-González, J., Burgos, N., Fontanella, S., Bertin, H., Habert, M.-O., Durrleman, S., Evgeniou, T., Colliot, O., ADNI (2017), ‘Yet another ADNI machine learning paper? Paving the way towards fully-reproducible research on classification of Alzheimer’s disease’, Machine Learning in Medical Imaging, MLMI 2017, LNCS, vol. 10541, pp. 53–60
En savoir plus

6 En savoir plus

Learning Multicriteria Fuzzy Classification Method PROAFTN from Data

Learning Multicriteria Fuzzy Classification Method PROAFTN from Data

j (b h i ), j = 1,2..,m; h=1,2,. . . ,k and i= 1,2,. . . , Lh. When evaluating a certain quantity or a measure with a regular (crisp) inter- val, there are two extreme cases, which we should try to avoid. It is possible to make a pessimistic evaluation, but then the interval will appear wider. It is also possible to make an optimistic evaluation, but then there will be a risk of the output measure to get out of limits of the resulting narrow interval, so that the reliability of obtained results will be doubtful. Fuzzy intervals do not have these problems. They permit to have simultaneously both pessimistic and optimistic representations of the studied measure [22]. This is why we introduce the thresholds d 1
En savoir plus

20 En savoir plus

Frankenstein: Learning Deep Face Representations using Small Data

Frankenstein: Learning Deep Face Representations using Small Data

Although hand-crafted features and metric learning achieve promising performance for uncontrolled face recognition, it remains cumbersome to improve the design of hand-crafted local features (such as SIFT [25]) and their aggregation mechanisms (such as Fisher vectors [38]). This is because the experimental evaluation results of the features cannot be automatically fed back to improve the robustness to nuisance factors such as pose, illumination and expression. The major advantage of CNNs is that all processing layers, starting from the raw pixel-level input, have configurable parameters that can be learned from data. This obviates the need for manual feature design, and replaces it with supervised data-driven feature learning. Learning the large number of parameters in CNN models (millions of parameters are rather a rule than an exception) requires very large training datasets. For example, the CNNs which achieve state-of-the-art performance on the LFW benchmark are trained using datasets with millions of labeled faces: Facebook’s DeepFace [54] and Google’s FaceNet [40] were trained using 4 million and 200 million training samples, respectively.
En savoir plus

12 En savoir plus

Using partial correlation to enhance structural equation modeling of functional MRI data.

Using partial correlation to enhance structural equation modeling of functional MRI data.

The relationships between structural equation modeling and conditional cor- relation have been the topic of much research and involve graph theoretic con- cepts like morality and d-separation (Whittaker, 1990; Lauritzen, 1996; Pearl, 2001). Theoretical considerations led us to hypothesize that partial and, more generally, conditional, correlation coefficients could extract the (undirected) structure of effective connectivity from the data (Marrelec et al., 2005a). The analysis developed in this paper strongly supports this assumption. Indeed, while we demonstrated that a lack of partial correlation between two regions can potentially be related to a lack of underlying anatomical connection, the example used suggests that a strong and significant partial correlation can be interpreted as the presence of an effective connection. Whether this behavior is a general property of fMRI data or only incidental remains to be investigated. Nonetheless, we believe that partial correlation will prove essential to effective connectivity investigation, for it can compensate for some of the most impor- tant drawbacks from which SEM analysis suffers (i.e., difficulty to provide a structural model a priori and lack of control over the SEM algorithms and results).
En savoir plus

22 En savoir plus

Fuzzy Rule Learning for Material Classification from Imprecise Data

Fuzzy Rule Learning for Material Classification from Imprecise Data

Fig. 3: Screenshots of the different visualizations introduced in [2] to highlight the proximity, in terms of chemical composition, of the current voxel with voxels previously inspected and whose content is known (see figure 3). Thus, it is not a question of recognizing the materials present in the container but of displaying visually close and known containers in order to deduce the contents. This approach has the advantage of not requiring learning or parameteriza- tion since it relies on the manual selection of a neighborhood. In figure 3, we can also see two classical representations that have been used in conjunction with this method. These are projections of the current voxel on two triangles: a trian- gle called “materials triangle” indicates the proximity of the voxel with metals, ceramics and organic materials, while a so-called “alert triangle” presents the ratios between carbon, nitrogen and oxygen. This last triangle makes it possible to distinguish between drugs and explosives. The main drawback of the visual analytics approach is that the operator must be able to interpret the different representations himself.
En savoir plus

13 En savoir plus

Spatio-temporal wavelet regularization for parallel MRI reconstruction: application to functional MRI

Spatio-temporal wavelet regularization for parallel MRI reconstruction: application to functional MRI

periodic blocked design. However, this interleaved partial k -space sampling cannot be exploited in aperiodic dynamic acquisition schemes like in resting state fMRI (rs-fMRI) or during fast-event related fMRI paradigms [22, 23]. In rs-fMRI, spon- taneous brain activity is recorded without any experimental design in order to probe intrinsic functional connectivity [22, 24, 25]. In fast event-related designs, the presence of jittering combined with random delivery of stimuli introduces a trial-varying delay between the stimulus and acquisition time points [26]. This pre- vents the use of an interleaved k -space sampling strategy between successive scans since there is no guarantee that the BOLD response is quasi-periodic. Because the vast majority of fMRI studies in neurosciences make use either of rs-fMRI or fast event-related designs [26, 27], the most reliable acquisition strategy in such contexts remains the “scan and repeat” approach, although it is suboptimal. To our knowledge, only one kt-contribution ( kt -GRAPPA [19]) has claimed its ability to accurately reconstruct fMRI images in aperiodic paradigms.
En savoir plus

42 En savoir plus

Role of homeostasis in learning sparse representations.

Role of homeostasis in learning sparse representations.

Perrinet Institut de Neurosciences de la Timone UMR7289 CNRS / Aix-Marseille Universit´e — France e-mail: Laurent.Perrinet@univ-amu.fr http://invibe.net/LaurentPerrinet/Publications/Perr[r]

22 En savoir plus

Learning Obstacle Representations for Neural Motion Planning

Learning Obstacle Representations for Neural Motion Planning

Keywords: neural motion planning, obstacle avoidance, representation learning 1 Introduction Motion planning is a fundamental robotics problem [ 2 , 3 ] with numerous applications in mo- bile robot navigation [ 4 ], industrial robotics [ 5 ], humanoid robotics [ 6 ] and other domains. Sampling-based methods such as Rapidly Exploring Random Trees (RRT) [ 7 ] and Probabilistic Roadmaps (PRM) [ 8 ] have been shown successful for finding a collision-free path in complex en- vironments with many obstacles. Such methods are able to solve the so-called piano mover prob- lem [ 9 ] and typically assume static environments and prior knowledge about the shape and location of obstacles. In many practical applications, however, it is often difficult or even impossible to obtain detailed a-priori knowledge about the real state of environments. It is therefore desirable to design methods relying on partial observations obtained from sensor measurements and enabling motion planning in unknown and possibly dynamic environments. Moreover, given the high complexity de- voted to exploration in sampling-based methods, it is also desirable to design more efficient methods that use prior experience to quickly find solutions for motion planning in new environments. To address the above challenges, several works [ 10 , 11 , 12 , 13 , 14 , 15 ] adopt neural networks to learn motion planning from previous observations. Such Neural Motion Planning (NMP) methods either improve the exploration strategies of sampling-based approaches [ 13 ] or learn motion policies with imitation learning [ 12 , 15 ] and reinforcement learning [ 14 ]. In this work we follow the NMP paradigm and propose a new learnable obstacle representation for motion planning.
En savoir plus

11 En savoir plus

Show all 10000 documents...