Multilevel diffusion tensor imaging classification technique for characterizing neurobehavioral disorders

(1)

Article

Reference

Multilevel diffusion tensor imaging classification technique for characterizing neurobehavioral disorders

DALBONI DA ROCHA, Josue Luiz, et al.

Abstract

This proposed novel method consists of three levels of analyses of diffusion tensor imaging data: 1) voxel level analysis of fractional anisotropy of white matter tracks, 2) connection level analysis, based on fiber tracks between specific brain regions, and 3) network level analysis, based connections among multiple brain regions. Machine-learning techniques of (Fisher score) feature selection, (Support Vector Machine) pattern classification, and (Leave-one-out) cross-validation are performed, for recognition of the neural connectivity patterns for diagnostic purposes. For validation proposes, this multilevel approach achieved an average classification accuracy of 90% between Alzheimer's disease and healthy controls, 83%

between Alzheimer's disease and mild cognitive impairment, and 83% between mild cognitive impairment and healthy controls. The results indicate that the multilevel diffusion tensor imaging approach used in this analysis is a potential diagnostic tool for clinical evaluations of brain disorders. The presented pipeline is now available as a tool for scientifically applications in a broad range of studies [...]

DALBONI DA ROCHA, Josue Luiz, et al . Multilevel diffusion tensor imaging classification technique for characterizing neurobehavioral disorders. Brain Imaging and Behavior , 2020, vol. 14, no. 3, p. 641-652

DOI : 10.1007/s11682-018-0002-2

Available at:

http://archive-ouverte.unige.ch/unige:142502

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

ORIGINAL RESEARCH

Multilevel diffusion tensor imaging classification technique for characterizing neurobehavioral disorders

Josué Luiz Dalboni da Rocha^1,2 &Gabriel Coutinho³&Ivanei Bramati³&Fernanda Tovar Moll^3,4&

Ranganatha Sitaram^5,6,7

#Springer Science+Business Media, LLC, part of Springer Nature 2018

Abstract

This proposed novel method consists of three levels of analyses of diffusion tensor imaging data: 1) voxel level analysis of fractional anisotropy of white matter tracks, 2) connection level analysis, based on fiber tracks between specific brain regions, and 3) network level analysis, based connections among multiple brain regions. Machine-learning techniques of (Fisher score) feature selection, (Support Vector Machine) pattern classification, and (Leave-one-out) cross-validation are performed, for recognition of the neural connectivity patterns for diagnostic purposes. For validation proposes, this multilevel approach achieved an average classification accuracy of 90% between Alzheimer’s disease and healthy controls, 83% between Alzheimer’s disease and mild cognitive impairment, and 83% between mild cognitive impairment and healthy controls. The results indicate that the multilevel diffusion tensor imaging approach used in this analysis is a potential diagnostic tool for clinical evaluations of brain disorders. The presented pipeline is now available as a tool for scientifically applications in a broad range of studies from both clinical and behavioral spectrum, which includes studies about autism, dyslexia, schizophrenia, dementia, motor body performance, among others.

Keywords Diffusion tensor imaging . Fractional anisotropy . Fiber tracking . Graph theory . Machine learning

Introduction

Stochastic water displacement without any physical barrier obeys a three-dimensional Gaussian distribution (Einstein 1956). This property of identically displacing through the

three dimensions is known as isotropy, which indicates directionality independence. However, inside the brain, water molecules are immersed within the neuronal microenvironment and then eventually must cross different types of biological tissue, in both white and gray matter. These tissues form ob- stacles and cause drastic reduction in water diffusion in specific directions. Diffusion becomes an anisotropic process (Pierpaoli and Basser 1996) involving directionality depen- dence. In the brain environment, the highest anisotropy occurs in the axons (Varkuti et al.2011), due to free movement in a direction parallel to the axon and strong restriction in directions perpendicular to it.

Diffusion tensor imaging (DTI) can estimate diffusion of water molecules within brain tissues by magnetic resonance imaging (MRI), in terms of intensity and vectorial direction, represented by a tensor (Bihan and Breton 1985). The data acquisition consists of a non-diffusion weighted image, known as B0 image, and a defined number of diffusion weighted images (DWI), each one corresponding to a different gradient direction. From those images, it is possible to calcu- late three eigenvalues (λ1,λ2,λ3) per volumetric picture ele- ment (voxel), whose values indicate the diffusion intensities in the direction of each one of the three dimensions represented

* Ranganatha Sitaram [email protected]

1 Brain and Language Lab, Department of Clinical Neuroscience, University of Geneva, Geneva, Switzerland

2 Department of Biomedical Engineering, University of Florida, Gainesville, USA

3 D’Or Institute for Research and Education, Rio de Janeiro, Brazil

4 Federal Univerisity of Rio de Janeiro, Rio de Janeiro, Brazil

5 Institute for Biological and Medical Engineering, Schools of Engineering, Biology and Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile

6 Department of Psychiatry and Section of Neuroscience, School of Medicine, Pontificia Universidad Católica de Chile, Santiago, Chile

7 Laboratory for Brain-Machine Interfaces and Neuromodulation, Pontificia Universidad Católica de Chile, Santiago, Chile Published online: 5 December 2018

(3)

by the respective eigenvectors (v1,v2,v3). Fractional anisotropy is a scalar measure varying from 0 to 1, and it quantifies the local anisotropy of a diffusion process. Voxel-based fractional anisotropy is calculated from the diffusion eigenvalues, according to Eq.1(Basser and Pierpaoli1996).

FA¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi λ1−λ2

ð Þ²þðλ3−λ2Þ²þðλ1−λ3Þ² q

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2∙ðλ1þλ2þλ3Þ

p ð1Þ

If all the eigenvalues are equal, as for isotropic diffusion, the fractional anisotropy is 0. The fractional anisotropy is equal to 1 when the diffusion occurs exclusively in the direction of a unique eigenvector (Basser and Pierpaoli1996). In the ventricles of the brain, where the cerebrospinal fluid has almost a free movement, the fractional anisotropy is close to 0.

White matter has fractional anisotropy values higher than gray matter. Reasons for that are the higher length of the axons and their bundle parallelism in white matter tissues. Damage to axons in white matter breaks down the organized parallel structure and promote a decrease in the local fractional anisotropy values. Therefore, white matter has been recognized as the best region to study fractional anisotropy values, providing an in vivo marker of cerebral integrity (Varkuti et al.2011).

Voxel-based fractional anisotropy dataset allows the composition of three-dimensional fractional anisotropy maps of the brain.

Based on this association, a three-dimensional modeling technique called tractography (or fiber tracking) has been de- veloped (Yeh et al.2013). This technique tracks connection fibers along a curve, whose tangential vector at voxel is the eigenvector corresponding to the highest eigenvalue at the location. The track is built along the direction of the tensor.

When the fractional anisotropy on crossing voxels exceeds a pre-defined threshold value, the track is extended in the pre- vious direction it was drawn along. The biophysical validation for the existence of these bundles of nerves has been revealed by post-mortem dissections (Lawes et al.2008).

Discrete tractography allows the composition of connectivity graphs, where nodes are brain regions and edges are the bundles of nerves connecting those regions (Gong et al.2008).

From such a connectivity graph, a connectivity matrix is extracted, in which the rows and columns of the matrix represent the brain regions according to a standard brain atlas, and each cell of the matrix represents the number of connections between the brain regions represented by the corresponding row and column. This connectivity graph does not specify the direction (from each node the connection is starting or to where it is going) on a connection between two nodes (Rosen2006).

The connectivity graph can also be analyzed by network measures, such as degree, betweenness centrality, clustering

coefficient, efficiency, and vulnerability. The degree of a node is the total number of direct connections to other nodes (Sporns2003). Betweenness centrality (Barthelemy2004) indicates the level of centrality of a node. It is the fraction of paths in the network which pass through the given node (among all the shortest-paths between every two other nodes existing in the whole graph). A node with a higher betweenness centrality is more participative in the optimal iteration of the network and its damage would affect the transfer follows on more shortest-paths. Depending on the availability and quality of paths to substitute the affected short-path (what is not measured by betweenness centrality), the damage may lead to relevant alterations in the global network operation.

The local clustering coefficient of a node quantifies the fraction of direct connections between the nearest neighbor nodes that effectively exist among all possible direct connections amongst those nearest neighbor nodes that could exist (Fagiolo2007). The efficiency of a node estimates how easily data is transferred from/to other nodes, by the inverse of the harmonic mean of the distances from every other node (Latora and Marchiori 2001). The vulnerability of a node in a network measures how strongly the average of efficiencies among all the other nodes of the whole network decreases when that node is removed (Varkuti et al. 2011). If a node with high betweenness centrality, high degree, and low clustering coefficient is damaged, a central hub is removed, promoting less efficiency, which characterizes high node vulnerability. If a network is fully connected (where all nodes are connected to all other nodes), degree, efficiency and clustering coefficient of all nodes are extremely high, but vulnerability is extremely low since a damage in a node does not affect the efficiency of other nodes, due to the existence of parallel pathways.

As classification of single level DTI neuroscientific datasets has been shown to be a challenging pattern recognition issue (Dyrba et al.2013; Li et al.2014a; Demirhan et al.

2015; Ebadi et al.2017), we propose the use of a multilevel DTI classification technique for characterizing neurobehavioral disorders, for providing to the classifier further opportuni- ties to identify patterns on different levels of the data. A multilevel ECG approach has been proposed (Li et al.2014b) using a five-level signal quality classification algorithm, as well as multimodal approaches (Hong et al. 2017; Zurita et al. 2018), but no multilevel DTI classification approach has been proposed yet.

Methods

The multilevel approach for DTI analyses (Fig.1) intends to discriminate classes of subjects performing binary classification based on input feature sets from three different levels:

(4)

Preprocessing

B0 image DWI T1 image

Reconstruction

MD map FA map Eigenvector

Probabilistic tracking

Brain connection

fibers Connectivity

matrix Graph metrics

Leave-one-out Cross-validation

Feature selection

Voxel Connection Network

Classifier training Classification

Accuracy / Sensitivity / Specificity

Fig. 1 Dataflow of the multilevel approach for DTI analyses

(5)

voxel level, connection level, and network level. The dataflow consists of three steps: preprocessing, processing and cross- validation. Preprocessing is composed of realignment, coregistration and normalization, performed on SPM8 (Ashburner et al.2010), followed by segmentation, performed on DSI Studio (Yeh et al. 2013). Processing includes reconstruction, probabilistic tracking, connectivity matrix generation (by nodal discretization) and calculation of graph theory measures. Leave-one-out cross-validation approach is then performed including feature selection and classification. Output labels obtained on classification results are compared with input labels to reveal accuracy levels.

Data preprocessing

Due to head movement during the experiment, some of the images may be acquired in the wrong position. Realignment corrects for motion across each session of B0 and DWI acquisition on an individual subject. Each time-series of DWI images is realigned to each respective first volume image, known as the reference image to which all subsequent scans are realigned. For removal of movement-related artifacts, the rou- tine realigns images acquired from the same subject by trans- lation and rotation, using a least squares approach and a 6- parameter rigid body spatial transformation, on SPM8 (Ashburner et al.2010), with acceptable movement up to 6 mm. The voxel values of these images are adjusted within the general linear model to discount movement-related components (Friston 1996).

Since the B0 and DWI images have a much lower spatial resolution than the structural spin-lattice relaxation time (T1) weighted images, coregistration of realigned B0 and DWI images to T1 images is performed. The within-subject voxel similarity based coregistration is performed by rigid body transformation (Huettel et al.2004). The reference image is the image that is assumed to remain stationary (also known as target or template image). The source image is the image that is moved about to best match the reference image. Other images denote all the other images that need to remain in align- ment with the source image and for that, they are submitted to the same translational and rotational transformations. The images are resliced to match voxel-for-voxel with the reference image as regards to the defined space.

The realigned and coregistered B0 and DWI images can then be used for calculation of fractional anisotropy values, as described in the voxel level processing section below. The normalization step warps each brain to the Montreal Neurological Institute (MNI) template (Ashburner et al.

2010). The transformation matrix (set of warps) used to nor- malize the T1 image (source image) to MNI space (template image) is applied for fractional anisotropy maps (which are produced during the voxel level processing, see next section)

of the respective subject. Normalization is performed to allow multisubject voxel level analysis on normalized fractional anisotropy images, while the fractional anisotropy images used for tractography on connection level analysis remains unnormalized.

Segmentation is performed in three steps. Firstly, to select the voxels which are inside the brain. For this, an intensity threshold is applied to the T1 image, to separate what is brain and what is not. The possible discarded voxels (due to the threshold) which are inside the brain are later reintegrated to the brain selection by a dilation and erosion procedure. The second segmentation step separates the T1 image into white matter, grey matter and cerebrospinal fluid. Finally, in the third step grey matter is segmented into Brodmann Areas (Yeh and Tseng2011).

Voxel level processing

Based on B0 and DWI images, reconstruction is performed on each subject in a software called DSI Studio (Yeh et al.2013) to compute tridimensional diffusion at voxel resolution, com- puting for each voxel the three eigenvalues and their respective eigenvectors. After that, mean diffusivity, fractional anisotropy and the main eigenvector (eigenvector associated to the highest eigenvalue) is calculated at the voxel level. This fractional anisotropy data output from DSI Studio is then load- ed into the software Matrix Laboratory (MATLAB) as the input features for further calculation, allowing comparison among subjects by Leave-one-out cross-validation at the voxel level.

Connection level processing

Whole brain deterministic tractography is conducted using DSI Studio (Yeh et al.2013), on the unnormalized fractional anisotropy image. This procedure uses a fractional anisotropy threshold equal to 0.1, a maximum angle equal to 60 degrees, step size equal to 1.25 mm, length constraint from 25 mm to 100 mm, and no spatial smoothing. Once the tractography is performed, a brain connectivity matrix is extracted with the specified nodes. In our approach, each Brodmann area is represented by a node. Brodmann areas are human cortical brain regions with specific localization, structure, and organization of cells (Brodmann1909). The idea is to evaluate interregional gray matter connectivity through white matter pathways, based on the number of fibers connecting different Brodmann areas. The connectivity matrix of each subject provided as output by DSI studio is then imported to MATLAB as the input feature for Leave-one-out cross-validation, allowing machine learning classification of subjects at connectivity level.

(6)

Network level processing

Graph measures are extracted from the brain connectivity ma- trices (where each Brodmann area is a node) into the MATLAB environment. Thereby, degree, clustering coefficient, efficiency, betweenness centrality and vulnerability of each individual brain connectivity matrix are calculated and provided as the input features for subject classification at the network level.

Leave-one-out cross-validation

For maximization of the number of subjects for the classifier training dataset, the Leave-one-out cross-validation has been recognized as the most suitable approach (Radmacher et al.

2002). In this approach, classification is performed inniterations, wherenis the number of samples (in this case subjects) from each class. Each iteration is divided into three steps:

feature selection, classifier training, and classification.

Feature selection and classifier training are performed using n-1 subjects per class, leaving out one subject per class for subsequent classification. This process is donentimes to ap- ply classification on all subjects (Fig.2).

This process results in an output classification label for each subject, which could be positive (+1) or negative (−1).

After that, this output is compared with the input label (neu- ropsychological diagnosis) to measure the success level of the approach. If the output label from the classifier is equivalent to the input label, the classification is recognized as suc- cessful for that individual iteration. The accuracy of classification is the percentage of correct output labels for all subjects from both positive and negative input classes. The sensitivity is the percentage of positive input labels correctly identified by the output labels, and specificity is the percentage of negative input labels correctly recognized as such.

Feature selection focuses here on finding best voxel sites to increase classification accuracy. Feature selection on the input dataset is performed using a high-pass filter approach based on the Fisher score. Classification (as well as classifier training) is performed by using a linear Support Vector Machine.

Fisher score

Given a set of n data points (subjects) with label xi;y_i

f gⁿ_i¼1; y∈f1;⋯;cg, wherexis the feature to be scored, yis the input label. Letnirepresent the number of data points Fig. 2 Representation of the

Leave-one-out cross-validation.

This cross-validation approach is performed inniterations, where n is the number of samples. Each iteration performs classification on only one sample (testing data - gray square), based on parameters decided by feature selection and training applied only on the other samples (training data - white squares) and leaving one (the testing sample) out

(7)

(subjects) in classi. Letμiandσibe the mean and standard deviation of class i on the evaluated feature. Let μ and σ represent the mean and the standard deviation of the whole feature dataset. Then, Fisher score for each feature is defined by the Eq. 2 (He et al. 2005). Basically, the idea is to find those features that distinguish the classes the most (Jin et al. 2009), based on high interclass deviations and low intraclass variations. A good feature has a large separation between the class averages and high uniformity within each class.

FS¼∑^c_i¼1n_i∙ðμ_i−μÞ²

∑^c_i¼1ni∙σi2 ð2Þ

Basically, the idea is to find those features that distinguish the classes the most (Jin et al.2009), based on high interclass deviations and low intraclass variations. A good feature has a large separation between the class averages and high uniformity within each class. This measure is expected to work well when the data is normally distributed within each class. On the other hand, if the data is not normally distributed, this score can fail.

When the numbers of training samples per class are the same, the Fisher score can be represented by the Eq.3:

FS¼∑^c_i¼1ðμ_i−μÞ²

∑^c_i¼1σi2 ð3Þ

In the special case of two classes with the same number of training data, Fisher score is represented by the Eq. 4.

FS¼ ðμ1−μ2Þ²

2∙ðσ12þσ22Þ ð4Þ

Linear support vector machine

Support Vector Machine, one of the most popular machine learning techniques, can discriminate between classes of patient populations based on features from the input dataset. Support Vector Machine discriminates data points through the division of the feature space into two domains by a surface and the assignment of each space domain to one class. In the linear Support Vector Machine, this surface is a hyperplane. This hyperplane is represented in the Eq. 5 (Vapnik and Lerner 1963), where w and x are vectors in the hyperspace dimension.

Fig. 3 Example of the binary majority function decision tree used for multilevel analysis

(8)

f xð Þ ¼w∙xþb¼0 ð5Þ The two regions separated by the hyperplane H0arew∙x+ b> 0 andw∙x+b< 0. Those regions represent the two classes in the Support Vector Machine classification. Conventionally, they are called negative class (−1) and positive class (+1), according to the Eq.6(Vapnik and Lerner1963).

g xð Þ ¼ −1; w∙xþb< 0 þ1; w∙xþb>0 (

ð6Þ

Multilevel analysis

The combination of the three levels (voxel, connection, and network) is performed by binary majority function decision tree (Becker and Drechsler1998), based on the output of the Support Vector Machine classification for each participant on each level (Fig. 3). The multilevel is positive for a subject whenever the majority of the Support Vector Machine output levels is positive for that subject. Whenever the majority is negative, the multilevel output is negative. Positive and negative classes represent the binary classification. Table1con- tains the outcome of the multilevel analysis, according to possible the voxel, connection, and network binary incomes.

The results of the multilevel analysis are presented in terms of classification accuracy, confusion matrix, as well as permutation-basedp value and confidence interval (Cumming and Calin-Jageman2017) for the achieved accuracy. The one-tail permutation-based p value (Good 2000) is calculated as the proportion of samples (also including the sample with original labels, for statistical reasons) with Table 1 Outcome of the multilevel analysis, according to the binary

incomes from each level

Income Outcome

Voxel level Connection level Network level Multilevel Analysis

Positive Positive Positive Positive

Positive Positive Negative Positive

Positive Negative Positive Positive

Positive Negative Negative Negative

Negative Positive Positive Positive

Negative Positive Negative Negative

Negative Negative Positive Negative

Negative Negative Negative Negative

Fig. 4 The simplified data flow of the multilevel DTI approach applied for Alzheimer’s disease diagnosis

(9)

randomly permuted labels whose achieved accuracy is higher or equal to the sample with original labels. The user chooses the number of permutation, but we recommend the use of at least 30 permutations, to test the significance level of 5%

(Ojola and Garriga2010). This approach adopts the significance level of 95% for the confidence interval (CI).

Validation

Overview of Alzheimer’s disease

DTI is a promising imaging technique for early diagnosis of Alzheimer’s disease and mild cognitive impairment. Recent machine learning approaches for discrimination between Alzheimer’s disease and controls using fractional anisotropy values at voxel resolution as the input features have attained classification accuracies on the range of 75%–88% accuracy (Dyrba et al.2013; Li et al.2014a; Demirhan et al. 2015).

Another machine learning approach (Ebadi et al.2017) used graph measures and achieved an accuracy classification of 80% between Alzheimer’s disease and healthy controls, and 83% between Alzheimer’s disease and mild cognitive impairment patients.

Following the hypothesis that DTI measures are use- ful for early diagnosis of dementia, we applied this multilevel algorithm to discriminate Alzheimer’s disease patients, mild cognitive impairment patients, and healthy control volunteers. The multilevel approach for DTI analyses evaluated whether feature sets obtained from analysis of fractional anisotropy (voxel level), axonal tractography (connection level), graph theory (network level) are good discriminators of these three classes of subjects (Fig.4).

Validation methodology Data acquisition

Forty-five adults were recruited for DTI data acquisition (including fifteen Alzheimer’s disease patients, fifteen mild cognitive impairment, and fifteen healthy volunteers). Healthy volunteers were selected by matching age and years of education (Table 2).

Validation results Voxel level

Fractional anisotropy was extracted and used as the feature for linear Support Vector Machine classification among the three classes in the study. We performed feature selection, selecting the most discriminant voxels based on the Fisher score between healthy and Alzheimer’s disease subjects (according to the Leave-one-out cross-validation approach). The set of 80 voxels with higher Fisher score obtained the highest classification level, with an accuracy of 80%

(CI: ± 14%) between Alzheimer’s disease and controls (Fig. 5). Using this same set of voxels, classification accuracy between Alzheimer’s disease and mild cognitive impairment patients was 67% (CI: ± 17%), and between healthy volunteers and mild cognitive impairment patients was 53% (CI: ± 18%).

Connection level

Deterministic tractography was performed considering the whole brain as seed. After that, a brain connectivity matrix was extracted considering each Brodmann area as a node. This connectivity matrix was used as the input feature for classification of subjects. Applying Fisher score feature selection (Top 1, 2, 5, 10, 20, 50 and 100 edges), the top 1 edge obtained the highest classification, reaching 83% (CI: ± 13%) for Alzheimer’s disease patients versus healthy controls, 73% (CI: ± 16%) for mild cognitive impairment versus healthy controls, and 70% (CI: ± 16%) for Alzheimer’s disease versus mild cognitive impairment patients (Fig. 6).

Network level

Five graph measures (degree, betweenness centrality, clustering coefficient, efficiency, and vulnerability) were extracted from each node as features for subject categorization by linear Support Vector Machine. After application of Fisher score feature selection (top 1, 2, 4, 8 and 16 features), the highest classification level was 80% (CI: ± 14%) for Alzheimer’s disease patients versus healthy controls, as well as for both mild

Table 2 Adults in this study

Group Healthy controls Mild cognitive impairment Alzheimer’s disease

Number of adults 15 15 15

Sex (females / males) 11 / 4 10 / 5 9 / 6

Age 74.6 ± 6.9 74.3 ± 6.8 74.5 ± 6.5

Years of education 12.0 ± 4.1 11.9 ± 5.0 12.1 ± 4.3

(10)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

All voxels

1280 voxels

640 voxels

320 voxels

160 voxels 80 voxels

40 voxels 20 voxels

10 voxels

VOXEL LEVEL

AD vs Controls MCI vs Controls AD vs MCI Fig. 5 Linear Support Vector

Machine accuracy based on fractional anisotropy for different numbers of select voxels by Fisher score

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 edge

2 edges

5 edges

10 edges

20 edges 50 edges

100 edges

All edges

CONNECTION LEVEL

Machine accuracy for different numbers of connections on sets of different numbers of edges selected by top-scoring Fisher score

(11)

cognitive impairment versus Alzheimer’s disease patients, and mild cognitive impairment patients versus healthy controls (Fig.7).

Multilevel analysis

Binary majority function decision tree was performed considering the Support Vector Machine output classification from voxel level (top 80 voxels), connection level (top 1 edge) and network level (top 4 features extracted from all 5 graph measures). The one-tail permutation-basedpvalue was calculated by performing 100 permutations. Linear Support Vector Machine applied across Leave-one-out cross-validation achieved accuracy equal to 90% (p value <0.01; CI = ± 11%) for Alzheimer’s disease patients in contrast to healthy controls. For mild cognitive impairment patients versus healthy controls, as well as for Alzheimer’s disease versus mild cognitive impairment patients, accuracy was 83% (p value <0.01;

CI = ± 13%).

Comparison between single-level and multilevel approaches The multilevel approach achieved an average accuracy of 86%

considering all the binary classifications performed, while the

single-level approaches achieved an average accuracy of 74%

(voxel level: 67%, connection level: 76%, network level: 80%) by also considering all the binary classifications performed.

Based on the z-score test for comparing accuracies (Johnson and Freund2011), the multilevel approach performed significantly better than the average single-level approaches (Z-score = 2.34; p value = 0.010). Specifically, multilevel approach performed significantly better than voxel (Z-score = 3.01; p value = 0.001) and connection (Z-score = 1.71; p value = 0.044) levels.

However, the multilevel approach did not show significant improvement when compared with the network level (Z-score = 1.07; p value = 0.142) approach.

Conclusion

This new multilevel approach for DTI analyses is based on three levels of analyses of DTI data:

1) Voxel level analysis of fractional anisotropy of white matter tracks.

2) Connection level, based on fiber tracks between specific brain regions.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 feature

2 features

4 features

8 features 16 features

All features

NETWORK LEVEL

Machine accuracy for all 5 proposed measures on sets of different numbers of nodes selected by top-scoring Fisher score

(12)

3) Network level, based connections among multiple brain regions.

The combination of the three levels (voxel, connection, and network) is performed by binary majority function decision tree. This pipeline is now available for non-invasive measure- ment of brain connectivity. In this way, this technique can be applied for diagnosis of neural pathologies, e.g. discrimination among Alzheimer’s disease patients, mild cognitive impairment patients, and healthy individuals.

This multilevel approach for DTI analysis was applied to pairwise classification among Alzheimer’s disease patients, mild cognitive impairment patients, and healthy control volunteers. Classification accuracy reached up to 90% between Alzheimer’s disease patients and healthy control volunteers, up to 83% between Alzheimer’s disease patients and mild cognitive impairment patients, and up to 83% between mild cognitive impairment patients and healthy control volunteers.

These results indicate that DTI approaches are potentially good diagnostic tools for helping clinical evaluation of brain disorders. In this way, the obtained result supports the validation of the presented approach.

For the specific data sample in analyzed in this study, we performed binary classification among Alzheimer’s disease patients, mild cognitive impairment patients, and healthy controls. The accuracy achieved on the multilevel approach is significantly higher than the average accuracy achieved on the single-levels. While in this study, Alzheimer’s disease and Mild Cognitive Impairment have shown greater discriminability at the voxel level, other types of brain disorders such as Multiple Sclerosis (Zurita et al. 2018) may show greater differences at the connection and network levels. Hence, the etiology of the disease and its influence on the brain connections and networks may determine which level of analysis is more important in its diagnosis. Hence, having all levels of analysis, starting from the voxel-level to the connection and network level in one analysis pipeline may be advantages for effective classification.

However, this approach contains limitations and function- alities are open for future improvement. For example, the use of Fisher score and Support Vector Machine inside the Leave- one-out cross-validation may not be the best approach for machine learning classification of multilevel DTI data on each specific study, considering the range of different types of subjects’conditions and diseases that could be studied. Moreover, the current pipeline is designed to perform binary classification, and a further improvement will allow it to perform multiclass assignments.

Nevertheless, the current approach is now available as a tool for scientifically applications in both clinical and behavioral studies, which includes studies about autism, dyslexia, dementia, schizophrenia, motor body performance, among

others. Moreover, the range of applications can also be extended to studies about aptitude for musical instrument and singing performance, language or math learning. Hence, interested users can use this multilevel DTI freeware on their DTI data by download the script pipeline available online on ‘https://osf.io/bgfer/’, whose link is stored on the

‘Open Science Framework’data sharing platform (Foster and Deardorff2017).

Acknowledgements The senior author of this study was supported by the Indigo Project FKZ 01DQ13004, and Fondecyt Regular projects number 1171313 and number 1171320.

Compliance with ethical standards

Ethical approval All procedures involving human participants were in accordance with the ethical standards of the institutional research com- mittee and with the 1964 Helsinki declaration.

Conflict of interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s Note Springer Nature remains neutral with regard to juris- dictional claims in published maps and institutional affiliations.

References

Ashburner, J., Barnes, G., Chen, C., et al. (2010). SPM 8 Manual.

https://lsa.umich.edu/psych/danielweissmanlab/downloads/

spm8_manual.pdf. Accessed 09 September 2018.

Barthelemy, M. (2004). Betweenness centrality in large complex networks.The European Physical Journal B, 38(2), 163–168.

Basser, P. J., & Pierpaoli, C. (1996). Microstructural and physiological features of tissues elucidated by quantitative-diffusion-tensor MRI.

Journal of Magnetic Resonance. Series B, 111, 209–219.

Becker, B., & Drechsler, R. (1998).Binary decision diagrams: Theory and implementation. Springer. isbn:978-1-4419-5047-5.

Bihan, D. L., & Breton, E. (1985). Imagerie de diffusion in-vivo par résonance.Comptes rendus de l'Académie des Sciences., 301(15), 1109–1112.

Brodmann, K. (1909). Vergleichende Lokalisationslehre der Grosshirnrinde. Leipzig: Johann Ambrosius Barth.

Cumming, G., & Calin-Jageman, R. (2017).Introduction to the new statistics. New York: Routledge.

Demirhan, A., Nir, T. M., Zavaliangos-Petropulu, A. (2015). Feature selection improves the accuracy of classifying Alzheimer disease using diffusion tensor images.IEEE 12th International Symposium on Biomedical Imaging, New York.https://doi.org/10.1109/ISBI.

2015.7163832.

Dyrba, M., Ewers, M., Wegrzyn, M., Kilimann, I., Plant, C., Oswald, A., Meindl, T., Pievani, M., Bokde, A. L. W., Fellgiebel, A., Filippi, M., Hampel, H., Klöppel, S., Hauenstein, K., Kirste, T., Teipel, S. J., &

the EDSD study group. (2013). Robust automated detection of classification of multicenter DTI data. PLoS One, 8(5), e64925. https://doi.org/10.1371/journal.pone.0064925.

Ebadi, A., Dalboni da Rocha, J. L., Nagaraju, D. B., Tovar-Moll, F., Bramati, I., Coutinho, G., Sitaram, R., & Rashidi, P. (2017).

Ensemble classification of Alzheimer's disease and mild cognitive impairment based on complex graph measures from diffusion tensor

(13)

images.Frontiers in Neuroscience, 11, 56.https://doi.org/10.3389/

fnins.2017.00056.

Einstein, A. (1956).Investigations on the theory of Brownian motion.

New York: Dover.

Fagiolo, G. (2007). Clustering in complex directed networks.Physical Review E, 76(2), 026107.

Foster, E. D., & Deardorff, A. (2017). Open science framework (OSF).

Journal of the Medical Library Association, 105(2), 203–206.

https://doi.org/10.5195/jmla.2017.88.

Friston, K. J. (1996). Statistical parametric mapping and other analysis of functional imaging data.Brain Mapping: The Methods, pages 363– 385. Academic Press.

Gong, G., He, Y., Concha, L., Lebel, C., Gross, D., Evans, A., &

Beaulieu, C. (2008). Mapping anatomical connectivity patterns of human cerebral cortex using in vivo diffusion tensor imaging tractography.Cerebral Cortex, 19(3, 1 March 2009), 524–536.

https://doi.org/10.1093/cercor/bhn102.

Good, P. I. (2000).Permutation tests: A practical guide to resampling methods for testing hypotheses, springer series in statistics(Vol. 2).

Springer.

He, X., Cai, D., & Niyogi, P. (2005). Laplacian score for feature selection.

Advances in Neural Information Processing Systems, (pp. 507–514).

Hong, S., Bernhardt, B. C., Caldairou, B., Hall, J. A., Guiot, M. C., Schrader, D., Bernasconi, N., & Bernasconi, A. (2017).

Multimodal MRI profiling of focal cortical dysplasia type II.

Neurology, 88(8), 734–742. https://doi.org/10.1212/WNL.

0000000000003632.

Huettel, S. A., Song, A. W., & McCarthy, G. (2004).Functional magnetic resonance imaging. Sunderland, Massachusetts: Sinauer Associates Publishers.

Jin, B., Strasburger, A., Laken, S. J., Kozel, F. A., Johnson, K. A., et al.

(2009). Feature construction and selection for fMRI-based deception detection.BMC Bioinformatics, 10(Suppl 9), S15.

Johnson, R., & Freund, J. (2011).Miller and Freund’s probability and statistics for engineers(8th ed.) Prentice Hall International.

Latora, V., & Marchiori, M. (2001). Efficient behavior of small-world networks.Physics Review, 87, 198701.

Lawes, I. N., Barrick, T. R., Murugam, V., Spierings, N., Evans, D. R., et al. (2008). Atlas-based segmentation of white matter tracts of the human brain using diffusion tensor tractography and comparison with classical dissection.Neuroimage, 39, 62–79.

Li, M., Qin, Y., Gao, F., Zhu, W., & He, X. (2014a). Discriminative analysis of multivariate features from structural MRI and diffusion tensor images.Magnetic Resonance Imaging, 32, 1043–1051.

https://doi.org/10.1016/j.mri.2014.05.008.

Li, Q., Rajagopalan, C., & Clifford, G. D. (2014b). A machine learning approach to multi-level ECG signal quality classification.Computer Methods and Programs in Biomedicine, 117(3), 435–447.https://

doi.org/10.1016/j.cmpb.2014.09.002.

Ojola, M., & Garriga, G. C. (2010). Permutation tests for studying classifier performance.Journal of Machine Learning Research, 11, 1833–1863.

Pierpaoli, C., & Basser, P. J. (1996). Toward a quantitative assessment of diffusion anisotropy.Magnetic Resonance in Medicine, 36(6), 893–906.

Radmacher, M. D., McShane, L. M., & Simon, R. (2002). A paradigm for class prediction using gene expression profiles.Journal of Computational Biology, 9, 505–512.

Rosen, K. H. (2006). Discrete mathematics and its applications.

McGraw-Hill.

Sporns, O. (2003). Graph theory methods for the analysis of neural connectivity patterns.Neuroscience Databases, 171–186.

Vapnik, V., & Lerner, A. (1963). Pattern recognition using gener- alized portrait method. Automation and Remote Control, 24, 774–780.

Varkuti, B., Cavusoglu, M., Kullik, A., Schiffler, B., Veit, R., Yilmaz, O., et al. (2011). Quantifying the link between anatomical connectivity, gray matter volume and regional cerebral blood flow: An integrative MRI study.PLoS One, 6(4), e14801.

Yeh, F. C., & Tseng, W. Y. (2011). NTU-90: A high angular resolution brain atlas constructed by q-space diffeomorphic reconstruction.

Neuroimage, 58, 91–99.

Yeh, F. C., Verstynen, T. D., Wang, Y., Fernández-Miranda, J. C., &

Tseng, W. I. (2013). Deterministic diffusion fiber tracking improved by quantitative anisotropy.PLoS One, 8(11), e80713.https://doi.org/

10.1371/journal.pone.0080713.

Zurita, M., Montalba, C., Labbé, T., Cruz, J. P., Dalboni da Rocha, J., Tejos, C., Ciampi, E., Cárcamo, C., Sitaram, R., et al. (2018).

Characterization of relapsing-remitting multiple sclerosis patients using support vector machine classifications of functional and diffusion MRI data.NeuroImage Clinical, 20,724–730. Advance online publication.https://doi.org/10.1016/j.nicl.2018.09.002.