Compositional analyses and other things
Rémi Losno
IPGP – Université de Paris
2 / 33
Compositional Analysis
Statistics applied to geochemistry
Composition of matter
● A given material is composed of parts. The sum of the weigh of each part if equal to the weigh of the material.
● Composition in a given part is the weigh of the part divided by the weigh of the material.
– wtotal = Σ wi
– Ci = wi / wtotal
● w is an extensive parameter, C an intensive one
4 / 33
Closure condition
● If the given material contains n components, each wi can vary independently and all
possible values make a D dimension space.
● The space determined by the Ci values has a dimension equal to D-1 because of the
closure condition constraint: Σ Ci = 100%
● All compositional values are therefore linked together and cannot vary independently. For example, if Cj increases, all the Ci decrease.
Consequences
● So called "spurious correlations"
– Variations of Cj induce all Ci varying together
● Difficulties to decipher what is really varying, Cj ou Ck?
6 / 33
"Spurious correlation"
A B C Sum
6.4 6.8 97 110
8.4 7.4 62 78
7.0 6.5 105 119
5.7 6.6 137 149
6.0 7.8 145 159
7.0 6.3 86 99
6.2 5.4 107 118
7.2 7.6 93 108
6.8 5.5 98 111
6.6 7.7 42 56
6.3 6.3 93 105
A B C
6% 6% 88%
11% 10% 80%
6% 6% 89%
4% 4% 92%
4% 5% 91%
7% 6% 87%
5% 5% 90%
7% 7% 86%
6% 5% 89%
12% 14% 74%
6% 6% 88%
3 components A, B and C, dilution by C
5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 5.0
5.5 6.0 6.5 7.0 7.5 8.0
Quantities Data
A
B
5%
10%
15%
f(x) = 1.018 x − 0.001 R² = 0.868
Compositional data
B
8 / 33
5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 5.0
5.5 6.0 6.5 7.0 7.5 8.0
Quantities Data
A
B
3% 4% 5% 6% 7% 8% 9% 10% 11% 12% 13%
0%
5%
10%
15%
f(x) = 1.018 x − 0.001 R² = 0.868
Compositional data
A
B
To get ride of such
behaviour issue, no way with a patch on
regression equation. The solution is coming with a modification of the
variable space
New tools for compositional statistics
● Composition property
– do not depend on the size of the sample
– a minor component has no influence if a simple sum is used
● Performing elemental ratios
– log ratio rather than linear ratio
10 / 33
Example: aerosols and soils
● Chemical composition
– Soils
– Sieved soils
– Generated aerosols
● 2 aerosol generation method comparison
– wind tunnel
– Sygavib
● 4 soils
Trabelsi et al. submitted to JGR
Sygavib system
12 / 33
CaO SiO2 Al2O3 Fe2O3 MgO K2O Na2O TiO2 SrO MnO
0.1 1.0 10.0
Syg Hsar WT Hsar FS Hsar
Classical approach
Ratio on bulk soil
CaO SiO2 Al2O3 Fe2O3 MgO K2O Na2O TiO2 SrO MnO 0.0
1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
Syg Hsar WT Hsar FS Hsar
Classical approach: linear
scale
14 / 33 CaO Al2O3 Fe2O3 MgO K2O Na2O TiO2 SrO MnO
0.100 1.000 10.000
Syg Hsar WT Hsar FS Hsar
What else? SiO2 removed
CaO Al2O3 Fe2O3 MgO K2O Na2O TiO2 SrO MnO 0.000
1.000 2.000 3.000 4.000 5.000 6.000 7.000
Syg Hsar WT Hsar FS Hsar
What else? (linear = bad)
16 / 33 CaO
Al2O3
Fe2O3
MgO K2O
Na2O
TiO2
SrO
MnO 0.100
1.000 10.000
Syg Attaya WT Attaya FS Attaya
CaO Al2O3
Fe2O3
MgO K2O
Na2O TiO2 SrO
MnO 0.100
1.000 10.000
Syg Cherarda WT Cherarda
CaO
Fe2O3
K2O
TiO2
MnO
0.100 1.000 10.000
Syg Hsar WT Hsar FS Hsar
CaO
Al2O3
Fe2O3
MgO K2O
Na2O
TiO2
SrO
MnO
0.100 1.000 10.000
Syg Ghraiba WT Ghraiba FS Ghraiba
What else?
Compositional tools
● R, "compositions" package
● Variance and covariance analyses of log-ratio with
– PCA (Principal Component Analyse)
– Plot samples on a "compositional distance"
graph.
● Very sensitive to analytical uncertainties, fail if one zero is encountered.
– remove dubious variables
18 / 33
Biplot including bulk and fine soils, and generated aerosol. Comp 1 and Comp 2 account together for ca. 86% of the total variance (56% and 30%, respectively). a:
Attaya, b: Cherrarda, c: Ghraiba, d: Hsar
20 / 33
Biplot when Si and Na are removed.
Differences remain between bulk soil and fine fractions
22 / 33
How to figure what is plotted
From John Aitchison, A concise Guide to Compositional Data Analysis, 2nd Compositional Data Analysis Workshop, CoDaWork’05, Girona Universitat de Girona
What happens element by element?
● Comparison between two sampling heads
– One commercial
– One home made
● Field campaign in Tunisia
● Y. Xu et al., in preparation (hope to be submitted soon).
24 / 33
Where we were
Biplot on composition
26 / 33
Biplot on composition
Projection of the perturbation vector
Perturbation vector
● Compositional distance
● To see element by element
● probability of composition change
28 / 33
REEs: subset of
compositional data
30 / 33
REEs perturbation vector
A new tool for REEs profile presentation
Crozet Kerguelen
32 / 33
Instead of ...
La Ce Pr Nd Sm Eu Gd Tb Dy Ho Er Tm Yb Lu 0
0.5 1 1.5 2 2.5 3
3.5 Normalized REE profiles
Ker Cro
Who can see here that Ce will not discriminate Cro from Ker?
Conclusions
● Compositional tools improve
– discussions on compositional data
– mathematics related to chemical compositions
– robustness of conclusion
● Compositional tools dramatically decrease the amount of word and sentences to
describe compositional evolution