• Aucun résultat trouvé

Traceability in chemical measurements: the role of data analysis

N/A
N/A
Protected

Academic year: 2021

Partager "Traceability in chemical measurements: the role of data analysis"

Copied!
18
0
0

Texte intégral

(1)

Publisher’s version / Version de l'éditeur:

Vous avez des questions? Nous pouvons vous aider. Pour communiquer directement avec un auteur, consultez la première page de la revue dans laquelle son article a été publié afin de trouver ses coordonnées. Si vous n’arrivez pas à les repérer, communiquez avec nous à PublicationsArchive-ArchivesPublications@nrc-cnrc.gc.ca.

Questions? Contact the NRC Publications Archive team at

PublicationsArchive-ArchivesPublications@nrc-cnrc.gc.ca. If you wish to email the authors directly, please see the first page of the publication for their contact information.

https://publications-cnrc.canada.ca/fra/droits

L’accès à ce site Web et l’utilisation de son contenu sont assujettis aux conditions présentées dans le site LISEZ CES CONDITIONS ATTENTIVEMENT AVANT D’UTILISER CE SITE WEB.

La Rivista del Nuovo Cimento, 2019

READ THESE TERMS AND CONDITIONS CAREFULLY BEFORE USING THIS WEBSITE.

https://nrc-publications.canada.ca/eng/copyright

NRC Publications Archive Record / Notice des Archives des publications du CNRC :

https://nrc-publications.canada.ca/eng/view/object/?id=c1646f29-76ca-4838-b6a3-e68af980fe42

https://publications-cnrc.canada.ca/fra/voir/objet/?id=c1646f29-76ca-4838-b6a3-e68af980fe42

NRC Publications Archive

Archives des publications du CNRC

This publication could be one of several versions: author’s original, accepted manuscript or the publisher’s version. / La version de cette publication peut être l’une des suivantes : la version prépublication de l’auteur, la version acceptée du manuscrit ou la version de l’éditeur.

Access and use of this website and the material on it are subject to the Terms and Conditions set forth at

Traceability in chemical measurements: the role of data analysis

Meija, Juris

(2)

the role of data analysis

2

Juris Meija

National Research Council Canada - Ottawa ON, Canada

3

Summary.— Countless chemical measurements are performed worldwide each day and they rely not only on calibration standards and measurement methods but also on mathematical models. The choices analysts make about such models can have a significant effect on the reported measurement results. The challenge is therefore for analysts to explore the rich variety of modeling options, appreciate their effect on the results, and recognize that a larger statistical toolkit can raise the bar for more reliable results.

4

1. – Introduction 5

The result of a chemical measurement is determined by many crucial components such 1 6

as the primary calibration standards, choice of measurement methods, the physical act 7

of measurement, mathematical measurement model, other input quantities, calculations, 8

and the expression of the result. Together, these steps form a full picture of the measure-9

ment in a sense that it could be fully understood or even reproduced by others. Chemical 10

measurements are rarely performed without the recourse to data analysis, mathematical 11

or statistical models. These models, and their implementation, form an integral part 12

of the modern measurement process and, much like the physical act of measurement, 13

c

(3)

can lead to errors. Indeed, the modern view is that scientific measurement is a form of 14

model-based inference [1]. 15

2. – Calculations as the source of error 16 Good measurement can lead to bad results when incorrect mathematics is em- 17 ployed [2]. In 2002, the Laufenburg bridge was built across the Rhine river between 18 Switzerland and Germany. Engineers from both countries were building the bridge from 19 their sides. Switzerland uses the Mediterranean Sea as the reference point for eleva- 20 tion measurements whereas Germany adopts the North Sea. There is a 27 cm difference 21 between these two elevations and it was certainly known and taken into consideration 22 during the planning stages of the bridge. Yet, things went wrong during the implemen- 23 tation. However, error in applying this correction (wrong sign) led to a mismatch in the 24 height of the base pillars of 54 cm on both sides of the bridge. Hence, the error in this 25 project can be traced to an incorrect implementation of a simple measurement model, 26

hnew = hold+ 27 cm. 27

A similar example is often encountered at doctor’s offices in Canada where person’s 28 height is measured using rulers with imperial length scale and the results are then con- 29 verted to metric system using a simplified conversion rule, hmetric/cm = 2.5 · himperial/in. 30 The difference between defined conversion factor 1 in = 2.54 cm and the actual measure- 31 ment model, 1 in = 2.5 cm, leads to a 2% error in arguably the simplest and most reliable 32 measurement one can imagine in doctor’s office. Both of these examples demonstrate 33 that even trivial calculations can lead to errors. 34

3. – Human errors 35

The 2007 Report of the International Panel on Climate Change (ipcc) noted that 36 “uncertainties in deep-ocean nutrient observations may be responsible for the lack of 37 coherence in the nutrient changes” [3]. When reading such a statement, it is natural to 38 think of the complexities underlying the measurements of nitrates or phosphates. Yet, 39 the International Ocean Carbon Coordination Project survey on what causes the most 40 errors in the seawater nutrient analysis revealed that most errors have little to do with 41 chemistry or the measurement process itself, but rather involve a variety of effects that 42 could be classified as human errors (see table I) [4]. 43 One of the most avoidable errors is certainly due to misreporting of units. Docu- 44 mented is an example of an extraordinary high blood glucose reading, 42 mmol/L, which 45 was interpreted as an extraordinary low blood glucose level, 42 mg/dL, and intravenous 46 dextrose was requested even though the patient required the opposite —insulin [5]. This 47 cannot possibly be an isolated example and all chemists must be more vigilant when 48 communicating their results. When reporting nitrate results, for example, care must be 49 taken to avoid the potential confusion between the mass of nitrate ion and the corre- 50 sponding mass of nitrogen. Ambiguity in reporting chemical results can indeed have big 51 consequences. In 2015, the us Supreme Court heard a pharmaceutical patent dispute 52

(4)

TableI. – What causes errors in chemical analysis? Example of seawater nutrient analysis [4]. Source of error Magnitude of error

Temperature effect on air volume 5% Weighing of impure standard 4% Weighing of wet standard 3% Reporting µM/L as µM/kg 3% Forcing linear fit to nonlinear data 3% Heat-distorted plastic pipettes 1%

that centered around the presumed meaning of the term “average molecular weight” [6]. 53

Indeed, there are many ways one can “average” masses of molecules: one can add up the 54

mass of each molecule and divide by the number of molecules (number average molec-55

ular weight), one can take the mass of the molecule that is most prevalent in the mix 56

(peak average molecular weight), or one can calculate the average mass while giving 57

heavier molecules a bonus when doing so (weight average molecular weight). Ultimately, 58

parts of the us Patent 5,800,808 were found indefinite in the relevant context and were 59

rendered invalid. Perhaps this is a reminder to strive for terminological clarity in their 60

work. After all, many chemists employ the term “concentration” to report all kinds of 61

quantities (mass fractions, mass concentrations, or amount concentration) thus leading 62

to unwanted misunderstandings. An example of this is the conflation of the mass and 63

amount in various legal definitions of natural or enriched uranium. Some countries define 64

natural uranium as having 0.720% uranium-235 by mass even though that value refers 65

to the amount fraction. (Note that n(235U)/n(U) = 0.007 20 mol/mol is equivalent to

66

m(235U)/m(U) = 0.007 11 g/g for natural uranium). Moreover, some countries use

con-67

voluted language to define natural uranium as containing uranium-233 and uranium-235 68

in an amount such that the abundance ratio of the sum of those isotopes to the isotope 69

uranium-238 is 0.720%. Hence, we have the following statements: 70

n(235U)/n(U) = 0.007 20 mol/mol,

m(235U)/m(U) = 0.007 20 g/g,

(n(233U) + n(235U))/n(238U) = 0.007 20 mol/mol.

While all of these definitions use the same numerical value, 0.007 20, they are in conflict 71

with one another as they (inadvertently) refer to different physical quantities. 72

4. – Do the results speak for themselves? 73

Assumptions about how to interpret data are often more critical than the mathemat-74

ics. Consider the following example [7]: 75

Richard received samples of sterling that was produced by melting pure silver 76

and copper. He has to analyze this material and report the silver content from 77

(5)

TableII. – The choice of measurement model can have a significant effect on the interpretation of the results.

Measurement model for w(Ag) Explanation of the measurement model Result A wA= mAg/(1 g) Ignore Cu data 0.869(15)

B wB= (1 − mCu)/(1 g) Ignore Ag data 0.926(6)

C wC= mAg/(mAg+ mCu) Ignore that each pin weighs 1 g 0.922(6)

D wi∼N[wD, u2(wi)] (i ∈ A, B) Fixed effects model of Cu-Ag data 0.918(6)

E wi∼N[wE, u2(wi) + τ2] (i ∈ A, B) Random effects model of Cu-Ag data 0.900(29)

ten 1-g pins at his disposal. Five of them were used to determine the mass of 78 silver and the other five to determine the mass of copper with the following 79

results for each pin: 80

mAg= 0.844 g, 0.888 g, 0.825 g, 0.907 g, 0.882 g,

mCu = 0.060 g, 0.096 g, 0.067 g, 0.075 g, 0.070 g.

81 It is clear that the results are contradictory: the sum of the average copper and silver 82 values fall 6% short of the stated mass of each pin (1 g). Table II summarizes several 83 reasonable choices on how one can interpret such results. These approaches are by 84 no means the only choices that can be made and some may choose entirely different 85 measurement models to interpret this seemingly simple scenario. For example, one can 86

adopt a statistical model 87

mAg,i/(1 g) ∼ N[wAg, u21] and mCu,i/(1 g) ∼ N[1 − wAg, u22] (i = 1 . . . 5),

(1)

whose three parameters (wAg, u1, u2) can be estimated using Bayesian methods [8]. If a 88 decision is made to reject the sterling when its silver content is below 0.925 g/g, different 89 conclusions will be reached depending on the mathematical model used. Analysts face 90 similar situations daily and are faced with the plurality of reasonable choices that can 91 lead to different results and uncertainties [9]. 92 Data analysis is often seen as a process that reveals the results and indeed it might 93 be convenient to overlook the fact that results might depend on the chosen analysis 94 strategy. As we have seen in the above example of sterling analysis, there is often 95 no single way to interpret data. On the contrary, often there are many reasonable 96 approaches to evaluate the same data [9]. While this is generally understood by scientists, 97 in many cases there is not enough appreciation for the implications of model choice in 98 practice. It is therefore important to appreciate the fact that results emerge from data 99 through measurement models. Consequently, different choices of statistical models can 100 lead to different conclusions. The significance of the measurement model was recently 101 demonstrated in a study where 29 research teams were given the same soccer game 102 dataset and asked to determine if referees are more likely to give red cards to dark-skinned 103

(6)

players [10]. Each research team used a different statistical model which they thought 104

best applies to the dataset. Twenty teams declared that the data contain significant 105

evidence of racial bias whereas nine teams did not. Both examples demonstrate that 106

subjective good-faith modeling choices can lead to significantly different conclusions from 107

the same data. 108

5. – Dark uncertainty 109

There are stories of a spectrography laboratory where measurement precision was 110

degraded on the days when cleaners had polished the floors with wax. The solvent of 111

the polish contained enough uv-absorbing compounds to affect the light transmission 112

in air-path spectrometers. Similarly, a microwave oven in the nearby kitchen was the 113

source of radio signals, called perytons, and has baffled scientists at Australia’s Parkes 114

radio telescope for two decades. In both examples, human judgment is essential in 115

choosing which influences to consider in method validation and uncertainty evaluation. 116

This exercise of judgment should be acknowledged as a crucial aspect of the analyst’s 117

professional skills. 118

There is plenty of evidence that scientists tend to underestimate the uncertainty 119

of their measurement results [11, 12] and analytical chemists are no exception. This 120

might not be too surprising since the traditional “bottom-up” approach to uncertainty 121

evaluation requires analysts to identify all sources of uncertainty. If one is not aware of 122

a certain influence on the results, it will not be accounted in the uncertainty budget and 123

will become part of the dark uncertainty —the uncertainty that remains invisible to the 124

analyst [13]: 125

In every case, at least some of the laboratories must be omitting important 126

contributions from their formal uncertainty calculations, by underestimation 127

of recognized uncertainty sources, by omitting important effects from the 128

model used or for other reasons. This is evidence for the prevalence of dark 129

uncertainty in much of chemical measurement. 130

The coherence of chemical measurements is largely underpinned by the use of Certified 131

Reference Materials with their stated uncertainties often taken as a measure of quality. 132

Yet, long recognized is the “crm syndrome” where analysts can succeed in repeating re-133

ported results, but can fail when samples are presented as blind unknowns [14]. Dark un-134

certainty seems to be an inevitable part of chemical measurement and, as a result, we tend 135

to underestimate uncertainties of our measurements [12]. As an example, fig. 1 shows that 136

the observed biases from measurements of numerous inorganic crms exhibit much wider 137

tails than a Gaussian distribution would suggest. This observation, in turn, calls for the 138

need to better understand the uncertainty evaluation methods in chemical measurements. 139

6. – Understanding the data-generation process 140

Increasingly, analytical chemists spend more time evaluating the uncertainty of their 141

measurement results because the uncertainty is essential to judge the significance of the 142

(7)

Fig. 1. – Chemists tend to underestimate their measurement uncertainties. In this case, heavy-tailed Cauchy distribution (t-distribution with degrees-of-freedom v = 1) describes the observed biases from measurements of numerous inorganic Certified Reference Materials at the nrc more adequately than the Gaussian distribution.

results. To illustrate this point, consider the figure skating scoring system. In the 2018 143 Winter Olympics, the gold and silver medals in ladies single figure skating were awarded 144 with corresponding total scores of 239.57 and 238.26 points. How significant is the 1.31 145 point gap that separates Olympic gold from silver? To better understand these results, 146 it is helpful to explore how these scores are obtained. Table III shows an example of a 147 figure skating score sheet. In short, each technical element has a certain agreed-upon 148 base value, bi, and its execution is evaluated by nine judges. After the removal of the 149 lowest and highest marks, the average score is calculated (trimmed mean) and added to 150 the base value with a certain predetermined weight, wi 151

si = bi+ wi 7 ⎛ ⎝−Jmin− Jmax+  j=1...9 Jj ⎞ ⎠. (2)

The final scores represent the total outcome of such a linear multi-component model 152 which includes professional opinion scores from nine judges. Given that judges do not 153 always agree on their scores [15], it is reasonable to explore the observed levels of disagree- 154 ment. One way to assess the reliability of the judging scores is to simulate samples by 155

Table III. – Excerpt of figure skating score card (Alina Zagitova, 2018 Winter Olympics). 3S stands for triple Salchow, 3F for triple flip, and 2A for double Axel.

Component Base J1 J2 J3 J4 J5 J6 J7 J8 J9 Weight Score

i bi wi si

3S 4.84 2 2 3 2 1 2 2 2 2 0.7 6.24 3F 5.83 3 2 3 3 3 2 2 2 2 0.7 7.53 2A 3.63 1 1 1 2 2 1 1 1 1 0.5 4.20

(8)

Fig. 2. – Probabilistic interpretation of the ladies single figure skating medal scores at the 2018 Winter Olympics. According to this model, the probability of Evgenia Medvedeva having received gold medal instead of silver is 6% as a result of judging uncertainty.

randomly drawing judging scores with replacement and then calculating the total score 156

for each such random sample (fig. 2). This method is commonly known as nonparamet-157

ric bootstrap resampling [16] and is widely used for uncertainty evaluations in science, 158

medicine, and engineering. In this simplified case, the bootstrap provides a simple way 159

to appreciate the effect of judging on the final result. 160

7. – The importance of the measurement model 161

There is an idealized view shared by many that “data should speak for them-162

selves” [17]. For data to provide any meaningful context or conclusions, there has to 163

be an understanding of their generation, purpose, and context. Ultimately, mathemati-164

cal models are necessary to interpret data (measurement results) and they play a crucial 165

role in reaching conclusions and decisions. 166

7.1. Titration endpoint. – Consider one of the oldest classical methods of chemical 167

analysis —titrimetric determination of chloride ions with silver nitrate and potassium 168

chromate as indicator (Mohr method, 1856). When all chloride ions have reacted with 169

silver, the reddish-brown silver chromate will precipitate which indicates the endpoint of 170

the titration to the analyst. To what are the results of such titration traceable and what 171

is the measurement model in this case? 172

When 100 mL of 0.01 M chloride solution is titrated with 0.1 M silver nitrate solution 173

in the presence of chromate indicator (cind= 0.002 M), most will agree that the titration

174

endpoint is reached precisely at 10 mL, in accordance with the following measurement 175 model: 176 Vend= V0 c0 c . (3)

How accurate is this expression? It is important to realize that this simplified model 177

assumes that silver chromate will start forming only when all chloride ions are consumed 178

(9)

Fig. 3. – Potentiometric titration of KHP with NaOH. How do we estimate the endpoint? In this example, the smooth line is empirical mathematical model —the 5-parameter logistic regression y = a + (b − a)/(1 + 10c(d−x)

)e

with the inflection point at x = d + log10(e)/c.

by the silver ions. In reality, both AgCl and Ag2CrO4 are formed continuously during 179 the titration and the titration endpoint corresponds to the point in time when Ag2CrO4 180 begins to precipitate or, mathematically speaking, the solubility product of Ag2CrO4 is 181 crossed. Considering only the most relevant chemical reactions 182

Ag++ Cl= AgCl, K

sp1= [Ag+][Cl−] = 10−9.75mol2L−2,

(4a)

2Ag++ CrO2−4 = Ag2CrO4, Ksp2= [Ag+]2[CrO2−4 ] = 10

−11.9mol3L−3,

(4b)

and that the two predominant ions in the solution are Ag+ and Clwhose total charge 183

must balance each other, the following simplified theoretical measurement model is ob- 184

tained [18]: 185 Ksp1  Ksp2Vc0ind+VVend0 −  Ksp2 V0+ Vend cindV0 −c0V0− cVend V0+ Vend = 0. (5)

Solving this measurement model, f (Vend) = 0, for Vend yields Vend = 10.022 mL which 186 differs by 0.2% from the simple measurement model based on stoichiometry alone (Vend= 187 10 mL). Titration is a primary method of chemical analysis with results typically believed 188 to have sub-percent uncertainties. This example demonstrates that the accuracy of 189 titration results can be significantly improved by better understanding of the physico- 190 chemical conditions of the analyzed system. 191 7.2. Detecting the endpoint. – At the core of any titration is the estimation of endpoint. 192 This is the moment in titration when the analyst observes a sudden change signaling the 193 consumption of the analyte. Today, most titrations are conducted with the help of 194 instrumental measurements of absorbance or electric conductivity (see fig. 3). In such 195 cases the endpoint is estimated as a parameter of the observed titration curve. Typically, 196

(10)

Table IV. – A summary of endpoint estimates from a potentiometric titration of potassium hydrogen phthalate with NaOH.

Model for endpoint Estimation method Description Result Inflection point Nonparametric 1st-order derivative (forward) 5.00(5) mL Nonparametric 1st-order derivative (central) 4.95(6) mL Nonparametric Extremum distance estimator 4.88(3) mL Parametric 5P logistic curve fitting 4.92(4) mL Parametric 4P logistic curve fitting 4.90(2) mL Point at pH = 7 Parametric 5P logistic curve fitting 4.94(2) mL

the endpoint is defined as the inflection point of the titration curve although it is also 197

reasonable to define it as a point with a certain pH of the system. Even when we agree 198

that the inflection point of the titration curve is an unbiased estimate of the equivalence 199

point, there are many ways to estimate it [19]. There are nonparametric methods, such 200

as the numerical derivative calculation (Euler method), or parametric methods which 201

provide the inflection point from the parameters of the titration curve-fitting. For this, 202

one needs to provide an adequate mathematical description of the titration curve [20]. 203

As can be seen from table IV, several methods to obtain the endpoint show a 2% spread 204

which requires further contemplation by the analyst. 205

7.3. Solubility calculations. – Solubility of chemical substances is often communicated 206

in terms of the solubility product which is the equilibrium constant of a hypothetical 207

dissociation reaction into the constituent ions. For nickel dimethylglyoxime, Ni(dmg)2,

208

as an example, this corresponds to the following hypothetical reaction: 209

Ni(dmg)2= Ni2++ 2dmg, K

sp= [Ni2+][dmg−]2.

(6)

Thus, the calculation of solubility constant requires the knowledge of the dissolved metal 210

ion, [Ni2+], and the concentration of the free ligand, [dmg], which is often simplified

211

by considering [dmg−] = 2[Ni2+]. If this expression is used to calculate the solubility

212

constant, i.e. Ksp = 4[Ni2+]3, one will obtain incorrect results because other ions are

213

present in significant amounts, including OH−, NiOH+, and Hdmg [21]. As an example,

214

the concentration of dmg−ions in a saturated aqueous solution of Ni(dmg)

2is two-orders

215

of magnitude higher than the approximation [dmg−] = 2[Ni2+] suggests [22].

216

Simple mathematical models have a tendency to permeate textbooks and, in turn, 217

scientific practices. In fact, many analysts choose simplicity over accuracy when it comes 218

to mathematical models and the half-titration method to determine the pKa values of

219

acids is a well-known example of such a phenomenon [23]. These examples illustrate the 220

importance of dissecting the robustness of simple measurement models before adopting 221

them in daily practice. 222

(11)

8. – Traceability in curve-fitting 223 Curve fitting plays an important role in modern science and one can argue that nearly 224 all everyday measurements in chemistry rely on it. In this context, we shall consider two 225 “high-end” calibration methods of analytical chemistry —standard additions and isotope 226

dilution. 227

8.1. Method of standard additions. – The method of standard additions was introduced 228 in the 1930s as a way to deal with the effect of the sample matrix on analytical signals [24]. 229 It consists of making aliquots of the sample to which different amounts of the standard 230 are added. All samples are then processed and subjected to chemical measurement. The 231 mass fraction of the analyte in the sample is obtained by regressing the observed signal 232 changes against the mass of the added standard and corresponds to zero analytical signal. 233 Considering the simplest case or ordinary least squares fitting, the measurement model 234 for the mass fraction of the analyte in the sample, w, is as follows: 235

w = a/b, where {a, b} = arg min

a,b



i

(yi− a − bxi)2.

(7)

In this case, both the intercept and the slope, a and b, are normally distributed, correlated 236 normal variables. The ratio of two correlated normally distributed random variables is 237 a rather controversial topic of mathematical statistics [25]. To make a point by exagger- 238 ation, the normal ratio distributions can be asymmetric and even bimodal. 239 Overall, obtaining the result which involves a calibration curve, whether it is external 240 or internal, involves three layers of modeling. The first layer involves the choice of 241 measurement model function which can be linear, quadratic, or rational, as an example. 242 The second layer involves the choice of optimality criteria for fitting which can be the 243 ordinary least squares, weighted least squares, or errors-in-variables, as an example. 244 The third layer involves the choice of statistical method of analysis. This can involve 245 many things. For example, propagation of uncertainties (Gauss method) or probability 246 density functions (Monte Carlo method), nonparametric resampling of data, or the choice 247 between frequentist and Bayesian methods of uncertainty evaluation. 248 Consider the effects of the model choice in mass spectrometric determination of nitrate 249 in standard solution with known amounts of nitrate, w(NO−

3) = 50.5 ± 0.2 mg/kg (fig. 4). 250 The coefficient of determination r2= 0.9996 which most would take as a convincing level

251 of linearity yet the difference in the result between linear and quadratic models is 8%. 252 Both of these fitting models are empirical and both fit the data rather well. Shall one 253 simply ignore the results from the quadratic model? While the Aikake Information 254 Criterion (aic) or Fisher’s F-test of the residual variances can evaluate which model fits 255 data better, they do not tell us which is the correct model. Once the measurement model 256 is adopted, the fitting method has to be chosen. For linear regression, chemists often 257 use ordinary and weighted least squares with a variety of weighting alternatives such as 258 w = x−2, x−1, 1, y−1, and y−2[26,27]. With replicate measurements often lacking, expert 259

judgment plays an important role in deciding which weighting scheme, if any, to adopt. 260 Depending on the choice of weighting method, the (standard addition) results can vary 261

(12)

Fig. 4. – Effect of the measurement model in gc-ms determination of nitrate by standard ad-ditions: linear model gives result w(NO−

3) = 54.0 ± 1.8 mg/kg (expanded uncertainty, k = 2)

whereas quadratic measurement model gives 8% lower value, 49.9 ± 1.3 mg/kg, in agreement with the true value 50.5 ± 0.2 mg/kg (unpublished data).

to within several percent. Hence, if more precise measurements are required, one has to 262

consider the role of statistical methods in chemical analysis. 263

8.2. Isotope dilution. – Isotope dilution has been around in chemistry for a long 264

time, tracing its conceptual roots from early 20th century ecology [28] and leading to 265

the 1943 Nobel Prize in chemistry to George de Hevesy. The simplest implementation 266

of this method is a form of single-point calibration and requires addition of a known 267

amount of isotopic standard to the sample with a subsequent measurement of the isotopic 268

composition of the resulting mixture. In order to perform multi-point calibration, one 269

has to employ lengthy measurement model equations or resort to curve-fitting. Consider 270

isotope dilution measurement involving a three-point calibration. The calibration plot 271

is obtained by preparing three mixtures of a standard (A∗) and isotopic spike (B) and 272

measuring the isotopic composition of these blends (R1, R2, and R3). Then, the sample

273

(A) is mixed with the isotopic standard (B) and the isotopic composition of this blend 274

is measured (R4). The somewhat simplified theoretical measurement model for the mass

275

fraction of the analyte in the sample, wA, is as follows [29]:

276 wA= wA∗ +mA∗,2mA∗,3mB,1mB,4(R2− R3)(R1− R4) +mA∗,1mA∗,3mB,2mB,4(R1− R3)(R4− R2) +mA∗,1mA∗,2mB,3mB,4(R1− R2)(R3− R4) +mA,4mA∗,1mB,2mB,3(R2− R3)(R4− R1) +mA,4mA∗,2mB,1mB,3(R1− R3)(R2− R4) +mA,4mA∗,3mB,1mB,2(R1− R2)(R4− R3) . (8)

(13)

Here, wA∗ is the mass fraction of the analyte in the standard solution and mX,i are 277 the masses of solutions used to make the various blends. Instead of employing this 278 lengthy equation, one can use regression-based methods to fit the calibration curve from 279

measurements R1, R2, R3 280 yi= a1+ a2· xi 1 + a3· xi (i = 1 . . . 3). (9)

Here, xi = wA∗mA∗,i/mB,i and yi = Ri. The mass fraction of the analyte, wA, is then 281 obtained by applying the measurement R4 to this calibration function. In fact, it is 282 a common practice in biomedical research to construct calibration graphs by analyzing 283 mixtures of a standard and isotopically labeled substance. By plotting the observed 284 isotope ratios against the mass-ratios of the standard and the labeled substance, one 285 obtains the calibration plot. Such calibration curves can display slight curvature which 286 is often modeled empirically with quadratic or cubic polynomials 287

yi= a + b · xi+ c · x2i(+d · x3i).

(10)

Polynomial models have been used in isotope-based quantitation of biomolecules since the 288 1970s but they can lead to biased results, that is, the results obtained from the theoretical 289 measurement model (eq. (8) or eq. (9)) and the empirical measurement model (eq. (10)) 290 might differ by several percent [30]. 291 The examples from titrimetry, standard additions, and isotope dilution, all demon- 292 strate that data interpretation in analytical chemistry requires choices to be made about 293 the underlying measurement models leading to significant effects on the results. 294

9. – What is the best estimate? 295

The result of the analysis is often viewed as the “best estimate” of the measurand. The 296 Guide to the Expression of Uncertainty in Measurement (gum), for example, states that 297 the result of a measurement is expressed as Y = y ± U (y), which is interpreted to mean 298 that the best estimate of the value attributable to the measurand Y is y. Consider further 299 that an output quantity y is given from the measured input quantity x by a function 300 y = f (x). Trivially, if x = 5 and y = x2 then the best estimate of the output quantity is 301

y = 52 = 25. What happens, however, when the best estimate x has uncertainty, as all

302 measurement results do? It turns out that if x = 5 with standard uncertainty u(x) = 1, 303 which is modeled using normal distribution, then the best estimate of x2 is no longer 25 304

but rather 26, albeit with a large standard uncertainty u(y) = u(x2) = 10. 305

As the above example suggests, the best value of an output quantity might not corre- 306 spond to the best values of the input quantities. Consider the isotopic abundance of ion 307

13C

112C92(having a 1117 Da mass) in the molecular fragment of C93. The measurement 308 model is given from the isotopic abundance of carbon-13 atoms, x13, and the binomial 309

(14)

expansion 310

x1117= 93 · x13(1 − x13)92.

(11)

If the isotopic abundance of carbon-13 is taken as x13= 0.0107 with standard uncertainty

311

u(x13) = 0.0005 and modeled as a Gaussian random variable, then the best estimate of

312

x1117is 0.369 46(59) as obtained using the Monte Carlo method which effectively performs

313

uncertainty propagation of probability distribution of x13. In comparison, this estimate

314

of x1117does not correspond to a value obtained by inserting x13into eq. (11) but rather

315

x13− 1.1 · u(x13), a consequence of the nonlinear character of eq. (11) which leads to

316

a highly asymmetric distribution of x1117. Asymmetric distributions are commonplace

317

in physics and they are becoming more popular in chemistry especially for high-purity 318

materials [31]; chemists need to become acquainted with their properties. 319

10. – Traceability chains 320

The global economy relies on measurement results that can be compared and trusted 321

worldwide. How can we decide that the atmospheric CO2 has been steadily increasing

322

since the 1950s unless we have confidence in these measurements throughout the last 70 323

years? The use of Certified Reference Materials for calibration is a straightforward way 324

to provide metrological traceability and trust in measurement results [32]. Discussions 325

on traceability often idealize it by using the analogy of a simple chain [33]. However, the 326

relationship between the various standards can often be much more complicated. 327

10.1. Carbon isotope delta. – High-precision carbon isotope ratio measurements can 328

tell whether sugar is derived from maple syrup or high fructose corn syrup, it can tell 329

whether the testosterone in an athlete’s blood has been made in the body or in the lab, 330

and more importantly, it can tell whether carbon dioxide comes from plants or from 331

burning fossil fuels [34]. The carbon isotope delta measurements are traceable to the 332

Vienna Peedee Belemnite (vpdb) scale which is defined by the reference material nbs 333

19. In practice, a variety of other reference materials are employed in carbon isotope 334

measurements and their relationship to the nbs 19 is shown in fig. 5. It is clear that 335

the various international standards are heavily intertwined [35]. While such connectivity 336

helps to enhance the coherence of the reference materials, the complication arises when 337

some elements in this network are revised. The best estimates of carbon isotope deltas 338

in usgs40 and usgs41 have been questioned since their introduction [36, 37] and lsvec 339

has been found unreliable as carbon isotope reference material [38]. It remains unclear 340

how to revise the entire network of the isotope delta reference materials as a result of 341

these observations other than measuring their absolute isotopic composition. 342

10.2. Arsenobetaine. – In 2007, a ccqm interlaboratory comparison was conducted to 343

assess capabilities to determine arsenobetaine (AsBet) in fish tissues. During this study 344

(ccqm-p96) discordant results among the five participants were observed and doubts 345

were raised over the certified value of AsBet in primary reference materials bcr-626 346

(15)

Fig. 5. – A network of relationships between the international carbon isotope delta reference materials related to the nrc sugar reference material beet-1. How do the revisions of some of these materials affect beet-1? Red color indicates the links from scale-defining material nbs 19 whereas blue color shows all other reference materials.

(irmm, Belgium) and nmij-7901a (nmij, Japan), both of which were used as primary 347 calibrators for AsBet measurements and both of which shared a common traceability 348 link via bcr-626. As a result of this study, the certified value for AsBet in nmij-7901a 349 was revised by 20% in 2009 whereas only the expanded uncertainty was subsequently 350 revised for the AsBet in bcr-626 from 0.6% to 7% [39, 40]. Thus we witnessed a vastly 351 different reaction to identical changes in the traceability chain. 352 Measurement results often depend on a variety of additional input quantities that 353 might undergo changes. For example, one of the largest uncertainty components in 354 the nist benzoic acid standard srm 350b (2005) was the standard atomic weight of 355 carbon. While iupac revised the latter in 2009, the srm 350b was not revised accordingly. 356 It remains unclear how best to capture such changes or if end-users are “allowed” to 357 revise the certified values on their own. Consequently, it is important that traceability 358 statements be scrutinized to ensure they reflect the measurement practices. 359

11. – Summary 360

This lecture highlights the fact that a plurality of reasonable choices in data analysis 361 and interpretation can have significant impact on the results of chemical measurements 362 including the comparability of Certified Reference Materials. Chemical measurements 363 require a rich background in analytical chemistry and the ability to evaluate the under- 364 lying mathematical models and assumptions which are often overlooked. The challenge 365 is for analysts to excel in both and recognize that measurement models are crucial for 366 better understanding of measurement results. 367

(16)

REFERENCES

[1] Tal E., Stanford Encyclopedia of Philosophy (Stanford University) 2015, Ch. Measurement in Science.

URL https://plato.stanford.edu/entries/measurement-science/ [2] Lewis H. A., PRIMUS, 25 (2014) 181.

URL https://doi.org/10.1080/10511970.2014.928657

[3] Bindoff N. et al., Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change (Cambridge University Press Cambridge) 2007.

[4] Hydes D. et al., IOCCP Report No. 14 (ICPO Publication Series No. 134) 2010, Ch. Recommendations for the Determination of Nutrients in Seawater to High Levels of Precision and Inter-Comparability using Continuous Flow Analysers.

URL https://www.go-ship.org/Manual/Hydes et al Nutrients.pdf [5] Tomaszewski C., J. Med. Toxicol., 3 (2007) 87.

URL https://doi.org/10.1007/bf03160915 [6] Noonan K., Nat. Rev. Drug Discov., 14 (2015) 157.

URL https://doi.org/10.1038/nrd4564

[7] Meija J., Anal. Bioanal. Chem., 409 (2017) 2497. URL https://doi.org/10.1007/s00216-017-0210-4 [8] Kruschke J. K., Trends Cognitive Sci., 14 (2010) 293.

URL https://doi.org/10.1016/j.tics.2010.05.001 [9] Possolo A. and Pintar A. L., Metrologia, 54 (2017) 617.

URL https://doi.org/10.1088/1681-7575/aa7e4a

[10] Silberzahn R. et al., Adv. Methods Practices Psychol. Sci., 1 (2018) 337. URL https://doi.org/10.1177/2515245917747646

[11] Henrion M. and Fischhoff B., Am. J. Phys., 54 (1986) 791. URL https://doi.org/10.1119/1.14447

[12] Bailey D. C., R. Soc. Open Sci., 4 (2017) 160600. URL https://doi.org/10.1098/rsos.160600

[13] Thompson M. and Ellison S. L. R., Accredit. Quality Assur., 16 (2011) 483. URL https://doi.org/10.1007/s00769-011-0803-0

[14] Kuselman I. and Pennecchi F., Pure Appl. Chem., 88 (2016) 477. URL https://doi.org/10.1515/pac-2015-1101

[15] Emerson J. W., Seltzer M. and Lin D., Am. Statist., 63 (2009) 124. [16] Efron B., Ann. Statist., 7 (1979) 1.

URL https://doi.org/10.1214/aos/1176344552

[17] Blackstone E. H., Seminars Thoracic Cardiovasc. Surgery: Pediatric Cardiac Surgery Annual, 7 (2004) 192.

URL https://doi.org/10.1053/j.pcsu.2004.02.002

[18] Meija J., Michalowska-Kaczmarczyk A. M. and Michalowski T., Anal. Bioanal. Chem., 408 (2016) 4469.

URL https://doi.org/10.1007/s00216-016-9555-3

[19] Forigua D. A. A. and Meija J., Anal. Bioanal. Chem., 411 (2019) 3705. URL https://doi.org/10.1007/s00216-019-01800-7

[20] Morales D. A., J. Chemometrics, 16 (2002) 247. URL https://doi.org/10.1002/cem.719

[21] Michalowska-Kaczmarczyk A. M. and Michalowski T., Anal. Bioanal. Chem., 407 (2015) 4877.

(17)

[22] Izquierdo A., Guasch J., Ferre M. and Rius F., Polyhedron, 5 (1986) 1007. URL https://doi.org/10.1016/s0277-5387(00)80143-6

[23] Meija J. and Bisenieks J., Anal. Bioanal. Chem., 389 (2007) 1301. URL https://doi.org/10.1007/s00216-007-1566-7

[24] Kelly W. R., Pratt K. W., Guthrie W. F. and Martin K. R., Anal. Bioanal. Chem., 400(2011) 1805.

URL https://doi.org/10.1007/s00216-011-4908-4 [25] Marsaglia G. et al., J. Stat. Software, 16 (2006) 1.

[26] Asuero A. G. and Gonz´alez G., Crit. Rev. Anal. Chem., 37 (2007) 143. URL https://doi.org/10.1080/10408340701244615

[27] Raposo F., Trends Anal. Chem., 77 (2016) 167. URL https://doi.org/10.1016/j.trac.2015.12.006 [28] Meija J. and Mester Z., Anal. Chim. Acta, 607 (2008) 115.

URL https://doi.org/10.1016/j.aca.2007.11.050

[29] Pagliano E., Mester Z. and Meija J., Anal. Bioanal. Chem., 405 (2013) 2879. URL https://doi.org/10.1007/s00216-013-6724-5

[30] Pagliano E., Mester Z. and Meija J., Anal. Chim. Acta, 896 (2015) 63. URL https://doi.org/10.1016/j.aca.2015.09.020

[31] Possolo A., Merkatas C. and Bodnar O., Metrologia, 56 (2019) 045009. URL https://doi.org/10.1088/1681-7575/ab2a8d

[32] Hibbert D. B., Accredit. Quality Assur., 11 (2006) 543. URL https://doi.org/10.1007/s00769-006-0177-x

[33] Bi`evre P. D., Dybkær R., Fajgelj A. and Hibbert D. B., Pure Appl. Chem., 83 (2011) 1873.

URL https://doi.org/10.1351/pac-rep-07-09-39

[34] Brand W. A. and Coplen T. B., Isotopes Environ. Health Studies, 48 (2012) 393. URL https://doi.org/10.1080/10256016.2012.666977

[35] Dunn P. J., Malinovsky D. and Goenaga-Infante H., Rapid Commun. Mass Spectrom., 34 (2020) e8711.

URL https://doi.org/10.1002/rcm.8711

[36] Schimmelmann A., Qi H., Coplen T. B., Brand W. A., Fong J., Meier-Augenstein W., Kemp H. F., Toman B., Ackermann A., Assonov S., Aerts-Bijma A. T., Brejcha R., Chikaraishi Y., Darwish T., Elsner M., Gehre M., Geilmann H., Gr¨oning M., H´elie J.-F., Herrero-Mart´ın S., Meijer H. A. J., Sauer P. E., Sessions A. L.and Werner R. A., Anal. Chem., 88 (2016) 4294.

URL https://doi.org/10.1021/acs.analchem.5b04392

[37] Chartrand M. M., Meija J., Kumkrong P. and Mester Z., Rapid Commun. Mass Spectrom., 33 (2019) 272.

URL https://doi.org/10.1002/rcm.8357

[38] Assonov S., Rapid Commun. Mass Spectrom., 32 (2018) 827. URL https://doi.org/10.1002/rcm.8102

[39] Miura T., Chiba K., Kuroiwa T., Narukawa T., Hioki A. and Matsue H., Talanta, 82(2010) 1143.

URL https://doi.org/10.1016/j.talanta.2010.06.024

[40] Kumkrong P., Thiensong B., Le P. M., McRae G., Windust A., Deawtong S., Meija J., Maxwell P., Yang L.and Mester Z., Anal. Chim. Acta, 943 (2016) 41. URL https://doi.org/10.1016/j.aca.2016.09.031

(18)

AUTHOR: Please check VERY CAREFULLY throughout

1) TEXT (Copy-editing corrections already inserted in the manuscript-to-proofs process) 2) FIGURES (if any): All writings and symbols. Thank you

1 Author please check that the copyright you indicated on bottom of page 1 is

368

not the SIF standard one. By default all Varenna papers have SIF copyright

369

(as you can see in the latex macro we provide authors and also by the

370

attached copyright form to these proofs...). If this is not the case for your

371

paper, please provide us with a new copyright form and specify your new

372

Figure

Table I. – What causes errors in chemical analysis? Example of seawater nutrient analysis [4].
Table II. – The choice of measurement model can have a significant effect on the interpretation of the results.
Table III. – Excerpt of figure skating score card (Alina Zagitova, 2018 Winter Olympics).
Fig. 2. – Probabilistic interpretation of the ladies single figure skating medal scores at the 2018 Winter Olympics
+5

Références

Documents relatifs

”Where the data are coming from?” Ethics, crowdsourcing and traceability for Big Data in Human Language Technology1. Crowdsourcing and human computation multidisciplinary

By combining regulatory sequence analy- sis and metabolic pathway analysis, one could obtain two independent and comple- mentary sources of information for these clusters

Several experiments have led to investigate the effect of different conditions such as frequency, electric field, and temperature on the transmission spectra of the GaN quantum

(1) at W = 1.1 GeV any ∆S window completely covers the phase space of radiative events, (2) at W = 1.3 GeV any ∆S window is small compared to the phase space of radiative events,

According to the variational principle,the best molecular orbitals are obtained by va- rying the one-electron functions yi until E achieves its minimum value.The condition for

• Joint publication: Timed release of data tied to conventional journal article. • Separate publication: Independent release of data so that it can be

lame mince de l’échantillon Brèche_Min illustrant le fait que la minéralisation se développe autour d’un claste de la dolomie chocolat 2 (aspect orangé et mouchetures)

In general, it has been reported (e.g. Schuster et al., 2006) that values of Angstrom parameter <1.0 indicate size distributions dominated by coarse-mode aerosols (radii>0.5