• Aucun résultat trouvé

5. BASIC STATISTICAL TOOLS FOR THE ANALYTICAL CHEMIST

5.7. Student’s t distribution

We introduce our topic by considering the following problem. An experiment has been carried out to evaluate a new analytical method to determine arsenic in seafood as part of the control for export products. The maximum allowed amount of As in the commodity is 0.5 mg/kg. Six independent determinations were made with the following results: 0.46, 0.61, 0.52, 0.48, 0.57, and 0.54 mg/kg. Do the six measurements present sufficient evidence to indicate that the average mass fraction exceeds the 0.5 mg/kg?

for samples drawn from a normally distributed population was discovered by W.S. Gosset and published (1908) under the pen name of Student. He referred to the quantity under study as t and it has ever since been known as Student's t. We omit the complicated mathematical expression for the density function for t but describe some of its characteristics.

The distribution of the test statistic t x

s / n

= − µ

(15)

in repeated sampling is, like z, bell-shaped and perfectly symmetrical, about t=0. Unlike z, it is much more variable, tailing rapidly out to the right and left, a phenomenon that may readily be explained.

The variability of z in repeated sampling is due solely to x, the other quantities appearing in z (n andǦ Φ) are non-random. On the other hand, the variability of t is contributed by two random quantities, xǦ ands, which can be shown to be independent of one another. Thus when x is very large, s may be very Ǧ small, and vice versa. As a result, t will be more variable than z in repeated sampling. Finally, as we might assume, the variability of t decreases as n increases because the estimate of s, will be based upon more and more a larger set of sample. When n is infinitely large, the t andz distributions will be identical. Thus, Gosset discovered that the distribution of t depended upon the sample size, n.

The divisor of the sum of squares of deviations (n-1), which appears in the formula for s2 is called the number of degrees of freedom associated with s2. The origin of the term “degrees of freedom” is linked to the statistical theory underlying the probability distribution of s2. One may say that the test statistic t is based upon a sample of n measurements or that it possesses (n -1) degrees of freedom.

The critical values of t, which separate the rejection and acceptance regions for the statistical test are presented in Table XV. The tabulated value t, records the value of t in such a way that an area∀ lies to its right. The degrees of freedom associated with s2, d.f., are shown in the first and last columns of Table II, and the t, corresponding to various values of ∀, appear in the top row. Thus, if we wish to find the value of t, in such a way that 5% of the area lies to its right, we would use the column marked t0.05. The critical value of t, for our example, is found in the t0.05 column opposite to d.f. = (n - 1) = (6 - 1) = 5, is t = 2.015. Thus, we would reject Ho: = 0.5 when t>2.015.

The reason for choosing n =30 as dividing line between large and small samples is apparent.

For n = 30 (d.f. = 29), the critical value of t0.05 = 1.699 is numerically quite close to z0.05= 1.645. For a two-tailed test based upon n =30 measurements and ∀ = 0.05, we would place 0.025 in each tail of the t distribution and reject Ho:: = :o when t > 2.045 or t < -2.045. Note that this is very close to the z0.025 = 1.96 employed in the z test.

It is important to note that the Student's t and corresponding tabulated critical values are based upon the assumption that the sampled population possesses a normal probability distribution, This indeed is a very restrictive assumption because, in many sampling situations, the properties of the population will be completely unknown and may well be non-normal (non-parametric). If this were to seriously effect the distribution of the t statistic, the application of the t test would be very limited.

Fortunately, this point is of little consequence, as it can be shown that the distribution of the t statistic is relatively stable for populations not normally distributed, but possesses a bell-shaped probability distribution. This property of the t statistic and the common occurrence of bell-shaped distributions of data in nature, enhance the value of Student's t for use in statistical inference.

One would note that x and s2 must be independent (in a probabilistic sense) in order that the Ǧ quantity below (Equation 16) exhibit a t distribution in repeated sampling. As mentioned previously, this requirement will automatically be satisfied when the sample has been randomly drawn from a normal population.

x s / n

− µ

(16)

Having discussed the origin of Student's t and the tabulated critical values, Table XV, we now return to the problem of making an inference about the mean mass fraction of As in our seafood based upon our n = 6 measurements.

The statistical test of a hypothesis concerning a population mean may be stated as follows:

Test of a hypothesis concerning a population mean: Ho:: = :o

Alternative hypothesis, H1: specified by the experimenter depending upon the alternative values he wishes to detect.

Test statistic:

t x s / n

= − µ

(17)

To apply this test to the data, we must first calculate the sample mean, x, and its standard Ǧ deviation, s. This latter quantity is calculated using the formula explained before.

Mean x Ǧ = 0.53 (18)

Standard deviation, s = 0.0559 (19)

Remember that we wish to test the null hypothesis that the mean mass fraction of As is not significantly different from 0.5 mg/kg, against the alterative hypothesis that it is greater than 0.5. Then the elements of the test as defined above are:

Ho::= 0.5. (20)

Test statistic:

t= (0.53 - 0.5) %6 / 0,0559 = 1.31 (21)

The rejection region for the Ho, for 0.05 and (n - 1) (6 - 1) = 5 degrees of freedom is t > 2.015.

The calculated value of the test statistic does not fall in the rejection region. Therefore, we do not reject Ho. This implies that the data do not present sufficient evidence to indicate that the mean mass fractions of As in the sample do not exceed 0.5 mg/kg.

Performing hypothesis tests, one found that there are to two types of approaches to the problem, depending on how it is presented: a one-sided test and a two-sided test.

5.7.1. One sided test

A one sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, Ho, are located entirely in one tail of the probability distribution. In other words, the critical region for a one-sided test is the set of values smaller than the critical value of the test, or the set of values greater than the critical value of the test. A one sided test is also referred to as a one-tailed test of significance.

5.7.2. Two sided test

A two sided test is a statistical hypothesis test in which the values for which we can reject the null hypothesis, Ho, are located in both tails of the probability distribution. In other words, the critical region for a two-sided test is the set of values smaller than a first critical value of the test and the set of values greater than a second critical value of the test. A two sided test is also referred to as a two-tailed test of significance.

The choice between a one-sided test and a two-sided test is determined by the purpose of the investigation.

As an example, let us suppose we want to test a manufacturer claim that there is, on average, 50 matches in a box. We could set up the following hypotheses

Ho:: = 50 against H1;: < 50 or H1;.:> 50 (22) Either of these two alternative hypotheses would lead to a one-sided test. Presumably, we would want to test the null hypothesis against the first alternative hypothesis since it would be useful to know if there is likely to be less than 50 matches, on average, in a box (no one would complain if they get the correct number of matches in a box or more).

Another alternative hypothesis could be tested against the same null hypothesis, leading this time to a two-sided test:

Ho:: = 50 against H1:: 50 (23)

That is, nothing specific can be said about the average number of matches in a box; only that, if we could reject the null hypothesis in our test, we would know that the average number of matches in a box is likely to be less than or greater than 50.

Hypothesis testing can also be performed to establish the properties of a one given sample or to relate or compare two samples.

5.7.3. One sample t-test

A one sample t-test is a hypothesis test for answering questions about the mean where the data is a random sample of independent observations from a normally distributed population.

The null hypothesis for the one sample t-test is: Ho:: = :o (where :o known)

That is, the sample has been drawn from a population of a given mean and unknown variance (which therefore has to be estimated from the sample).

This null hypothesis, Ho is tested against one of the following alternative hypotheses, depending on the question posed:

– H1: : 0 – H1: : > :o

– H1: : < :o

5.7.4. Two sample t-test

A two sample t-test is a hypothesis test for answering questions about the mean where the data are collected from two random samples of independent observations, each from a normally distributed population.

When carrying out a two sample t-test, it is usual to assume that the variances for the two populations are equal, that is:

Φ12 = Φ22 (24)

The null hypothesis for the two samples t-test is:

Ho::1 = :2 (25)

That is, the two samples have both been drawn from the same population.

This null hypothesis is tested against one of the following alternative hypotheses, depending on the question to be answered.

– H1:: 0 – H1:: > :o

– H1:: < :o

To illustrate the above mentioned concepts here are some examples:

5.7.5. Testing the mean against a given value

This is the case when validating an analytical method or when comparing the results from a routine analytical method with the value established for that analyte in a RM or a QCM.

We will calculate t using the equation t x

s / n

= − µ

(26)

where x is the mean of your data, is the given (certified or reference) value, n is the number of Ǧ measurements and s is the standard deviation of your data. Let’s suppose that we have analysed an RM for Cu and found the following results: 10.5, 11.0, 10.0, 10.8 and 10.4 mg/kg. The mass fraction in the RM is 10.0 mg/kg. Our question would be whether there is evidence, at the 95% confidence level, of any significant difference between the mean and the reference value.

The procedure would be to calculate t from the above equation, then compare this value with the tabulated t value (at the chosen confidence level) and for (n-1) degrees of freedom. If the tcalc is lower that the t tabulated, we can accept the Ho, thus, there is no significant difference between both values.

H0: are the results coming from a population with mean = 10.00 mg/kg?

=x = 10.00 Ǧ

x = 10.54 mg/kg, s = 0.385, tǦ calc = 2.105

t(tab,0.05,4) = 2.78, so tcalc < t(tab,0.05,4); therefore, there is no significant difference between the mean and the reference value.

5.7.6. Testing two means

In this example we will suppose that two samples have been analysed by the same method. We can test if the means are significantly different by a t-test

We will assume that the standard deviations of each set are not significantly different. Thus, Ho is that there is no significant difference between the means, i.e., the difference between the means should be zero

:1 = :2 (27)

Proceed, as follows:

(1) Calculate mean and standard deviation of each set

(2) Calculate pooled standard deviation using the following equation s (n 1)s (n 1)s

As an example, let’s supposed that two replicates from one sample have been analysed by two methods and we want to test if the means are significantly different by a t-test:

With method A: mean = 28.0,

standard deviation = 0.3,

n = 10

With method B: mean = 26.25,

standard deviation = 0.23,

n = 10

The results are the following:

Pooled standard deviation: s2p = (9 x 0.32 + 9 x 0.232)/18

= 0.0715

sp = 0.267

t (28.0 26.25) 0.267 1

10 1 10

= −

+

(30) t = 14.7

t(0.05, 18) = 2.1

Therefore, since tcalc > ttab the null hypothesis is rejected which means that the results from the two methods are significantly different