H Exercise 2. H Exercise 1.

(1)

Exercise 1. (Tutorial for lesson page 6)

8 different letters have to be drawn from our alphabet. The “success” letters are vowels. The random variable X is the number of successes among the 8 drawn letters.

a. Justify that the distribution of X is Hypergeometric and give its parameters.

* A letter is a vowel (success) or a consonant. N = 26, a = 6.

* 8 letters are to be drawn (n = 8) without putting back.

* After 8 draws, X is the number of successes.

The probability distribution of X is then

H

(8 ; 6 ; 26).

b. Calculate p(X = 0), p(X = 3), p(X = 8).

( )

⁰⁶ 8 ⁸²⁰ ^. ^;

( )

⁴⁶ 8 ⁴²⁰ ^. ^;

( )

26 26

C C C C

p 0 0 0806 p 4 0 0465 p 8

C C

X = = × ≈ X = = × ≈ X = is impossible

c. Calculate the expectation and the standard deviation of X, then comment the expectation.

( ) ( ) ( )

( ) ( )

. ; .

.

2 2

6 N N 6 20 18

E 8 1 846 V 8 1 0225

N 26 N N 1 26 25

V 1 0112

a a

a n

X n X n

X X

σ

− − ×

= = × ≈ = = × × ≈

−

= ≈

The expectation shows us that getting two vowels is the most likely number of successes. On average:

1.846 success in 8 letters.

d. Build a bar-chart of this probability distribution.

Exercise 2.

A supermarket sells 24 fruit species, counting 8 “bio” label. A blind control consists in choosing 10 fruits of different species. The variable X gives the number of “bio” species among these 10.

1. Give, with explanations, the probability law of X.

* A species is bio (success) or not. N = 24, a = 8.

* 10 species are to be drawn (n = 10) without putting back (10 different species).

* After 10 draws, X is the number of successes.

The probability distribution of X is then

H

(10 ; 8 ; 24).

2. Calculate its expected value and standard deviation.

E(X) = 10×8/24 = 3.333 bio species.

V(X) = 10×8/24×16/24×14/23 = 1.3527. σ(X) = 1.163 bio species.

3. What is the probability that less than two “bio” species would be chosen?

p(X < 2) = p(X = 0) + p(X = 1) = 0.0041 + 0.0467 = 0.0508.

(2)

Exercise 3.

Three light bulbs are taken at random from a batch of 15 at the same time, 5 of which are defective. Calculate the probability of events:

A : at least one light bulb is defective B : the three are defective C : exactly one is defective

* A bulb will lighten (failure) or not (success). N = 15, a = 5.

* 3 bulbs are to be drawn (n = 3) without putting back.

* Among the 3 bulbs, X is the number of successes. The law of X is then

H

(3 ; 5 ; 15).

( ) ( ) ( )

⁰⁵ 3 ³¹⁰

15

C C 120 335

p A p 0 1 p 0 1 1 0.7352

C 455 455

X X ×

= > = − = = − = − = ≈

( ) ( )

³⁵ 3 ⁰¹⁰ 15

C C 10

p B p 3 0.02198

C 455

X ×

= = = = ≈

( ) ( )

¹⁵ 3 ¹⁰²

15

C C 225

p C p 1 0.4945

C 455

X ×

= = = = ≈

Exercise 4.

The oral examination consists of a total of 100 subjects; the candidates randomly select three subjects and then choose the subject to be covered from these three subjects. A candidate comes, having studied only 60 of the 100 possible topics. What is the probability that this candidate actually studied:

a. the three randomly chosen subjects? b. exactly two of the three? c. none of the three?

* A subject has been studied (success) or not. N = 100, a = 60.

* 3 subjects are to be drawn (n = 3) without putting back.

* Among the 3 subjects, X is the number of studied ones. The law of X is then

H

(3 ; 60 ; 100).

a.

( )

³⁶⁰3 ⁰⁴⁰ 100

C C 34220

p 3 0.2116

C 161700

X = = × = ≈

b.

( )

²⁶⁰3 ¹⁴⁰ 100

C C 70800

p 2 0.4378

C 161700

X = = × = ≈ c.

( )

⁰⁶⁰3 ³⁴⁰

100

C C 9880

p 0 0.0611

C 161700

X = = × = ≈

A wheel (roulette) is divided into 26 same-sized sectors. 6 sectors are white and the others are red. After spinning the wheel, the success is : "it stops on a white sector". The random variable X gives, after 8 successive attempts, the total number of successes.

a. Explain why the law of X is binomial and give its parameters.

* The wheel will stop on a white sector (success) or on a red one.

* For each of the 8 attempts (n = 8), the probability of success is constant: p = 6/26.

* X is the number of successes among the 8 attempts. The law of X is then

B

(8 ; 6/26).

b. Calculate p(X = 2). On your calculator, list the probabilities of each possible value of X.

Formula: p(X = k) = C^k_n×p^k×q^{n k}⁻ ; here: =

8 8

6 20

C 26 26

k k

k

    −

×  × 

    ;

so: p(X = 2) =

2 6

2 8

6 20

C 26 26

   

×  ×  ≈

    0.3089.

On the calculator, for instance using the formula:

Enter the possible values k in List1 : 0, 1, 2, 3, 4, 5, 6, 7, 8

Enter the formulain List2 : 8CList1*(6/26)^List1*(20/26)^(8-List1) Of course, we can instead use the "binomial" tool of the calculator.

c. Calculate the expected value and the standard deviation of X, then comment these values.

E(X) = np = 8×6/26 ≈ 1.846 ; σ^{(X) =} npq ≈ 1.192

On long-term, we can expect on average 1.846 success, every eight attemps. The event "2 successes" is then the most likely.

By adding and subtracting the standard deviation to the mean, we can build an interval that contain the

(3)

d. Graph (sticks) these results.

Exercise 6.

A car driver meets five signal lights on his way. They share the same duration of red and green lightening : 40 seconds green and 20 seconds red. These lights are not synchronized, so that the color of one light is

independent of the color of another one.

1. At the moment you come to the first light, what’s the probability it would be green?

The light is green during the two thirds of each minute.

At any moment, the probability it's green is p = 2/3.

2. What’s the probability that the lights would all have been green on crossing them?

The same experiment is conducted 5 times. Each time, the probability of success is p, 2/3.

X is the random variable that counts the number of successes (green lights) after 5 attempts.

Then, te law of X is

B

(5 ; 2/3) and

( )

5⁵ ⁵ ⁰ ⁵ ^.

2 1 2

p 5 0 1317

3 3 3

X C      

= = ×  ×  =  ≈

     

3. What’s the probability that at least two lights would have been red?

( )

5⁴ ⁴ ¹ ⁴ ^.

2 1 2 1

p 4 5 0 3292

3 3 3 3

X C      

= = ×  ×  = ×  × ≈

     

( ) ( ) ( )

^.

p X ≤ = −3 1 p X = −5 p X = ≈4 0 5391 4. What’s the mean expected number of green lights driving this way?

E(X) = np = 5×2/3 = 3.333 green lights, on average.

Exercise 7.

The germination capacity of a seed is 0.8 (probability to germinate).

1. 8 seeds are sown. Calculate the probabilities of the following events : a. exactly 5 seeds will germinate.

The same experiment is repeated eight times, identically: p = 0.8. Let X be the number of germinated seeds, among the eight. Then, the law of X is

B

(8 ; 0.8). p

(

X =5

)

=C8⁵×

( ) ( )

0 8^. ⁵× 0 2^. ³≈0 1468^. b. At least 7 seeds will germinate.

( )

⁷⁸

( ) ( )

^. ⁷ ^. ¹ ⁸⁸

( ) ( )

^. ⁸ ^. ⁰ ^.

p X ≥7 =C × 0 8 × 0 2 +C × 0 8 × 0 2 ≈0 5033

2. When a seed has germinated, the probability that a slug eats the young plant is 0.4.

a. Calculate the probability that a seed will finally become a grown plant.

Considering a seed, we could build a tree in which only one path is to be taken, containing the probability that it germinates (0.8) and then the probability that slugs don't eat it, given that it germinated (0.6). The probability that it will become a grown plant is 0.8×0.6 = 0.48.

b. How many seeds must be sown to get more than 99% chances of getting at least one grown plant?

The random variable Y counts the number of grown plants (psuccess = 0.48) among n seeds.

The law of Y is

B

(n ; 0.48).

The contrary event (of this question) is simple: p(Y = 0) be less than 0.01.

(4)

( ) ( ) ( ) ( ) ( )

( )

ln .

. . . .

ln .

0 0 0 01

p 0 0 01 C 0 48 0 52 0 01 1 1 0 52 0 01

0 52

n n

Y = < ⇔ n× × < ⇔ × × < ⇔ n>

We must have n > 7.042: at least 8 seeds have to be sown.

Exercise 8.

⁹ ¹

p X ≥8 =p X =8 +p X =9 +p X =10 =0.3020+C × 0.8 × 0.2 +0.1074≈0.6778

Exercise 9.

6% of French people are clients of the mobile phoning operator “Yellow”. A survey consists in asking to 50 French people chosen at random which one is their mobile phoning operator. The variable X gives the number of “Yellow” clients among these 50 people.

1. a. Justify and give the probability law of X.

* An individual is a client (success) or isn't.

* 50 individuals are to be interviewed, without putting back.

* X counts the successes among the 50. The law of X is then

H

(50 ; a ; N)… but we don’t know N.

A hypothesis has to be made : the French population is bigger than 20 times 50 (and we know it’s true).

The probability of success is then considered as invariable (p = 0.06) and we can use

B

(50 ; 0.06).

b. What are the chances that the population proportion of clients would be the same into the sample?

6% of 50 people = 3 people. p(X = 3) = 0.2311 c. What is special with this former probability?

This is the greatest probability, because E(X) = 50×0.06 = 3.

d. What’s the probability that none of the 50 people would be a “Yellow” client? p(X = 0) = 0.04533 e. What’s the probability that there would be at least 4 “Yellow” clients?

p(X ≥ 4) = 1 – p(X ≤ 3) = 1 – 0.6473 = 0.3527

2. In this part, the number of individuals to be called is still unknown. How many people would have to be called, to get more than 99% chances finding at least one “Yellow” client?

p(X = 0) < 0.01 iff 0.94ⁿ < 0.01 iff n > ln(0.01)/ln(0.94) iff n > 74.43 At least 75 people have to be called.

The law of the variable X is binomial with parameters n = 50 and p = 0.06.

a. Obtain (calculator's lists) p(X = k) for each integer k from 0 to 7.

Enter the values k in List1 : 0, 1, 2, 3, 4, 5, 6, 7

Enter the following formula in List2 : 50CList1*0,06^List1*0,94^(50-List1) The "Poisson" tool can be used too.

b. Justify the approximation of this law by a Poisson's one whose parameter has to be

(5)

n > 30 ? yes (n = 50) ; p < 0.1 ? yes (p = 0.06) ; np < 10 ? yes (np = 3).

Hence the existence of a Poisson's distribution whose results are all close to reality.

λ = np = 3. This distribution is the law

P

_(3).

c. Give, by using Poisson's law table, the probabilities asked above.

Compare them to the ones obtained with the binomial law.

In the Poisson's law table you can find the opposite series:

_(5).

2. Calculate the probability that…

a. All these people won’t give their name. ^p

(

^X ^{= ≈}⁰

)

^0.00674

b. At least 5 people will give their name. ^p

(

^X ^{≥ = −}⁵

)

^{1 p}

(

^X^{≤ ≈}⁴

)

^0.5595

Exercise 13.

A box contains 250 matches. It has been exposed to moisture, so that 20% of matches won’t lighten.

Taking at random 10 matches, the variable X gives the number of matches that will lighten.

1. Demonstrate that X can be described by a binomial law and give its parameters and expected value.

* A match will lighten (success) or won't.

* 10 matches are to be tested, without putting back.

H

(10 ; 200 ; 250).

As 250 is bigger than 20 times 10, we can consider the probability of success as invariable (p = 0.8) and we can use

B

(10 ; 0.8). E(X) = np = 8.

2. Calculate the following probabilities:

a. No match will lighten p

(

X = =0

)

C10⁰ ×0.8⁰×0.2¹⁰=0.2¹⁰ ≈10⁻⁷ b. They will all lighten p

(

X =10

)

=C10¹⁰×0.8¹⁰×0.2⁰=0.8¹⁰≈0.1074 c. At least 3 won’t lighten

( ) ( ) ( ) ( ) ( )

p X < = −8 1 p X > = −7 1 p X = −8 p X = −9 p X =10 = −1 0.3020 0.2684 0.1074− − ≈0.3222 3. a. Calculate the same probabilities, this time using a Poisson’s law.

λ = E(X) = 8 ; p(X = 0) = 0.00034 ; p(X = 10) = 0.09926 ;

(6)

p(X < 8) = p(X = 0) + p(X = 1) + ... + p(X = 7) = 0.45296

b. Explain the differences between your answers at questions 2 and 3.

Poisson's results are rather incorrect, because the conditions allowing to transform a binomial

distribution into a Poisson's one are not met: we need n > 30, but n = 10, we need p < 0.1, but p = 0.8…

However np < 10 (indeed: np = 8).

Exercise 14.

In a large population are met on average 0.4% of blind people.

1. Into a sample of 100 people, what’s the probability there’s no blind one? at least 2?

* An individual is blind (success) or not.

* 100 people are to be chosen, without putting back.

H

A hypothesis has to be made : the population is bigger than 20 times 100.

B

(100 ; 0.004).

( )

¹⁰⁰⁰

( ) (

⁰

)

¹⁰⁰

( )

¹⁰⁰

p X =0 =C × 0.004 × 0.996 = 0.996 ≈0.6698

( ) ( ) ( )

100¹

( ) (

¹

)

⁹⁹

p X ≥2 = −1 p X =0 −p X =1 = −1 0.6698−C × 0.004 × 0.996 ≈0.06123 2. Answer these questions using the correct Poisson’s law (justify its use).

n > 30 ; p < 0.1 ; np = 0.4 < 10. Hence

B

(100 ; 0.004) is approximated by

P

(0.4).

According to the Poisson's table (λ = 0.4) : ^p

(

^X ^{= ≈}⁰

)

^0.6703

( ) ( ) ( )

p X ≥ = −2 1 p X = −0 p X = = −1 1 0.67032 0.26813 0.06155− ≈ These results are very close to the former ones.

(7)

1) Let's consider a statistical data set that shows a symmetric distribution with most of central values, from a large population. e.g. : many objects were manufactured and weighed. Their theoretical mass is 3.8 kg, and the weights of 200 objects are distributed as follows:

masse (kg) [3.5 ; 3.7[ [3.7 ; 3.77[ [3.77 ; 3.8[ [3.8 ; 3.83[ [3.83 ; 3.9[ [3.9 ; 4.1[

frequency 9 27 63 60 29 12

freq. rate 0.045 0.135 0.315 0.3 0.145 0.06

Let's graph the frequency histogram of this series : on abscissas: the variable (mass, kg); on ordinates:

frequency concentration (in % of objects per kg) the rectangles areas are in proportion with the

frequencies.

What's the probability, for an object taken randomly, to weigh less than 3.77 kg ?

The frequencies of the corresponding ranges have to be added: 0.045 + 0.135 = 18%.

2) Now, the 200 results can be given more precisely, into a greater number of thinner intervals The new frequency histogram is the following (each point is the middle of the high side of a rectangle):

A bell curve is appearing, typical of numerous distributions in many concrete fields (production, economy, biology, ecology, …).

Which way could we find, using this histogram, the probability that an object's mass (chosen at random) be less than 3.7 kg ? be between 3.7 kg and 3.9 kg ?.

The frequencies of the corresponding ranges have to be added as well, that are the rectangles areas.

3) We could consider weighing far more than 200 pieces, and with far more accurate results. The histogram could contain a large number of rectangles and would become difficult to draw and to read ! The only useful graph would contain only a points cloud, that would actually follow a bell-shaped curve, that could be modelled by a function f. In this context, how would we calculate the probabilities asked above?

This time, the sum of rectangles areas has to be translated into the area between the curve and the abscissas axis, that is to say the integral of the density function, between both ounds (desired abscissas).

Give or calculate the probabilities asked below, from the standard normal law table and the available transformations formulas. Then, check your results thanks to the tool "normal law" of your calculator.

in green: what can be found directly in the standard normal law table

« = » in red: reversed problem: p(U > –) = p(U < +) or p(U < –) = p(U > +) = 1 – p(U < +) p(U < 1) = 0.8413

p(U < 1.96) = 0.9750 p(U < 2.58) = 0.9951

p(U > 1) = 1 - p(U < 1) = 1 - 0.8413 = 0.1587 p(U > 1.63) = 1 – p(U < 1.63) = 1 – 0.9484 = 0.0516 p(U > 0.35) = 1 – p(U < 0.35) = 1 – 0.6368 = 0.3632

(8)

p(1 < U < 2) = p(U < 2) – p(U < 1) = 0.9772 – 0.8413 = 0.1359

p(0.42 < U < 1.07) = p(U < 1.07) – p(U < 0.42) = 0.8577 – 0.6628 = 0.1949 p(U < –1) = p(U > 1) = 1 – p(U < 1) = 1 – 0.8413 = 0.1587

p(U < –0.88) = p(U > 0.88) = 1 - p(U < 0.88) = 1 – 0.8106 = 0.1894 p(U > –0.5) = p(U < 0.5) = 0.6915

p(U > –2.23) = p(U < 2.23) = 0.9871

p(–1.85 < U < –1.07) = p(1.07 < U < 1.85) = p(U < 1.85) – p(U < 1.07) = 0.1101 p(–1.12 < U < 0.6) = p(U < 0.6) – p(U < –1.12) = p(U < 0.6) – p(U > 1.12)

= p(U < 0.6) – (1 - p(U < 1.12)) = 0.5943 Exercise 17. (Tutorial for lesson page 12)

Calculate the probabilities, using the standard normal law table and the available transformations formulas.

Then, check your results thanks to the tool "normal law" of your calculator.

1. law of X:

N

(50 , 10). Calculate p(X < 60), p(X < 43), p(45 < X < 55) p(X< 60) = p(U < 60 50

10

− ) = p(U < 1) = 0.8413

p(X< 43) = p(U < 43 50 10

− ) = p(U < –0.7) = 1 – p(U < 0.7) = 0.2420

p(45 < X< 55) = p(45 50 10

− < U < 55 50 10

− ) = p(–0.5 < U< 0.5)

= p(U < 0.5) – p(U < –0.5) = p(U < 0.5) – (1 – p(U < 0.5)) = 0.3830 2. law of X :

N

(3 , 0.45). Calculate p(X > 4), p(X < 2;55), p(3.2 < X < 3.7) p(X> 4) = p(U> 4 3

0.45

− ) = p(U> 2.22) = 1 – p(U < 2.22) = 0.0132

p(X< 2.55) = p(U< 2.55 3 0.45

− ) = p(U< –1) = 1 – p(U < 1) = 0.1587

p(3.2 < X < 3.7) = p(3.2 3 0.45

− < U < 3.7 3 0.45

− ) = p(0.44 < U < 1.56) = p(U < 1.56) – p(U < 0.44) = 0.9406 – 0.6700 = 0.2706 Exercise 18. (Tutorial for lesson page 13)

In a land, 30 % of the companies do exports. We decide to choose 80 companies at random and we set X as the number of those doing exports among the 80.

1. What is the probability distribution of X and what are its parameters?

80 different companies are chosen and the variable counts how many of them do exports (success). Its law is then Hypergeometric, but we don't know the size of the population.

2. Justify that a binomial law can be used instead of the former one.

We assume that the population is big, more than 20 times bigger than the sample, which allows us to assume as a constant the probability of success, p = 0.3, and then to use a binomial law.

X is then approximately distributed by:

B

(80 ; 0.3).

3. Justify that a normal law can be used; give its parameters.

n = 80 > 30 ; np = 24 > 5 ; nq = 56 > 5 Hence:

B

(80 ; 0.3) ≈

N

(24 ; √16.8 ≈ 4.1) 4. Using that normal law, then checking with the tool "binomial law" of your calculator, give:

a. the probability that more than 30 companies do exports among the 80.

p(X > 30.5) = 0,05644 ; using the binomial distribution: p(X > 30) = 0.05875 b. the probability that 30 companies do exports among the 80.

p(X = 30) = p(29.5 < X < 30.5) = 0.03344 ; binomial: p(X = 30) = 0.03285

(9)

A company has put its merchant website online. Tests have shown that a connection problem appears on average once in 500. For its brand image, it considers that a bad week is one with more than 50 connection problems and that during the year no more than 5 bad weeks may occur.

1. Given that 20000 weekly user connections are expected for the following weeks, buid the probability distribution of the number X of problems per week.

Each one of the 20,000 connection attempts has one chance out of 500 to fail, which can be considered as a probability of success p = 0.002, invariable since it's a statement of this exercise. In 20,000 attempts, the number X of problems is then a random variable distributed by the law

B

(20,000 ; 0.002).

2. Determine which normal law can be used in that case.

n = 20000 > 30, np = 40 > 5 and of course nq > 5. Then, a relevant normal distribution is:

N

(40 ; 6.318).

3. What is the probability that in a given week more than 50 problems may occur?

With this normal distribution: p(X > 50.5) = 0.04826.

4. What is the probability distribution of the number Y of bad weeks during one year?

The previous probability is the one of success (bad week), while building the variable Y, invariable in time.

The variable Y is then distributed by

B

(52 ; 0.04826).

5. What is the probability that Y would be more than 5?

With this binomial distribution: p(Y > 5) = 0.03854.

Exercise 20.

It’s been stated that the variable X “mass (kg) of a newborn baby” is distributed by the law

N

(3.4 ; 0.5).

1. What’s the probability that a newborn baby weighs more than 4 kg ? p(mass > 4) = 0.1151 2. What’s the probability that a newborn baby weighs less than 3 kg ? p(mass < 3) = 0.2119 3. What’s the probability that a newborn baby’s weight is between 3 and 4 kg ?

p(3 < mass < 4) = 1 – 0.1151 – 0.2119 = 0.673 Exercise 21.

A company manufactures beacons (flashing lights) for all types of machines, in large quantities. The probability that a beacon is defective is p = 0.04. A random sample of 600 beacons is taken from the production. X is the random variable that gives the number of defective beacons among the 600.

1. Show that the random variable X has a binomial distribution whose parameters are to be specified.

Each beacon is defective (success), or isn't, with a constant probability of success: p = 0.04.

The variable X is the number of successes among 600 items.

Hence, X is distributed by

B

(600 ; 0.04). (we assume the overall production is much bigger than 600) 2. Show that we can approximate the distribution of X by a normal distribution.

n = 600 > 30, np = 24 > 5 and of course nq > 5, so a normal law can correctly represent this variable.

3. Determine µ^andσ, mean and standard deviation of the variable X for the normal distribution.

np = 24 and npq= 4.8. This normal distribution is

N

(24 ; 4.8).

4. Then calculate the probability of having at least 27 defective flashing lights in the draw of 600.

p(X > 26.5) = 0.3012 Exercise 22.

A commercial agent is assigned to telephone solicitations. On average, one in five phone calls leads to appeal an order. We name X the random variable “number of orders after 60 calls”.

1. a. Give the name and the parameters of the probability distribution of X.

A phone call leads to an order (success) or doesn't. The probability of success, invariable, is p = 1/5 = 0.2.

Hence the binomial distribution

B

(60 ; 0.2).

(10)

b. Justify that the law can be approximated by a normal distribution, give its parameters.

n =60 > 30, np = 12 > 5 and nq = 48 > 5. We can use a normal distribution instead of the binomial one.

np = 12 and npq= 3.098. Thus, we can use:

N

(12 ; 3.098).

c. Calculate p(X > 15), p(X < 10), p(X = 12).

* p(X > 15.5) = 0.1293 * p(X < 9.5) = 0.2098

(0.2n ; 0.4 n ).

The variable change, between X and U, is expressed by: 0.2 0.4

X n

U n

= − .

On the other hand, the standard normal law table provides us the following information: u0 has to be less than –1.645 so that p(U > u0) is more than 95%.

Combining these elements, we must have: 14.5 0.2

1.645 0.4

n n

− < − ⇔ 0.2n – 0.658 n – 14.5 > 0.

From that, you can test several values for n, or solve this equation (by writing n = N²).

The result is that n must be more than 106:

107 phone calls have to be made, to have 95% chances to get at least 15 orders.

Exercise 23.

It is assumed that you’re checked in the bus by a controller on average once every 20 travels. Mr Don Wanapay makes 800 trips a year on the line.

1. What is the probability that Mr Don Wanapay would be checked between 30 and 50 times a year?

Each trip can end in a control (success, p = 1/20 = 0.05 invariable) or not.

Let X be the random number of controls among 800 trips. Its distribution is

B

(800 ; 0.05).

n = 800 > 30, np = 40 > 5 and nq > 5, so we are allowed to use a normal law. np = 40 and npq= 6.164.

This distribution is

N

(40 ; 6.164). p(29.5 < X < 50.5) = 0.9118.

2. Mr Don Wanapay always travels without a ticket. An annual subscription would be €320 / year.

At what height must the company fix the fine, so that at least 99% of cheaters would better take an annual subscription?

Let's find out the number x0 of controls that someone has 99% chances to exceed: p(U > u0) = 0.99 gives u0 = –2.33 and, by a variable change: x0 = 25.66. 99% of people will be controlled 25 times at least, in 800 travels. Thus, 25 fines must cost more than the annual subscription (€320), hence a €12.80 fine at least.

(11)

1. From a normal population, µ = 120 and σ = 40, are taken every SRS of sizes n = 10 and n = 50.

a. What are the distributions of these samples means?

Let be X the variable "value of an individual in the population", Which law is

N

(120 ; 40).

The variable X refers to the average value in any sample composed with n individuals.

On SRS, X is distributed by

N

_{(120 ;} ⁴⁰

n ). If n = 10:

N

(120 ; 12.65) and if n = 50:

N

(120 ; 5.657).

b. Graph both distributions, roughly, in the same frame, in order to compare them.

(the most spread out is the one whose standard deviation is bigger)

c. What is the probability that the mean of a random 10-sized sample would be more than 130?

p(X> 130) = 0.2146

d. Same question for a 50-sized sample. p(X> 130) = 0.03855

2. Several years ago, the world counted 3.38 billion women and 3.12 billion men. P is the variable giving the proportion of women in every sample of 100 people.

a. Give the probability distribution of P.

La proportion de femmes dans le monde était alors : 3.38 . 6.5 0 52

π = = .

Considering a SRS (the population is much more than 20 times bigger than the sample), P is distributed by ^,

(

¹

)

(12)

Exercise 26.

A candidate obtained 55% of votes cast in an election.

1. What is the probability that, in a sample of 100 people, his result be less than 50%? Among 2000 people?

The proportion of voters of this candidate, in samples of 100 people, is a random variable P whose law is

N

(0.55 ; √(0.55×0.45/100)) =

N

(0.55 ; 0.04975). p(P < 0.5) = 0.1574.

For 2000 people, the law of P is

N

(0.55 ; 0.01112). p(P < 0.5) = 0.000003456

2. How many people are required so as the probability his result be less than 50% drops below 1%?

Let's take back our reasoning in the opposite direction: p(U < -u0) < 0.01 means p(U > u0) < 0.01, and then p(U < u0) > 0.99 which implies u0 equals at least 2.33.

The variable change between P and U is

0.55 0.45 U P

n π

= −

× .

Then, we need 0.05 0.55 0.45 2.33

2.33 0.55 0.45 537.46

2.33 0.05

0.55 0.45

P n n

n n

π

− < − ⇔− > × ⇔ > × × ⇔ >

× −

At least 538 votes have to be analysed so as to the probability that less than 50% of them voted for the candidate drops below 1%.

Exercise 27.

In a region, during the summer period, we assume that the number of tourists present in a day follows a normal distribution whose mean is 50,000 and standard deviation is 8,000.

1. The prefecture considers that the tourism is "manageable" (reception, environment, pollution, ...) when more than 70% of days count less than 55,000 tourists each. What is the actual situation?

p(X < 55000) = 0.7340 > 70%.

2. Officials want to base their thinking on 10 vacation days periods.

a. What is the law of X : "Average daily number of vacationers in a sample of 10 days" ? X is distributed by

N

(50000 ; 8000/√10) =

N

(50000 ; 2530)

b. What is the probability that, in 10 days, this average daily number be less than 55,000?

p(X < 55000) = 0.9759 Exercise 28.

A large population took an IQ test. The results are normally distributed with µ = 102 and σ^{= 15.}

1. What’s the proportion of people whose IQ is less than 100? p(X < 100) = 0.4470

2. We wish to analyse the results of a few samples of this population. For that, let's create groups of 20 individuals selected by SRS, and the average IQ of each group will be calculated.

a. Give the parameters of the normal distribution of IQ means of all 20-sized samples.

mean: 102 ; standard deviation: 15/√20 = 3.354

b. What is the probability that a group of 20 people has an average IQ below 100?

p(X < 100) = 0.2755

c. Instead of 20, how many people would have to be chosen for this probability to be less than 5%?

prob < 0.05 iff p(U < u) > 0.95 iff u ≥ 1.645 given that u also equals 2/(15/√n) = 2√n /15 according to the variable change. We get √n > 1.645×15/2 = 152.2. At least 153 people have to be chosen.

3. Using the answer of question 1, what is the probability that, in a group of 20 people, the proportion (of individuals whose IQ is less than 100) is more than 50%?

In the population, π = 0.4470. The variable P, proportion of people whose IQ is less than 100 in a sample

(

¹

)

π π

π − 

 

(13)

Exercise 29.

An elevator can carry a load of 580 kg. It is assumed that someone's weight is a random variable following a normal distribution

N

₍_µ_,_σ_{) with}_µ = 70 kg and σ = 16 kg. What is the maximum number of people you may allow to be together in the elevator if you want the risk of overload does not exceed 0.01?

Let's consider a group of n people entering the elevator as a sample. Given the individual mean mass and standard deviation, we can deduce the distribution of the sample's mean mass:

N

(70 ; 16/√n). The sample's total mass is n times its mean mass, and is then distributed by:

N

(70n ; 16√n).

Within the standard normal law, the value of U that has only 1% chance to be exceeded is 2.33.

The variable change formula X

U = −moyenne

écart type is the translated in: 580 70

2.33 16

n n

= − .

We can solve this equation or test several values for n, the result being that not more than 6 people should be allowed to enter this elevator (if more, n

n

− 580 70

16 will drop down less than 2.33 and then our probability will value more than 1%).

(14)

A sample of companies of the same industry provided the following results:

turnover (M€) [0 ; 2[ [2 ; 3[ [3 ; 4[ [4 ; 5[ [5 ; 7[

size (# of companies) 6 12 17 10 5

1. Give point estimates of the mean turnover and its standard deviation in the whole set of companies.

estimate of µ^:µ^ˆ = ≈^x M€ 3.41 ; estimate of σ^: ^ˆ

1 s n

σ = n ≈

− M€ 1.358 2. Give the 95% confidence interval of the mean turnover in this industry.

σ is unknown; then, we have to use a Student's law.

dof = n – 1 = 49 ; confidence level: 95%, so α = 0.05 ; then: t = 2.010 (Student's law table)

; 1.3442; 1.3442

3.41 2.01 3.41 2.01

7 7

1 1

s s

I x t x t

n n

α

   

= − − + −  = − × + ×  = [3.024 ; 3.796] (M€).

3. Give a point estimate of the proportion of companies whose turnover is more than M€ 4.5.

On reading the table, 10 companies show a turnover that is more than M€ 4.5, hence the proportion in the sample: p = 10/50 = 0.2. Point estimate of π^:πˆ= =p 0.2

4. Give the 99% confidence interval of this proportion in this industry.

Formula: _I _p _u ^p

(

¹ ^p

)

_;_p _u ^p

(

¹ ^p

)

n n

α

 − − 

= − + 

 

 

, confidence level: 99%, α^{= 0.01} We look for u such that p(U < u) = 0.995: u = 2.58.

( )

_;

( )

0.2 1 0.2 0.2 1 0.2

0.2 2.58 0.2 2.58

50 50

I_α

 − − 

= − + 

 

 

= [0.054 ; 0.3459] = [5.4% ; 34.59%].

Exercise 31.

From a vine, 10 grapes have been taken at random and weighed, which gave the following results in kilograms:

2.4 ; 3.2 ; 3.6 ; 4.1 ; 4.3 ; 4.7 ; 5.3 ; 5.4 ; 6.5 ; 6.9

1. Give the mean and standard deviation of a grape’s mass in this sample.

This sample's mean is 4.64 kg and its standard deviation is 1.348 kg. (calculator: mode stat, 1 variable) 2. Give a point estimate of the variance of the grape mass in the whole vine (population).

2 2 10 2

1.348 2.019

1 9

n s

σ =n = × =

−

3. Give a 95% confidence interval of the average mass of grapes in the whole population.

The standard deviation of the population is unknown. So, we have to use a Student's law.

The 95% confidence interval of µ^is: _{0 05}. ;

1 1

s s

I x t x t

n n

 

= − + 

− −

  where t = 2.262 (dof = 9 ; α^{= 0.05).}

[ ]

; ;

0.05

1.348 1.348

4.64 2.262 4.64 2.262 3.624 5.656

3 3

I  

= − × + × =

4. Calculate the minimum number of grapes that would have to be analyzed so that this interval be 1 kg wide, assuming that the estimated standard deviation (q.2.) is the real one of the population.

Taking the estimated variance (2.019) as an actual value allows us to build a confidence interval with the

σ σ

 

= − + σ

(15)

then to fix: u n

σ = 0.5. Moreover, u = 1.96 since α = 5%, and σ = 2.019≈1.421kg as told. u n σ = 0.5

gives in these conditions n≈5.57 and so n = 31.02. To conclude, we need a sample of at least 31 grapes so that the interval be 1 kg wide or less.

Exercise 32.

A laboratory wishes to analyze the level of contamination of trees by the soil’s pollution, in a given territory.

After having examined one thousand trees, 142 of them appeared to have been affected.

Give an estimate of the proportion π of affected trees in this territory, by a 90% confidence interval.

The proportion p found in the sample is 0.142. The 90% confidence interval of π, proportion of affected trees

in the population, is: 0.142 0.858^; 0.142 0.858

[

^;

]

0.142 1.645 0.142 1.645 0.124 0.160

1000 1000

 − × + × =

 

  .

Exercise 33.

On managing a grain elevator, we wonder about the safety (minimal) required stock that has 99% chances to satisfy customers at any time. For this, the weekly consumption of grain has been analyzed during a sample of 15 weeks. The following results were obtained:

weekly consumption (in tons) 4.6 4.7 4.8 4.9 5 5.1 5.2 5.3

number of weeks 1 0 2 3 5 2 1 1

1. Give the mean and standard deviation of the consumption in this sample.

x = 4.973 and s = 0.1652 , in tons

2. We set X the “weekly consumption of grain” at any time, and we assume that its distribution is normal.

a. Give the point estimates of µ and σ.

Estimates: µˆ = =x 4.973tons et ˆ 15

0.1652 0.1710

1 14

s n ton

σ = n ≈ ≈

−

b. Using this normal law, calculate the value of X that has 99% chances not being exceeded.

For the law

N

(0 ; 1), p(U < 2.33) = 0.9901. Value of X corresponding to u = 2.33 : x = 2.33σ + µ = 5.371.

In a rough estimate, we can say that the needs will be covered in 99% cases with a weekly minimum stock of 5.371 tons.

3. a. Using the results of question 2, build a 99% confidence interval of the average weekly consumption.

σ being unknown, let's use the Student's law, with 14 dof.

The interval is: I = [4.973 – 2.977×0.1652/√(14) ; 4.973 + 2.977×0.1652/√(14)] = [4.842 ; 5.104]

b. What’s the probability that this average value would exceed the upper limit of this interval?

p(µ > 5.104) = 0.5%

Exercise 34.

A company wants to specialize in the delivery of large packages. Those that have already been carried are considered as a representative sample of all future packages.

data set of large packages that have already been carried:

volume (L) 200 to 400 400 to 500 500 to 600 600 to 1000

# of packages 15 40 60 10

1. Give the point estimates of the mean and standard deviation of the future packages volume.

volume (L) 300 450 550 800

# of packages 15 40 60 10

x = 508 et s = 118.1 , in L

Point estimate of µ : 508 m³. Point estimate of σ^{: 118.1×}√(125/124) = 118.5 L.

(16)

2. Give a 99% confidence interval of the average volume of the future packages.

σ being unknown, we must determine the coefficient t of a Student's law with 124 dof at a 1% significance level: t = 2,576. I = [508 – 2.576×118.1/√124 ; 508 + 2.576×118.1/√124] = [480.7 ; 535.3]

a normal law would be accepted here, as far as the number of dof is big, with u = 2.58.

3. In this question, the standard deviation of the population is considered known and its value is the one you found in question 1. We want to use a confidence interval of the average volume, whose size would be 50 L. What would be the confidence level of such an interval?

The semi-size of this interval is u×118.5/√125 = 10.6u and must be equal to 25 (size given in the question).

Thus, u must be equal to 2.36. p(-2.36 < U < 2.36) = 2×0.9909 - 1 = 0.9818.

The confidence level is here 98.18%, for an interval whose size is 50 L, which is then [483 ; 533].

So, there are 98.18% chances that the average volume of a future package would be inside this interval.

(17)

A die has been rolled 120 times. The results are gathered in the table below.

Considering this sample of results, can we say that this die is a fake one, at a 2% significance level?

Let's build a table:

result 1 2 3 4 5 6

observed number 26 15 14 24 25 16

theoretical number 20 20 20 20 20 20

(

^{obs th}

)

th

− ² _1.8 _1.25 _1.8 _0.8 _1.25 _0.8 _7.7

Null hypothesis: the observed distribution doesn't contradict the theoretical one Decision variable:

Exercise 36.

An experiment consists in trying something three times, with 1/3 chances of success each time.

X is the random variable “number of successes at the end of the experiment”.

1. Prove that p(X = 0) = 8/27, p(X = 1) = 12/27, p(X = 2) = 6/27 and p(X = 3) = 1/27.

Since the probability of success is invariable, the number of successes after 3 attempts is distributed by the binomial law

B

(3 ; 1/3), thanks to which we will obtain the desired probabilities.

2. Now, let's imagine that 54 people performed this experiment.

a. Complete the following table:

number of successes per individual 0 1 2 3 total observed number of individuals 20 14 16 4 54 theoretical number of individuals 16 24 12 2 54

b. By a

χ

² test, at a 5% significance level, decide whether the observed results are in adequacy with the expected theoretical ones.

Null hypothesis H0: the die is not a fake one calculation of the part Chi²s: (obs - th)²/th

1 4.16667 1.3333 2 total: 8.5

Calculated χ²from the sample: Chi²calc = 8.5

Limit Chi² before rejection (5% significance level, 3 dof): Chi²lim = 7.815

Decision : At a 5% significance level, we can reject H0. (our chance to be wrong is less than 5%).

result 1 2 3 4 5 6

observed # of throws 26 15 14 24 25 16

(18)

Exercise 37.

There has been reported, for five French groups in the same industry, the annual budget for promotion on the Internet compared to the global annual budget for promotion:

group A B C D E

Internet budget (k€) 47 55 58 63 72 overall budget (k€) 558 545 587 560 585 Let's build a table containing the useful data:

group A B C D E

Internet budget (k€) 47 55 58 63 72

overall budget (k€) 558 545 587 560 585

observed rate int./over. 0.084229 0.100917 0.098807 0.1125 0.123077

10% of the overall budget 55.8 54.5 58.7 56 58.5

χ² 1.387814 0.004587 0.008348 0.875 3.115385

Part 1

1. Determine the sample's proportion of groups whose Internet budget exceeds 10% of the overall one.

3 groups out to 5 correspond to this category, hence a proportion p = 3/5 = 0.6.

2. a. Determine the 95% confidence interval of the proportion that could be observed in all French groups in this industry.

The formula gives, with p = 0.6 and u = 1.96 : I = [0.1706 ; 1.0294]

(one can notice that it's very wide and even passes over 1!)

b. This industry actually consists of 58 groups in France. What is the minimal number, that can be assumed with a confidence level of 80%, of groups whose internet budget exceeds 10% of their overall budget?

The probability to be less than this number is 20%. It's then the upper bound of a 60% confidence interval, for which u = 0.85. The corresponding proportion is 0.6 0.4

0.6 0.85 0.4138

5

− × = and

finally: 41.38% of 58 groups are 24 groups.

Part 2

Perform a Chi-squared test to tell, with a significance level of 5%, if the observed data set of five companies is in adequacy with the following assertion: “in France, Internet budget is worth 10% of overall budget”.

The table above gives the part Chi², whose sum is 5.391 (Chi²calc).

With 4 dof and a 5% level, the table gives Chi² limit = 9.488.

Hence we can't reject, at a 5% significance level, the hypothesis that in France the Internet budget represents 10% of the overall budget.

Exercise 38.

A study was conducted in a sample of 50 plastics companies, getting their 2016 net income (variable R):

net income R (M€) [-1 ; 1[ [1 ; 1,5[ [1,5 ; 2[ [2 ; 3[ [3 ; 5[

# of companies 3 10 18 15 4

Part 1

1. Give the income’s mean and standard deviation of this sample.

x = 1.950 and s = 0.8761 (in M€)

2. Give a 99% confidence interval of the average net income in the whole large population of plastics companies (you may notice that the population’s standard deviation is unknown).

σ being unknown, we look for the coefficient t (Student) with 49 dof, for α = 1% : 2.680.

The interval is then: [1.615 ; 2.285].

(19)

Part 2

Our aim here is to make an assertion about the possibility that the net incomes distribution are in adequacy with the normal law

N

(2 , 0.9). Let's name X the variable of this law.

1. a. Calculate : p(-1 < X < 1) ; p(1 < X < 1,5) ; p(1,5 < X < 2) ; p(2 < X < 3) ; p(3 < X < 5).

p(-1 < X < 1) = 0.1328 p(1 < X < 1.5) = 0.1560 p(1.5 < X < 2) = 0.2107 p(2 < X < 3) = 0.3667 p(3 < X < 5) = 0.1328

b. Explain why, in adequacy with this normal law, and then in accordance with the five probabilities you just calculated, a theoretical sample of 50 plastics companies would give the following table:

net income R (M€) [-1 ; 1[ [1 ; 1,5[ [1,5 ; 2[ [2 ; 3[ [3 ; 5[

# of companies 6.642 7.8 10.537 18.337 6.642

Multiplying the probabilities by 50 makes us find this list.

2. a. Then, perform an adequacy

χ

² test between this normal law and reality, at a 5% significance level.

The comparison between the observed list (3 ; 10 ; 18 ; 15 ; 4) and the theoretical one (q.1.b.) allows us to calculate the part Chi-2s: 1.996663421 ; 0.620606224 ; 5.285536856 ; 0.607268869 ; 1.050632649.

Chi-2 calc is their sum: 9.56. With a Chi-2 law, 4 dof, α = 5%, Chi-2 limit = 9.488.

We can reject, at a 5% significance level, the idea that the net income is distributed by

N

(2 ; 0,9).

b. Give detailed explanations of this significance level.

On rejecting this adequacy hypothesis, the chance to be wrong is less than 5%.

Exercise 39.

The study of 320 families with 5 children has given the distribution of the following table.

children 5 boys 4 boys 3 boys 2 boys 1 boy 0 boy

0 girl 1 girl 2 girls 3 girls 4 girls 5 girls

# families 18 56 110 88 40 8

Are these results compatible with the hypothesis that the births of a boy and a girl are equally likely events ? Here, the difficulty is the calculation of the theoretical frequencies, since it obliges us to calculate the

probabilities of each kind of event in 5 children families (assuming that boys and girls are equally likely). Let's begin with that.

A child is a girl (success, p = 0.5, invariable) or a boy (failure, q = 0.5).

Among five children (n = 5), X is the random number of girls, hence distributed by

B

(5 ; 0.5). Then, every probability concerning any number of girls can be obtained (see the table below: "prob").

Finally, these probabilities have to be multiplied by 320 so as to get the theoretical numbers of families of each kind ("th. freq").

children 5 boys 4 boys 3 boys 2 boys 1 boy 0 boy

0 girl 1 girl 2 girls 3 girls 4 girls 5 girls

obs freq 18 56 110 88 40 8

prob 0.03125 0.15625 0.3125 0.3125 0.15625 0.03125

th. freq 10 50 100 100 50 10

Chi2 6.4 0.72 1 1.44 2 0.4

Chi²calc = 11.96

Values of some Chi² limit dor 5 dof :

dof 1% 2% 5% 10%

5 15.09 13.39 11.07 9.236 H0 can't be rejected at a 2% significance level.

H0 can be rejected with a 5% or more significance level according to the table. The matching level of χ² = 11.96 is located between 2% and 5% ; that is to say: we can afford rejecting the null hypothesis with a 95%

confidence level, but not with 98%.

(20)

A greengrocer wishes to buy vegetables from a new supplier. The latter claims that his beans measure 10 cm on average. If this value is plausible, or if the estimate is even higher, then the greengrocer will decide to choose this supplier. Of course he won’t, in case a sample gives a too low average size. The greengrocer fixed its risk level to 5%. Let X be the random variable "size of a bean (cm)", distributed by

N

₍_µ, 2.3). After having taken a sample of n = 25 beans, the calculated average size was x = 9.5 cm. Will he buy the beans here?

a. Both hypothesis: Null hypothesis: H0 : µ = 10 ; alternate hypothesis: H1 : µ < 10 (left one-sided test:

the greengrocer won't buy the beans if the claim "there average length is less than 10 cm" is very likely.

b. Statistics: under H0, X is distributed by

N

_{(10 ;} ^2,3

25) =

N

(10 ; 0.46).

c. Rejection area: this area is located on the left of ulim = –1.645 (prob = 5%).

d. Decision: from the sample, _obs 10 0.46 1.09

u = x− = − . That is outside the rejection area; then, we can't assume the alternate hypothesis at a 5% significance level.

A career should produce 300 tons of ore, on daily average, not more, not less. It is assumed that the daily mass of produced ore is normally distributed. Produced daily quantities have been examined for 10 days (see the following results, in tons): 302 287 315 322 341 324 329 345 392 289

Can we consider that the all-days average production is 300 tons, at a 5% significance level?

a. Both hypothesis: Null hypothesis: H0 : µ = 300 ; alternate hypothesis: H1 : µ≠ 300 (two-sided test:

the production must be as close to 300 tons as possible).

b. Statistics: under H0, and with σ unknown, the law of X is

St

_{(300 ;}

9

s ) =

St

(300 ; 9.74).

c. Rejection area: (9 dof, α = 5%) this area is located outside the interval bounded by tlim = ±2.262.

d. Decision: from the sample, _obs 300

2.526 9.74

t = x− = . That is inside the rejection area; then, we can reject the null hypothesis: we're 95% sure that the overall average production differs from 300 t/day.

Exercise 42.

A manufacturer claims that the strings he produces have a 300 kg average tensile strength with a standard deviation of 30 kg. It is assumed that the variable “strength of a string” is normally distributed. Strengh tests made on 10 strings revealed the following breakdown tensions:

251 277 255 305 341 324 329 314 272 289

Can we consider, thanks to this sample, that the average tensile strength of the whole production is equal to 300 kg? (significance level: 10%)

a. Both hypothesis: strings are in compliance with the standard in case their actual average strength is equal to or more than 300 kg. We have to perform a left one-sided test: H0 "the tensile strength is 300 kg" and H1 " the tensile strength is less than 300 kg".

b. Statistics: the standard deviation of tensile strengths is unknown in the population, hence the use of a Student's law. standard deviation in the sample: s = 30. Under H0 , X is distributed by

St

_{(300 ;}

9 s ) =

St

(300 ; 10).

c. Rejection area: Student (9 ddl, α = 10%), this area is located below tlim = –1.383. (the Student's table was designed for centred intervals)

d. Decision: from the sample, _obs 300 295.7 300 10 10 0.43

t = x− = − = − , which is outside the rejection area. At a

(21)

Exercise 43.

On 1000 French baccalaureate candidates chosen at random, 875 were successful. Test at a 5% significance level the claim of the minister that the success rate in France is 90%.

a. Both hypothesis: the minister tells the truth if the result of our sample is not too far from 90%. We have to perform a two-sided test: H0 "the success rate is 90%" and H1 "the success rate is not 90%".

b. Statistics: a proportion is the parameter to be tested; then, our tool is the normal law. Under H0, the proportions found in every sample of 1000 people are normally distributed around 90%; in details:

( )

; 0.9 0.1 ;

0.9 0.9 0.009487

1000

 × =

 

 

N N

_.

c. Rejection area: it's located on both sides and must contain 5% of the possible samples. The corresponding values of U are ulim = –1.96 and 1.96.

d. Decision: from the sample, _obs 0.875 0.9

2.635 0.009487

u = − = − , which is located inside the rejection area.

We can claim, with 5% chance to be wrong, that the success rate is not 90%.

Exercise 44.

In several countries, the weather forecast is given as a probability.

Forecasting "the probability of rain tomorrow is 0.4" was made 50 times during the past year and it appeared that the rain actually came 26 times the day after. Test the accuracy of the prediction, with a 5% α-level.

a. Both hypothesis: the prediction is considered as accurate in case the observed rate is close enough to 0.4. We have to perform a two-sided test: H0 "rate = 0.4" and H1 "rate ≠ 0.4".

b. Statistics: a proportion is the parameter to test; then, our tool is the normal law. Under H0, the

proportions found in every sample of 50 such predictions are normally distributed around 0.4; in details:

( )

; 0.4 0.6 ;

0.4 0.4 0.06928

50

 × =

 

 

N N

_.

c. Rejection area: this area is located outside the interval bounded by ulim = ±1.96. (since p(U < 1.96) = 0.975 implies p(U > 1.96) = 0.025 = 2.5%).

d. Decision: from the sample, _obs 0.52 0.4

1.732 0.06928

u = − = , which is outside the rejection area.

At a 5% significance level, the accuracy of the prediction can't be rejected.