All you need to know about probability pdf - Web Education

(1)

(2)

(3)

(4)

Cover Begin reading

Introduction – About the Author Index of Cited Names

Table of Contents

Thank you for buying this ebook by Paolo Manca

All you need to know about probability... Probably

To receive special offer and info on new titles sign up for our newsletter (Italian only)

SIGN UP

Or visit us online at

(5)

English Translation: Fiona Cunningham Editing: goWare ebook team

Cover: Lorenzo Puliti

ePub Development: Marco Arrighi

goWare is a startup of Florence specialized in digital publishing Send us your comments at: [email protected]

Bloggers and journalists can require a copy of the book writing to:[email protected].

Follow us on (italian only)

facebook twitter newsletter flipboard: goware blog

(6)

T

ABLEOF

C

ONTENTS Cover Title page Colophon Introduction 1. Warning 2. Uncertainty

3. The simple lottery 4. Composite lotteries

5. Kolmogoroff’s formal model

6. Probability: the only possible definition 7. Conditional probability and Bayes’ Theorem 8. Random variables, mean value, risk

9. Independent random variables,exchangeable random variables 10. Jakob Bernoulli’s Theorem

11. The Frequentist “theory” of probability 12. Probability and statistics

13. Some so-called “Paradoxes of Probability”

14. Gambling and the theorem of the ruin of the gambler 15. Attempts to “objectify” probability.

16. The limits of probability and Fuzzy Logic Plea in mitigation

(7)

I

NTRODUCTION

– A

BOUTTHE

A

UTHOR

The cover of this book shows the roulette game. There are 37 numbers and the complexity of calculus increases with the increasing game combinations. Find out how to become a master of probability!

But this work is not only about calculus... There are also many funny examples and paradoxes: statistics tell us that 20% of motorway car accidents are caused by drivers with high blood alcohol levels. It can then be derived that 80% of accidents are caused by sober drivers. Therefore, we should supply alcohol to those who drive on motorways!

Probability, the one that’s unknown, deserves a presentation that enables even those who are not experts to acquire the correct knowledge of an instrument which is essential and necessary for dealing with the world around us.

This text takes only one hour to read, but it will help you to avoid many mistakes and enable you to understand the origin of those you might have made in the past.

* * *

PAOLO MANCA professor of Probability Calculus and Chair of Financial Mathematics at the University of Pisa, Director of the Masters Program of Finance and Financial Markets, currently works as a financial consultant, specialized in derivative valuations and savings protection. He is president of the association CERFIDI (Centro Studi di Finanza e Diritto) and author of numerous publications in financial economics.

(8)

1. Warning

This book was prompted by the contents of some of the textbooks on probability and statistics that are on the reading lists even today, in 2016 AD1, of university courses, from reading entertaining drivel floating about on the internet, and by the survival of internet sites and television adverts that sell “foolproof” methods of winning at gambling and making a fortune on stock market.

I would never have believed it possible that such nonsense about probability could still be said and read, but there we are.

Although the fundamentals of the intrinsic nature of probability were laid out in the 1930s, especially by the Italian genius Bruno de Finetti (as always, appreciated more abroad than at home), and excellently restated and enlarged upon by distinguished scholars, it appears that their work was all in vain.

Being surprised that his ideas, simple yet brilliant, are still ignored, I felt obliged to set out, with all due humility, Bruno de Finetti’s thinking on probability, trying, as far as possible, to avoid technicalities, in order to be understood not only be those familiar with the field but also by those who are not familiar with mathematics and its jargon.

It is, perhaps, a foolhardy mission, since I will doubtless be criticised by experts and non-experts alike. The former will object to certain simplifications that, I myself acknowledge, to be rather insouciant; the latter will object to the more complex reasoning that I have been forced to resort to from time to time.

I think the great debates of the past centuries: the Ptolemaic system and the Copernican, the theories of evolution of species and creationist, to theories of relativity of Galileo and Einstein, nonetheless I am surprised to note that, even today, there is still debate about the nature of probability.

Yet probability can be expressed in this single sentence:

In the context of a non-deterministic view of the world, (which seems more appropriate than a rigid determinism) probability is a number that explains the degree of confidence that a given subject expresses about the truth of a given statement/event, whilst the probability calculus (PrC) is a set of rules and theorems that allow us to draw logical inferences from initial evaluations of probability.

(9)

(10)

2. Uncertainty

The term uncertainty is the opposite of the term certainty; we will use the word uncertainty interchangeably with the words casualness, chance, aleatory (randomness) and stochasticity, underlining the conceptual difference by using terms such as inaccuracy, vagueness and indeterminacy.

Chance (in Italian caso) invokes the Latin casus (fall) used in the sense “that which falls upon him”; aleatory (random) invokes the term alea which in Latin means die (from which, the famous words of Caesar: alea iacta est –the die is cast–); stochastic comes from the Greek stocasticos which literally means conjectural.

Unfortunately, in our society, much of the organisation of knowledge carries deterministic connotations and uncertainty is seen as disorder, irregularities as uncomfortable. Yet uncertainty should be an essential physiological feature in many areas and in many models of both classical sciences and social sciences.

Uncertainty is something that characterises the whole of our lives and the world around us, and the need to make decisions in unpredictable conditions occurs daily.

Sometimes the lack of certainty is inherent because it relates to future events. Sometimes it relates to the lack of available information, which could be obtained, but which is not, for reasons of the cost or the length of time obtaining the information would involve.

When we need to make a decision quickly, on issues of limited significance, we trust in common sense and often in instinct, and this is inevitable.

However, there are important issues and decisions that can have a profound impact on the course of our lives and for which we must proceed in the most well informed way possible: it is here that probability and probability calculus (PrC) are an essential tool for those who wish to recognise uncertainty and manage it responsibly.

We must therefore, as mentioned above, reject the deterministic view of the world around us and instead embrace probability and the probability calculus offered by science as appropriate tools for reasonable people to use to understand and manage rationally a large part of the considerable uncertainty that surrounds us.

Reflect upon the criteria that you used to decide how and where to invest your savings, how you insure against adverse events (death, theft, fire, etc.), how we choose a partner, where to go on holiday...

Try to answer the following questions:

Why in poker is four aces worth more a full house, and, in general, why is four of a kind worth more than full house?

(11)

If I choose two lottery numbers and they come up, how much should I win?

In roulette, the house has a small advantage (the zero) but also many expenses (premises, croupiers, management...) yet still makes a significant profit. How come?

What comment would you make on the following assertions? You have to risk more to earn more.

You shouldn’t put all your eggs in the same basket. Weather forecasts are very unreliable.

(12)

3. The simple lottery

To understand the principles of probabilistic logic it is helpful to consider the so-called games of “pure chance”. It was from these that the first rational approaches to measuring uncertainty, and the birth of probability calculus (PrC), originated.

The first documented “scientific” approach to probability calculus is the book De ludo aleae (On The Game of Dice), by Girolamo Cardano, probably written around 1525. The book, although published posthumously in 1663, shows the level of understanding of PrC at the time.

There is also very important correspondence (six letters) written around 1654, between the knight De Mere and Blaise Pascal and Pierre de Fermat. De Mere posed a number of questions related to games of chance, including one comparing the probability of throwing a six in four throws of a single die with the probability of throwing a double six in twenty-four throws of two dice.2

Without entering into historical digression, we can say that from that correspondence was born a “science”, aimed initially at helping people to manage the uncertainty linked to games of chance, but subsequently used in many fields of knowledge to provide convincing models to explain the reality around us.

The basis of, and the many applications of, probability and PrC can be illustrated using a simple yet powerful model, familiar even to those not working in the field: the urn.

To describe the model of the urn let’s consider a common lottery.

Imagine an urn (ideal) that contains N balls, each indistinguishable; each ball contains a number, from the number one to the number N . One ball is extracted from the urn.

Before the ball is extracted, those who wish to play the lottery pay a certain sum s , to choose a number (obviously between 1 and N ).

If the number inside the ball extracted is the same as that chosen by the player, he receives a prize R , from the promoter of the lottery, the bank.

Suppose for these purposes that the players and the bank have exactly the same information, that is that the balls are indistinguishable, the one from the other.3

Naturally, one can participate in the lottery without questions or analysis, just for the pleasure of betting. However it makes sense to ask ourselves whether there are logical criteria for making decisions for those who want to participate whether as players or as the bank. Namely, given that the amount of the prize R , is fixed, how much s , would it be reasonable for a player to pay to participate in the game?

(13)

To interpret to the word “reasonable”, note that the player would like to pay as little as possible; the bank, on the other hand, would like him to pay as much as possible. Therefore,both need the protection afforded by accepting a few “reasonable”, indeed indispensable, rules of conduct.

We can suppose that the participants (both players and bank): 1) are greedy: prefer to have more money rather than less; 2) consider that all the balls in the urn are indistinguishable;

3) will set the price and the prize so that no participant is guaranteed to make money.

The first two conditions are self-explanatory; the third essentially states that the lottery must have an uncertain outcome for everyone: if there were a strategy that allowed one party to be absolutely guaranteed to make money, then everyone would want to take that role and this would make it impossible to play.

We say that the price paid or received is fair to the participants if it is within the constraints of the conditions 1-2-3 above. We also say that the participants are coherent if they meet the criteria in 1-2-3.

We can conclude that:

The fair price s to pay to receive the prize R , if you are coherent, is equal to the prize multiplied by the probability of choosing the correct number, the latter being equal to the ratio of favourable outcomes to possible outcomes.

(1) s = p · R

The product p · R is also called the mean value of the bet.

It is not difficult to demonstrate that the fair price for a rational participant must be equal to the prize R divided by N :

(2) s = R / N From which: (3) s / R = 1 / N

If we denote the ratio s / R as p , we define p as the probability of the extraction of a given number. So:

(4) p = 1 / N

is the fair price to be paid/asked for a ticket.

The demonstration is simple: It must be N · s ≥ R .

(14)

certain to win R and thus to pocket overall the sum of R – N · s greater than zero. It must be N · s ≤ R

In fact, if it were N · s > R the bank would sell tickets for every number pocketing overall the sum of N ·

s – R greater than zero.

Definitively it cannot be other than: N · s = R .

It is interesting to note that the probability of guessing the number extracted is equal to 1 / N and that is equal to the ratio between the number of favourable outcomes (one outcome) and the number of possible outcomes (N outcomes).

More generally, if a complex bet can be deconstructed and expressed as a combination of simpler bets, we can define the mean value of the complex bet as the sum of the mean values of the simple bets that compose it.

Remembering that in a fair lottery the relationship between the price and the prize must be equal to the probability of winning, it is stressed that this relationship ignores the price of the ticket and the amount of the prize because the subject is not obliged to participate in the bet and therefore not is influenced by the size of either the price or the prize.4

If we take as a starting point the reasoning linked to a bet on the extraction of a single number we can, using the logical connectors and, or and not, consider more complex bets, such as whether the number drawn will be odd or even, lower than or higher than a given number, whether, if two numbers are extracted, they will be consecutive, and so on.

The result obtained in the particular case of the extraction of only one ball, is generalized for those more complex bets and, with reasoning similar (albeit more complicated) to that used above, still shows that the fair price is equal to the mean value of the bet.

The fair price for those bets is always the product of the prize and the probability of winning it, that probability being defined as the relationship between a favourable outcome and all possible outcomes.

So for N = 90 and R = 1, if you bet that the number drawn will be even, the fair price will be p = 45/90.

If you bet that the number drawn will be higher than 35 the fair price will be p = (90 – 35)/90 = 55/90. If you bet that the number extracted will range between 10 and 35 it will have to be p = (35 – 9)/90 = 26/90.

Intuitively this result conforms with the “spontaneous”, unchallenged reasoning that, very roughly, goes like this: “If all the numbers have the same chance of being selected and therefore none has an advantage, the price of the lottery ticket must be proportionately lower the higher the number of possible outcomes, and proportionately higher the greater the

(15)

number of favourable outcomes.”

In fact, one can demonstrate very easily that this is the case by noting that the price of taking part in a lottery that offers the certainty of winning the prize R cannot be other than equal to R , bearing in mind that the price must be the mean value.

It seems reasonable to conclude at this point, that probability, that is the ratio of favourable outcomes to possible outcomes, represents a fair and consistent measure of the degree of confidence that the participants have in the chance of winning, in the extraction of a ball in a simple urn lottery, knowing that the various balls in the urn are indistinguishable from each other. In summary:

If we accept the conditions of reasonableness 1-2-3 above and we want to participate, as players or bank, in a simple urn lottery, we must know that the fair price of the lottery ticket is always equal to the prize multiplied by the probability of winning it, such probability being defined as the ratio of favourable (winning) outcomes to possible outcomes.5

We have, perhaps, arrived at this conclusion too rapidly but we will return to it later. What we want to stress now is that the probability, as it has been defined, the ratio of favourable outcomes to possible outcomes, is operational, in the sense that it is defined by means of a process that allows calculation, and is compatible with the conditions of reasonableness 1-2-3.

In the context of the simple urn lottery, this probability, defined and understood as the ratio between favourable outcomes and all possible outcomes of a given event, is known as classical probability.

Given that two events are said to be mutually exclusive if the occurrence of one excludes the occurrence of the other and vice versa, we can immediately see that classical probability has the following properties:

a) to be always between zero and one;

b) the fact that the probability of an event that is the logical union of two mutually exclusive events is equal to the sum of the probability of those two events.

Under point (a) note that if there are no favourable outcomes to the event (something impossible) the probability is obviously zero, while if all outcomes are favourable to the event (event certain) the probability is one.

Under point (b), note that, given that the two events are mutually exclusive, if you bet that at least one will occur then the favourable outcomes of the occurrence of either event are equal to the sum of favourable outcomes of each event; it follows that the probability that one or other of these two events occurs is equal to the sum of their respective probabilities.

(16)

So, for example if I have to evaluate the chances that a card drawn from a deck of 52 cards is black, then the probability could relate to a spade (13/52 = 1/4) or to a club (13/52 = 1/4). The probability of drawing a black card is the sum of all the favourable outcomes, that is of drawing a spade or drawing a club.

Observation 1

Let’s not forget that knowing the fair price does not necessarily mean adopting it, or at any rate being obliged to adopt it. Apart from anything else, people play lotteries that are not even handed and conversely ultra-cautious people refuse investments that carry risk even when they are offered at a price below their mean value.

However knowing the fair price means knowing how much it costs to participate in a lottery that is not even handed.

If the difference between the price we are willing to pay and the fair price is positive, it’s an indicator of our willingness to take risks, or, if you like, the price of the thrill (fleeting) of dreaming that we might become millionaires; if it’s negative, it measures our aversion to risk or, if you like, the price of the terror (lasting) of thinking that we might become poorer. Usually a person is more inclined to take the risk if the price of the bet appears very modest in comparison to his financial resources, and much more averse to the risk if the price seems significant in the context of his financial position.

If I have a monthly income of €3000 and savings of €500000, I can buy a lottery ticket for €5, even though the mean value of the ticket is one euro, to have the possibility of winning a million euros (though the probability of doing so is vanishingly small). Buying the ticket and losing has little effect on my financial situation, but if I win my situation changes massively, from savings of €500000 to a balance of €1500 000.

If, however, the price of the ticket is €5000 and the prize a billion euros, it would be more difficult to bet for the opposite reason. After all, the possibility of having a billion instead of a million is not markedly different for me: the only difference would be having the worry of how to spend 999 more useless millions.

Observation 2

In a “fair” bet, neither the bank nor the players are guaranteed to make money; in reality the bank, that is those who organise the lottery, bears the expenses of running it and therefore are motivated to organise it only if the players pay a bit more (a mark-up) on the fair price, at least enough to cover the expenses.

In French roulette, there are 37 numbers on which to bet: 0, 1, 2, ... , 36.

For one euro a bet, in the case of a win, the bank pays 36 euros instead of 37 (where the game is fair).

Seen like this, the mark-up on the fair price seems modest, being: 1/36 – 1/37 = 0,00075

(17)

Many states do not behave like this, using instead “mark-ups” amounting to usury.

It is worth looking at some instructive examples of the game of lotto (leaving aside details not relevant to the present argument relating to various rules and regulations that can vary). In the Italian state lottery, for those who have only one number drawn, the lotto pays 250 times instead of 4005; for three it pays 4500 instead of 117840; for four it pays 120000 instead of 2555 190, for five it pays 6000 000 instead of 43949 268.

In general all lotteries, apart from being unfair, are also immoral because they hits, indeed rob from, the very weakest and the least advantaged because, to quote one cynical saying, the poor should be taxed above all, because there are so many of them.

Observation 3

All of the considerations we have examined so far deal with a single bet. Very often, those who bet do so repeatedly, and therefore take part in a series of lotteries. Let’s call them the game of lotteries (or at times simply the game), a succession of lotteries/bets, not necessarily identical.

Unfortunately, it is the case that a game made up of a sequence of fair bets would not be “worthwhile”. We will spend some time later on this surprising and “unfortunate” state of affairs but we can anticipate the overall conclusion: extended participation in a game of lotteries, whether fair or not fair,by a man who is not rich, leads inevitably to his ruin.

To have any chance at all, you have to be very much richer than the other players.

We will look later at the detailed reasons that make running a casino economically advantageous, even with fair games.

Observation 4

Let’s not forget that the size of s and R have no influence on the rule that allows us to quantify probability. The thing that counts is the relationship of s/R .

To put it another way, in determining s , neither the participants’ propensity or aversion to risk nor their financial position is relevant.

However, the propensity or aversion to risk plays a significant role in the lotteries that are in reality offered to the public.

The price charged for a ticket is much higher than the “fair” price, but being a modest sum in comparison to prizes of very significant amounts, tempts many to participate.

Running lotteries is extremely lucrative for the bank, and for this reason many states, with that unique sense of ethics which sets them apart, reserve all rights in the matter to themselves.6

(18)

2_{If the dice are true and the person throwing them is not a con-artist, the so called “classical”} probability is, in the first case 0.5077 and in the second case 0.491.

3_{Note however that in real lotteries not everything works in the same way as in the perfect model.} There is the famous case in which the balls were extracted by blindfolded children, but the balls “to be extracted” were heated. Obviously, in that case the model would not be applicable.

4_{The explanation that only the relationship R / s counts here may seem nit picking, but in fact, it is a} very important aspect of the definition, because if the participant were really obliged to bet he would behave in a different way, depending on the (absolute) value of the outcomes and not on their relationship.

5_{Repetita iuvant.}

6_{More than a few states, with a marked sense of ethical behaviour, reserve the right to control betting} to “fight against illegal betting”, in the same way as they allow the sale cigarettes for the sole purpose of having adverts on the packet to inform us that “Smoking kills”. Deeply moving.

(19)

4. Composite lotteries

The simple lottery model can be generalised in various ways, by considering in particular the extraction of a predetermined number of balls or putting the ball extracted back in the urn (returning it) or not putting it back (not returning it).

One can also imagine constructing variations on the simple lottery based on criteria of descending urns: one urn, the mother urn, contains indistinguishable balls with numbers inside. The number extracted from the mother urn identifies the daughter urn, the extraction of a ball from the daughter urn identifies the granddaughter urn, and so on.

In general, the extraction of a ball from the superior level identifies the urn at the next level, from which is extracted the ball that identifies the urn at the next level and so on until the urn at the final level.

Fortunately, all of the variants of the composite lottery described above can be taken back to a simple lottery, and therefore all of the observations previously made remain valid here too. Composite lotteries are prefered to simple ones because, described in this way, the formulation of the problems of probability are intuitively more comprehensible.

The paradigm of the composite lottery is, in its turn,very powerful and encompasses issues that take up a great deal of space in advanced courses in PrC.

An inexhaustible theme is that of random walk that includes amongst others Brownian motion and Markov Chain (Andrej Markov 1856-1922): they involve complex arguments which have applications in many fields.

We will look at Brownian motion later, but here we can introduce Markov’s Chain with the tale of the enchanted castle (and I do hope Markov will forgive me).

The usual princess, victim of a magic spell, is condemned to sleep every night in the bedroom corresponding to the number that she draws from an urn; every bedroom in the castle is equipped with an urn containing balls with the numbers of the bedrooms. The princess takes a ball from the urn in the room where she has slept and she has to sleep the next night in the room she has drawn. Prince Charming knows the composition of the various urns and therefore the probability of moving from one room to another, and has as his aim finding the princess.

Let us demonstrate by example how the complex lottery can be reduced to the equivalent of a simple lottery. As usual, we will take simple examples, observing that the reasoning applies more generally.

Let’s consider an urn with 5 balls, numbered from one to five, from which we extract two balls, without returning the first ball. The process is equivalent to the extraction of one ball from an urn containing 10

(20)

balls marked with the following pairs of numbers: (1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4,5).

Now let’s consider an urn of 5 balls, numbered one to five, from which we extract two balls in succession, this time returning the first drawn before drawing the second. This is equivalent to drawing one ball from an urn containing 15 balls marked with the pairs of numbers made up of all of the possible combinations of the 5 numbers in pairs;

(1, 1), (1, 2), (2, 1), (1, 3), (3, 1), (1, 4), (4, 1), (1, 5), (5, 1) (2, 2), (2, 3), (3, 2), (2, 4), (4, 2), (2, 5), (5, 2)

(3, 3), (3, 4), (4, 3), (3, 5), (5, 3) (4, 4), (4, 5), (5, 4)

(5, 5)

Let’s consider a mother urn containing one white ball and one red. The extraction of the white ball identifies the daughter urn B. The extraction of the red ball identifies the daughter urn C. We proceed to extract a ball from the mother urn, this identifies the daughter urn from which, in turn, we extract a ball. This process is equivalent to extracting a ball from an urn containing for every B ball marked w two balls marked from (w, b) and (w, r) and likewise for every C ball.

Among the composite lotteries, those composite lotteries with variable composition merit attention: the composition of the urn changes according to the ball extracted. The simplest example is the lottery with additions: there are black and white balls. If a white (or black) ball is extracted the ball is returned to the urn and a further ball (or balls) of the same colour is added to the urn.

Again, the lotteries with variable composition can be reduced to composite lotteries, which in their turn can be reduced to simple lotteries.

With reference to the lottery with additions, let’s think of the initial lottery as a mother lottery and (at the first descending level) of two daughter lotteries. The first contains an additional white ball with respect to the mother lottery, the second contains an additional black ball. The extraction of the white ball identifies the first daughter lottery, the extraction of the black ball identifies the second daughter lottery.

Observation

In classical probability calculus, to evaluate the possible and favourable outcomes one has to calculate the number of the combinations of n things taken k at a time without repetition, and the number of arrangements of n numbers taken k at a time.

The k-combinations are all the subgroups of k of n objects that differ from each other by at least one object. The arrangements are all the subgroups of k of the n objects that differ from each other by at least one object or by order.

The number of the combinations are indicated by C_{n, k}, those of the arrangement by D_{n, k}. Result:

C_{n, k} = n · (n – 1) ... (n – k + 1)/k!

(21)

whole ks: 1 · 2 · ... · k

The arrangements are “more” than the combinations and are precisely: D_n,k = k! · C_n,k = n!/k!

and in fact from every combination of k objects, permuting them in all possible ways we obtain k! arrangements.

(22)

5. Kolmogoroff’s formal model

The simple (compound) urn model is not appropriate for bets on events such as a football match, a horse race, or a boxing or tennis match. Equally, the model is not appropriate for a world of other applications of probability in the fields of medicine, science and social science. The need to widen the classical definition of probability and the use of the simple urn model is addressed using two approaches: the first is the formal, the second is that of the search for a satisfactory operational definition.7

The formal approach to the question was addressed and resolved by a Russian mathematician, Andrey Nikolaevic Kolmogoroff (1903 – 1987) around 1930. Observing that classical probability enjoys the simple and evident propositions that :

a) It is always between zero and one

b) The probability of two mutually exclusive events is equal to the sum of the probability of the same events.

Kolmogoroff, deliberately without troubling himself with specifying how to measure probability, constructed a formal model of probability calculus, defining probability through the properties set out above.

The idea of constructing an axiomatic formal model starting from undefined terms (or primitive notions) and axioms is not new in mathematics and Euclidean geometry is but the most famous example. A formal model is constituted from the undefined terms, terms derived by definition from the undefined terms, a non-contradictory group of axioms and all of the theorems that one can derive from the axioms.

The undefined terms are characterised exclusively by the axioms and the theorems arrive as logical consequences of the formal rules.

Naturally, in the same way as for Euclidean geometry and for any formal systems, the usefulness of the formal model emerges if a “real model”, or, using technical terms, an interpretation exists, which is obtained by giving a real meaning to the undefined terms in a way that, using that meaning, the axioms show themselves to be true, turning into real formulations.

But we will look at this later.

To introduce Kolmogoroff’s formal model, we need definitions of the terms: event/proposition, algebra of events, mutually exclusive events, and partition.8

(23)

An event is a proposition that has sense when one can asks oneself if it is true or false.

1) “Two is a number” is an event

2) “Giovanni is tall” is not an event if what is meant by tall is not unambiguously defined. 3) “The 4th_{of April 1837 was a Thursday” is an event.}

4) “The pairs of prime numbers with a difference of two are infinite” is an event.

5) “The world high jump record in 2030 will be not lower than 250 centimetres” is an event.

Events can be connected one to another using the connective logic of conjunction, disjunction and complementary: or (disjunction), and (conjunction), not (negation).

If we denote the events with upper case letters, disjunction with , conjunction with ∩, negation with ¬, we can verify how the symbols , ∩, ¬, enjoy the same formal properties that characterise the union, the intersection, the complementary in the theory of sets. This “correspondence” renders the use of the symbols very pleasing.9

An algebra of events is a family of events to which belong the impossible event which we denote as Φ, the certain event which we denote as Ω, such that if two events, A and B belong to the algebra then their conjunction A B is an event that belongs to the algebra.

With this terminology and these notations, an algebra of events Π is a group of events which always contains the events Ω and Φ and which is closed with respect to operation of conjunction, disjunction and complementarity.

The term “closed” meaning that if certain events belongs to the algebra, then every event which one can obtain from combining these events using the logical operations outlined above also belongs to the algebra.10

Partition of a certain event is defined as every group of events which, taking together all their separate parts, make up the certain event.

An algebra of events is a simple object, which we often use implicitly: every algebra formed from a finite number of events (and here we are speaking only about that) is in fact constituted from a partition of events and of all of the events that one can possibly construct combining them using connective logic.

If A₁, ... A_n is a partition of events, the algebra generated by these events is formed by the events themselves, by the union two to two, by the union three to three… by the union of n – 1 to n – 1 of those events and also by Ω and Φ. Overall, there are 2n_events.

In this sense, the events of a partition are like differently coloured Lego bricks that are put together in every possible way to produce every event in the algebra.

An event E generates an algebra consisting of: E, its negation, the impossible event and the certain event. (Moreover, it is the smallest algebra that contains the event E).

Taking two coins tossed at the same time, indicating as H the one that lands heads and with T the one that lands tails we have the algebra consisting by the events (H, H), (H, T), (T, T), and by the events

(24)

that one obtains combining this events by the connective logic or, and, not.

Taking a throw of a die and its six events: throwing a one, two, three, four, five, six and all of the combination of events that are obtained...

Toss a coin: the partition is formed by only two events: heads (H), tails (T). The algebra generated is formed by the events Ω, Φ, H, T (note that Ω = T H = ¬ Φ).

Toss two coins at the same time. The partition is formed by the events (H, T), (H, H), (T, T). The algebra formed by those three events is comprised of the combination from two by two of those events: (T, H) (T, T), (T, H) (H, H), (T, T) (H, H), also Ω, Φ.

Toss two coins in succession: the partition is formed by the events (H, T), (T, H), (H, H), (T, T). The algebra generated is formed from the union two by two of the four events, from the union three by three of the four events also Ω, Φ

Contemporaneous extraction of three balls from an urn containing white w and red r balls: the partition is formed by the trios (r,r,r), (w,w,w), (r,r,w), (r,w,w).

Extraction is succession of three balls with replacement in the urn: the partition is formed by the trios (r,r,r), (r,r,w), (r,w,r), (w,r,r), (r,w,w), (w,r,w), (w,w,r), (w,w,w).

Two events of an algebra, A and B, are mutually exclusive if the occurrence of one excludes the possibility of the occurrence of the other, or, equally, if we know that A is true it follows that B is false and vice versa. Expressed as a formula, two events, A and B are mutually exlusive if:

A ∩ B = Φ

At this point, we can introduce the formal definition of probability.

A probability, P, defined on an algebra of events Π is a law that attributes to every event of the algebra a number which enjoys the following properties:

a) 0 ≤ P (A) ≤ 1

b) P (A + B) = P (A) + P (B) if A ∩ B = Φ

Ultimately a probability on Π is one of whatever function defined on Π that attributes to every element of Π (event) a value between zero and one and which enjoys the properties a) and b).

Kolmogoroff’s axiomatic formulation stimulated further researches and gave rise to the development of a “new” discipline: probability calculus (PrC).

The PrC reached extraordinary levels of sophistication, making use of, amongst other things, a great number of results acquired in another sector of mathematics, “measure theory ”.

Observation 1

It can almost immediately be demonstrated that finite addition derives from property (b) : if A₁, A₂, ... , A_k are k mutually exclusive events, then the probability of the occurrence of all of those k events is equal to the sum of the probability of those events.

(25)

but also for any countable infinite set of events (in the jargon this is sigma-additivity) and this allows, as anticipated, abstract results and powers of the measure theory to be translated into terms of probability.

Sigma-additivity however, is problematic for the interpretation of some of its results in terms of probability: the one that measure the level of confidence of an individual towards an event. It is a thorny, technical subject that we cannot examine in depth here.

Let’s just touch upon the fact that in a set of events that is not finite it is impossible to attribute the same possibility to every event: also the intuitive idea of a “chance” choice creates problems. For example the random choice of a number in the set of natural numbers leads one to attribute to every natural number a probability of zero to be extracted, nevertheless the probability assigned overall to all of the natural numbers must have a value of one.

That is not to dismiss sigma additivity: its theoretical uses are many. It is just to say to bear in mind that one cannot always use it to explain completely the valuation of probability.

Observation 2

The formal definition does not include the semantic meaning that must be attributed to the term probability in order to make it a useful tool.

There are many different semantic interpretations of the formal model: it is the case for example that if one defines as the algebra of events the set of figures on a plane comprised in a quadrant of unitary dimensions and for which it would make sense to calculate the area, then the area of these figures would satisfy the axioms (but the area is not a probability).

Here the axioms are also satisfied by the empirical frequency relative to an experiment which could give rise to different results and which can be repeated more times, not necessarily in the same conditions.

But probability is another thing: for us it is to do with an event, a circumstance, an occurrence, an affirmation, a proposition, susceptible to being either true or false and from which, on the basis of the information that we have, we cannot affirm with certainty if it is true or false, but about which we can express, with a certain degree of confidence, its truth or falsity.

7_{The concept of an effective, operative definition is fundamental in science, and has changed the way} of seeing and working with scientific knowledge. In the operative context a physical quantity doesn’t matter what it is, it matter how we measure it. For example time and mass pertain to knowing in what way and with what one measures it.

(26)

possible that different declarations can express the same proposition. So, for example, Paolo loves Maria, Maria is loved by Paolo, are different declarations that all express the same proposition.

9_{Think, for example, of the famous property of set theory: ¬ ( ¬ A) = A, for which the} complementary of the complementary of a group A is actually A, which, in terms of events, expresses that the negation of the negation of the event A coincides with A .

10_{For the more obstinate, remember that the closure with respect to the union implies closure} towards the intersection and vice versa, being: A B = ¬ (¬ A ∩ ¬ B).

(27)

6. Probability: the only possible definition

Probability calculus is a branch of mathematics, as are geometry, rational mechanics, algebra, and therefore, as for those disciplines, to construct a satisfactory theory for probability we must identify

– A theoretical, formal aspect

– An interpretive aspect in the domain of reality – A set of applications

And indeed every scientific theory organised as a logical theoretical system can be considered: – syntactically, that is as a set of symbols that can be combined by way of the rules of logic and defined by the axioms

– semantically, that is in relation to the meaning “ intuitive-real” – in a practical sense, that is in the function of the application.

The semantic aspect looks at the meaning and interpretation of the symbols which, in order to render useful the theoretical aspect, has to be reframed in a “representation of reality” or in the “domain of reality”, that is, a real model in which we find a correspondence between the undefined entities of the axiomatic system and entities in the real model, such that the real entities verify the proposals predicted by the hypothesis.

To understand more clearly, and to avoid referring to the usual example of Euclidean geometry, let’s talk about the game of chess: the theoretical aspect is represented by a group of undefined entities (pawns, knights, bishops, castles, king and queen) and by the rules that characterise them (how the pieces are moved and how they can be taken); the semantic aspect and an interpretation of the model are, for example, the chessboard with the 32 pieces (possibly finely carved); the applications are the chess championships, the handbooks on strategy, the computer programmes that play chess etc…

For probability, the syntactic aspect is represented by the formal model, originally proposed by Kolmogoroff in which no attention is paid, quite deliberately, to the semantic aspect and to the establishment of a “real” depiction.

In fact, probability calculus is concerned with the consequences that derive from the axioms that characterise probability itself, but it is not concerned with methods capable of furnishing an operational definition of probability, that is,with a method or process that would allow a numerical measure of probability to an event.

As for other formal models, the formal model of probability proposed by Kolmogoroff, exactly because it does not provide any criteria (nor does it aim to provide such) for

(28)

measuring probability, lends itself to representing a range of real systems.

Thus, as already anticipated, the area of a figure on a plane contained in a quadrant of unitary measure satisfies the hypothesis: in this case, the events are the figures; the logical operations of the complementary, conjunction and disjunction correspond to the operation of complementary, intersections and union between the planar sets.

Thus the empirical frequency relative to the demonstration of a characteristic inside of a population (of machines, of people…) satisfy the proposal postulated: the number of cases in which the characteristics are presented divided by the number of subjects that constitute the population being examined.

But probability, the probability to which we refer, is not the area, nor the empirical frequency nor yet a defined function on the algebra of events: probability of the kind that interests us expresses our degree of confidence in the truth of an event and therefore we need an operational definition that allows us to measure that.

The reference to an operational definition merits clarification. As the winner of the Nobel Prize for Physics, Percy Williams Bridgman brilliantly illustrated in his The Logic of Modern Physics (1927), many concepts in physics and in science in general are accepted without criticism (for example those of time and of mass), and therefore have a pure nominalistic value.

To define them unequivocally the need is not to identify what they are, but instead to provide an unequivocal method of measuring them.

In other words, in the field of science, we understand a magnitude only if we have an operational procedure with which we can measure it.11

To arrive at an operational definition of probability we start from the observation that the truth or falsity of an event depends on the information possessed by those who want to ascertain it. We must add that, in the absence of sufficient information, every event, whether we refer to the past or to the future, is uncertain.

One can add that, excluding fortune tellers, then, at least for those of us endowed with only normal gifts, every event that refers to occurrences in the future is uncertain.

Let’s consider the following statements/events:

– two and two are four: the event is true for those who know arithmetic

– the 4th of April 1837 was a Friday: the event is true or false for those who have or are able to consult a calendar for 1837

– the pairs of prime numbers formed by two consecutive odd numbers are infinite: the event is neither true nor false because it concerns a mathematical statement that, up until now, has neither been demonstrated to be true or false and we know that we will never arrive at a demonstration as to whether it is true or false.

(29)

to ascertain the truth we will have to wait.

The degree of certainty of an event can be expressed with various gradations, which can be qualitative or quantitative. An event, for a given subject, can be judged true or false, but also credible, implausible, and many other shades of certainty.

Probability and probability calculus are a valid attempt (and not the only one) that allows a rational subject to express the degree of certainty/uncertainty of an event in quantative terms and which provides him with rules of behaviour adequate to manage uncertainty.

Many approaches indicate the criteria with which a subject can attribute a value to probability in the context of the algebra of events, but the only valid operational criterion is that of the bet that a subject believes to be appropriate to gamble on the truth of that event. Obviously to give unequivocal meaning to the amount of the bet, which otherwise would be “the minimum possible” (or “the maximum possible”) we have to take a subject that we will call rational and coherent and thus disposed to accept rules of behaviour that are more than reasonable.

The first rule is to accept the ability to express the real degree of certainty on the truth/falsity of an event using a numerical valuation based on betting principles.

The second rule is to accept coherence, that obliges the subject to be fair and impartial in his evaluation.

On the basis of the first rule the subject expresses his degree of certainty of the truth of an event E establishing how much s he is willing to pay/receive to participate in a bet that envisages a win/loss R if and only if the event E is shown to be true.

The notation pay/receive, win/lose means that the subject must establish the value of s fairly and so, s having been established, must accept to participate in the bet, whether as the bank or as the gambler. The ratio s / R measures the probability that the subject attaches to the event E. Following Bruno de Finetti:

The probability of an event E, in the opinion of a given coherent individual, is the ratio between the price s that he estimates fair to bet and the sum R payable at the occurrence of E.12

The condition of coherence must be considered in the context of the algebra of events: the subject, in the attribution of a probability to an event in the context of the algebra of events, must respect the rule that no combination of bets on the events allows a guaranteed profit. We note that in the case of a single event E which produces the algebra formed by the events E, ¬ E, Φ, Ω, coherence dictates simply that if p is the probability of E then 1 – p must be the probability of ¬ E and vice versa.

(30)

For an algebra of events the formulation of the rule of coherence in more rigorous terms might appear over complex so we will let it be: let’s just say that coherence does not allow that the probability of events can be defined in a way to find a combination of bets that allows for the certainty of winning: no rational subject would accept that attribution of probability.

The condition of coherence, fortunately, is equivalent to a property that, at the intuitive level, is usually accepted without any difficulty: that the additivity of the probability of mutually exclusive events satisfies Kolmogoroff’s axioms.

Historically, the definition of the probability just discussed was called “subjective possibility”. This term, which specifies the very nature of probability has, unfortunately, been badly misunderstood: as if there existed other kinds of probability and, in particular, as if there also existed an objective probability.

“Probability” is by its nature subjective and represents an extension to so-called classical probability. It is an extension because, if on the basis of the available information, we hold that all events are equally probable, then, to be logical and coherent we must adopt the classical definition.13

Observation 1

If several subjects hold that, on the basis of the available information, all of the events are equally probable then they adopt the same valuation of probability: we are dealing with a valuation that is widely shared, but obviously not objective.

Observation 2

On the basis of the definition given, it appears evident how a subject, in defining his level of confidence in the truth of an event, takes into account all the information he has on the event. In this sense, in games of chance, if one has faith in the impartiality of whoever distributes the cards, or throws the dice, or the mechanism that extracts the lottery numbers and so on, and if one does not have any other relevant information, it is reasonable to choose classical probability as a logical measure of the degree of confidence.

If there is reason to believe that, for example, the mechanism of the roulette wheel is defective and therefore not all of the numbers are equally probable, and it happens that the same numbers always come up, then in the expression of our degree of confidence we would duly take account of the empirical frequency observed.

If in the end, the croupier or the manufacturer of the wheel tells us that the roulette is rigged then we behave accordingly.

On the weight to be given to previous outcomes, note that where one has to attribute probability to the outcome of a football match, no one would dream of asking the footballers to repeat the same match a hundred times; rather one reads the specialist press, one questions

(31)

the experts and one considers the bookmakers’ odds. Observation 3

Not all uncertain events can be modelled satisfactorily in the context of probability and probability calculus even in the context of so-called subjective probability, the only acceptable probability.

Uncertainty that cannot be modelled in terms of probability arises especially when it comes to deciding quickly in a context of lack of information; it should be added that total ignorance does not equate to equality of probability of the events.

There are decision-making situations of such uncertainty and at the same time which carry such important consequences that no reasonable person would model them in the context of the probability. See in this respect also the section on fuzzy logic (par. 16).

Imagine that, whist driving on the motorway, you see a man waving. Is he asking for help, greeting a friend or warning of a serious accident? What do you do? If you stop, you risk being hit by others. If you go ahead, you could find yourself confronted by an overturned truck blocking the roadway.

11_{For example, it is exactly by starting from the process of measuring time that we discover the} relativity of the concepts before and after. What is more, we discover that time, in terms of physics, exists only if space and change also exist.

12_{Bruno de Finetti (1906-1985): an Italian genius and a master teacher. I invite my readers to find} out more about him by reading, even on the internet, about his scientific contribution and his life. He was witty, as demonstrated by these few examples from the vocabulary he invented, and which are still apposite: “bureausaurs” for bureaucrats, “trinomites” for the illness that afflicts many teachers who are fixated on useless and annoying exercises such as those on trinominals, “adhockery” for demonstrations done ad hoc, specious, leaden, lacking in spirit and imagination.

13_{In the past humanity survived quite happily, renouncing the geocentric notion of the universe,} accepting the relativity of motion (Galileo), accepting the relativity of time and space (Einstein), accepting the principle of indeterminism (Heisenberg), and survived even accepting the subjectivity of probability.

But in the end what is the fundamental reason for this demand for objectivity if not a weakness of the human soul?

(32)

7. Conditional probability and Bayes’ Theorem

By definition, in measuring the probability of an event we should always specify the range of information that we possess: it is assumed that if new information becomes available the assessment of probability may change.

In this sense, strictly speaking, to represent the probability of an event A, in the opinion of a certain person at a certain time t, we should use the symbol P (A, θ, t) to emphasize the dependence on time of evaluation, t, and the information, θ, available at that time.

If, therefore, after assessing the probability of an event A, we learn that another H event took place, we must obviously ask whether and how this new information will change our initial assessment.14

In classical probability the response is related to whether the knowledge of the event H changes the number of favourable outcomes.

So the probability of drawing two aces from a pack of cards is: 12/(52 · 51) = 3/(13 · 51) = 0.004524

but if we learn that the two cards drawn are spades or clubs the probability is 4/(26 · 25) = 2/(13 · 25) = 0.006153.

The new information increases the initial probability.

If, on the other hand, we learn that the two cards drawn are both clubs then the probability that they are both aces (excluding the presence of a magician) becomes zero. The new information decreases the initial probability.

Even if we learn that in a previous draw two spades were drawn, the valuation remains unchanged because we believe that the cards in the pack have no memory.

We denote by P (A) and P (A | H) respectively the probability of A and the probability of A knowing that H occurred: P (A) for convenience “probability a priori”, indicates our degree of confidence on the truth of A not knowing H, P (A | H), for convenience “probability a posteriori”, indicates the degree of confidence that a logical individual puts in the occurrence of A in the hypothesis that H is true.

In other words, P (A | H) is the price that is judged fair to bet that event A occurs having learned that the event H occurred.

In accordance with the principle of fairness, the individual, to be coherent, must evaluate P (A | H) with the following formula:15

(33)

(1) P (A | H) = P (A | H) / P (H)

The formula (1), which, in Kolmogoroff’s formal approach is taken as a definition, in terms of subjective probability is a necessary and sufficient condition for coherence.

Let’s return to the example of drawing two cards from a deck of 52. Drawing two simultaneously, what is the probability that they are both two aces?

If both cards drawn are spades or clubs how is the initial probability changed?

If one of the two cards drawn is an ace what is the probability that the second is an ace?

First question: all pairs of two cards that can be formed from a pack of 52 are (52 · 51)/2 = 1326. As favourable outcomes we have the pairs of aces that can be formed with four aces: that is 6.

The probability (classical) is thus: 6/1326 = 0.004524.

Second question: denote as H the event that the cards drawn are of spades or clubs and as A the event the cards drawn are two aces.

Noting that P (H) = (26 · 25)/(52 · 51) = 25/(2 · 51) and that P (A ∩ H) = 1/(26 · 25)

For (1) we have:

P (A | H) = P (A ∩ H)/P (H) = (1/130)/(1/10) = 1/13 = 0.0769

Third question: Let A denote the event the first card is an ace and H the event the second card is an ace. For Bayes’ we have:

P (A | H) = P (A ∩ H)/P (H) = (1/130)/(1/10) = 1/13 = 0,0769.

It could happen that the knowledge that H is true does not change our degree of certainty that we have about A, in this case it must be P (A | H) = P (A), and then from (1) we have equally:

(2) P (A ∩ H) = P (A) · P (H) (2') P (A | H) = P (A)

(2'') P (H | A) = P (H)

and then, if A is independent from H, H also is independent of A and therefore the equivalent conditions (2, 2', 2'') identify events which we will call stochastically independent or for simplicity, where there is no issue of confusion, independent.

Two events that are not independent are called positively or negatively correlated if: P (A | H) > P (A) or P (A | H) < P (A).

In the first case, the knowledge of H increases our degree of confidence on the occurrence of A. In the second case, it decreases.

An important generalization of (1) is represented by Bayes’ theorem, which shows how probability should be changed with the arrival and/or acquisition of new information: it is deduced formally from the axioms of the calculus of probability and from the definition of conditional probability.16

(34)

The demonstration, which we omit here, is elementary; conversely, the implications are significant and shed light on what is the essential characteristic of probability: a possible and satisfactory logic of uncertainty.

Bayes’ theorem is:

being: A₁, A₂, ... , A_n a partitioning of a certain event (incompatible events whose union is the certain event).

The theorem is also referred to as “theorem of causes” where events mutually exclusive and exhaustive A₁, A₂, ... , A_n are interpreted as causes and event B as the effect.

The theorem tells us how to change the initial probability of the cause in a coherent way after having seen the effects.

A basic example: Three machines A₁, A₂, A₃ produce the same quantity of springs.

The percentage of defective pieces of A1 is 0.1. The percentage of defective pieces of A2 is 0.15. The

percentage of defective pieces of A3 is 0.05.

A spring is found to be defective: how probable is it that it was produced by machine A3?

Since P (A1) = P (A2) = P (A3) = 1/3

we have:

P (A1) · P (B | A1) + P (A2) · P (B | A2) + P (A3) · P (B | A3) = 1/3 · (0.1 + 0.15 + 0.05) = 0.1

P (B | A3) · P (A3) = 0.05 · 1/3

and therefore for (3): P (A3 | B) = 1/6.

A less basic example, useful to introduce some ideas on Bayesian statistical inference.

An urn contains 3 balls divided between white and black. In the absence of further information the events A1, a white ball and two black, and A2, two white balls and one black are considered equally

probable.

Two extractions of a ball, with replacement, are made. Knowing that both balls drawn are white (event

B), how do you change the initial probability about the composition of the urn?

Intuitively we should increase the degree of confidence in composition A2 and therefore, applying

Bayes, we should have: P (A2 | B) > P (A1 | B).

We have P (A1) = P (A2) = 1/2, P (B | A1) = 1/9, P (B | A2) = 4/9

P (A1) · P (B | A1) + P (A2) · P (B | A2) = 1/2 · (1/9) + 1/2 · (4/9) = 5/18

And then: P (A1 | B) = (1/9) · ½/(5/18) = 1/5

And then: P (A2 | B) = (4/9) · ½/(5/18) = 4/5

Observation 1

In the case of a partition formed by two events A₁and A₂, and in this case A₂ = ¬ A₁, P (A₂) = 1 – P (A₁)

(35)

Observation 2

Conditional probability is introduced without comment from Kolmogoroff through condition (1). Only subjective probability manages to justify it fully: it expresses the necessary and sufficient condition to maintain coherence with the arrival of new information.

This consideration is essential to understand the scope and consequences of Bayes’ rule that, demonstrated by means of a simple manipulation of symbols, acquires, with the definition (the only one possible) of subjective probability, the meaning of necessary and sufficient condition for coherence.

Observation 3

Conditional probability is a probability in so much as it satisfies the axioms that define a probability. Precisely if H is an event of the algebra Π with P (H) ≠ 0 then P ( · |H) is a probability defined in the algebra of events that is obtained from the events of algebra Π at its intersection with the event H.

Observation 4

Three or more events are called independent if knowledge of any of these events does not change the probability of any other events.

In more formal terms, they are independent if the probability that they occur simultaneously is equal to the product of the probabilities.

Simply because this is unintuitive, it is worth noting that three events (or more), two and two independent, can be not totally independent.

Observation 5

In case of independence of two events A and H the probability a priori of A, P (A), coincides with the probability a posteriori, P (A | H) , and thus in that case “Bayes’ formula does not change any evaluation of probability.”

14_{Many things in this world that appear totally original have already been stated. I quote}_Jakob Bernoulli’s Ars Conjectandi of 1713: “…it follows that that which appears contingent to a person in a certain moment could appear necessary to another person or even to the first person at a different moment.”

(36)

I show that: P (H) · P (A | H) = P( A | H) as:

1) P (H) is the sum I bet to receive one when H occurs

therefore P (H) · (P(A | H)) is the sum I bet to receive P (A | H) when H occurs.

2) Knowing that H is true I bet the sum P (A | H) which I have from receiving one of the occurrences of H.

With this strategy, therefore I bet the sum P (H) · (P (A | H)) and if A and H both occur, I collect on one.

But even betting the sum P (A | H) I obtain one if A and H both occur.

(37)

8. Random variables, mean value, risk

With the introduction of the term random variables (r.v.), many problems of PrC can be formulated with greater precision and concision.

The idea and the definition of random variable arises from considering situations such as the value of the prizes that can be won in a lottery, the number of goals scored in a given football game, the time taken by the winner in a marathon, the closing price tomorrow of Coca Cola shares on the New York Stock Exchange, the roll of a die, the appearance of a number in roulette, the temperature at 2 p.m. tomorrow at the top of the leaning tower of Pisa, the price of gold on the first working day of next month.

A r.v. is essentially an “object” characterized:

– by an exhaustive set of events which represent a partition of the certain event; – from their respective probabilities (which add up to one);

– by numerical values corresponding to the occurrence of each of the partitioned events. So the r.v. “throwing of a die” is characterized by the partition of the certain event into the events E₁: a one thrown, E₂: a two, ... , E₆: a six; and each of these events is associated with a probability (that is 1/6 if we are in conditions of equiprobability) and a numeric value: the value one if a one is thrown, value two if a two, ..., value six if a six.

The whole can be set out as follows:

event partition: E₁, ... , E₆

associated values: 1, ... , 6

associated probability: p₁, ... , p₆

In abstract terms therefore, a r.v. is characterised as a set E₁, ..., E_n , of mutually exclusive and exhaustive events with their respective probabilities p₁, ..., p_n , and by a set of numerical values v₁, ..., v_n associated with those events.

E₁, ... , E_n 1) v₁, ... , v_n

p₁, ... , p_n

(38)

the context and therefore a r.v. is often presented as just a list of the values and their respective probabilities.

If we use Y as the r.v. defined by (1) we can also use P (Y_h = v_h) to indicate the probability p_h that Y is v_h at the occurrence of the event E_h.

With reference to the r.v. of throwing a die, consider a bet that gives a prize in euros equal to the number thrown (one euro if it is a 1, two euros if it is a 2 and so on).

The player can replicate the bet with six distinct bets relative to six mutually exclusive events: number one is thrown, the number two is thrown….the number 6 is thrown.

He bets that number one is thrown at a price of 1/6, bets that number two is thrown at 2/6, ... , bets that number 6 is thrown at 6/6 and he spends in total 3.5 euros.

The total sum of the fair prices of the bets that comprise the overall bet is called the mean value of the overall bet and represents the fair price.

In general, for an a.v. defined as above at 1) the mean value is defined by the formula: M = v₁ · p₁ + ... + v_n · p_n

The mean value measures the fair price of a bet made up of sub-bets connected to each event in the partition, the sum of the fair prices of the sub-bets into which it is divided, following the principle of coherence.17

Essentially, following the principle of coherence means that “in the market” one cannot have discounts or pay premiums when buying lottery tickets: the definitive value is set by the principle of totalling the prices. If a bet is bought at its mean value, the overall mean value of the operation is worth zero and that condition is consequence of the coherence.

So, if we reconsider, for example, (4) in paragraph 3 above.

s = p · R

We can verify that the player who bets s to win R attributing to the chances of winning a probability p, has a probability q = 1 – p of losing; therefore, in total, his position is represented by the a.v. which assumes the value: – s with probability q, and the value R – s with probability p and has therefore mean value:

– s · q + (R – s) · p = – s · (p + q) + R · p = – s + R · p = 0

Now while in the case of a simple bet (price s , prize R if the event occurs), the mean value works in determining the probability, in the case of a non-simple bet the mean value expresses a price for the bet with respect to the condition of coherence.

They are not the same thing.

In terms which are unscientific but are expressive: “the mean value always represents the coherent price, but the more the bet becomes complex the more the meaning of the mean