• Aucun résultat trouvé

Measuring «Correlation Neglect» : experimental prodecures and applications

N/A
N/A
Protected

Academic year: 2021

Partager "Measuring «Correlation Neglect» : experimental prodecures and applications"

Copied!
55
0
0

Texte intégral

(1)

Measuring “Correlation Neglect”

Experimental Procedures and Applications

Mémoire Blanchard Conombo Maîtrise en Économique Maître ès sciences (M.Sc.) Québec, Canada © Blanchard Conombo, 2017

(2)
(3)

Résumé

Des études économiques récentes ont identifié la difficulté des personnes à prendre des déci-sions optimales lorsqu’il existe une corrélation entre différentes variables d’état (aléatoires), que l’on appelle maintenant dans la littérature «inattention envers la corrélation». Dans cet article, nous supposons que l’inattention envers la corrélation est un trait individuel d’une personne et nous proposons différentes mesures de cette caractéristique. Nous comparons dif-férentes mesures en termes de corrélation à partir des résultats d’expériences de laboratoire. Nous présentons les applications de ces mesures dans des domaines précis.

Mots clés : heuristiques et biais, inattention envers la corrélation, mesure.

(4)
(5)

Abstract

Recent economic studies identified the difficulty of persons to make optimal decisions when there is correlation between (random) state variables, now referred to in the literatures as “correlation neglect.” In this article, we presume correlation neglect to be an individual trait of a person and propose different measures of this characteristic. We compare different mea-sures in terms of their correlation based on results from laboratory experiments. We present applications of the measures in the field.

Keywords : heuristics and biases, correlation neglect, measurement.

(6)
(7)

Table des matières

Résumé iii

Abstract v

Table des matières vii

Liste des tableaux ix

Liste des figures xi

Remerciements xvii Introduction 1 1 Literature 3 2 Experiment 7 2.1 Experimental Design . . . 7 2.2 Treatments . . . 10 3 Empirical Measure 11 3.1 Simple Set Up. . . 11 4 Results 13 4.1 Descriptive Statistics . . . 13

4.2 Heterogeneity in correlation neglect. . . 13

4.3 Cognitive effort and learning over time . . . 17

4.4 Mechanisms underlying correlation neglect . . . 18

5 Applications 21 5.1 Mother’s age at first childbearing and Child mortality . . . 21

5.2 Financial Allocation Decisions. . . 22

5.3 Pre-election Polls . . . 23

Conclusion 25 Bibliographie 27 A Annexe 29 A.1 Enke and Zimmerman : correlation between signals . . . 29

(8)

A.2 Convergence of Correlation in the Simple Set-Up . . . 29

A.3 Others Results . . . 31

A.4 Instructions . . . 34

(9)

Liste des tableaux

2.1 Experimental Design : Variation of experimental parameters and correlation

coefficients. . . 8

2.2 beliefs prediction for joint distribution . . . 9

2.3 Structure of portfolio choice problem. . . 9

4.1 Descriptive statistics . . . 13

4.2 Median correlation by treatment and informational presentation. . . 14

4.3 Structural . . . 16

4.4 Empirical without order . . . 16

4.5 Empirical with order . . . 16

4.6 Cognitive effort and learning over time . . . 17

4.7 Correlation neglect and informational presentation . . . 18

4.8 correlation=0.33 . . . 19

4.9 correlation=0.20 . . . 19

4.10 Correlation=0.11 . . . 19

4.11 Correlation=0.005 . . . 20

4.12 Number of historical information over time . . . 20

4.13 Correlation neglect and number of past observations . . . 20

5.1 Joint distribution . . . 22

5.2 subjects choices for each type of individual . . . 23

(10)
(11)

Liste des figures

4.1 Heterogeneity in Correlation Neglect . . . 15

A.1 Median Correlation Neglect- Structural vs Empirical Without Order . . . 31

A.2 Median Correlation Neglect- Structural vs Empirical With Order . . . 31

A.3 Median Correlation Neglect- Empirical With Order vs Empirical Without Order 32 A.4 Mean correlation neglect per value of true correlation . . . 32

A.5 Histograms of the difference between original median correlation neglect para-meters and modified median correlation neglect parapara-meters when excluding one parameter that is closest to the original median correlation neglect parameter. . 33

A.6 Presentation - Structural. . . 34

A.7 Presentation - Empirical Without Order . . . 35

A.8 Presentation - Empirical With Order . . . 36

A.9 Presentation - Empirical With Order . . . 37

(12)
(13)

To my father Daniel Conombo, Rest in peace

(14)
(15)

Le travail est la vie elle-même, et la vie est un continuel travail

Émile Zola

(16)
(17)

Remerciements

Je tiens à exprimer ma profonde gratitude à mes directeurs, les Professeurs Sabine Kröger et Charles Bellemare pour avoir accepté de superviser et de financer cette recherche, pour leur grande disponibilité et pour leurs avis éclairés. Je remercie également le Professeur Sylvain Dessy pour ses multiples conseils.

(18)
(19)

Introduction

A decision maker’s ignorance of the correlation between two variables, the so called “correla-tion neglect,” has been demonstrated in many different choice contexts, i.e., portfolio choice (Kallir and Sonsino,2009), investment decisions (Eyster and Weizsäcker,2011), predictions of outcomes of random draws (Enke and Zimmermann,2015). All those studies use the experi-mental comparison between treatments with and treatments without correlation between two variables to show correlation neglect to exist in a particular context.

In real life as well as in the experimental studies above, the way varies in which persons can identify and are presented with the correlation between two variables.Enke and Zimmermann

(2015) present the correlation by the structure of the situation. In Eyster and Weizsäcker

(2011), a decision maker can observe realizations of the different variables but has no informa-tion on the structure. Finally,Kallir and Sonsino(2009) give both information on the structure and realizations.

Whether either presentations are equivalent or which presentation allows better to identify the correlation is not a straightforward question and needs an empirical investigation.

This study proposes a general measure, in the same spirit as measures of individual risk aversion (e.g.,Cohen et al.(1987) ;Holt and Laury.(2002)), that to the best of our knowledge do not exist so far for individual correlation neglect.

The paper is structured as follows. Section 1 summarizes the existing literature. In section

2 we present an experimental design allowing us to assess the severity of correlation neglect. Section 3presents our empirical approach how to model and estimate correlation neglect. In section 4 we present results from a series of laboratory experiments allowing us to compare both presentations. In section 5we show how these abstract measures can be easily adapted to applications in the field, when the aim is to assess the severity of correlation neglect of decision makers for particular situations. Section 6 concludes.

(20)
(21)

Chapitre 1

Literature

Recent literature analysed the psychological phenomenon of correlation neglect (Eyster and

Weizsäcker,2011;Enke and Zimmermann,2015;Levy and Razin,2015;Ortoleva and

Snow-berg,2015). Correlation neglect implies that individuals underappreciate the correlation bet-ween state variables or different events they observe. This cognitive bias has negative spillovers for individuals decisions making and lead to overconfidence in market settings which predicts bubbles and crashes. A small set of previous experiments have examined individuals’ responses to correlations in informational sources and have found that subjects have limited attention and find it cognitively challenging to work with joint distributions of random variables (Eyster and Weizsäcker,2011). However, in this literature, there is a conceptual difference in the mea-ning of correlation neglect (structure of the environment and empirical analysis of historical data). The literature on boundedly rational and beliefs formation in networks uses the struc-ture of the information and analyses a double-counting problem in the informations sources when people update their beliefs about a state variable (Enke and Zimmermann, 2015). As result, subjects in experiments on group communication (DeMarzo et al., 2003) or political competition and voting behavior (Ortoleva and Snowberg, 2015) overweight the impact of informational redundancies in their beliefs.

Enke and Zimmermann (2015) used the structure of decision problem and analysed double

counting problem in beliefs formation. They find that experimental subjects in a relative simple setting neglect correlations in information sources when forming beliefs with heterogeneity at the individual level. They suggest a measure of individual correlation neglect in the between subjects design under the assumption that signals are drawn from a truncated discretized normal distribution with mean µ - the true value of the state - and standard deviation σ = µ2. Truncation implies that signals belong to the interval [0, 2µ] in order to avoid negative signals and then negative correlations. But in real life, people face many situations involving negative and positive informations (signals) about events. Informations might be positively or negatively correlated. In their experiment, two computers, A and B, generate two “iid” unbiased signals sAand sB with (sh ∼ N (µ, (µ2)2), h ∈ {A, B}). Subjects observe these signals as numbers that

(22)

they must use to estimate the number of items in an imaginary container. In the correlated treatment, subjects observe the realizations of a computer A (sA) and the mean realization of two computers ˜sB =

sA+ sB

2 so that the two signals are correlated with a correlation of 71%

1.

Subjects in the control condition observe two independent signals (sA and sB). For rational

estimation of the number of item, subjects in the correlated treatment must take into account the information about the correlation of two signals given the structure of the environment when the unbiased estimate of the number of items is the empirical mean of two signals in control condition. Therefore, a rational subject must extract sB from ˜sB and compute the

mean of sA and sB as an estimate of the number of items in the container. The following rule is used for each subject in the correlated treatment when he tries to extract the right signal sB :

ˆ

sB= χ˜sB+ (1 − χ)sB

χ is an individual measure of correlation neglect that captures subjects ability to extract the right signal into ˆsB when authors consider only the structure of the environment.

(

χ = 0 for rational subject

χ = 1 for full correlation neglecter

As result, people’s beliefs in the correlated treatment deviate from rationality because sub-jects neglect informational redundancies. Their individual measure of correlation neglect can be apply in differents settings on informational structure (e.g Group communication, Voting Behavior, Political competitions, etc.). While a part of literature focuses on the structure of the environment, the other analyses people’s limited attention on joint distribution of ran-dom variables when engaging in empirical analysis of historical and empirical data (Kallir and Sonsino,2009;Eyster and Weizsäcker,2011).

Kallir and Sonsino (2009) find that changing the correlation of a portfolio-choice problem

leads to little or no change in participants decision making. In their experiment, subjects observed historical data on the joint distribution of the realized returns of two virtual assets with different levels of correlation for 12 preceding periods ; and they have to predict the realized returns of the first asset in four additional observations when observing the returns of the second asset. In this predictions-allocations problems, the results show that subjects recognize shifts in correlation in their prediction tasks but fail to account for this correlation in their allocations decisions. Therefore, correlation neglect predicts no change in participants’ behavior. They shed additional light on the cognitive nature of the bias that is consistent with

1. See appendix A

(23)

the interpretation of correlation neglect as deriving from limited attention. No formal measure of individual correlation neglect has been suggested.

Eyster and Weizsäcker (2011) focus on the impact of correlation neglect on financial decision making because an investor who fails to account for the correlation when allocating his financial portfolio can hold a portfolio that contains undesirable risks. They suggest a measure of individual correlation neglect in a series of controlled experiments using a framing variation in which each participant faces two versions of the same portfolio-choice problem. The assets in the correlated frame are linear combinations of those in the uncorrelated frame and span exactly the same set of earnings distributions. Under the hypothesis that people correctly perceive the covariance structure, the framing variations does not affect behavior. By ensuring that participants understand the payoff structure and the co-movements of the assets returns, they find that behaviors change strongly. People ignore the correlation and treat correlated assets as independent following sometimes a simple "1/N heuristic" which is investing equal shares of financial portfolio into all available assets. They measure people "ignorance" using the following transformation of the matrix of variance-covariance with penalties on variance and covariance terms.

V = (σ

2

1)l k.sgn(σ12)|σ12|l

k.sgn(σ12)|σ12|l (σ22)l

!

k and l represent the parameters of correlation neglect and variance neglect (respectively) and are estimated for each subject in the experiment. Then, they classify subjects in 3 differents groups according the severity of correlation neglect.

This conceptual difference according the context and the structure of the decision problem and various experimental approaches (within and between-subjects designs) raises the difficulty to apply a measure of one paper to evaluate correlation neglect in another. To the best of our knowledge, there exists no a single measure of correlation neglect that can be applied in different contexts.

Our paper is related to this literature on correlation neglect and we propose a measure that can be used as general measure of correlation neglect regardless of contexts. We need not to assume any hypothesis about individuals’ preferences, but only assessing their beliefs about the distribution of state variables in our experiment. Then, we compute a measure of individual correlation neglect.

(24)
(25)

Chapitre 2

Experiment

A decision maker can identify correlation between state variables either by engaging in empi-rical analysis of histoempi-rical data or by analyzing the structure of the environment. Correlation neglect is present, when information, that can help to identify the correlation, is available but is ignored in the decision process. It is important to distinguish between the two dimensions because it allows to localise the cognitive shortcoming of judgement and because it might affect the choice of policy if one wants to mitigate the ignorance.

Following this reasoning, we propose to measure correlation neglect in two ways : first, the correlation of state variables is presented by the structure of the decision problem (e.gEyster

and Weizsäcker (2011),Enke and Zimmermann (2015)), or second, by observing realizations

of both variables (e.gKallir and Sonsino (2009)).

As in the literature on correlation neglect, we presume that ignoring the correlation between two random variables affects beliefs about the joint distribution of those variables. Howe-ver, different to the existing literature, we propose to measure correlation neglect directly by eliciting (subjective) beliefs and not indirectly via observed choices.

2.1

Experimental Design

The basic set-up of our experiment consists of two urns, Urn 1 and Urn 2, containing N1and N2 balls, respectively. Balls are either blue B or green G with B1+G1 = N1 and B2+G2 = N2

and the distribution is represented by the ratio of blue balls, b1= B1/N1 and b2 = B2/N2. B1

and G1are respectively the number of blue and green balls in Urn 1 while B2 and G2represent the number of blue and green balls in Urn 2. Then a number of balls D1 are drawn from Urn 1 without replacement and placed in Urn 2, from which then D2 balls are drawn again

without replacement. This procedure is repeated S times, each time resetting both urns to the original set-up. The task of the participant is to give a personal evaluation of the following three distributions : distribution of variable X : representing the distribution of blue balls

(26)

(XB) or green (XG) in D1 over S, the distribution of variable Y , representing the distribution of blue (YB) or green (YG) balls in D2 over S, and the distribution of variable Z, representing their joint distribution over S.

Simple Set Up

no Urn 1 Urn 2 E[X] E[Y ] E[XY ] Cov Corr N1 b1 D1 DN11 N2 b2 D2 (N2D+D2 1) ρ X = XB, Y = YB 1 2 0.5 1 0.5 2 0.5 1 0.33 0.5 0.5 0.33 0.08 0.33 2 2 0.5 1 0.5 4 0.5 1 0.20 0.5 0.5 0.30 0.05 0.20 3 2 0.5 1 0.5 8 0.5 1 0.11 0.5 0.5 0.28 0.03 0.11 4 2 0.5 1 0.5 200 0.5 1 0.005 0.5 0.5 0.25 0.005 0.005 X = XB, Y = YG 5 2 0.5 1 0.5 2 0.5 1 0.33 0.5 0.5 0.16 -0.08 -0.33 6 2 0.5 1 0.5 4 0.5 1 0.20 0.5 0.5 0.20 -0.05 -0.20 7 2 0.5 1 0.5 8 0.5 1 0.11 0.5 0.5 0.22 -0.03 -0.11 8 2 0.5 1 0.5 200 0.5 1 0.005 0.5 0.5 0.245 -0.005 -0.005 Table 2.1 – Experimental Design : Variation of experimental parameters and correlation coefficients.

A simple case of our experiment is shown by the following example that is summarized in Table 2.1. The urns contain 2 balls, one blue and one green, each. One ball is drawn from each urn, with a total of S=100 repetitions. xB is the number of times out of 100 repetitions a blue ball would be drawn first, yB is the number of times out of 100 a blue ball would be

drawn second and zB is the number of times out of 100 where both draws would be blue. The

corresponding questions eliciting the distribution of those variables are :

1. “What are the chances out of 100 that a blue ball is drawn from the first urn ?” 2. “What are the chances out of 100 that a blue ball is drawn from the second urn ?” 3. “What are the chances out of 100 that a blue ball is drawn from both urns ?”

With the response “xB out of 100,” question 1 elicits E[X] = Pr[X = B] = Pr[D1 = B] = xB/100 = ¯xB. Response to question 2 reveals E[Y ] = Pr[Y = B] = Pr[D2 = B] = yB/100 =

¯

yB and to question 3, E[Z] = E[XY ] = Pr[X = B, Y = B] = Pr[D1 = B, D2 = B] =

zB/100 = ¯zB. The correlation is obtained simply by ρ = (¯zB−¯xBy¯B)/p ¯xBy¯B(1 − ¯xB)(1 − ¯yB).

Given the experimental design of the simple example, the theoretical correlation between the two random variables is 0.33.

With this structure, we introduced a correlation between X and Y when taking a ball from the first urn and put it in the second urn. Because ignoring this correlation affects beliefs about the joint distribution of X and Y , Table 2.2presents beliefs’ prediction for rational subjects and full correlation neglecters for 4 parameterizations presented in Table 2.1. Someone who

(27)

fully neglects the structure of correlation thinks that there is no link between X and Y and then :

P [X = xi, Y = yj] = P [X = xi] × P [Y = yj] for i, j ∈ {Blue, Green}

P[X=1, Y=1]

Treatment correlation Rational beliefs Full Corr. neglect beliefs

0,33 0,33 0,25

0,2 0,3 0,25

0,11 0,27 0,25

0,005 0,25 0,25

Table 2.2 – beliefs prediction for joint distribution

Our design is structured so that for all beliefs elicitations problem E[X] = E[Y ] = 12. The Simple Set Up allows the correlations to lay between −0.33 and 0.331. In the Simple Set Up we restrain N1 = 2, b1= b2= 0.5 and D1 = D2= 1. By varying N2, the size of the second urn,

we can manipulate the level of correlation to be between 0 and 0.332. And by varying whether the subjective expectation for the second draw concern the same color as the one in the first draw or the other, we manipulate the direction of the correlation to be positive or negative. Rows (1) - (8) of Table 2.1 shows 8 possible parameterizations resulting in correlations of {−0.33, −0.20, −0.10, −0.005, 0.005, 0.10, 0.20, 0.33} covering uniformly the range of possible values.

This first design allows to measure people understanting of the correlation when they face situations which introduce the correlation by taking one variable as its combination with the other one. For instance,Eyster and Weizsäcker(2011) construct portfolio choice problems with state-dependent returns using framing variation in which each participant faces two versions of the same portfolio-choice problem. Across the two framing variations, they switch asset correlation on and off as presented in table 2.3

State-dependent returns

{X(1), X(2), X(3), X(4)} {Y(1), Y(2), Y(3), Y(4)} portfolio 1 A = {12, 24, 12, 24} B = {12, 12, 24, 24} portfolio 2 C = {12, 24, 12, 24} D = {12, 18, 18, 24}

Table 2.3 – Structure of portfolio choice problem.

In portfolio 1 there is no correlation between asset A and B. Portfolio 2 is constructed such that the returns of C = A and D = A + B

2 , thus introducing the correlation between C

1. By switching the color of the ball drawn in the second urn we allow correlation to be positive or negative 2.

lim

N2→∞

ρX,Y = 0

(proof in appendix A.2)

(28)

and D. Under the hypothesis that people correctly perceive the correlation structure, this framing variation does not affect their behaviour. Our simple design allows to measure people understanding of the correlation in this kind of situations before making their decisions. Instead of observing the structure of the environment (decision problem), subjects may face situations in which they observe historical data of state variables. This situation is illustrated inKallir and Sonsino(2009) where subjects observe the joint distribution of realized returns of two virtual assets with two levels of returns (high and low) for 12 preceding periods. They consider five predictions problems involving five different levels of correlation between assets returns and subjects are requested to predict returns for 4 additional periods under the as-sumption that future returns are sampled from the empirical distribution.

We integrate this situation in our experiment and subjects observe S draws from the first and the second urn simultaneously. We allow the number of draws S to be endogenous to each participant in the experiment. Then, each subject can decide on the number of draws that he wants to observe. By endogenizing S, the number of draws, participants control the quantity of information that they have, a possible source of debiasing. The task of the subject is to give his personal evaluation of following distributions : distribution of variable X : representing the distribution of blue (XB) or green (XG) balls in D1 over 100, the distribution of variable

Y , representing the distribution of blue (YB) or green (YG) balls in D2 over 100, and the

distribution of variable Z, representing their joint distribution over 100. The corresponding questions eliciting the distribution of those variables are :

1. “In how many out of 100 draws do you think thata blue ball or a green ball is drawn from the first urn ?”

2. “In how many out of 100 draws do you think thata blue ball or a green ball is drawn from the second urn ?”

3. “In how many out of 100 draws do you think that a blue ball is drawn from the first urn and a green ball from the second urn ?”

4. “In how many out of 100 draws do you think thata green ball is drawn from the first urn and a blue ball from the second urn ?”

2.2

Treatments

We consider three different presentations. First, a presentation of the structure of the situation, but no demonstration of realizations, i.e., S = 0 and questions as in the structural presentation above. Second, no information on the structure, but a time series showing actual realizations of a certain amount S of draws and questions as in the empirical presentation. Third, no information on the structure, but a time series showing joint distributions of variables as relative frequencies in matrix form and questions as in the empirical presentation.

(29)

Chapitre 3

Empirical Measure

In this section, we present our measure of individual correlation neglect and the others mea-sures in the literature.

3.1

Simple Set Up

In this paper, we presume correlation neglect to be an individual trait of a person and we propose a measure of this caracteristic. To compute a subjective measure of “individual correlation” when asking for their beliefs about the joint distribution of state variables in the Empirical treatment, we use a measure of correlation for bivariate data, the so called “Phi Coefficient.” This is one of the straightforward and usefull methods to assess the correlation between two bivariate variables and it has the same interpretations as pearson’s correlation. The “Individual Phi Coefficient” for each subject is compared to the true value of the correlation allowed by our experiment and in the same treatment.

During the experiment and in empirical treatments, subjects answer differents questions about bivariate variables and for each subject, we construct a 2×2 matrix corresponding to his answers. Let subject “i” when answer to questions in treatment “j” forms the following 2 × 2 Matrix.

Urn 2 : Variable Y

Blue Green Total Urn 1 : Variable X Blue aji bji eji

Green cji dji fij Total gij hji n = 100

In the experiment, eji and fijrepresent the subjectives distributions of variable X for individual i. gji and hji represent the distribution of variable Y (representing the color of the ball drawn in the second urn) in the same treatment. With this presentation, the value of correlation for individual i in treatment j, is computed as follow :

(30)

φji = a j i × d j i − c j i × b j i q (eji × fij× gij× hji)

Since φji is a subjective value of the correlation for individual i in the treatment j, this is compared to the right value of correlation in the same treatment ( called φj). Our measure is defined as follow :

χji = φj− φji

χji quantifies individual i correlation neglect in treatment j. This framework allows χji to be-long in the interval [−2, 2] where near to 0 represents rational subjects.

In the structurals treatments we don’t need to use the definition of “Phi Coefficient ;” subjects responses are used to compute their subjective correlation in the corresponding treatment using the formula :

ρi=

covi(X, Y )

σi,Xσi,Y

= P [X = x, Y = y] − E[X]E[Y ]

pP [X = x](1 − P [X = x]) × P [Y = y](1 − P [Y = y])

This value is then compared to the theoretical value of correlation as describe above to quantify their neglect.

For the purpose of some statistical analysis, because our beliefs formation tasks allow for 3 differents presentations of the information, we compute a single measure of individual correla-tion neglect (median correlacorrela-tion neglect) for each subject and each informacorrela-tional presentacorrela-tion. We compute this measure by taking a median of j correlation neglect parameters at the same informational presentation (k) :

χki = med(φj,k− φj,ki )

Then, each subject has one value of correlation neglect parameter by type of informational presentation.

(31)

Chapitre 4

Results

4.1

Descriptive Statistics

In this experiment 17 participants were enrolled to participate in the 12 belief formation tasks for a total of 204 observations. The experiment was run using the experimental software z-Tree and a session lasted about 90 minutes. Subjects were paid the same amount of 25 CAN. Table

4.1presents descriptive statistics of participants. The mean age is 30 years and about 41% of

Table 4.1 – Descriptive statistics Variables Mean Std.Dev Min Max

Age 30 6 22 47

Gender

Male 0.59 0.49 0 1 Female 0.41 0.49 0 1

subjects are women. The subjects are either undergraduate or graduate. By pooling the data across treatments, table 4.2 presents summary statistics for all treatments and reveals that subjects evaluation of the correlation by presentation differ to the true value of correlation in some treatments. Thus, table 4.2 presents a sufficient amount of correlation neglect between individuals and treatments. We focus our analysis on correlation neglect at individual-level in the next section.

4.2

Heterogeneity in correlation neglect

Because table 4.2 reveals that some subjects neglect correlation, we develop a measure of individual correlation neglect in order to investigate this heterogeneity. Our design allows us to estimate individual’s correlation neglect parameter by informational presentation k and treatment j, χki = med(φj,k − φj,ki ) for each level of correlation. At the individual level,

(32)

Table 4.2 – Median correlation by treatment and informational presentation

True correlation Median Subjective Correlation value structural Empirical Empirical

without order with order

0.33 0.00 0.15 0.20

0.20 0.20 0.18 0.20

0.11 0.12 0.00 0.00

0.005 0.00 0.00 0.00

The table presents a summary statistics of subjective correlations across treatments for each presentation.

graphics 6-8 reveal that neglecting the correlation somewhat depends on the presentation of information1.

Figure 1 provides kernel density estimates of the distribution of these median correlation neglect parameters by informational presentation. The plots reveal 2 spikes for structural presentation ; one around zero for the vast majority of subjects (they behaves approximately rational) and one around 0.2 (correlation neglecters). For the others types of informational presentation, we observe 3 spikes when the majority of subjects also behave rational with the spike around 0. The others spike suggest the presence of different types of individuals. This procedure however ignores the variability in subjects tendency to neglect correlation. Figure 1 suggests the existence of different types of individuals who neglect the correlation at different level of informational presentation. For each type of presentation, some partici-pants behave like if they completely ignore the correlation between variables. For the purpose of finite mixture model, we suppose that each participant is characterized by a set of two-dimensional types (χkt, σtk) with t ∈ {1, ..., T }, and k ∈ {1, 2, 3}, where the population weights πt are estimated along with (χt, σt). σkt is the variance of individual type t in informational

presentation k. The correlation neglect parameter of subject i in round j for presentation k can be expressed as χj,ki = χkt+ µj,ki , µj,ki ∼ N (0, σk

t). The likelihood contribution of individual

i in presentation k is given by : Li(χk, σk, πk) = T X t=1 πt 17 Y j=1 P [χj,ki = χkt + µj,ki |χkt, σtk]

The grand likelihood is obtained by summing the logs of the individual likelihood contributions. The model generates different results depending on the number of types T included. The estimations are ran for up to T = 5 and we report results for up to T = 3 for the two different historical data presentations (with and without order) and for up to T = 4 for the structural presentation of the information. For the others types, the results are significantly similar for the case T = 3 and T = 4 but the likelihoods are scarcely improved relatively.

1. We investigate this issue in section 4.4.1

(33)

0 1 2 3 4 Density -.2 0 .2 .4 .6 .8

structural Empirical without order Empirical with order

Figure 4.1 – Heterogeneity in Correlation Neglect

The estimates show that when we allow for one type of individuals in the experiment and for each informational presentation, the variance estimated is high. This mean that, the model with one class of individuals masks a considerable degree of heterogeneity. By allowing the existence of two types of individuals, the model fit increases but the variance is still higher in some groups. The model suggests that the data can be explained as a mixture of two differents groups of subjects. For one group, the estimates generate a correlation neglect parameter close to 0 (rational subjects). The second group is characterized by a large amount of correlation neglect. The high variances in some groups predict the presence of further sub-populations in the data. We allow for three types of individuals in the experiment. If we allow for more than three types in our historical presentations and more than four types in structural presentation the models fits increase but not dramatically and the parameters estimated for some groups remain unchanged. This individual-level analysis shows that subjects tendency to ignore cor-relation masks a considerable heterogeneity and the results are similar to the inference from Figure 4.1.

(34)

Table 4.3 – Structural

Parameter Goodness of fit

Model Type χ σ π (%) LL AIC BIC

T=1 t=1 0.04 0.14 100 40.12 -78.25 -76 (0.02) (0.02) t=1 -0.28 0 6 T=2 (0) (0) (0.03) 724.2391 -1442.478 -1435.82 t=2 0.06 0.11 94 (0.014) (0.01) (0.03) t=1 -0.28 0 6 (0) (0) (0.03) T=3 t=2 -0.02 0.05 59 735.9765 -1459.953 -1446.636 (0.008) (0.006) (0.06) t=3 0.18 0.044 35 (0.01) (0.0065) (0.06) t=1 -0.28 0 6 (0) (0) (0.03) t=2 0.004 0.002 47 T=4 (0.0004) (0.0003) (0.06) 824.5472 -1631.094 -1611.119 t=3 0.18 0.04 35 (0.009) (0.006) (0.06) t=4 -0.12 0.013 12 (0.005) (0.034) (0.04) 17 subjects, standard errors in parentheses.

Table 4.4 – Empirical without order

Parameter Goodness of fit

Model Type χ σ π (%) LL AIC BIC

T=1 t=1 0.13 0.16 100 26.600 -51.198 -48.978 (0.02) (0.03) t=1 0.05 0.09 77 T=2 (0.013) (0.01) (0.05) 39.523 -69.046 -57.948 t=2 0.38 0.05 23 (0.014) (0.01) (0.05) t=1 -0.014 0.044 47 (0.008) (0.005) (0.06) T=3 t=2 0.38 0.055 23 91.186 -166.372 -148.616 (0.014) (0.05) (0.05) t=3 0.15 0.003 30 (0.0008) (0.0005) (0.055) 17 subjects, standard errors in parentheses.

Table 4.5 – Empirical with order

Parameter Goodness of fit

Model Type χ σ π (%) LL AIC BIC

T=1 t=1 0.13 0.21 100 8.073 -14.14576 -11.926 (0.03) (0.05) t=1 0.05 0.07 75.7 T=2 (0.012) (0.009) (0.075) 36.106 -62.213 -51.116 t=2 0.4 0.27 24.3 (0.09) (0.05) (0.075) t=1 0.055 0.07 82.5 (0.01) (0.0075) (0.05) T=3 t=2 0.86 0 6 185.695 -359.390 -346.073 (0.00) (0.00) (0.03) t=3 0.34 0.033 11.5 (0.011) (0.008) (0.04) 17 subjects, standard errors in parentheses.

(35)

4.3

Cognitive effort and learning over time

This section investigates the relationship between correlation neglect and subjects’ response times commonly used as proxy of cognitive effort (Rubinstein (2007)). Because of cognitive costs, subjects might develop a solution strategy and opt for simplifying heuristic. Table 4.6 provides the results of panel regression with random effects and heteroskedasticity-robust standard errors of correlation neglect parameter for each treatment on subjects’ response time. Then, we check if subjects learn about correlation over time. The results show that correlation neglect is not associated with response time. A longer time spent on a task doesn’t affect subject’s tendency to neglect correlation.

Table 4.6 – Cognitive effort and learning over time

Dependent variable : Correlation Neglect Parameter Time Treatment -0.00002 (0.00003)

Time Trend -0.002

(0.009) Time Treatment# Time Trend 0.00001 (0.00002) 1 if Female 0.18 (0.13) Age 0.093** (0.038) Age2 -0.0015*** (0.0006) Education -0.014 (0.045) Constant -1.36** (0.63) Controls variables include age, gender, level of educa-tion. *p < 0.10, **p < 0.05, ***p < 0.01.

Do subjects learn about correlation over time ? the results suggest that correlation neglect doesn’t become smaller over time.

(36)

4.4

Mechanisms underlying correlation neglect

4.4.1 Correlation neglect and informational presentation

As shown in table4.7and graphics A.1-A.4 (appendix A), correlation neglect is associated with informational presentation. Some presentations allow to better recognize correlation between states variables than others. To investigate this issue, we regress individual correlation neglect parameter on informational presentation.

Table 4.7 – Correlation neglect and informational presentation

Median correlation neglect (1) (2) 2 if empirical without order 0.14*** 0.065*

(0.04) (0.04) 3 if matrix form 0.04 0.001 (0.04) (0.04) Constant 0.0025 0.05 (0.03) (0.17) controls No Yes Obs. 204 204 Pseudo R2 0.05 0.08

Median regression, standard errors in parentheses. Controls va-riables include age, gender, level of education. *p < 0.10, **p < 0.05, ***p < 0.01.

The results show that relative to the baseline presentation (structural presentation), obser-ving historical data without ordering significantly increases subjects tendency to neglect the correlation. There is no significant difference between structural presentation and historical data presentation with ordering possibility. This results reveal that structural and empirical with order presentations are better to reduce individuals’ correlation neglect.

(37)

4.4.2 Number of historical informations

Our endogenous treatment allows subjects to decide the number of past observations they observe. A good strategy is to observe all the data up to 100 draws and report the same distribution in his beliefs formation tasks. Tables 4.8-4.11 give the number of subjects who simultaneously choose a certain number of draws in the two empirical treatments (with and without ordering) for each level of true correlation. The tables reveal that the majority of individuals (in bold) observe up to 100 draws from the 2 informational presentations. However, we are not able to tell if the same subject choose the same number in all empirical treatment during the experiment. We investigate this issue by regressing the number of observations observed by each subject on time trend. The results in table 4.12 suggest that the number of observed informations doesn’t increase significantly over time. Some of subjects keep the same number of observations during experiment. The results also suggest that level of education and gender affect the number of informations. Most educated individuals tend to ask for more informations than less educated one and women ask for less informations than men.

Table 4.8 – correlation=0.33

Emp. With order 10 20 100 Total

10 2 1 0 3

20 3 0 1 4

Emp. Without order 30 1 0 2 3

100 0 0 7 7

Total 6 1 10 17

Table 4.9 – correlation=0.20

Emp. With order

10 20 40 100 Total

10 1 1 0 0 2

Emp. Without order 20 2 1 1 2 6

30 0 0 0 1 1

60 1 0 0 1 2

100 1 0 0 5 6

Total 5 2 1 9 17

Table 4.10 – Correlation=0.11

Emp. With order

10 20 30 100 Total

10 3 1 1 1 6

Emp. Without order 20 1 2 0 1 4

30 0 0 0 1 1

100 0 0 0 6 6

Total 4 3 1 9 17

We are also interest to investigate the effect of the number of historical observations allow by our two empirical presentations on subjects tendency to neglect the correlation. In table4.13, we run a median regression of correlation neglect parameter on the number of informations. The table reveals that more information tend to reduce correlation neglect. So, it is important to observe the maximum of information available before forming his beliefs.

(38)

Table 4.11 – Correlation=0.005

Emp. With order 10 20 100 Total

10 1 0 0 1

Emp. With order 20 1 2 2 5

30 1 0 0 1

40 0 0 1 1

90 0 0 1 1

100 3 1 4 8

Total 6 3 8 17

Table 4.12 – Number of historical information over time

Dependent variable : Number of histirical informations

Empirical without order Empirical with order Emp. without order + Emp. with order

(1) (2) (3) (4) (5) (6) Time 0.66 0.57 0.07 -0.02 0.00 0.31 (1.23) (1.2) (0.97) (0.0001) (0.45) (0.74) Age -0.88 -2.4*** -1.65*** (0.83) (0.8) (0.57) 1 if female -18.2 -17.75* -17.8** (11.13) (10.6) (7.65) Educational level 2.23 9.22* 5.8 (5.62) (5.35) (3.86) constant 48.70*** 80*** 60*** 127*** 52.9*** 103.4*** (10) (28) (11.5) (26.7) (0.04) (19.22) Obs. 68 68 68 68 136 136 R2 0.004 0.11 0.0001 0.30 0.0023 0.19

Median regression, standard errors in parentheses. Controls variables include age, gender, level of education. *p < 0.10, **p < 0.05, ***p < 0.01.

Table 4.13 – Correlation neglect and number of past observations

Dependent variable : Correlation Neglect Parameter Empirical without order Empirical with order

(1) (2) (3) (4)

Numb. Past observations -0.002 -0.002* -0.002** -0.0008 (0.0008) (0.0007) (0.0006) (0.0005) Constant 0.14** 0.56*** 0.17*** 0.06

(0.06) (0.16) (0.05) (0.13)

controls No Yes No Yes

Obs. 68 68 68 68

Pseudo R2 0.03 0.11 0.07 0.08

Controls variables include age, gender, level of education. *p < 0.10, **p < 0.05, ***p < 0.01.

(39)

Chapitre 5

Applications

5.1

Mother’s age at first childbearing and Child mortality

Young maternal age is associated with adverse birth outcomes for child and mother (Omar et al.(2010) ;Wang et al.(2012) ;Shrim et al.(2011) ;Kang et al.(2015) ;Fraser et al.(1995)). Measuring people “ignorance" of this correlation may help for news stragies to reduce adverse birth outcomes for children and mothers. Recent literature (Diarra and Dessy (2017)) on the determinants of the demand of child bride suggests that people ignores the correlation between women’s age at first birth and the level of mother mortality, or fails to factor this information in their marriage decision. Our design allows us to measure individual correlation neglect by eliciting their expectation about the age of bride at first childbearing and the level of infant mortality1.

We elicit these distributions using natural frequency questions and asking for probability for outcome intervals. People are asked to share their evaluation out of 100 randomly selected women at age 12 or less, how many will have their first child at different age groups. This question elicits their subjective distribution of age at first birth.

Then, people are told to share their expectation out of 100 randomly selected newborn-babies, how many will died before reaching their first birthday. This question assesses subjective dis-tribution about infant mortality.

finally, we elicit beliefs about the joint distribution of two variables. Subjects have to answer these questions in the following way :

A.) I think that out of the 100 randomly selected women at age 12 or less,

(1) “...d1... will have their first birth before 18 years old.” (2) “...d0... will have their first birth from 18 years old. ”

B.) I think that out of the 100 randomly selected newborn-babies,

1. Because mother mortality is not observed in the data in order to compute an empirical correlation with mother mortality

(40)

(1) “...b1... will died before their first birthday,” (2) “...b0... will be alive until their first birthday,”

C.) I think that out of the ...d1... women who will have their first birth before 18 years old,

(1) “...d1b1... will have their baby died before reaching age 1,” (2) “...d1b0... will have their baby alive until their first birthday,”

D.) I think that out of the ...d0... women who will have their first birth from 18 years old,

(1) “...d0b1... will have their baby died before reaching age 1,” (2) “...d0b0... will have their baby alive until their first birthday,”

5.2

Financial Allocation Decisions

In order to measure people perception of the correlation in assets returns when making their financial allocation decisions, our measure using beliefs elicitation can be apply. For simplicity, likeKallir and Sonsino(2009), portfolio contains two assets A and B with two levels of returns : "high" and "low ". Information on the joint distribution of returns is present in the form of empirical frequencies for 12 preceding periods as presented in table 5.1. Subjects are told to predict returns for 100 additional periods, under the assumption that future returns are sampled from the empirical distribution.

Asset B high low Asset A high 5/12 1/12

low 1/12 5/12 Table 5.1 – Joint distribution

The corresponding questions eliciting the joint distribution of 100 future returns for assets A and B are :

1. “How many out of the 100 additional periods do you think that assets A and B will have simultaneously high returns levels ?”

2. “How many out of the 100 additional periods do you think that assets A and B will have simultaneousl low returns levels ?”

3. “How many out of the 100 additional periods do you think that asset A will have high return and low return for asset B ?”

4. “How many out of the 100 additional periods do you think that asset A will have low return and high return for asset B ?”

By varying the correlation in assets returns and asking for the same questions as above, we can analyse subjects responses to change in correlation.

(41)

5.3

Pre-election Polls

Subjects are told to predict the results of an election. They have a choice between two options A and B. Each option has two levels of cost : high cost or low cost. The two options represent the two candidates for presidential race. There is 4 types of individuals in this population characterized by 4 different colors : blue, dark green, pale green and yellow. Each individual belongs to only one type of color. Subjects in the experiment observe the decisions of 100 individuals, randomly selected in the whole population (25 individuals per color). The results are presented in table 5.2. Subjects are told that this table represents the joint distribution between the choice of an option and his cost. Subjects have to predict to predict the choices of 100 additional people sampled in the same population such that 25 persons were drawn in each group. The following questions are used to assess their beliefs for 100 additional observations :

Blue Pale Green

high cost low cost Total high cost low cost Total

Option A 2 20 22 Option A 12 3 15

Option B 1 2 3 Option B 5 5 10

Total 3 22 25 Total 17 8 25

Dark Green Yellow

high cost low cost Total high cost low cost Total

Option A 1 11 12 Option A 4 1 5

Option B 1 12 13 Option B 18 2 20

Total 2 23 25 Total 22 3 25

Table 5.2 – subjects choices for each type of individual

1. “How many out of the 100 additional people do you think, will choose option A and will pay high costs ?”

2. “How many out of the 100 additional people do you think, will choose option A and will pay low costs ?”

3. “How many out of the 100 additional people do you think, will choose option B and will pay high costs ?”

4. “How many out of the 100 additional people do you think, will choose option B and will pay low costs ?”

Subjects who perceive the correlation between variables should fill in exactly by summing the number of individuals across groups who are in the same situation in the historical distribution that they observed. This design allow to measure, for each subject, his level of correlation neglect.

(42)
(43)

Conclusion

This paper has provided a sequence of belief formation tasks by varying the level of correlation between tasks and has demonstrated that people neglect correlation when they face some types of informational presentation. In this paper, we suppose correlation neglect to be an individual characteristic of a person and we suggest an empirical measure. Our measure is based on a sequence of laboratory experiments and tests for 3 types of informational presentation. First, subjects learn about the structure of decision problem and make their predictions ; second, they observe historical data of state variables without ordering possibility and third, they have a possibility to sort historical data.

Our results suggest a good amount of heterogeneity in correlation neglect at individual-level analysis. Two types of presentation allow to reduce subjects’ tendency to neglect correlation : structural presentation and historical presentation with ordering. Women are likely to neglect correlation more than men and the level of education does not significantly affect their neglect. Empirical data analysis suggests that individuals may observe all the information available in order to form their beliefs about state variables. Subjects don’t learn about correlation over time and cognitive effort doesn’t affect the correlation neglect parameter.

Our strategy, because we are able to measure correlation neglect at individual-level without making any ancillary hypothesis about individual’s preferences and without observing their decisions making process can be use as a general measure of correlation neglect when asking people’s expectations about the distribution of state variables.

Although we propose a simple measure of individual correlation neglect that can be applied in various domains, some authors suggest a measure that is specific to a domain (investment decision and portfolio choice problem, auctions market setting, etc.). These measures present the correlation either by the structure of information (correlation neglect in informational source or common source of information) or by observing historical data (e.g returns) and compute correlation neglect based on individual’s investment decision.

(44)
(45)

Bibliographie

Cohen, M., J.-Y. Jaffray, and T. Said (1987). Experimental comparisons of individual behavior under risk and under uncertainty for gains and for losses. Organizational Behavior and Human Decision Processes 39, 1–22.

DeMarzo, P., D. Vayanos, and J. Zwiebel (2003). Persuasion bias, social influence, and unidi-mensional opinions. Quaternaly Journal of Economics 118 (3), 909–968.

Diarra, S. and S. Dessy (2017). The determinants of the demand for child brides in sub-saharan africa. Working paper .

Enke, B. and F. Zimmermann (2015). Correlation neglect in belief formation. University of Zurich. Mimeo.

Eyster, E. and G. Weizsäcker (2011). Correlation neglect in financial decision making. Working paper .

Fraser, A., J. Brockert, and R. Ward (1995). Association of young maternal age and adverse reproductive outcomes. The New England Journal of Medecine 332 (17).

Holt, C. A. and S. K. Laury. (2002). Risk aversion and incentive effects. American Economic Review 92 (5), 1644–1655.

Kallir, I. and D. Sonsino (2009). The neglect of correlation in allocation decisions. Southern Economic Journal 75 (4), 1045–1066.

Kang, G., J. Lim, A. Sugam, and L. Y. Lee (2015). Adverse effects of young maternal age on neonatal outcomes. Singapore Medical Journal 56 (3), 157–163.

Levy, G. and R. Razin (2015). Correlation neglect, voting behavior, and information aggrega-tion. American Economic Review 105 (4), 1634–1645.

Omar, K., S. Hasim, N. A. Muhammad, and A. Jaffar (2010). Adolescent pregnancy outcomes and risk factors in malaysia. International Journal of Gynecology and Obstetrics 111, 220– 223.

(46)

Ortoleva, P. and E. Snowberg (2015). Overconfidence in political behavior. American Econo-mic Review 105 (2), 504–535.

Rubinstein, A. (2007). Instinctive and cognitive reasoning : A study of response times. Eco-nomic Journal 117 (523), 1243–1259.

Shrim, A., S. Ates, A. Mallozzi, and R. Brown (2011). Is young maternal age really a risk factor for adverse pregnancy outcome in a canadian tertiary referral hospital ? North American Society for Pediatric and Adolescent Gynecology 24, 218–222.

Wang, S., L. Wang, and M. Lee (2012). Adolescent mothers and older mothers :who is at high risk for adverse birth outcomes ? Public Health 126, 1038–1043.

(47)

Annexe A

Annexe

A.1

Enke and Zimmerman : correlation between signals

Computers generate two signals sAand ˜sB =

sA+ sB 2 , where sh ∼ N (µ, ( µ 2) 2) for h ∈ {A, B}. corr(sA, ˜sB) = cov(sA, sA+ sB 2 ) σsAσs˜B = 12 (σsA) 2 σsAσs˜B = 12σsA σs˜B σ˜sB =p(σ˜sB)2 = r V ar(sA+ sB 2 ) = q 1 4( µ 2)2+ 1 4( µ 2)2 = µ 2 q 1 2 = q 1 2σsA.

Finally, the spearman’s correlation coefficient between sA and ˜sB is define as follow :

corr(sA, ˜sB) = √

2

2 ≈ 71%

A.2

Convergence of Correlation in the Simple Set-Up

The linear correlation between X and Y is defined as follow :

ρX,Y = cov(X, Y ) σXσY = E[XY ] − E[X]E[Y ] σXσY E[X] = P [X = B] = B1 N1 = N1 2 N1 = 1 2 E[Y ] = P [Y = B] = B1 N1 ∗ B2+ 1 N2+ 1 +G1 N1 ∗ B2 N2+ 1 = 1 2 with G1 = N21 29

(48)

σX = p P [X = B](1 − P [X = B]) = 1 2 σY = p P [Y = B](1 − P [Y = B]) = 1 2 E[XY ] = P [X = B; Y = B] = B1 N1 ∗ B2+ 1 N2+ 1 = 1 4∗ N2+ 2 N2+ 1 ρX,Y = 1 N2+ 1 lim N2→∞ ρX,Y = 0 lim N1→∞ ρX,Y = 1 N2+ 1 30

(49)

A.3

Others Results

-1 -.5 0 .5 1 1.5

Median Corr. Neglect - Structural

-1 -.5 0 .5 1 1.5

Median Corr. Neglect - Empirical without order

Figure A.1 – Median Correlation Neglect- Structural vs Empirical Without Order

-1 -.5 0 .5 1 1.5

Median Corr. Neglect - Structural

-1 -.5 0 .5 1 1.5

Median Corr. Neglect - Empirical with order

Figure A.2 – Median Correlation Neglect- Structural vs Empirical With Order

(50)

-1 -.5 0 .5 1 1.5

Median Corr. Neglect - Empirical without order

-1 -.5 0 .5 1 1.5

Median Corr. Neglect - Empirical with order

Figure A.3 – Median Correlation Neglect- Empirical With Order vs Empirical Without Order

-.2 -.1 0 .1 .2 Mean Bias 0 .1 .2 .3 .4 Value of Correlation

Structural Empirical without order

Empirical with order

Figure A.4 – Mean correlation neglect per value of true correlation

(51)

0 .2 .4 .6 .8 0 .2 .4 .6 .8 -.5 0 .5 -.5 0 .5

structural Empirical without order

Empirical with order

Fraction

Figure A.5 – Histograms of the difference between original median correlation neglect pa-rameters and modified median correlation neglect papa-rameters when excluding one parameter that is closest to the original median correlation neglect parameter.

(52)

A.4

Instructions

Figure A.6 – Presentation - Structural

(53)

Figure A.7 – Presentation - Empirical Without Order

(54)

Figure A.8 – Presentation - Empirical With Order

(55)

Figure A.9 – Presentation - Empirical With Order

Références

Documents relatifs

(2.2) As a consequence of Theorem 2.1 we obtain the following result which can be used in fluid mechanics to prove the existence of the pressure when the viscosity term is known to

Then, denoting by l(a) the number of occurrences of the pattern “01” in the binary expansion of a, we give the asymptotic behavior of this probability distribution as l(a) goes

In particular, a theoretical explanation is given for the observation that in Europe many firms borrow from several banks while one bank takes the role of a relationship bank to

We have established a model for the coefficient of variation assuming a time scale

The more the relative weight on inflation (output gap) stabilization in the loss function, the more vigorous the reaction coefficient on inflation (output) in its policy rule, at

We define the importance of variables as their power to predict the ID cluster: a variable is relevant only if it is able to predict correctly the ID cluster obtained from the

Another, albeit less frequent method, consists in measuring the distribution of the interference effect by the opposite of the trend of the delta plot (Delta Trend Index, DTI),

Antimicrobial resistance of specific animal pathogens raises veterinary problems in terms of efficacy but, for two reasons, has no direct impact on human medicine: (i) these