A review of small area estimation methods for poverty mapping

(1)

A review of small area estimation methods for poverty mapping

Isabel Molina

Dept. of Statistics, Univ. Carlos III de Madrid

Coauthors: Mar´ıa Guadarrama, Balgobin Nandram, J.N.K. Rao

(2)

POPULATION AT RISK OF POVERTY

(3)

EXAMPLE

Survey on Income and Living Conditions 2006 Total sample size:n = 34,389 persons.

Province×gender sample sizes:

(Barcelona,F) (C´ordoba,F) (Tarragona,M) (Soria,F)

1483 230 129 17

3

(4)

NOTATION

• U finite population of sizeN.

• U partitioned into D subsetsU1, . . . ,UD of sizes N1, . . . ,ND, calleddomains or areas.

• s sample of size n drawn from the population U.

• sd=s∩Ud sub-sample from domaind of size nd.

• r_d =U_d−s_d out-of-sample elements from domaind.

4

(5)

DOMAIN PARAMETERS

• y_dj outcome for unit j in area d.

• y_d = (y_d1, . . . ,y_dN_d)⁰ vector of outcomes forarea d.

• Target quantities:Possiblynon-linear function of y_d, δd =hd(yd), d = 1, . . . ,D.

• Example: mean of d-th area,

Y¯d = 1 N_d

Nd

X

j=1

ydj.

5

(6)

DIRECT ESTIMATION

• Based essentially on the area-specificsample data.

• π_dj =P(j ∈s_d) inclusion prob., w_dj = 1/π_dj sampling weight.

• Sampling weights w_dj protect against informative sampling (probability of selection depending on outcomes).

• Example: Horvitz-Thompson direct estimator, ˆ¯

Y_d^DIR = 1 N_d

X

j∈sd

w_djy_dj.

• Sampling varianceV_π( ˆY¯_d^DIR) can be estimated easily with the area-specific data.

(7)

POVERTY AND INEQ. INDICATORS

• Edj welfare measure for indiv.j in domaind.

• z = poverty line.

• FGT poverty indicator of order α for domain d:

F_αd = 1 Nd

Nd

X

j=1

z−E_dj z

α

I(E_dj <z), α≥0.

• Whenα= 0⇒Poverty incidence(or at-risk-of-poverty rate)

• Whenα= 1⇒ Poverty gap

• Other: Quintile share ratio, Gini coef., Sen index, Theil index, Generalized entropy, Fuzzy monetary/supplementary index.

XFoster, Greer& Thornbecke (1984), Econom.

XNeri, Ballini & Betti (2005), Stat. in Transition 7

(8)

DIRECT ESTIMATORS

• FGT pov. indicator as a mean:

F_αd = 1 N_d

Nd

X

j=1

F_αdj, F_αdj =

z−E_dj z

α

I(E_dj <z)

• HT estimator:

Fˆ_αd^DIR = 1 Nd

X

j∈sd

w_djF_αdj.

(9)

DIRECT ESTIMATORS

ADVANTAGES:

• No model assumptions.

• Sampling weights can be used⇒ Approx.design-unbiased even under informative sampling.

• Additivity (Benchmarking property):

D

X

d=1

Yˆ_d^DIR = ˆY^DIR.

DISADVANTAGES:

• V_π( ˆY¯_d^DIR)↑ as n_d ↓. Veryinefficient for small domains.

• Estimator of sampling error very inefficientfor small domains.

• Cannot be calculated for out-of-sample areas.

9

(10)

INDIRECT ESTIMATORS

• Indirect estimator: It borrows strengthfrom other areas by making some kind ofhomogeneityassumption across areas (model withcommonparameters) that uses auxiliary information.

10

(11)

FAY-HERRIOT (FH) MODEL

(i) Linking model:

δd =x⁰_dβ+ud, ud

iid∼(0, σ²_u), d = 1, . . . ,D σ_u² unknown

(ii) Sampling model:

δˆ^DIR_d =δ_d+e_d, e_d ^ind∼ (0, ψ_d), d = 1, . . . ,D ud anded indep., ψd =Vπ(ˆδ^DIR_d |δ_d)known∀d (iii) Combined model: Linear mixed model

δˆ^DIR_d =x⁰_dβ+u_d+e_d, d = 1, . . . ,D.

XFay & Herriot (1979), JASA 11

(12)

BEST LINEAR UNBIASED PREDICTOR

• Minimizes the MSE among linearandunbiased estimators.

• Easily obtained by fitting the mixed model:

δ˜_d^BLUP =x⁰_dβ˜+ ˜ud, where

β˜= ˜β(σ²_u) =

D

X

d=1

γ_dx_dx⁰_d

!⁻¹ _D X

d=1

γ_dx_dδˆ^DIR_d ,

˜

ud = ˜ud(σ_u²) =γd(ˆδ_d^DIR−x⁰_dβ),˜ γd = σ_u² σ²_u+ψ_d

(13)

GOOD PROPERTY OF THE BLUP

• Weighted combination of direct and “regression synthetic”

estimator:

δ˜^BLUP_d =γ_dˆδ_d^DIR+ (1−γ_d)x⁰_dβ,˜ γ_d = σ_u² σ_u²+ψ_d.

• When ˆδ^DIR_d is reliable (↓ ψd) or when area heterogeneity is not well explained by x⁰_dβ˜ (↑ σ²_u), ˜δ_d^BLUP −→δˆ^DIR_d .

• Otherwise, if ˆδ^DIR_d unreliableorx⁰_dβ˜reliable, ˜δ_d^BLUP −→x⁰_dβ.˜ 13

(14)

EMPIRICAL BLUP (EBLUP)

• δ˜^BLUP_d depends on unknownσ²_u through ˜βand γd: δ˜_d^BLUP = ˜δ^BLUP_d (σ_u²)

• Empirical BLUP (EBLUP) ofδ_d: ˆσ²_u estimator of σ_u², δˆ^EBLUP_d = ˜δ_d^BLUP(ˆσ_u²), d = 1, . . . ,D

• The EBLUP remains model-unbiased under certain conditions (typically satisfied).

• MSE of EBLUP under FH model can be approximated with o(1/D) bias under normality.

(15)

NESTED ERROR MODEL

• Nested error unit level model:

y_dj =x⁰_djβ+u_d+e_dj, j = 1, . . . ,N_d, d = 1, . . . ,D u_d ^iid∼N(0, σ_u²), e_dj ^iid∼N(0, σ²_e)

XBattese, Harter & Fuller (1988), JASA

• The distribution of incomes E_dj is highly right skewed.

• Select a transformation T() such that the distribution of ydj =T(Edj) is approximately Normal.

• Assumption: y_dj =T(E_dj) satisfies the nested error model.

15

(16)

EB METHOD FOR POVERTY ESTIMATION

• Poverty indicators in terms of yd = (yd1, . . . ,ydNd)⁰:

F_αd = 1 Nd

Nd

X

j=1

z−T⁻¹(y_dj) z

α

I

T⁻¹(y_dj)<z =h_α(y_d).

• Partition yd into sample and out-of-sample: yd = (y⁰_ds,y_dr⁰ )⁰

• Best predictor:Minimizes the MSE F˜_αd^B =E_y_dr

F_αd|y_ds;β, σ²_u, σ_e² .

• Empirical best (EB) predictor: Fˆ_αd^EB = ˜F_αd^B ( ˆβ,ˆσ_u²,σˆ_e²).

XMolina and Rao (2010), CJS 16

(17)

MODEL-BASED EXPERIMENT

• Simulate I = 10⁴ populations from thenested-error model.

• For each population i = 1, . . . ,10⁴, compute true domain FGT poverty indicatorsF_αd⁽ⁱ⁾ for α= 0,1 and d = 1, . . . ,D.

• Take the sample part of each population (assuming SRS within each domain) and compute EB, direct and ELL estimates(Elbers, Lanjouw and Lanjouw (2003), Econometrica).

• Approximate true MSEsof EB estimators as

MSE( ˆF_αd^EB) = 1 I

I

X

i=1

Fˆ_αd^EB(i⁾−F_αd⁽ⁱ⁾ 2

, α= 0,1, d = 1, . . . ,D.

• Similarly for direct and ELL estimators.

17

(18)

MODEL-BASED EXPERIMENT

• Population and sample sizes:

N = 20000, D=80

N_d = 250, n_d=50, d = 1, . . . ,D

• Variance components:

σ_e² = (0.5)², σ²_u= (0.15)²

• Explanatory variables:2 dummies:

X₁ ∈ {0,1}, p_1d = 0.3 + 0.5d/80, d = 1, . . . ,D.

X₂ ∈ {0,1}, p_2d = 0.2, d = 1, . . . ,D.

• Coefficients:

β= (3,0.03,−0.04)⁰

(19)

POVERTY INCIDENCE

•EBmuch more efficient than ELL and direct estimators.

•ELL even less efficientthan direct estimators!

Bias ( %)

0 20 40 60 80

−0.3−0.2−0.10.00.10.20.3

Area

Bias, poverty incidence (x100)

●

●●

●

●●

●

●●●

●

●●

●

●●

●

●●

●

●●

●●●

●

● EB Direct ● ELL

MSE (×10⁴)

0 20 40 60 80

10203040506070

Area

MSE, poverty incidence (x10000)

●●●●●●●●●

●●●●●●●●●●

●

●●

●

●●●●●●●●●●●

●●●●●●

●

●●●●

●

●●●

●

●●●

●

●●●●●●●●●

●

●●●●

EB Direct● ELL

19

(20)

POVERTY GAP

•Same conclusions as for poverty incidence.

Bias ( %)

0 20 40 60 80

−0.10−0.050.000.050.10

Area

Bias, poverty gap (x100)

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●●

●

● EB Direct ● ELL

MSE (×10⁴)

0 20 40 60 80

1234

Area

MSE, poverty gap (x10000)

●●

●

●●●●●●

●●●●

●

●●●●

●

●●

●

●●

●●●●●●●

●●●

●●●●

●

●●

●

●●

●

●●●

●

●●

●

●●●●●

●

●●●●

● EB Direct ELL

(21)

HIERARCHICAL BAYES METHOD

• Reparameterization:

ρ=σ²_u/(σ²_u+σ²_e)

• Reparameterized nested-error model:

y_dj|u_d,β, σ²_e ^ind∼ N(x⁰_djβ+u_d, σ²_e), u_d|ρ, σ_e²^ind∼ N

0, ρ

1−ρσ²_e

• Noninformativeprior:

π(β, σ_e², ρ)∝1/σ_e²

XRao, Nandram& Molina (2014), Annals of Applied Statistics21

(22)

HIERARCHICAL BAYES METHOD

• Properposterior density (provided X full column rank andρ is in a closed interval from (0,1)):

• Conditional distributions:

ud|β, σ²_e, ρ,ys

ind∼ Normal, β|σ²_e, ρ,y_s ∼Normal, σ⁻²_e |ρ,y_s ∼Gamma

• π₄(ρ|y_s) not simple butρ-values can be generated using a grid method.

XRao, Nandram& Molina (2014), Annals of Applied Statistics22

(23)

COMPARISON WITH FH ESTIMATES

•FH estimates biased because oflinearityproblems.

•HB≈EB estimates nearly unbiased and much more efficient.

Rel. Bias ( %)

0 20 40 60 80

−8−6−4−2024

1:D

Relative Bias (%)

●

●●

●

●●●

●●

●

●●

●

●●

●

●●●●

●

●●

●

●●

●

●●●●

●

●●

●

●●

●

● DIR FH HB

Rel. Root MSE ( %)

0 20 40 60 80

20222426283032

1:D

Relative MSE (%)

●

●●

●

●●●●●

●

●●

●

●●

●

●●

●

●●●

●

●●

● DIR FH HB

23

(24)

INFORMATIVE SAMPLING

•HB and EB estimates biasedunder informative sampling.

•FH estimates with known dependency of inclusion probabilities on true responsesless biased.

Rel. Bias ( %)

0 20 40 60 80

−10−505101520

1:D

Relative Bias (%)

●

●●●

●

●●

●●●

●●●●●

●

●●

●●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

● DIR FH HB

Rel. Root MSE ( %)

0 20 40 60 80

22242628303234

1:D

Relative MSE (%)

●

●●

●

●●

●

●●

●

●●

●

●●●●●

●

●●

●

●●

●

●●●

●●

●

● DIR FH HB

(25)

PSEUDO EB

• Best predictor for additive area parameters:

F˜_αd^B =Eydr [Fαd|y_ds] = 1 N_d



 X

j∈s_d

Fαdj +X

j∈r_d

E(Fαdj|y_ds)

| {z }



,

• Under the nested-error model:

E(F_αdj|y_ds) =E(F_αdj|¯y_d)−→E(F_αdj|¯y_dw).

• Pseudo Best predictorfor additive parameters:

F˜_αd^PB= 1 Nd



 X

j∈s_d

F_αdj +X

j∈r_d

E(F_αdj|¯y_dw)

| {z }



.

25

(26)

PSEUDO EB

•Includingsampling weights reduces the design bias!

•Pseudo EB estimators do not lose much efficiency.

Rel. Bias ( %)

0 20 40 60 80

−50510152025

Area

Relative Bias (%)

SM WSM EB Pseudo EB

Rel. Root MSE ( %)

0 20 40 60 80

40506070

Area

RRMSE (%)

SM WSM EB Pseudo EB

(27)

OTHER EXTENSIONS

• Existence of two grouping levels−→ EB method under a two-foldnested-error model.

XMarhuenda, Molina, Morales and Rao (2017), JRSSA

• Particular non linear parameter: Area mean under a model for the log-transformation of the target variable −→

Explicit exact EB estimator andasymptotic MSE.

XMolina and Mart´ın (2018), AOS

• EB method for poverty estimation assumesnormality for some transformation of the variable of interest−→ Extension to skeweddistributions.

XGraf, Mar´ın and Molina (2018), Test

27

(28)

POVERTY MAPPING IN SPAIN

• Data source: Spanish Survey on Income and Living Conditions (EU-SILC) of 2006.

• Target:Calculate EB and HB estimates of poverty incidences and gaps for Spanish provinces by gender.

• Areas: D= 52provincesfor eachgender. We fit a separate model for each gender.

• Transformation: We consider the nested-error model for the log-equivalized disposable income:

y_dj =T(E_dj) = log(E_dj +k).

• Explanatory variables:indicators of 5age groups, of having Spanishnationality, of 3education levels and of laborforce status (unemployed, employed or inactive).

(29)

HIERARCHICAL BAYES METHOD

•HB estimates practically the same as EB ones. The same in simulations under the frequentist setup (frequentist validity).

Poverty incidence ( %)

EB estimate poverty incidence (x100)

HB estimate poverty incidence (x100)

10 15 20 25 30 35

●

● ●

●

10 15 20 25 30 35

gender

●M F

Poverty gap ( %)

EB estimate poverty gap (x100)

HB estimate poverty gap (x100)

4 6 8 10 12 14

●

●●

●

4 6 8 10 12 14

gender

●M F

29

(30)

POVERTY MAPPING IN SPAIN

• Estimated CVs of direct, EB and HB estimators of poverty incidences for selected provinces crossed with gender:

Province Gender n_d Obs. Poor CVˆ Dir. CVˆ EB CVˆ HB

Soria F 17 6 51.87 16.56 19.82

Tarragona M 129 18 24.44 14.88 12.35

C´ordoba F 230 73 13.05 6.24 6.93

Badajoz M 472 175 8.38 3.48 4.24

Barcelona F 1483 191 9.38 6.51 4.52

(31)

RESULTS

Poverty incidence ( %): Men

under 15 15 − 20 20 − 25 25 − 30 over 30

Poverty incidence ( %): Women

under 15 15 − 20 20 − 25 25 − 30 over 30

Pov.inc.≥30 %,Men: Almer´ıa, Granada, C´ordoba, Badajoz, ´Avila, Salamanca, Zamora, Cuenca.

Women:also Ja´en, Albacete, Ciudad Real, Palencia, Soria. 31

(32)

RESULTS

Poverty gap ( %): Men

under 5 5 − 7.5 7.5 − 10 10 − 12.5 over 12.5

Poverty gap ( %): Women

under 5 5 − 7.5 7.5 − 10 10 − 12.5 over 12.5

Pov.gap≥12.5 %,Men:Almer´ıa, Badajoz, Zamora, Cuenca.

(33)

COMPARISON WITH DIRECT ESTIMATORS

DISADVANTAGES:

• Requiremodel assumptions (model checking important!).

• Not design-unbiased in general =⇒ Sampling weights can be incorporated to reduce design bias.

• Requireadjustment to satisfy the benchmarking property:

D

X

d=1

Yˆ_d = ˆY^DIR.

ADVANTAGES:

• Very efficientfor small domains.

• Estimator of model MSE efficientfor small domains as well.

• Can be calculated for out-of-sampleareas.

33

(34)

SOFTWARE

The R packagesaecontains functions:

• FH model: eblupFH,mseFH.

• Spatial FH model:eblupSFH,mseSFH,pbmseSFH, npbmseSFH.

• Spatio-temporal FH model:eblupSTFH,pbmseSTFH.

• Nested-error model: eblupBHF,pbmseBHF.

• EB method: ebBHF,pbmseebBHF.

• Other: direct,pssynt,ssd.

• Data sets and examples.

(35)

A review of small area estimation methods for poverty mapping