• Aucun résultat trouvé

Optimal design for linear forms of the parameters in a Chebyshev regression

N/A
N/A
Protected

Academic year: 2021

Partager "Optimal design for linear forms of the parameters in a Chebyshev regression"

Copied!
54
0
0

Texte intégral

(1)

HAL Id: hal-01010962

https://hal.archives-ouvertes.fr/hal-01010962

Preprint submitted on 21 Jun 2014

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Optimal design for linear forms of the parameters in a

Chebyshev regression

Michel Broniatowski, Giorgio Celant

To cite this version:

Michel Broniatowski, Giorgio Celant. Optimal design for linear forms of the parameters in a Chebyshev regression. 2014. �hal-01010962�

(2)

Optimal design for linear forms of

the parameters in a Chebyshev

regression

Michel Broniatowski

(1)

; Giorgio Celant

(2; )

(1)

LSTA, Université Pierre et Marie Curie, Paris, France

(2)

Dipartimento di Scienze Statistiche, Università degli

Studi di Padova, Italy.

( )

Corresponding author.

Abstract

This paper of pedagogical nature considers optimal designs for linear combinations of the parameters in a Chebyshev regres-sion scheme, namely de…ned by a vector c. Simple algebraic arguments lead to identify the class of such linear forms which admit unbiased linear estimators. This class is the Elfving set. Geometrical properties of this set provide a description of its frontier points as convex combinations of elements in the span of the regressors. The optimal design is shown to result from this representation. This statement is made precise in Elfving Theo-rem. However this Theorem does not provide an explicit form for the optimal design, but merely its existence and uniqueness. A further result due to Karlin and Studden provides a bridge be-tween the optimal design properties and the theory of uniform approximation of functions by functions in a Chebyshev or Haar system of functions. This in turn provides the explicit form for the optimal design. The derivation of these results makes use of geometrical considerations pertaining to the class of moment matrices, following the approach by Pukelsheim.

Key words: Chebyshev regression; Elfving Theorem; moment ma-trix; optimal design; estimable function

1

Introduction

In [1] pertaining to extrapolation designs we considered the estimation of peculiar linear forms of the parameters

f (x) :=

g 1

X

(3)

where

X(x) := ('0(x) ; :::; 'g 1(x))0 (1)

with jxj > 1:

In the present paper some results which we obtained previously will be generalized for a generic linear form of the parameter , namely of

the form D

c0; E (2)

with

c:= (c0; ::; cg 1)0:

Such a linear form is called a c -form, with c a known vector in Rg, in

order to stress the fact that all conditions for existence and optimality of the estimator of (2) will pertain to c: Also a c -form which admits a linear unbiased estimator with minimal variance (under a given design

) is called estimable.

We make use of the following notation. The class of all probability measures de…ned on [ 1; 1] with …nite support with cardinality g is denoted Md([ 1; 1]) : A measure in Md([ 1; 1]) is a design. The

design may also be de…ned by the set of g nodes x0; ::; xg 1. Given

some n in N and n0; ::; ng 1 in N such that n0 + :: + ng 1 = n de…ne

(xj) := nj=n: This subclass of measures inMd([ 1; 1]) is precisely the

class of designs in relation to experimental planning. For convenience and since n is …xed, we useMd([ 1; 1]) to denote this class of measures.

In the above example

c= ('0(x) ; :::; 'g 1(x))0:

An important example is

c= (0; ::; 0; 1; 0; ::; 0)0

where the 1 is at the i th position; hence the linear for is the value of the i th component of the vector of parameters .

In [2] we had

X (x) = 1; x; :::; xg 1 0:

We will consider unbiased linear estimators. The c -forms of the parameter which admit such an estimator determine a strict subset of all vectors in Rg, which is the Elfving set. For any …xed design consider

the variance of the estimate of the c -form. Consider all c -forms for which de…nes an optimal linear unbiased estimator; its is proved that this class of c -forms is reduced to a single element. Hence for a given design there exists a unique c -form which is estimated in an optimal

(4)

way by this design; the Theorem of Elfving provides the correspondence between the c -form and the design : Based on Elfving’s result, the optimality Theorem by Karlin and Studden provides for a given c -form an explicit expression of the corresponding optimal design.

In order to describe estimable c -forms it is convenient to introduce a formalism in relation with the geometry of symmetric matrices.

The relation of this program with the theory of the uniform approx-imation of functions is apparent when we turn to the explicit charac-terization of the optimal design of the c -form through the optimality Theorem by Karlin and Studden. Indeed it will be seen that the variance of the generic linear form (2) under any design may be written as

min c0M ( ) c = 1

mind2Rgmaxx2[ 1;1](d0X (x))2

(3)

where c0M ( ) c is the variance of the Gauss Markov estimator of

(2). The matrix M ( ) coincides with a generalized inverse of the so-called moment matrix of , and d is an element in Rg which satis…es

d0c = 1: As in [1] the model writes

E (y (x)) = (X (x))0 , X (x) := ('0(x) ; :::; 'g 1(x))0

and

f'0; :::; 'g 1g

is a Chebyshev system in [ 1; 1].

We will make use of the following form of the Borel-Chebyshev Theo-rem which characterizes oscillating functions in the span off'0; :::; 'g 1g

and which assume equal absolute values on its equioscillation points. This Theorem is useful in order to build the optimal design.

Theorem 1 (Karlin e Studden) Let f'0; :::; 'g 1g be a Chebyshev

sys-tem in [ 1; 1] . Then there exists a unique function u in V := spanf'0; :::; 'g 1g de…ned on [ 1; 1] , u (x) := Pgj=01aj'j(x) , which

satis…es a) ju (x)j 1, for all x 2 [ 1; 1], b) there exist g points in [ 1; 1], ex0, ..., exg 1 such that 1 ex0 < ::: < exg 1 1 and

u (exj) = ( 1)g 1 j, j = 0; :::; g 1. Up to a multiplicative constant

this function u is the best uniform approximation of the null function

Proof. See Karlin and Studden [9].

The contents of the chapter is as follows. We …rst study the prop-erties of the moment matrices associated to designs. Then we introduce

(5)

unbiased linear estimator. The geometry of the class of moment matri-ces is the frame in which the estimable forms are best handled. Then we introduce the Elfving set of vectors which provides a link between estimable linear forms of the parameters and the corresponding optimal designs. Finally a Theorem by Studden and Karlin provides a complete characterization of the optimal designs; at this point some use is made of the theory of best uniform approximation of functions, in relation with the minimax problem stated in (3). It also leads to an e¤ective way to obtain this design.

This chapter is based on the papers by [5],[6], [9], [12], [13], [14],[7].

2

Matrix of moments

With X (x) de…ned as above in (1)

X (x) = ('0(x); :::; 'g 1(x))0

for any := ( 0; ::; g 1) 2 Md([ 1; 1]) we may associate the moment

matrix de…ning M ( ) := g 1 X i=0 i X (xi) X 0(xi) :

Denote by M ( ) the family of all moment matrices de…ned in this way, namely

M( ) :=fM ( ) : 2 Md([ 1; 1])g :

Since X (x) X0(x) is symmetric, so is M ( ) :

The class of all symmetric matrices of order g is denoted S (g) and it holds

M( ) S (g) .

Some special subsets inS (g) deserve interest for the sequel. S 0(g) :=fA 2 S (g) such that x0Ax 0 for any x2 Rgg ;

S+(g) :=fA 2 S (g) such that x0Ax > 0 for any x 2 Rg

f0gg : The set S 0(g) is the class of all symmetric positive semide…nite

matrices of order g whereas the second class S+(g) is the class of all

(6)

We will also make use ofS (g) ; S 0(g) ; with clear interpretation.

The following result will be used.

Theorem 2 (Spectral Theorem) Let M 2 S (g). Its eigenvalues 1; :::; g 1

are real. Let vi be an eigenvector associated to the eigenvalue i; i =

0; :::; g 1. Then < vi; vj >= 0 for all i6= j. Furthermore according to

M belongs to S+(g), S (g), S

0(g) or to S 0(g), all eigenvalues of M

are positive, non negative, negative or non positive. If M 2 S (g) then M = P DP 1 where P indicates the matrix of the eigenvectors of M and

D is the diagonal matrix whose diagonal terms are the eigenvalues of M .

Proof. see [11].

Corollary 3 Let min(M ) be the smallest eigenvalue of M . It holds

: 1) M 2 S 0(g) i¤ min(M ) 0 i¤ tr (M B) 0, 8B 2 S 0(g); 2)

M 2 S+(g) i¤

min(M ) > 0 i¤ tr (M B) > 0, 8B6= 0 and B 2 S+(g).

Proof. Clear.

These classes induce a partial ordering inS (g) , the so-called Loewner ordering , de…ning

A % B () A B 2 S 0(g) ,

A B () A B 2 S+(g) :

It is customary to write A B % 0 in the …rst case and A B 0 in the second one. Loewner ordering enjoys a number of properties.

Let 2 R+, and A, B, A

nbelong toS 0(g), assuming that limn!1An

exists: Then

A % 0, A + B % 0, lim

n!1An%0;

see [12].

We note that the linear space Mn gof all matrices n g is isomorphic

to Rn g. We introduce a topology on M

n g , and therefore on S (g).

For this scope we introduce the trace operator. The mapping

(7)

de…nes an inner product, from which derive the norm and the dis-tance

k:ktr :S (g) ! R 0, A7! kAktr := tr A 2 ,

dist (A; B) :=kA Bk2 = tr (A B)2 :

The closed ball with radius 1 and center 0 is de…ned by

Str(0; 1) :=fA 2 S (g) : kAktr 1g :

Remark 4 If A % B % 0, then kAk2tr = tr (A2) tr (AB) tr (B2) =

kBk2tr.

Remark 5 . If B 2 Str(0; 1) then the absolute value of its

eigen-values is less or equal 1. Indeed denoting respectively j and vj the

eigenvalues and corresponding eigenvectors with norm 1 of B, we have B = Pj jvjv0j. Hence tr (B2) =

P

j 2j. Since B 2 Str(0; 1) i¤

tr (B2) 1, it holds P

j 2j 1 which yields j jj 1.

Furthermore it holds x0Bx x0x for all B2 S

tr(0; 1). Indeed x0Bx2 R and therefore x0Bx jx0Bxj = x0 X j jvjvj0 ! x = X j jx0vjvj0x = X j j(x0vj) 2 X j j jj (x0vj)2 X j (x0vj)2 =X j x0vjv0jx = x0 X j vjv0j ! x = x0x:

Indeed the eigenvectors are orthonormal, i.e.

< vj; vi >= 1 per i = j

0 per i6= j :

The geometric structure ofS 0(g) is described through the following

(8)

Theorem 6 (Pukelsheim p 29) S 0(g) is a convex closed cone. It is

pointed. Furthermore S+(g) is the relative interior of S

0(g) in S (g) :

We denote it int (S 0(g)) :

Proof. S 0(g) is a cone. Indeed if A2 S 0(g) then x0Ax 0. Hence

for any 0, (x0Ax) = x0( A) x 0. Hence A2 S 0(g).

We prove that S 0(g) is convex: Let A; B 2 S 0(g). It holds

x0Ax 0 e x0Bx 0. Hence x0Ax+ x0Bx = x0(A + B) x 0. It follows

that A + B 2 S 0(g). Let > 0. Since S 0(g) is a cone , when A,

B 2 S 0(g) ; A 2 S 0(g) and (1 ) B 2 S 0(g). It follows that so

for the sum, A + (1 ) B 2 S 0(g). Hence,S 0(g) is convex.

We see that S 0(g) is pointed, namely 02 S 0(g).

We prove that S 0(g) is closed by proving that its complementary

set S (g) in S (g) is an open set. Let A be a generic point in S (g). We prove that A is included in a closed set which is totally included in S (g) : By de…nition of A it holds x0Ax < 0. In order to determine a

closed set containing A and included in S (g), de…ne := x0Ax

2x0x

with x 6= 0. Since x0Ax < 0 and x0x > 0, it holds > 0. Let B 2

Str(0; 1). For any such B 2 Str(0; 1) it holds x0(A + B) x = x0Ax +

x0Bx. Since B 2 S

tr(0; 1), by Remark 5 x0Bx x0x. This inequality ,

together with the de…nition of , entails

x0(A + B) x = x0Ax + x0Bx x0Ax + x0x = x0Ax + x0Ax 2x0x x0x = x0Ax x0Ax 2 = x0Ax 2 :

On the other hand since A 2 S (g) it holds x0Ax < 0. It follows that

x0(A + B) x = x0Ax 2 < 0

and therefore x0(A + B) x < 0. Hence A + B 2 S (g). This holds for

any B 2 Str(0; 1). We have de…ned the closed set A + Str(0; 1) which

contains A and which is included in S (g). hence S 0(g) is closed.

We prove thatS+(g) is the relative interior ofS

0(g) inS (g) : We

(9)

A) We start with

int (S 0(g)) S+(g) :

Let A2 int (S 0(g)). Since A is an interior point inS 0(g), there exists

some > 0 with

A + Str(0; 1) S 0(g) : (4)

For any x6= 0, the matrix

B := xx0 x0x

belongs to Str(0; 1). Indeed evaluate kBktr: It holds

B2 = xx0 x0x 0 xx0 x0x = xx0 x0x; hence kBktr = tr B 2 = tr xx0 x0x = 1 x0xtr (xx0) = tr (x0x) x0x = x0x x0x = 1:

Furthermore we have, x0Ax x0x = x0(A + B) x. Indeed consider the

matrix x0B = xx 0 x0x x = x0xx0x x0x = x0x: Substituting x0Bx by x0x, we obtain x0(A + B) x = x0Ax + x0 Bx =

x0Ax x0x. If we can prove that

x0Ax x0x 0; (5) it then follows that x0Ax x0x. Now clearly x0x > 0 for x 6= 0.

Indeed > 0: Thus it holds x0Ax > 0 and therefore A 2 S+(g) : In

order to conclude it is thus enough to prove (5). Since x0Ax x0x = x0(A + B) x, and B 2 S

tr(0; 1) then by (4) A + B 2 S 0(g), (5) holds

true.

B) We now prove that

S+(g) int (S 0(g)) :

Let A 2 S+(g) and denote

min the minimal eigenvalue of A. Since min > 0, for any B 2 Str(0; 1), it holds x0Bx x0x. Indeed for any

B 2 Str(0; 1) it holds , x0Bx jx0Bxj . Now jx0Bxj x0x, for all

x 2 Rg. Hence x0x x0Bx x0x , i.e. x0x x0Bx. Therefore ,

since x0x 0, we get B 2 S+(g). We now consider x0(A +

(10)

It holds x0(A +

minB) x = x0Ax + min x0Bx. Since x0Ax min x0x

we get

x0Ax + minx0Bx minx0x + + minx0Bx minx0x minx0x = 0:

Therefore x0(A +

minB) x 0. Hence (A + minB) 2 S 0(g) for any

B 2 Str(0; 1). We have therefore found a closed set A + minStr(0; 1),

which contains A and which is included inS 0(g). Therefore A2 S+(g)

) A 2 int (S 0(g)).

As a subset ofS (g) ; the set M ( ) enjoys interesting geometric prop-erties, as seen now.

Associated with the Chebyshev system of regressors X(x) de…ned in (1) as x belongs to [ 1; 1] we de…ne

H := fx ! X(x) 2 Rg

; x2 [ 1; 1]g : (6) We also de…ne the regression range as the linear space generated by H:

Theorem 7 M( ) S (g) is a compact and convex subset of S 0(g).

Proof. Convexity is obvious. Indeed any moment matrix is obtained as a convex combination of the X (x) X0(x)’s. Indeed

M ( ) = X

xi2supp( )

(xi) X (xi) X0(xi) :

The coe¢cients of this combination are the values (xi). Varying those

coe¢cients, and therefore varying the measure inMd([ 1; 1]) we

gen-erate M ( ). Hence M ( ) is the smallest convex set which contains fX (x) X 0(x) : x 2 [ 1; 1]g : In order to prove the compactness of M ( )

it is enough to prove thatfX (x) X 0(x) : x 2 [ 1; 1]g is compact. Indeed

in a …nitely dimensional linear space the convex hull of a compact set is compact. We thus prove that the setfX (x) X0(x) : x 2 [ 1; 1]g is

com-pact. indeed this holds since [ 1; 1] is a compact set and the mapping X (:) X 0(:) is continuous, since so is the mapping X (:) : [ 1; 1] ! Rg,

x7! X (x) := ('0(x) ; :::; 'g 1(x))0.

We now will consider optimal designs following Kiefer, Wolfowitz and Studden, mostly using the approach of Karlin and Studden [8], which makes use of a theorem due to Elfving [4] , [12] and [14]. The approach by Kiefer and Wolfowitz [10] makes use of some arguments from game

(11)

3

Estimable functions

3.1

Notation

As already used it may at times be useful to denote the nodes taking into account their multiplicity, namely for each of them the number of repli-cations of the experiment to be performed. On the node xj we denote

by nj the number of replications. The discrete measure characterizing a

design is therefore described as

z }| {

x0; :::; x0 (n0 times); :::;z }| {xj; :::; xj (nj times); :::;zxg 1; :::; x}| g {1(ng 1 times)

with

n0+ ::: + ng 1 = n:

This design can also be written by

t1; :::; tn

where t1; :::; tn0 describe the n0 equal values of x0 ,and so on:

3.2

Model and estimators

The system which describes the n observations is given by

Y = T + " (7) where Y := 0 @ y:1 yn 1 A T := 0 B B B B @ '0(t1) ::: 'g 1(t1) :: ::: ::: '0(ti) ::: 'g 1(ti) ::: :: ::: '0(tn) ::: 'g 1(tn) 1 C C C C A, := 0 @ :0 g 1 1 A , " := 0 @ ":1 "n 1 A ; E (Y ) = T , var (") = 2I n;

and In is the identity matrix of order n. the vector of parameters

belongs to Rg and is unknown.

We assume that 2 > 0:

We introduce a linear form c in Rg which is identi…ed with the vector

(c0; ::; cg 1) through ! g 1 X j=0 cj j :=< c; 0> : (8)

(12)

We assume that c6= 0:

In order to emphasize the vector c of the coe¢cients of the parametric functionPgj=01cj j; we call a c - form the linear form !Pgj=01cj j :=<

c; 0> :

De…nition 8 Let c 2 Rg and identify c with the linear form (8) where

c 2 Rg; we say that the linear form c is estimable if and only if there

exists some unbiased linear estimator of < c; 0> for de…ned in (7).

Example 9 Consider c := (1; 0; ::; 0) ; therefore < c; 0>= 0; the linear

form c is estimable if we can de…ne an unbiased linear estimator of 0:

From this de…nition ! < c; 0> is estimable if and only if there exists u0 := (u

1; :::; ug)2 Rg; such that \< c; 0> =< u0; Y > and

E ( u0Y ) :=< c; 0>; for all 2 Rg; i.e. u0E (Y ) =< c; 0>; for all 2 Rg ; which amounts to u0T = c0 ; for all 2 Rg which entails u0T = c0:

Finally we see that a linear form is estimable if and only if c2 Im T:

In the next paragraph we make explicit the Gauss Markov estimator of a linear form.

3.3

Matrix of moments and Gauss Markov

estima-tors of a linear form

We discuss the link which connects the moment matrix and the estimable linear forms. The variance of the Gauss Markov estimators of an es-timable linear form will be derived.

Considering a generic design with

(x ) = nj; j = 0; :::; g 1;

g 1

X

(13)

The model to be considered writes as yi(xj) = g 1 X r=0 r'r(xj) + "i; E ("i) = 0; cov ("i; "k) = 2 for i = k 0 for i6= k ; i = 1; :::; nj; j = 0; :::; g 1: Let X(x) := ('0(x) ; ::; 'g 1(x))0

for all x in [ 1; 1], the observable domain: The matrix T 0T = g 1 X j=0 nj (X (xj)) X (xj)0

has dimension g g . The matrix

M ( ) := 1 nT 0T = X xj2supp( ) (x) (X (x))0X (x) = Z [ 1;1] (X (x))0X (x) d ( ) which is M ( ) = 0 B B B B B @ R [ 1;1]('0(x)) 2 d ( ) ::: R[ 1;1]'0'g 1(x) d ( ) ::: ::: ::: R [ 1;1]'i(x) '0(x) d ( ) :: R [ 1;1]'i(x) 'g 1(x) d ( ) ::: ::: ::: R [ 1;1]'g 1(x) '0(x) d ( ) ::: R [ 1;1]('g 1(x)) 2 d ( ) 1 C C C C C A named as the moment matrix.

We may rewrite the estimableness condition of the c - form through

c2 Im (M ( )) :

Indeed c 2 Im T since estimable, and M ( ) and T have same rank. It is common use to say that the c - form is estimable with respect to the measure .

In the sequel we will assume that the c - form is estimable w.r.t. . Solving the system of linear normal equations pertaining to (7)

(14)

we obtain the least square estimator, say b, of :

Since the matrix M ( ) is invertible (due to the fact that the family f'0; :::; 'g 1g is a linearly independent family of functions), it holds

b := (M ( )) 1

T 0Y: A way to estimate the linear form

< c; 0>=c0 0+ ::: + cg 1 g 1

consists in the plug in of b in place of the j, j = 0; :::; g 1, which

yields the least square estimator \

< c; 0> = c0b0+ ::: + cg 1bg 1:

We can easily see that this estimator is optimal within all linear unbiased ones, and that it is the only estimator enjoying these properties. Indeed let u0Y denote some other linear and unbiased estimator of

the c - form. Then

u0Y = c0Y + d0Y for some d in Rn. Using unbiasedness

E (u0Y ) = c0 : Therefore E (u0Y ) = E (c0Y + d0Y ) = c0 + d0E (Y ) = c0 + d0T = c0 ; which yields c0 + d0T = c0 ; i.e. d0T = 0; i.e. d0T = 0:

Let us now evaluate the variance of this estimator. It holds

(15)

var (u0Y ) = var (c0Y + d0Y ) = var (c0Y ) + var (d0Y ) + 2cov (c0Y; d0Y ) : Since

cov (c0Y; d0Y ) = E (c0Y c0 )0(d0Y c0 )

= 2c0(M ( )) 1T 0d= 0. it follows that

var (u0Y ) = var (c0Y ) + var (d0Y ) = var (c0Y ) + 2d0d:

This variance reaches its minimal value if and only if

d= 0:

It follows that the Gauss - Markov estimator of the c - form is \

< c; 0> = c0b0+ ::: + cg 1bg 1:

Furthermore

var < c;\0> = var c0b = cvar b c0 = c var (M ( )) 1T 0Y c0 = 2c0 (M ( )) 1 c:

Once de…ned the optimal estimator of a c - form, we intend to char-acterize the optimal measure pertaining to this estimator.

The variance of the optimal estimator \< c; 0> depends on the matrix M ( ) induced by the design. Now M ( ) is a symmetric positive de…nite matrix. The optimal design de…ned through the minimization of \< c; 0> will result from a study of a partial ordering of the symmetric matrices. Some other form for the variance of the estimator of a linear form can be obtained also when the moment matrix is singular. Indeed the following important result holds. Denote var y the covariance matrix of the vector y which is a symmetric positive semi de…nite matrix of order n:

(16)

Proposition 10 (Karlin and Studden) Let 2 Md([ 1; 1]). Assume

that supp( ) :=fx0; :::; xg 1g and that

(xi) =

ni

n;

with ni > 0 for all i: Let F ( ) be the set of unbiased linear estimators

< u; y > of the linear form < c; >, where the measure is …xed. As-suming that F ( )6= , and that < ; y > is the Gauss-Markov estimator of < c; >.It then holds ,

var < ; y > := min F( ) var < u; y > = 2 n s X i=1 (< vi; c >)2 i

where vi and i are respectively the eigenvectors with norm 1 and

eigenvalues of the matrix M ( ). Proof. We assume that 2 = 1

We …rst prove that for any element u0y in F ( ) it holds

var u0y sup 06=d2(KerM( ))? 1 n (< c; d >)2 < d; M ( ) d >:

Consider the inner product < c; d > with 0 6= d 2 (KerM ( ))?. Since the linear form < c; > is estimable it holds c 2 Im (M ( )) and therefore c2 Im T 0: Indeed

M ( ) = 1 nT

0T ) Im (M ( )) = Im T 0:

Hence there exists some vector u such that c = T 0u. Write henceforth

< c; d >=< T 0u; d >=< u; T d > : Applying Cauchy - Schwartz Inequality it holds

< u; T d > p< u; u >p< T d; T >: Therefore < c; d > p< u; u >p< T; T d >. Now var u0y = u0var y u = 2 n u 0u = 1 n < u; u > : Hence

(17)

(< c; d >)2 < u; u > < T d; T d > = nvar u0y < T d; T d > = n var u0y < d; T 0T d > = n var u0y < d; M ( ) d > : Hence var u0y 1 n (< c; d >)2 < d; M ( ) d >:

Going to the supremum in both sides of this inequality we obtain

var u0y sup 06=d2(KerM( ))? 1 n (< c; d >)2 < d; M ( ) d >:

We now prove that equality holds for some element in F ( ). Namely we prove that there exists 2 Rnsuch that for …xed , var < ; y > :=

minF( ) var < u; y > . Clearly by de…nition < u; y > will then be the

Gauss - Markov estimator of < c; > : Note that a basis of the linear space generated by the column vectors of M ( ) is given byfvi; i = 1; :::; sg

where s := dim Im M ( ). We assume the vectors vi ’s to have norm 1:

When M ( ) is of full rank then s = g: The condition for estimableness c2 Im (M ( )) may then be written as follows:

c= s X i=1 < vi; c > vi: Therefore (< c; d >)2 = < s X i=1 < vi; c > vi; d > !2 = s X i=1 < vi; c >< vi; d > !2 = s X i=1 < vi; c > p i < vi; d > p i !2

Apply Cauchy - Schwartz Inequality to each of the components of the vectors < v1; c > p 1 ; :::;< vps; c > s and < v1; d > p 1; :::; < v1; d > p s :

(18)

We get (< c; d >)2 s X i=1 < vi; c >2 i ! s X i=1 < vi; d >2 i ! :

From the spectral Theorem 2 we get that

M ( ) = s X i=1 i < vi; vi >; hence (< c; d >)2 s X i=1 < vi; c >2 i ! s X i=1 < vi; d >2 i ! (9) = s X i=1 < vi; c >2 i ! < d; M ( ) d > :

In this last display equality holds between the …rst and the second members in two cases (see Karlin and Studden for details p788). Either when there exists some constant h such that

< vi; c >2= h i < vi; d >; or when d is proportional to d := s X i=1 < vi; c >2 i vi:

We only consider this latest case. See [7]. Recall that M ( ) = 1 nT0T . Taking u = := n1T 0d in var u0y sup 06=d2(KerM( ))? 1 n (< c; d >)2 < d; M ( ) d > (10) we get equality in (9). In order to conclude the proof it is necessary to prove that the vector := n1T 0d belongs to the set F ( ). Now

T 0 := 1 nT 0T d = M ( ) d = s X i=1 i < vi; c > vi = c

(19)

Remark 11 The above Proposition 10 asserts that whatever 2 Md([ 1; 1]),

the Gauss - Markov estimator < ; y > of the linear form < c; > has variance var < ; y > := min F( ) var < u; y > = 1 n06=d2(KerM( ))sup ? (< c; d >)2 (< d; M ( ) d >):

Clearly if (M ( )) 1exists then var < ; y > =< c; (M ( )) 1c>. In-deed consider d := (M ( )) 1c. It holds

sup 06=d2(KerM( ))? 1 n (< c; d >)2 < d; M ( ) d > = sup 06=d2(KerM( ))? 1 n < c; (M ( )) 1c> 2 < (M ( )) 1c; M ( ) (M ( )) 1c> = sup 06=d2(KerM( ))? 1 n (M ( )) 2(< c; c >)2 (M ( )) 1 < c; c > = 1 n < c; (M ( )) 1 c> .

The above formula also holds when the moment matrix is not invertible. Denote (M ( )) a generalized inverse of (M ( )) . Then

var < ; y > = min F( ) var < u; y > = 1 n06=d2(KerM( ))sup ? (< c; d >)2 (< d; M ( ) d) = 1 n < c; (M ( )) c > : Remark 12 In the above Proposition 10 the measure is …xed inMd([ 1; 1]).

Now since M ( ) := n1T0T = R

[ 1;1](X (x))

0X (x) d (x); let vary in

Md([ 1; 1]); de…ne the optimal design, which minimizes the

vari-ance as follows var < ; y > := min 2Md([ 1;1]) var < ; y > = min 2Md([ 1;1]) min F( ) var < u; y > = min 2Md([ 1;1]) 1 n < c; (M ( )) c > = min 2Md([ 1;1]) sup 06=d2(KerM( ))? 1 n (< c; d >)2 < d; M ( ) d > = min 2Md([ 1;1]) sup 06=d2(KerM( ))? (d0c)2 R [ 1;1](d0X (x)) 2 d (x):

(20)

Since d in (KerM ( ))? can be chosen up to an arbitrary multiplicative constant (see formula (10)) we may assume that d0c= 1 . Minimizing upon choosing a measure whose support consists in the points where the mapping x! d0X (x) assumes its maximal values it holds

var < ; y > = min 2Md([ 1;1]) sup 06=d2(KerM( ))? 1 R [ 1;1](d0X (x)) 2 d (x) = sup 06=d2(KerM( ))? 1 R [ 1;1](maxxd0X (x)) 2 d (x) = sup 06=d2(KerM( ))? 1 (maxxd0X (x)) 2R [ 1;1]d (x) = sup 06=d2(KerM( ))? 1 (maxxd0X (x)) 2 = 1

min06=d2(KerM( ))?(maxxd0X (x))2

;

which turns the problem of …nding the optimal measure into a problem of optimal uniform approximation for some function, as will be seen in Section 5.

3.4

Geometric interpretation of estimableness .

Elfv-ing’s set

We already saw that estimableness is related to a precise geometric re-lation. This paragraph introduces the geometric context; we follow the presentation by [12].

The condition for estimableness of a linear form < c; > is given by

c2 Im (M ( )) where M ( ) := 1

nT0T .

This property may be extended independently from the measure , for a generic element in S 0(g). We thus consider a generic matrix

A 2 S 0(g) such that c 2 Im A, and also we consider all matrices

A2 S 0(g) for which c2 Im A.

De…nition 13 The set A (c) := fA 2 S 0(g) such that c2 Im Ag is

called the feasibility cone.

(21)

That M ( ) belongs to M M[ 1;1] means that M ( ) is a moment

ma-trix.

Proposition 14 The feasibility cone A (c) for < c; 0> is a convex

sub-cone of S 0(g) which includes S+(g) :

Proof. If > 0 and A 2 A (c) then since Im A = Im ( A) it holds A 2 A (c) ; for any positive . Hence A (c) is a cone. By de…nition A (c) S 0(g) and therefore A (c) is a subcone in S 0(g).

We prove that A (c) is convex. Let 2 (0; 1) and A; B 2 A (c) : Since Im (A + B) = Im A + Im B, it follows that for any A and B in A (c) it holds ( A + (1 ) B)2 A (c).

Since estimableness pertains to the expectation of an estimator and not to its variance we now characterize it using a generic unbiased linear estimator of a linear form.

Given E (y (x)) = E g 1 X j=0 j'j(x) + "j ! = X 0(xj) ;

an easy way in order to estimate a linear form c0 consists in a

weighted mean. Assume that we observed y (x) on points x0; :::; xg 1

with respective frequencies nj, j = 0; :::; g 1. Denote

c c0 := g 1 X j=0 ujy (xj) ; where y (xj) := Pnj i=1y (xj) nj ; j = 0; :::; g 1

and where the uj ’s are coe¢cients which should be determined in such

a way that cc0 is an unbiased.

(22)

c0 = E cc0 = E g 1 X j=0 ujy (xj) ! = g 1 X j=0 ujE (y (xj)) = g 1 X j=0 ujE Pnj i=1y (xj) nj = g 1 X j=0 uj Pnj i=1E (y (xj)) nj = g 1 X j=0 uj Pnj i=1E (X 0(xj) + "j) nj = g 1 X j=0 uj njX0(xj) nj = g 1 X j=0 ujX0(xj) : Therefore c0 is unbiased i¤ c0 = g 1 X j=0 ujX 0(xj) :

Observe that there exists at least one index j such that uj 6= 0 .

Indeed otherwise no data enters in the de…nition of the estimator. It follows that g 1 X j=0 jujj 6= 0: Henceforth dividing c0 = g 1 X j=0 ujX0(xj) byPgj=01jujj we get c0 Pg 1 j=0jujj = Pg 1 j=0ujX0(xj) Pg 1 j=0jujj ; which, setting j = sign (uj) = 1 becomes c0 P = g 1 X jjujj P X 0(x )

(23)

and therefore c Pg 1 j=0jujj = g 1 X j=0 jjujj Pg 1 j=0jujj X (xj) :

For j = 0; :::; g 1, the numbers

j := ju jj

Pg 1 j=0jujj

de…ne a discrete probability measure with support fx0; :::; xg 1g

in-cluded in [ 1; 1] :

The condition for estimableness may thus be stated as follows. Proposition 15 The linear form c0 is estimable if and only if c

Pg 1 j=0jujj

is a convex linear combination of the vectors jX (xj), j = 0; :::; g 1.

Since j is a sign function we conclude that c0 is estimable i¤ Pg c1 j=0jujj

belongs to the convex hull generated by the setf jX (xj) ; j = 0; :::; g 1g :

Call

R+:=fX (xj) ; j = 0; :::; g 1g ,

R := f X (xj) ; j = 0; :::; g 1g ,

R := convex-hull (R+[ R )

Since X (xj) ; j = 0; :::; g 1 is a …nite set of vectors, the set R is

a polytope. It holds

c0 is estimable i¤ c2 R:

More generally , since the xi’s are to be de…ned, as are the nj’s it is

customary to de…ne

R+ :=fX (x) ; x 2 [ 1; 1]g ,

R := f X (x) ; x 2 [ 1; 1]g , R := convex-hull (R+[ R )

The set R is the Elfving set. It is to be noted that it is the convex hull of the union of two cones. The intuitive representation is to see R as a "cylinder".

(24)

Remark 16 Minimizing the variance of cc0 upon the nj ’s follows in

a simple way. Indeed

var cc0 = var g 1 X j=0 ujy (xj) ! = g 1 X j=0 u2jvar (y (xj)) = g 1 X j=0 u2j 2 nj : The problem ( minnj Pg 1 j=0 u2 j nj Pg 1 j=0nj = n has solution nj, j = 0; :::; g 1.

3.4.1 Geometry of the Elfving set

The Elfving set is symmetric and convex, by its very de…nition.

The points inR may be seen as expected values for probability

mea-sures ’s. Indeed the random variable which assumes values ( j(X (xj))) X (xj),

j = 0; :::; g 1 with probability j has expectationPgj=01 j( j(X (xj)) X (xj));

the reciprocal statement clearly holds. Thus to any point z inR we may associate a design :

The Elfving set is contained in the regression range, namely in the linear space spanfX (x) : x 2 [ 1; 1]g. Indeed the convex combinations of fX (x) : x 2 [ 1; 1]g belong to this space.

In spanfX (x) : x 2 [ 1; 1]g de…ne the norm (gauge)

: spanfX (x) : x 2 [ 1; 1]g ! R+, z7! (z) := inf f 0 : z2 Rg :

This norm is useful in order to locate any point z2 span fX (x) : x 2 [ 1; 1]g with respect to R.

For example if (z) = 0 then z 2 0R =0. Therefore z = 0 2 R. If (z) = 1 + ", with " > 0 then z =2 R. If (z) = 1

n then z 2 1 nR

and z =2 1

n+ " R. The larger n , the closer z to the null vector.

Reciprocally small values of n make z close to the boundary F r (R) of R. As n decreases the point moves away from R.

Clearly z2 R if and only if (z) 1.

It follows that the Elfving set coincides with the closed sphere with radius 1 and center z = 0 in spanfX (x) : x 2 [ 1; 1]g.

It follows that

R = fz 2 span fX (x) : x 2 [ 1; 1]g : (z) 1g ; which yields thatR is a compact .

We now characterize the boundary points ofR:

(25)

Theorem 17 (Carathéodory) Let A be a non void subset in Rg 1. Then

any convex combination of elements in A can be written as a convex combination of at most g points in A. Furthermore if z 2 F r (conv (A)), then z is a convex combination of exactly g points in A:

Proof. See e.g. [15] p 41.

Proposition 18 Let z2span fX (x) : x 2 [ 1; 1]g : Then there exists a discrete measure with supp ( ) R+[ R such that

z (z) = Z [ 1;1] ( (X (x))) X (x) d (X (x)) , (X (x))2 f 1; 1g . Proof. Since z

(z) 2 F r (R) and since R is compact it holds z

(z) 2 R.

NowR = conv (R+[ R ) is a convex set; therefore (z)z is a convex

com-bination of elements in (R+[ R ) Rg. From the above Caratheodory

Theorem, since z

(z) is a frontier point ofR, it follows that z

(z) is a

con-vex combination of g points in R+[ R . Hence there exists a measure

which is de…ned on points (X (xi))2 (R+[ R ) such that

z (z) = g 1 X i=0 [ ( (X (xi)) X (xi))] (X (xi)) X (xi) = Z [ 1;1] ( (X (x))) X (x) d (X (x)) , (X (x)) 2 f 1; 1g : Since z

(z) is a frontier point of R it holds z

(z) = 1. Note that the

measure cannot put positive masses to points X (x) and X (x) inR. Indeed assume that 0 < (X (x0)) (X (x1)) withe X (x1) = X (x0)

and X (x0)2 R+. Then for any point (z)z 2 F r (R), we have

z (z) = g 1 X i=0 (X (xi)) ( (X (xi))) X (xi) ! = 0 @ + (X (x(X (x01)) ( (X (x)) ( (X (x0))) X (x1))) X (x0)1) +Pgi=21 (X (xi)) ( (X (xi))) X (xi) 1 A :

Since X (x1) = X (x0) it holds, (X (x1)) = X (x0), and (X (x0)) =

+1, z (z) = 0 @ (X (x( X (x0)) X (x0)) X (x0)0) +Pgi=21 (X (xi)) ( (X (xi))) X (xi) 1 A

(26)

= ( (X (x0)) ( X (x0))) X (x0) +Pgi=21 (X (xi)) ( (X (xi))) X (xi) (1) (1) ( (X (x0)) ( X (x0))) (X (x0)) + g 1 X i=2 (X (xi)) ( (X (xi)) (X (xi))) :

where (1) follows from the triangle Inequality. Since (X (xi)) (X (xi))

are points in R for all i, we have ( (X (xi)) (X (xi))) 1. Hence

z (z) (X (x0)) ( X (x0)) + g 1 X i=2 ( (X (xi)) X (xi))

Furthermore by hypothesis, 0 < (X (x0)) (X (x1)), and therefore

(X (x0)) ( X (x0)) + g 1 X i=2 ( (X (xi)) X (xi)) (X (x0)) ( X (x0)) + 2 (X (x1)) + g 1 X i=2 ( (X (xi)) X (xi)) = ( (X (x0)) X (x0)) + ( (X (x1)) X (x1)) + g 1 X i=2 ( (X (xi)) X (xi)) = g 1 X i=0 ( (X (xi)) X (xi)) = 1: Hence we get z (z) < 1 whereas we have z (z) = 1:

Henceforth X (x0) = X (x1) cannot hold. For a boundary point it

holds z (z) = Z [ 1;1] ( (X (x))) X (x) d (X (x)) , (X (x))2 f 1; 1g where (X (x)) = 0 if ( X (x)) > 0 and ( X (x)) = 0 if (X (x)) > 0.

(27)

We use the fact that in any boundary point of a convex set there exists a tangent hyperplane to the convex set. This hyperplane divides spanfX (x) : x 2 [ 1; 1]g into two subsets; the …rst one "below" con-tains R and the second one "above" does not contain any point in R. This fact allows for the determination of the boundary of R.

Let

c

(c) 2 F r (R) :

Proposition 19 There exists a vector h in Rg such that for any z in

R it holds

z0hh0z 1:

Proof. The tangent hyperplane to R at point c is de…ned as follows. Let eh be a vector; this vector de…nes the linear form which in turn determines the hyperplane if, for any z2 R,

z0 he c e0h (c):

This relation states that all points in R lay "below" the hyperplane. Since R is a symmetric set, when z satis…es z0 eh ceh

(c) then the same

holds for z which also belongs to R.

Hence (

z0 eh c(c)0he z0 eh c e(c)0h . It follows that for any z2 R, we have

z0 he c

0 he

(c). The real number

:= c 0eh (c)

is therefore non negative. Furthermore it does not equal 0: Otherwise R has a void interior. Hence > 0: De…ne therefore the vector

h:= eh and

z0h 1=c0 h (c).

(28)

Hence (c) = c0 h : Using z0 h 1 we get z0 h h0z 1; for any z2R.

We now de…ne cylinders generated by a …xed vector k in Rg by

De…nition 20 The cylinder induced by the matrix N (k) := kk0 is de-…ned through

fz 2span fX (x) : x 2 [ 1; 1]g such that z0kk0z 1g :

From (c) = c0 h we get ( (c))2

= c0 h h0c:

We will identify the cylinder with the de…nite symmetric semipositive matrix N = hh0:

The union of all cylinders over all choices of vectors k in Rgis denoted

N , which is the set of all cylinders. Observe that

z0h h0z 1 is a cylinder which contains R.

This indicates that from the outside ofR we may identify the bound-ary points either through hyperplanes or through cylinders. From inside, the boundary points of the Elfving set are obtained through convex com-binations of points in R+ and in R

z (z) = Z [ 1;1] ( (X (x))) X (x) d (X (x)) , (X (x))2 f 1; 1g with (X (x)) = 0 if ( X (x)) > 0 and ( X (x)) = 0 if (X (x)) > 0. We have seen that the tangent hyperplane ofR in c de…nes a cylinder which contains R:

Since R is the set of all vectors which de…ne estimable linear forms, it appears as a natural way to use the notion of cylinders in order to

(29)

3.4.2 The relation between cylinders and the variance of the estimator of the c form

The convex set R can be outer approximated through the union of its tangent hyperplanes. This outer approximation is optimal among all unions of hyperplanes which contain R.

Let c2 R: The best hyperplane which approximates R from outside at point (c)c is the tangent hyperplane toR in c

(c); it is de…ned through a

vector eh. It is convenient to introduce the vector h := he, =D c (c)f; h

E . To this vector h we link in a unique way the symmetric matrix N := hh0. We consider the associated cylinder

fz = X (x) ; x 2 [ 1; 1] : z0N z 1g : (11)

This cylinder contains R. We consider the question of the optimality of this cylinder when approximating R locally (in c) by the class of cylinders. This question admits a positive answer; indeed this cylinder (11) is optimal in the class of all cylinders which contain R and which are generated by a symmetric matrix, including therefore all moment matrices. This result is proved in Theorem 22.

Henceforth N M ( ) for all discrete measure with …nite support in [ 1; 1] : This induces the question whether there exists a measure such that

N = M ( ) with

M ( ) M ( )

for all inMd([ 1; 1]) : This would in turn imply

c0 M ( ) c c0 M ( ) c = var < c; > , for any\ in Md([ 1; 1])

where \< c; > is the Gauss - Markov estimator of < c; >. Condi-tions which ensure this fact are stated in Theorem 26.

Lower bound for the variance We introduce the following class of cylinders, N (X), which are the non negative semi de…nite matrix N (which depend on X) for which all elements in the regression range z = X (x) satisfy

(30)

as x runs in [ 1; 1] : If such z exists for all x in [ 1; 1] then clearly z belongs to the cylinder N (X) :

Namely we de…ne De…nition 21

N (X) := N 2 S 0(g) such that (X (x))

0N X (x) 1

for all X (x) and any x2 [ 1; 1] : Recall that the variance of the Gauss Markov estimator of the c form with design is c0(M ( )) c., where (M ( )) is the generalized

inverse of M ( ) ; hence an element in S 0(g): The next result compares

this variance with homologue terms when (M ( )) is substituted by a generic element in N (X) ; providing a lower bound for the variance upon all designs.

Theorem 22 (Pulkesheim) Assume that M ( )2 A (c)\M M[ 1;1] .

Then for any N in N (X)

V ar < c; > = c\ 0(M ( )) c c0N c;

Proof. We prove that

tr (M N ) 1: (12) Since M ( ) 2 M M[ 1;1] it follows that is a …nite probability

mea-sure on [ 1; 1]. Integrate with respect to in both sides of the inequality (X (x))0N X (x) 1. Then Z [ 1;1] (X (x))0N X (x) d (x) Z [ 1;1] 1 d (x) = 1.

Since M ( ) = X (x) (X (x))0 and tr (AB) = tr (BA), denoting A := X (x) and B := (X (x))0N , we obtain (X (x))0N X (x) = tr (X (x))0N X (x) (13) = tr X (x) (X (x))0 N = tr (M ( ) N ) : Therefore Z [ 1;1] (X (x))0N X (x) d (x) 1 . Hence Z tr (M ( ) N ) d (x) = tr (M ( ) N ) Z d (x) = tr (M ( ) N ) 1:

(31)

We now prove that

tr (M ( ) N ) c0(M ( )) c 1c0N c; (14)

This follows from the fact that M ( ) 2 A (c). Indeed if M ( ) 2 A (c) then by the Gauss Markov Theorem it can be proved that (see [12] 21 and 22).

M ( ) c c0(M ( )) c 1c0. Multiplying both sides by N ,

M ( ) N c c0(M ( )) c 1c0N

using the fact that when A B then tr (A) tr (B)), we obtain

tr (M ( ) N ) tr c c0(M ( )) c 1c0N = tr c0(M ( )) c 1c0N c = c0(M ( )) c 1c0N c:

We now prove the claim. From (14)

c0(M ( )) c tr (M ( ) N ) cN c0: Now by (12)

tr (M ( ) N ) 1

and therefore, multiplying by c0(M ( )) c we obtain

tr (M ( ) N ) c0(M ( )) c c0(M ( )) c (15) and …nally

V ar < c; > = c\ 0(M ( )) c cN c0.

We will see that this lower bound can be achieved, which yields a criterion for the optimality of the design .

(32)

The lower bound can be achieved We now state three technical lemmas. For …xed c 2 R denote eh the vector of the coe¢cients of the tangent hyperplane to R at point c and

h := he; = c

(c); h :f

Accordingly de…ne the symmetric semide…nite positive matrix

N := N := hh0:

Lemma 23 Let M ( ) 2 A (c). Then

M ( ) (M ( )) c = c (16) T p n = M ( ) (M ( )) T p n (17) If G = (M ( )) then pT n 0 Gc = pT n 0 (M ( )) c: (18)

Proof. We prove (16). From M ( ) 2 A (c) it follows that c 2 Im M ( ). hence there exists a vector z such that c = M ( ) z. Hence

M ( ) (M ( )) c = M ( ) (M ( )) M ( ) z =M ( ) z =c.

Therefore

M ( ) (M ( )) c = c.

We now prove (17). Semide…nite positive matrices can be expressed as squares. Hence M ( ) = pT n T 0 p n e N = hh 0.

By the de…nition of generalized inverse, it holds

M ( ) = M ( ) (M ( )) M ( ) therefore T p n T 0 p n = M ( ) (M ( )) T p n T 0 p n

(33)

T p n = M ( ) (M ( )) T p n.

We now prove (18). Let G be a generalized inverse of (M ( )) ; evaluate T 0 pnG c. It holds, substituting M ( ) (M ( )) pT n to T pn in T0 pnG c , T 0 p nG c = M ( ) (M ( )) T p n 0 G c = T 0 p n (M ( )) 0 (M ( ))0G c.

But M ( ) (M ( )) c = c and therefore

T 0 p nG c =(a) (M ( )) (M ( )) T p n 0 G c =(b) T 0 p n (M ( )) M ( ) G c =(c) T 0 p n(M ( )) (M ( )) (M ( )) c = pT 0 n (M ( )) c

Assertion (a) holds since pT

n = (M ( )) (M ( )) T

pn

.

Assertion (b) holds since M ( ). Finally (c) holds since by hypothesis G is the generalized inverse of (M ( ))

.

Lemma 24 (Pulkesheim) Assume that M ( )2 A (c) \ M M[ 1;1] .

Then tr (M ( ) N ) = 1 if and only if (X (x))0N X (x) = 1 for all x2supp( ).

Proof. From the above Theorem it follows from (13) (X (x))0N X (x) = tr (M ( ) N ) . Therefore

(34)

Lemma 25 (Pulkesheim) Assume that M ( ) 2 A (c) \ M M[ 1;1]

and let

N = X (x) : such that for any x2 [ 1; 1] it holds (X (x))0N X (x) 1 : Then

tr (M ( ) N ) = 1

tr (M ( ) N ) = c0(M ( )) c 1c0N c

if and only if (X (x))

0N X (x) = 1 for any x2 supp ( )

M ( ) N = c c0(M ( )) c 1c0N .

Proof. We prove the direct implication. From (15)

tr (M ( ) N ) c0(M ( )) c 1c0N c i.e.

tr (M ( ) N ) c0(M ( )) c 1c0N c 0: We prove that

tr (M ( ) N ) c0(M ( )) c 1c0N c

is the square of the norm of the matrix

A := pT 0 nh T 0 p n(M ( )) c c 0(M ( )) c 1c0h.

Some calculus yields

kAk2tr := tr (A0A) = tr (M ( ) N ) c0(M ( )) c 1

c0N c: Assuming that

tr (M ( ) N ) = c0(M ( )) c 1c0N c

it holdskAk2tr = 0 which entails A = 0:

Hence T 0 p nh = T 0 p n(M ( )) c c 0(M ( )) c 1c0h:

It then follows that T p n T 0 p nhh 0 =pT n T 0 p n (M ( )) c c 0(M ( )) c 1c0hh0

(35)

i.e.

M ( ) N = M ( ) (M ( )) c c0(M ( )) c 1c0N = M ( ) (M ( )) : From (16) in Lemma23 we have

M ( ) (M ( )) c = c hence

M ( ) N = M ( ) (M ( )) c c0(M ( )) c 1c0N = c c0(M ( )) c 1c0N:

The reciprocal statement follows straightforwardly.

The following Theorem indicates conditions for

c0(M ( )) c = cN c0 to hold.

Theorem 26 (Pulkesheim) Assume that M ( )2 A (c)\M M[ 1;1] :Then

, c0(M ( )) c = cN c0 if and only if (X (x))

0N X (x) = 1 for any x2 supp ( )

M ( ) N = c c0(M ( )) c 1c0N

Proof. We have by Lemma 25

trM ( ) N = c0(M ( )) c 1cN c0 and

trM ( ) N = 1: Hence

(36)

4

Elfving Theorem

4.1

An intuitive approach for a design supported

by two points in

[ 1; 1].

From the above discussion pertaining to the Elfving set it follows that the vectors c corresponding to an optimal measure (in the sense of minimizing the variance of the estimable c form) are frontier points of R:

In order to describe these points, through Elfving Theorem, we de-velop Elfving’s approach, following his dede-velopment in dimension 2.

Elfving [4] provides a geometric property which characterizes optimal measures for the estimation of linear c-forms.

We …rst have some insight to his result, following Elfving’s treatment of the following example.

Consider the regression problem

yi = xi1 1+ xi2 2+ "i; i = 1; :::; n;

where for all i, it holds :E ("i) = 0, var ("i) = 2 > 0, with 2

unknown, and the "i’s are uncorrelated.

The experimenter may choose the values of the vector x0 := (x 1; x2)

in a compact set S in R2. Finally the vector := (

1; 2)0 is unknown.

The aim is to …nd 2 points xj and the number of replications of the

experiment at any of those, say nj in S with 2

X

j=1

nj = n

such that the Gauss Markov estimator of the c-form < c0; >

< (c1; c2) ; ( 1; 2)0 >:= c1 1+ c2 2; with c := (c1; c2)0

be minimal. We denote x0

j := (xj1; xj2), j = 1; 2 the 2 points where

the experiment is to be performed. We thus have \ < c0; > := 2 X j=1 j yj

(37)

yj :=

Pnj i=1yi(j)

nj

,

and yi(j) is the i th observation at point xj, and j = 1; 2 together

with i = 1; ::; nj: The vector := ( 1; 2) is deduced from the normal

equations following from the standard minimum least square approach. We evaluate the variance of < c\0; >: It holds

var < c\0; > = 2 X j=1 2 j var (yj) = 2 X j=1 2 j 2 nj : Denote pj := nj

n and p := (p1; p2), and assume that

pj := nj n 2 N and 2 n = 1 it holds var < c\0; > = 2 X j=1 2 j var (yj) = 2 X j=1 2 j pj :

The initial problem may be formalized now through ( min minp P2 j=1 2 j pj P2 j=0pj = 1 and 0 pj, c =P2j=1 j xj ;

together with the unbiasedness condition pertaining to the estimator \ < c0; > , namely c0 = E < c\0; > = 2 X j=1 j E (yj) = 2 X j=1 j xj0 ; which amounts to c= 2 X j=1 j xj.

We proceed in a two steps procedure; in the …rst one, use Khun -Tucker Theorem in order to minimize with respect to p:

This yields

pj := P2j jj

j=1j jj

(38)

Substitution in the variance yields var (T ) = 2 X j=1 2 j var (yj) = 2 X j=1 2 j j jj P2 j=1j jj = 2 X j=1 j jj !2 : Denoting k := 2 X j=1 j jj

we now have to solve the problem ( min P2j=1j jj c=P2j=1 j xj : Since pj = P2j jj j=1j jj = (sgnP2 j) j j=1j jj it holds (sgn j) pj = (sgn j) (sgn j) j P2 j=1j jj = P2j jj j=1j jj ; and j = (sgn j)j jj ;

which in turn yields

c= 2 X j=1 j xj = 2 X j=1 j jj ! 2 X j=1 j P2 j=1j jj xj ! Therefore c= k 2 X j=1 pj (sgn j) xj = k a

where we have set a :=P2j=1pj (sgn j) xj:

This proves that the unbiasedness condition E < c\0; > = c1 1+

(39)

Henceforth

c

k 2 conv f xj; j = 1; 2g :

Furthermore the same unbiasedness condition shows that the vector c is parallel to the vector a:

Therefore

kck = jk j kak i.e.

k = kck kak:

We have obtained the following: minimizing k =P2j=1j jj amounts

to maximizekak under the constraint that a should belong to the convex set convf xj; j = 1; 2g :

The length of the vector a is maximal when a= a 2 F r (conv f xj; j = 1; 2g) :

Therefore the variance of the estimator of the c form is minimal when-ever

a2 F r (conv f xj; j = 1; 2g) :

Observe that (0; 0)2 conv f xj; j = 1; 2g.

Denote now by A the intersection of the oriented segment with ori-gin (0; 0) which represents the vector a with the frontier of the set convf xj; j = 1; 2g and recall that a is parallel to c.

Denote further by Xi1, Xi2 the extremities of the oriented segments

with origin (0; 0), which represent, respectively, the vectors xi1, xi2:

The point A is henceforth represented through the convex combina-tion

A = pi1Xi1 + (1 pi1) Xi2; pi1 2 (0; 1) :

The optimal design is therefore given by

(x) := 8 < : pi1 for x = xi1 1 pi1 for x = xi2 0 for x =2 fxi1; xi2g :

The optimal variance of the estimator is then

k2 := kck ka k:

(40)

In order to generalize the above construction de…ne L (supp ( )) := span fX (x) : x 2 supp ( )g which is a subset of Rg:

Remark 27 Taking into account the fact that when A and B are two de…nite non negative square matrices with same dimension then Im (A + B) = Im A+Im B, and assuming that supp( ) :=fx0; :::; xg 1g, then Im (M ( )) =

L (supp ( )) : Indeed since M ( ) = Pxj2supp( ) (x) (X (x)) X (x) 0,we obtain Im (M ( )) := Im 0 @ X x2supp( ) (x) X (x) (X (x))0 1 A = g 1 X j=0 Im (xj) (X (xj)) (X (xj))0 = g 1 X j=0 Im (X (xj)) (X (xj))0 = g 1 X j=0 Im ((X (xj))) = (g 1 X j=0 jX (xj) ; j 2 R ) :

Clearly a c- form is estimable if and only if c2 L (supp ( )) :

4.2

The general case

Theorem 28 (Elfving) Assume that the regression range fX (x) : x 2 [ 1; 1]g Rg

is compact, and that coe¢cient vector c2 Rg lies in the regression space

L (supp ( )) and has Elfving norm (c) > 0. Then a design 2 M ([ 1; 1]) is optimal for < c0; > in M ([ 1; 1]) if and only if there

exists a function on fX (x) : x 2 [ 1; 1]g which takes values 1on the support of and such that such that

c (c) =

Z

fX(x):x2[ 1;1]g

(41)

Proof. First step: (from Pukelsheim, pag. 51) Assume that there exists a function on fX (x) : x 2 [ 1; 1]g which on the support of takes values 1 such that (19) holds ; we prove that there exists an optimal design for < c0; > inM ([ 1; 1]), and that the optimal variance

is ( (c))2: We thus prove that < c0; > is estimable and that its Gauss

Markov estimator has minimum variance for the measure . We prove that < c0; > is estimable. By hypothesis,

c (c) =

X

x2supp( )

(x) ( (X (x))) (X (x)) , (X (x)) = 1. (20)

Since (c)c = 1 it follows that (c)c 2 F r (R). Hence there exists a tangent hyperplane which touches R in (c)c . Let h be a vector of the coe¢cients of this hyperplane. It holds (see Proposition 19)

(X (x)) (X (x))0h 1 for any X (x)2 span fX (x) : x 2 [ 1; 1]g . (21) Furthermore the tangency condition at point (c)c provides

c0

(c)h=1.

Substituting (20) in this latest expression we obtain

1 = c0 (c)h= X x2supp( ) (x) ( (X (x))) (X (x))0h. From (21), we get (x) ( (X (x))) (X (x))0h (x) and therefore 1 = c0 (c)h= X x2supp( ) (x) ( (X (x))) (X (x))0h X x2supp( ) (x) = 1. We deduce that X x2supp( ) (x) ( (X (x))) (X (x))0h =1: (22) Assume that

(42)

( (X (x))) (X (x))0h6=1.

for some x: Multiply then on both sides by (x) and sum upon all points in the support of : Then we get

X

x2supp( )

(x) ( (X (x))) (X (x))0h 6=1;

a contradiction. Hence (x) ( (X (x))) (X (x))0h=1 for all x in the support of :

From (22) we get

(X (x))0h= 1

(X (x)) = (X (x)) (23) and therefore, substituting (X (x)) by (X (x))0hin (20) and noting that (X (x)) (X (x))0 = M ( ), we obtain c (c)= X x2supp( ) (x) ( (X (x))) (X (x)) = X x2supp( ) (x) (X (x)) ( (X (x))) = X x2supp( ) (x) (X (x)) (X (x))0h = X x2supp( ) (x) M ( ) h =M ( ) h.

This proves that

c

(c)=M ( ) h (24) and therefore c(c)0 2 ImM ( ) , which yields M ( ) 2 A c(c)0 : Hence < c0; > is estimable.

We now prove that the matrix M ( ) is minimal in the Loewner order, which means that provides a minimal variance Gauss Markov estimator of < c0; >.

In the frontier point c(c)0

h0M ( ) h = c0

(43)

Hence

h0M ( ) h =1. By Theorem 26, is optimal i¤

c0(M ( )) c = cN c0. We evaluate c0(M ( )) c. Since c (c)=M ( ) h using (24) c0(M ( )) c = ( (c))2 c0 (c)(M ( )) c (c) = ( (c))2 (M ( ) h)0(M ( )) (M ( ) h) = ( (c))2 h0(M ( ))0(M ( )) M ( ) h:

Now M ( ) is symmetric and (M ( )) (M ( )) M ( ) = M ( ); hence since h0M ( ) h =1,

c0(M ( )) c = ( (c))2 h0M ( ) h = ( (c))2.

This proves that if is optimal then the variance of the estimator of < c0; > equals ( (c))2.

In order to prove optimality recall that

c0(M ( )) c = cN c0 i¤ (X (x))

0N X (x) = 1 for any x2 supp ( )

M ( ) N = c c0(M ( )) c 1c0N; . by Theorem 26. Since c0(M ( )) c= ( (c))2 , (c)c = M ( ) h , hh0=N and h0M ( ) h =1, we have c c0(M ( )) c 1c0N = c ( (c))2 1c0N = 1 ( (c))2 (c) c (c) c0 (c) (c) N = c (c) c0 (c)N = M ( ) h (h 0M ( ) h) h0 = M ( ) h1h0 = M ( ) N: Therefore M ( ) N = c c0(M ( )) c 1c0N .

(44)

(X (x))0N X (x) = 1 for any x2 supp ( ) : By (23)

(X (x))0N X (x) = (X (x))0h (h0X (x)) = (X (x)) (X (x)) = 1.

We have proved optimality and also that

( (c))2 = min var < c\0; > :

Second step: Let us prove now that if is optimal then

c (c) =

Z

fX(x):x2[ 1;1]g

( (X (x))) (X (x)) d ( (X (x))) , (X (x)) = 1.

If is the optimal measure to estimate < c0; >, then M ( )2 A (c)

and c0(M ( )) c = ( (c))2

.

Furthermore since is optimal it holds

(X (x))0N X (x) = 1 for any x2 supp ( ) M ( ) N = c c0(M ( )) c 1c0N .

Now since X (x)0N X (x) = 1 for any x 2 supp ( ) it holds, using Theorem 26,

1 = (X (x))0N X (x) = (X (x))0hh0X (x) = (X (x))0h 2 for any x2 supp ( ) : (25)

From M ( ) N = c c0(M ( )) c 1c0N we get, multiplying on the

right side by h h0h, M ( ) N h h0h = c c 0(M ( )) c 1c0N h h0h: Simplifying we have M ( ) hh0 h h0h = c c0(M ( )) c 1 c0hh0 h h0h and M ( ) h =c c0(M ( )) c 1c0h:

By the optimality of , it holds c0(M ( )) c = ( (c))2

(45)

M ( ) h =c 1 ( (c))2c 0h M ( ) h=c 1 ( (c))2 (c) = = c (c):

Denote now (X (x)) := (X (x))0h. From (25), 1 = (X (x))0h 2 for any x 2 supp ( ) it follows that p (X (x)) = 1. Hence (X (x)) = 1 for x2 supp ( ). Write now X x2supp( ) (x) X (x) (X (x)) =(1) X x2supp( ) (x) X (x) (X (x))0h =(2) X x2supp( ) (x) M ( ) h =M ( ) h X x2supp( ) (x) = M ( ) h =(3) c (c)

The equality (1) in the above display is obtained substituting (X (x)) by (X (x))0h. Equality (2) follows from M ( ) = X (x) (X (x))0 and (3) from the fact that M ( ) h = (c)c :

Therefore

c (c) =

X

x2supp( )

(x) X (x) (X (x)) with (X (x)) = 1 for x2 supp ( ) :

Remark 29 Elfving Theorem assesses that the vectors in R to which an optimal measure is associated are necessarily frontier points of the Elfving set. Indeed clearly (c)c = 1.

In the next section and in the last one, we discuss the results by Kiefer, Wolfowitz and Studden.

These authors have characterized optimal designs whose support con-sists in Chebyshev points. Our starting point is the optimal design which has been described above, through the Elfving Theorem 28.

(46)

5

Extension of Hoel - Levine result: Optimal design

for a linear c

form

From [3] we know (Borel Chebyshev Theorem) that any continuous func-tion f de…ned on a compact set in R has a uniquely de…ned best uniform approximation in the class of polynomials with prescribed degree. More generally given a …nite class f'0; :::; 'g 1g of functions a necessary and

su¢cient condition for f in C(0)([ 1; 1]) to admit a best uniform

ap-proximation ' 2 span f'0; :::; 'g 1g is that ff'0; :::; 'g 1gg be a

Cheby-shev system in C(0)([ 1; 1]) ; this Haar Theorem. Finally the Borel

Chebyshev equioscillation Theorem assesses that the resulting error of approximation by polynomials with …xed degree less or equal g 1 takes on its common maximal absolute values on g + 1 points in [ 1; 1] with alternating signs.

This important characterization of approximating schemes may be generalized through the following Theorem.

Theorem 30 ( Karlin e Studden) Letf'0; :::; 'g 1g be a Chebyshev

sys-tem in [ 1; 1] . Then there exists a unique element u (x) := Pgj=01aj'j(x)

in V := spanf'0; :::; 'g 1g, which enjoys the following properties:

1)

ju (x)j 1; for all x2 [ 1; 1] ; (26) 2) there exist g points in [ 1; 1] ;ex0; :::;exg 1 such that

1 ex0 < ::: <exg 1 1 and u (exj) = ( 1) g 1 j

; j = 0; :::; g 1: (27)

We state and prove the following Theorem, which extends Hoel -Levine result. The proof is due to Karlin and Studden.

Let C := 8 > > < > > :

c:= (c0; :::; cg 1)02 Rg such that det

0 B B @ '0(x0) : '0(xg 1) c0 '1(x0) : '1(xg 1) c1 : : : : 'g 1(x0) : 'g 1(xg 1) cg 1 1 C C A 6= 0 9 > > = > > ; .

For any c, consider the projections i, on the axes i = 0; :::; g 1,

i :C ! R, c 7! i(c) := ci

and let

z = 'i ( i(c)) = 'i (ci)

(47)

Finally denote

d (c; ) := sup

d

< X (x) ; d >2 < d;M ( ) d > which is the variance of [hc; i and

B := x2 R : u2(x) = 1 :

Theorem 31 (Optimality)Let f'0; :::; 'g 1g be a Chebyshev system in

C(0)([ 1; 1])

.

Then

1- there exists a unique function

x! u (x) := g 1 X j=0 aj'j(x)2 span f'0; :::; 'g 1g such that d (c; ) u2(z) , 8 2 Md([ 1; 1]) (28)

2-Let ex0 < :::: < exg 1, j = 0; :::; g 1, be g points in B such that

u (exj) = ( 1)g 1 j; j = 0; :::; g 1. De…ne lexj : j = 0; :::; g 1 the

Lagrange polynomials with degree g 1 de…ned on the nodes exj.Then

d (c; ) = u2(z) if and only if

:=

where is the measure with support exj, j = 0; :::; g 1, and (exj) :=

jlexj(z)j Pg 1

j=0jlexj(z)j.

3- If there exists ea := (ea0; :::;eag 1)0 2 Rg such that the function

x! U(x) :=Pgj=01eaj'j(x) coincides with the constant function 1[ 1;1] :

[ 1; 1]! R; x ! 1[ 1;1](x) = 1; then

#B = g and ex0 = 1;exg 1 = 1.

Furthermore

d (c; ) = u2(z) i¤ = :

Remark 32 Statement 1 means that the variance of the estimator of the c - form < c , > is bounded by below whatever : Statement 2 means that for any vector c there exists an optimal measure which provides optimality for the estimate of the c - form < c , >. Statement 3 assesses uniqueness.

(48)

Proof. (Karlin - Studden) Statement 2) follows from the above The-orem 30.

Statement 3) is proved as follows. There exist g points in B [ 1; 1], ex0, ..., exg 1 such that : 1 ex0 < ::: <

exg 1 1 and u (exj) = ( 1) g 1 j

, j = 0; :::; g 1. Since f'0; :::; 'g 1g

is a Chebyshev system in [ 1; 1], the functions '0; :::; 'g 1 are linearly

independent. The function u (x) :=Pgj=01aj'j(x) is de…ned in a unique

way when known at points ex0, ..., exg 1: Further there exists a unique

polynomial Pg 1, with degree g 1 which assumes the same values as u on

ex0; :::;exg 1 . Therefore u equals its interpolation polynomial with degree

g 1. Hence the system u (exj) = Pg 1(exj), j = 0; :::; g 1, has a unique

solution in the unknown numbers aj’s. Considering the basis which

con-sists in the elementary Lagrange polynomials lxej : j = 0; :::; g 1 we

may write Pg 1 , and henceforth u as follows: u (x) =Pgj=01lxej(x) u (exj).

Consider x = z; it holds u (z) = Pgj=01lxej(z) u (exj). Consider now the

functions 'j 2 span f'0; :::; 'g 1g, for j = 0; :::; g 1. Using the basis,

it holds 'j(z) = g 1 X j=0 lexj(z) 'j(exj) , j = 0; :::; g 1. Since lexj(z) = ( 1) g 1 j lexj(z) , denoting j := ( 1)g 1 j, we get 'j(z) = g 1 X j=0 j lexj(z) 'j(exj) , j = 0; :::; g 1.

Recall that X (x) := ('0; :::; 'g 1)0; the g equalities above write as

X (z) = g 1 X j=0 j lexj(z) X (exj) . Denoting j := lexj(z) Pg 1 j=0 lxej(z) , j = 0; :::; g 1 we have

(49)

X (z) Pg 1 j=0 lxej(z) = g 1 X j=0 j lxej(z) Pg 1 j=0 lexj(z) X (exj) = g 1 X j=0 j jX (exj) . Denote := Pg 11 j=0 lxej(z) : We then have X (z) = g 1 X j=0 j jX (exj) .

By Elfving Theorem it follows that if we prove that X (z)2 F r (R) then is optimal.

We now prove that X (z) 2 F r (R). It follows from the fact that there exists a tangent hyperplane R in X (z), i.e.

< (a )0; c >= 1

< y; a > 1, for any y2R . where the vector a de…nes the hyperplane.

We have u (z) = g 1 X j=0 aj'j(z) =< (a )0; X (z) >=< (a )0; c >= g 1 X j=0 lexj(z) u (exj) : Therefore u (z) =< (a )0; c > :

By de…nition u alternates sign at points exj’s. Hence

u2(exj) = 1 and u (exj) lexj(z) = lexj(z) : Now u2(z) = g 1 X j=0 aj'j(z) !2 = g 1 X j=0 lexj(z) u (exj) !2 = g 1 X j=0 lexj(z) !2 = 12

(50)

i.e.

2 = 1

u2(z):

Clearly u (z) > 0. Indeed u (z) =Pgj=01 lexj(z) . Hence = 1 u(z) and

therefore

< (a )0; c >= u (z) = 1

u (z)u (z) = 1: By the property (26) it holds

< (a )0; X (x) >= g 1 X j=0 aj'j(x) 1, for all x2 [ 1; 1] . We also have < (a )0; X (x) >= g 1 X j=0 aj'j(x) 1, for any x2 [ 1; 1] . Therefore < y; a > 1, for all y 2R .

Hence the hyperplane de…ned by the vector a is tangent toR in c. We prove that d (c; ) u2(z).

By Elfving’s Theorem , 2 is the minimum value of the variance.

hence

2 = min d (c; ) .

We have just seen that u2(z) = 1

2. This proves the claim.

We prove 3). It holds ju (x)j 1 for x2 [ 1; 1] . Also 1 u (x) 1 for x2 [ 1; 1] 0 u (x) + 1 2 for x2 [ 1; 1] i.e. 0 u (x) + 1 for x2 [ 1; 1] .

(51)

It also holds

2 u (x) 1 0 for x2 [ 1; 1] hence

1 u (x) 0 for x2 [ 1; 1] :

Now 1 u (x) and 1+u (x) are non negative functions for x2 [ 1; 1]. Therefore j1 u (x)j = 1 u (x) and j1 + u (x)j = 1 + u (x). From u (exg 1 j) = ( 1)j j = 0; :::; g 1; it follows that 1 u (x) = 0 for u (x) = 1 and 1 + u (x) = 0 for u (x) = 1:

Therefore, considering the zero’s in ( 1; 1) with multiplicity 2, we have that 1 u (x) and 1 + u (x) have g zero’s in [ 1; 1] .

Assume now that there exists a vector of coe¢cientsea:= (ea0; :::;eag 1)0 2

Rg, for which the function U (x) := Pg 1

j=0eaj'j(x) coincides with the

constant function 1[ 1;1] : [ 1; 1] ! R, x ! 1[ 1;1](x) = 1: Then the

functions g 1 X j=0 eaj'j(x) u (x) , g 1 X j=0 eaj'j(x) + u (x)

have g zero’s in [ 1; 1]. These are then the points 1;exg 1 j, j =

0; :::; g 3.

Since there exists a unique linear combination of the Chebyshev sys-tem which assumes value 0 on the points 1;exg 1 j, j = 0; :::; g 3 it

follows that

#B =g: We now prove that

d (c; ) = u2(z) i¤ = ,

where is the measure de…ned at point 3) of this Theorem.

Assume d (c; ) = u2(z) and consider a generic …nitely supported

probability measure whose support contains strictly the g points in B. We prove that the variance associated to is not optimal.

Références

Documents relatifs

deletion of ccpA in the tolerant mutant restored penicillin efficacy to the level of the susceptible strain, thus confirming the loss of tolerance observed in vitro..

Finally, the paper is organized as follows: After some definitions and a brief history of the explicit determination of the best approximating algebraic polynomial to a

In order to demonstrate a pathological relevance of the change in PCOLCE sub-cellular localisation and nuclear enrichment observed in the skeletal myoblast model sys- tem of

- The Monge point, more precisely its projection on faces, holds a compromise be- tween the projections of vertices on faces and orthocenters of faces: the Monge point is

Thanks to a reformulation within a control theory framework, the design of axisymmetric wind instruments has been revisited. It has been shown that in order for an oscillation regime

Thanks to this model, optimum operating conditions have been proposed for CO 2 capture, reducing the process energy requirement by 12% at equivalent solvent

In this section we will consider a problem which allow us to determine the parameters d and δ in order to obtain an optimal temperature at the porous anode of the PEM fuel cell..

In this paper, we characterize two of the production data analytics workflows used at the National Energy Research Scientific Computing Center (NERSC) at Lawrence Berke- ley