Expansions for the distribution of <span class="mathjax-formula">$M$</span>-estimates with applications to the Multi-Tone problem

(1)

DOI:10.1051/ps/2009012 www.esaim-ps.org

EXPANSIONS FOR THE DISTRIBUTION OF M -ESTIMATES WITH APPLICATIONS TO THE MULTI-TONE PROBLEM

Christopher S. Withers

¹

and Saralees Nadarajah

²

Abstract. We give a stochastic expansion for estimates θthat minimise the arithmetic mean of (typically independent) random functions of a known parameter θ. Examples include least squares estimates, maximum likelihood estimates and more generally M-estimates. This is used to obtain leading cumulant coeﬃcients ofθneeded for the Edgeworth expansions for the distribution and density ofn^1/2(θ−θ⁰) to magnituden^−3/2 (or ton⁻²for the symmetric case), whereθ⁰ is the true parameter value and n is typically the sample size. Applications are given to least squares estimates for both real and complex models. An alternative approach is given when the linear parameters of the model are nuisance parameters. The methods are illustrated with the problem of estimating the frequencies when the signal consists of the sum of sinusoids of unknown amplitudes.

Mathematics Subject Classification. 62E17, 62E20.

Received November 13, 2008. Revised March 9, 2009 and June 4, 2009.

1. Introduction and summary

Letθdenote an estimate of θ inR^p based on a random sample of size n. There is a large amount of work on expansions for θ−θ⁰, where θ⁰ is the true value of θ. However, most of the work to date are for the sample mean and functions of it. For example, Monti [5] obtains an expansion for the sample mean up to the second order by expanding the saddlepoint approximation. Boothet al. [1] give tilted expansions of a sample mean from a distribution on kpoints. Kakizawa and Taniguchi [4] obtain expansions for P(θ < x) under the assumption thatθhas a cumulant expansion in powers ofn⁻¹. Gatto and Ronchetti [3] provide approximations forP(m( ¯X)< x) up to 1 +O(n⁻¹) form(·) a smooth function. For a comprehensive review of the known work, we refer the readers to [8].

The aim of this paper is to provide expansions for those θthat minimise the arithmetic mean of random functions of θ. Maximum likelihood estimates (MLEs), least squares estimates (LSEs), and more generally M-estimates are examples of θwhich minimise a random mean function

Λ = Λ(θ) =n⁻¹ n N=1

Λ_N(θ)

Keywords and phrases. Bias, edgeworth, maximum likelihood,M-estimates, Skewness.

1 Applied Mathematics Group, Industrial Research Limited, Lower Hutt, New Zealand.

2 School of Mathematics, University of Manchester, Manchester M13 9PL, UK;mbbsssn2@manchester.ac.uk

Article published by EDP Sciences c EDP Sciences, SMAI 2011

(2)

for θin R^p. If E ∂Λ/∂θ = 0 andE ∂²Λ/∂θ∂θ >0, thenθ→ p θ⁰ as n→ ∞. (We use Rand C to denote the real and complex numbers).

The contents of this paper are organized as follows. In Section 2 we give a stochastic expansion forθ−θ⁰ of the form

θ−θ⁰≈^∞

a=1

δ^a,

whereδ^a=O_p(n^−a/2). (Theainδ^a is a superscript not a power).

In Section 3 we use this to obtain the leading coeﬃcients in the expansions for the cumulants ofθ:

κ(θ_i₁, . . . ,θ_i_r)≈ ^∞

j=r−1

kⁱ_j¹^...i^rn^−j (1.1)

for r ≥ 1 with k₀ⁱ = θ⁰_i. This implies that Y_n = n^1/2(θ−θ⁰) → Np(0, V) with V = (k₁ⁱ¹ⁱ²) = A⁻¹ for A=E ∂²Λ/∂θ∂θ. The leading bias and skewness coeﬃcientsk₀ⁱ¹ andkⁱ₂¹ⁱ²ⁱ³ give the Edgeworth expansion of the distribution (and its derivatives) ofY_n toO(n⁻¹), while the coeﬃcientsk₂ⁱ¹ⁱ² andkⁱ₃¹^...i⁴ give the Edgeworth expansion of the distribution ofY_n toO(n^−3/2) andP(Y_n∈S) toO(n⁻²) forS =−S⊂ R^p.

Section 4 applies these results to the LSEs for the general signal plus noise model

Y_N =S_N(θ) +e_N inRorC, 1≤N ≤n (1.2)

with Λ_N(θ) =|YN−S_N(θ)|²/2, where the residualse₁, . . . , e_n are assumed independent with mean zero. While the complex formulation can also be dealt with by the real formulation, there are some significant simplifications in staying with the complex model. The M-estimate with respect to a given convex functionρonRor C for the model (1.2) isθfor Λ_N(θ) =ρ(Y_N−ρ(θ)). For smoothρthe leading cumulant coefficients were essentially found by this method in the real case in [7].

Section 5 considers two examples on the signal frequency problem

y_k=s(θ) +n_k inC^M, k= 1, . . . , K, (1.3) wheren_k is complex normal with covariance not depending onθ, and themth component ofs(θ) is

s_m(θ) = R r=1

a_rexp(jφ_r+jw_r(m+m₀)/M) form= 0,1, . . . , M −1,

where j =√

−1. The p= 3R parameters areθ = (a, φ, w). So, (1.3) can be written in the form (1.2) with n= 2kM. Changing the covariance constantm₀is equivalent to reparameterisingφ_r: we shall see that ifR= 1 then takingm₀=−(M−1)/2 makes the asymptotic covariance ofθdiagonal.

In Section 6 we give a variation of the method for the case when the linear parameters of the model are nuisance parameters.

Appendix A provides a list of summation notations used throughout the paper. Some technical details required for the two examples in Section 5 are given in Appendices B and C. The proofs of all theorems are given in Appendix D.

Forxa complex matrix we shall usexto denote its transpose, ¯xits complex conjugate, andx^∗the transpose of its complex conjugate.

(3)

2. The stochastic expansion

Suppose {Λ_N(θ)} are real random functions of θ in R^p. Here we show that θ minimising Λ = Λ(θ) = n⁻¹_n

N=1Λ_N(θ) in R^p has the stochastic expansion δ=θ−θ⁰=

∞ a=1

δ^a (2.1)

withδ^a =O_p(n^−a/2). To avoid excessive subscripts we shall ﬁx 1≤i₀, i₁, . . .≤pand set

δâ_j = (δâ)_i_j, ∂_j =∂/∂θ_i_j, δ_j = (δ)_i_j,Λ_·12...=∂₁∂₂. . .Λ(θ), (2.2) A_12... = EΛ_·12..., and Δ_12...= Λ_·12...−A_12.... (2.3) Theorem 2.1 gives the first threeδâ explicitly in terms of these Δ’s andA’s. By (2.1) this givesθexplicitly in terms of Δ’s andA’s toO_p(n⁻²).

Forθto be a consistent estimate we need to assume that

A₁ = EΛ_·1= 0, (2.4)

Δ_1...r = O_p(n^−1/2) asn→ ∞. (2.5)

Typically the model contains a location parameter, and the constraint (2.4) effectively specifies how it is defined, as well as identifying the other parameters of the model. The constraint (2.5) generally follows by the Central Limit Theorem, if the{Λ_N(θ)}are independent or weakly dependent.

Theorem 2.1. Suppose θ is the estimate as deﬁned above satisfying (2.4) and (2.5). Suppose also that the eigenvalues of the p×pmatrix A= (A₁₂ : 1≤i₁, i₂ ≤p) are bounded away from zero as n→ ∞, so A has bounded inverseA⁻¹= (A¹²)asn→ ∞. Then (2.1) holds. Furthermore,

δ¹₀ = −A⁰¹Δ₁, (2.6)

δ²₀ = A⁰¹Δ₁₂A²³Δ₃−B⁰⁴⁵Δ₄Δ₅, (2.7)

δ³₀ = −A⁰¹(Δ₁₂δ²₂−A₁₂₃ 2

23

A²⁴Δ₄δ₃²/2 + Δ₁₂₃A²⁴Δ₄A³⁵Δ₅/2−C₁⁵⁶⁷Δ₅Δ₆Δ₇),

= −A⁰¹

Δ₁₂(A²³Δ₃₄A⁴⁵Δ₅−B²³⁴Δ₃Δ₄)

−2⁻¹A₁₂₃ 2

23

A²⁴Δ₄(A³⁵Δ₅₆A⁶⁷Δ₇−B³⁵⁶Δ₅Δ₆) +2⁻¹A²⁴A³⁵Δ₁₂₃Δ₄Δ₅−C₁²³⁴Δ₂Δ₃Δ₄

, (2.8)

and so on, whereB¹²³=A¹⁴A²⁵A³⁶A₄₅₆/2 andC₁⁵⁶⁷=A₁₂₃₄A²⁵A³⁶A⁴⁷/6.

Note that B⁰⁴⁵,B²³⁴,B³⁵⁶ andC₁²³⁴ in (2.6)–(2.8) follow from the definitions given forB¹²³andC₁⁵⁶⁷. For example,B⁰⁴⁵=A⁰⁴A⁴⁵A⁵⁶A₄₅₆/2,B²³⁴=A²⁴A³⁵A⁴⁶A₄₅₆/2 andC₁²³⁴=A₁₂₃₄A²²A³³A⁴⁴/6. The first three δâ given will be sufficient to obtain the cumulant coefficients needed.

3. The leading cumulant coefficients

In this section we give the cumulant coeﬃcients of (1.1) needed for the distribution ofn^1/2(θ−θ⁰) toO(n^−3/2), namely k_.1¹², k_.1⁰, k¹²³_.2 , k_.2¹², k_.3¹²³⁴, where we extend the notation of (D.1), (2.3) by settingk_.j^1...r =k_jⁱ¹^...i^r. We

(4)

now assume Λ₁(θ), . . . ,Λ_n(θ) independent. Forπ a sequence of integers in {1, . . . , p}, using the dot notation of (2.2) for partial derivatives,

Λ_·π =n⁻¹ n N=1

Λ_N·π.

So, forπ₁, π₂, . . .such sequences, the joint cumulants of the Λ_·π are given by

κ(Λ_·π₁, . . . ,Λ_·π_r) =n^1−r[π₁, . . . , π_r], (3.1) where

[π₁, . . . , π_r] =n⁻¹ n N=1

κ(Λ_N_·π₁, . . . ,Λ_N·π_r). (3.2)

For example, [1. . . r] = EΛ_·1...r =A_1...r. We shall give the leading cumulant coeﬃcients we need in terms of these [.] functions.

Setδ_rs= 1 forr=sand 0 otherwise, anda^b_1...r¹^...b^r =κ(δ^b₁¹, . . . , δ_r^b^r). This has an expansion of the form a^b_1...r¹^...b^r =

j≥(b1+...+br)/2

a^b_1...r·j¹^...b^r n^−j.

Also, the left hand side of (1.1) is equal to θ⁰_i₁δ_r1+κ(δ₁, . . . , δ_r). Substituting (2.1) into this gives k_·j^1...r=θ⁰_i₁δ_r1δ_j0+

b1+...+br≤2j

a^b_1...r.j¹^...b^r, (3.3)

whereb₁, . . . , b_r are positive integers.

Theorem 3.1. The coeﬃcients k¹²_.1,k⁰_.1,k¹²³_.2 ,k¹²_.2 andk¹²³⁴_.3 of (3.3) can be expressed as

k_.1¹²=a¹¹_12.1, k_.1⁰ =a²_1.1, k_.2¹²³=a¹¹¹_123.2+ 3 112

a¹¹²_123.2, (3.4)

k¹²_.2 = 2

12

a¹²_12.2+ 2

13

a¹³_12.2+a²²_12.2 (3.5)

and

k_.3¹²³⁴=a¹¹¹¹_1234·3+ 4 1112

a¹¹¹²_1234·3+ 4 1113

a¹¹¹³_1234·3+ 6 1122

a¹¹²²_1234·3, (3.6)

where

k_.1¹²=a¹¹_12.1=A¹³A²⁴[3,4], (3.7)

k_.1⁰ =a²_1.1=A⁰¹A²³[12,3]−B⁰⁴⁵[4,5], (3.8)

(5)

a¹¹¹_123·2=−A¹⁴A²⁵A³⁶[4,5,6], (3.9)

a¹¹²_123·2=A¹⁴A²⁵

A³⁶A⁷⁸ 2

45

[4,67] [5,8]−B³⁶⁷ 2

45

[4,6] [5,7]

, (3.10)

a¹²_12·2=−A¹³A²⁴A⁵⁶[3,45,6] +A¹³B²⁴⁵[3,4,5], (3.11)

a¹³_12·2 = A¹³A²⁴

A⁵⁶A⁷⁸ 3

[3,45] [67,8]−B⁵⁶⁷ 3

[3,45] [6,7]

−2⁻¹A₄₅₆ 2

56

A⁵⁷ A⁶⁸A^9,10 3

[3,7] [89,10]−B⁶⁸⁹ 3

[3,7] [8,9]

+2⁻¹A⁵⁷A⁶⁸ 3

[3,456] [7,8]−C₄⁵⁶⁷ 3

[3,5] [6,7]

, (3.12)

a²²_12·2 = A¹³A⁴⁵A²⁶A⁷⁸ 2 34,5

[5,8] [34,67]−²

12

A¹³A⁴⁵B²⁶⁷ 2

6,7

[5,6] [7,34]

+B¹³⁴B²⁶⁷ 2

34

[3,6] [4,7], (3.13)

a¹¹¹¹_1234·3=A¹⁵A²⁶A³⁷A⁴⁸[5,6,7,8], (3.14)

a¹¹¹²_1234·3=A¹⁵A²⁶A³⁷

−A⁴⁸A^9,10 6

[89,10] [5,6,7] +B⁴⁸⁹ 6

[8,9] [5,6,7]

, (3.15)

a¹¹¹³_1234·3 = A¹⁵A²⁶A³⁷A⁴⁸

A^9,10A^11,12 15

[5,6] [7,89] [10.11,12]

−B^9,10,11 15

[5,6] [7,89] [10,11]

−2⁻¹A_89,10 2 9,10

A^9,11 A^10,12A^13,14 15

[5,6] [7,11] [12.13,14]

−B^10,12,13 15

[5,6] [7,11] [12,13]

+2⁻¹A^9,11A^10,12 15

[5,6] [7,89.10] [11,12]

−C₈^9,10,11¹⁵

[5,6] [7,9] [10,11]

(3.16)

(6)

and

a¹¹²²_1234·3 = A¹⁵A²⁶

A³⁷A⁸⁹

A^4,10A^11,12 15

[5,6] [78,9] [10.11,12]

−B^4,10,11 15

[5,6] [78,9] [10,11]

−B³⁷⁸

A^4,10A^11,12 15

[5,6] [7,8] [10.11,12]

−B^4,10,11 15

[5,6] [7,8] [10,11]

. (3.17)

4. Least squares estimates

Here we apply the previous section to LSEs for both real and complex models. We begin with therealmodel.

Suppose we observe

Y =S(θ) +ein Rⁿ, that isY_N =S_N(θ) +e_N in Rfor 1≤N ≤n (4.1) with e₁, . . . , e_n independent and identically distributed (i.i.d.) with mean zero. Denote theirrth cumulant by λ_r=κ_r(e₁). The LSE isθminimising

Λ =n⁻¹|Y −S(θ)|²/2 =n⁻¹ n N=1

Λ_N(θ), (4.2)

where Λ_N(θ) =|Y_N −S_N(θ)|²/2. So,

Λ_N_·1...r =∂₁. . . ∂_rS_N²/2−Y_NS_N_·1...r, A_1...r = [1. . . r] =n⁻¹ n

N=1

{∂₁. . . ∂_rS_N²/2−S_NS_N_·1...r}.

Forπ₁, π₂, . . .sequences of integers in 1. . . p, set π₁, π₂, . . .=n⁻¹

n N=1

S_N_·π₁S_N_·π₂. . . (4.3)

Theorem 4.1 uses the previous section to express the cumulant coeﬃcients we need in terms of these functions.

Theorem 4.2 notes what form these take for the special case whenA is diagonal, having in mind the example of the next section for the caseR= 1.

Theorem 4.1. For the model given by (4.1), the cumulant coeﬃcients of Theorem 3.1 are

kⁱ₁¹ⁱ² =k¹²_·1 =λ₂A¹², (4.4)

kⁱ₁⁰=k⁰_·1= 2⁻¹λ₂A⁰¹A²³{−1,23 − 2,13+3,12}, (4.5)

kⁱ₂¹ⁱ²ⁱ³=k¹²³_·2 =A¹⁴A²⁵A³⁶

λ₃4,5,6+λ²₂ 3

456

4,56

−6λ²₂B¹²³, (4.6)

(7)

a¹²_12·2=λ₃A¹³

A²⁴A⁵⁶3,45,6 −B²⁴⁵3,4,5

, (4.7)

a¹³_12·2/λ²₂ = A¹³A²⁴A⁵⁶ A⁷⁸ 2

38

3,45 8,67+34,56

−2⁻¹A²⁴3,45A⁵⁶A₆₇₈(A¹³A⁷⁸+ 2A¹⁸A³⁷) +A¹³A²⁴

−2⁻¹3,45B⁵− 6,45B₃⁵⁶

+2A₃₄₆B⁶+ 2⁻¹A₄₅₆A₃₇₈A⁵⁷A⁶⁸+ 2⁻¹3,456A⁵⁶ +6,345A⁵⁶−2⁻¹A₃₄₅₆A⁵⁶

, (4.8)

a²²_12·2/λ²₂=A¹³A²⁶(A⁴⁷34,67+A⁴⁵A⁷⁸34,8 5,67)−2 2

12

A¹³B²⁴⁶6,34+ 2B₆¹³B²⁶₃ , (4.9) a¹¹¹¹_1234·3=λ₄A¹⁵A²⁶A³⁷A⁴⁸5,6,7,8, (4.10)

a¹¹¹²_1234·3 = [λ₂λ₃]

−A¹⁵A²⁶A³⁷A⁴⁸A^9,10 3

567

5,89 10,6,7

−A⁴⁸[A²⁶A³⁷A¹⁹89,6,7+A¹⁵A³⁷A²⁹89,5,7+A¹⁵A²⁶A³⁹89,5,6]

+A¹⁵A²⁶A³⁷A⁴⁸A^9,10 3

567

A_58,106,7,9

, (4.11)

a¹¹¹³_1234·3 = λ³₂A¹⁵A²⁶A³⁷A⁴⁸

A^9,10A^11,12 15

5,6 7,89 12,10.11

−B^9,10,11¹⁵

5,6 7,89 10,11

−2⁻¹A_89,10 2 9,10

A^9,11 A^10,12A^13,14 15

5,6 7,11 12.13,14

−B^10,12,13¹⁵

5,6 7,11 12,13

+ 2⁻¹A^9,11A^10,12 15

5,6 7,89.10 11,12

−C₈^9,10,11 15

5,6 7,9 10,11

(4.12) and

a¹¹²²_1234·3 = λ³₂A¹⁵A²⁶

A³⁷A⁸⁹

A^4,10A^11,12 15

5,678,910.11,12

−B^4,10,11¹⁵

5,6 78,910,11

−B³⁷⁸

A^4,10A^11,12 15

5,67,810.11,12

−B^4,10,11¹⁵

5,6 7,8 10,11

, (4.13)

whereB¹=B¹²³A₂₃= 2⁻¹A¹²A₂₃₄A³⁴ andB₁²³=B²³⁴A₁₄= 2⁻¹A₁₄₅A²⁴A³⁵.

(8)

Theorem 4.2. The corresponding expressions of Theorem 4.1 for A diagonal are

k₁ⁱ⁰ =k_·1⁰ =−2⁻¹λ₂A⁰⁰A²²0,22, (4.14)

kⁱ₂¹ⁱ²ⁱ³=k¹²³_·2 =A¹¹A²²A³³

λ₃1,2,3+λ²₂ 3

123

1,23 −3A₁₂₃

, (4.15)

a¹²_12·2=λ₃A¹¹

A²²A⁵⁵1,25,5 −B²⁴⁵1,4,5

, (4.16)

a¹³_12·2/λ²₂ = A¹¹A²²A³³[A⁷⁷(1,25 7,57+1,57 7,25) +12,55]

−2⁻¹A¹¹A²²A⁵⁵(1,25A₅₇₇A⁷⁷

+23,25A₅₃₁A³³) +A¹¹A²²[−2⁻¹1,25B⁵

−6,25B₁⁵⁶+ 2A₁₂₆B⁶+ 2⁻¹A₁₅₆A₂₅₆A⁵⁵A⁶⁶

+2⁻¹1,255A⁵⁵+5,125A⁵⁵−2⁻¹A₁₂₅₅A⁵⁵], (4.17)

a²²_12·2/λ²₂=A¹¹A²²A⁴⁴(14,24+A⁷⁷7,14 4,27)− 2

12

A¹¹B²⁴⁶6,14+ 2B₆¹³B²⁶₃ , (4.18)

a¹¹¹¹_1234·3=λ₄A¹¹A²²A³³A⁴⁴1,2,3,4, (4.19)

a¹¹¹²_1234·3/(λ₂λ₃) = A¹¹. . . A⁴⁴

−A⁵⁵ 3

123

1,452,3,5 − 3

123

1,2,34

+A⁵⁵ 3

123

A₁₄₅2,3,5

, (4.20)

a¹¹¹³_1234·3/λ³₂ = A¹¹A²²A³³A⁴⁴

A⁹⁹A^11,11 15

1,23,4911,9.11

−B^9,10,11 15

1,23,4910,11

−2⁻¹A_49,10 2 9,10

A^9,9 A^10,10A^13,13 15

1,23,910·13,13

−B^10,12,13 15

1,23,912,13

+ 2⁻¹A⁹⁹A^10,10 15

1,23,49.109,10

−C₄^9,10,11¹⁵

1,23,910,11

(4.21)

(9)

and

a¹¹²²_1234·3/λ³₂ = A¹¹A²²

A³³A⁸⁸

A⁴⁴A^11,11 15

1,238,84.11,11

−B^4,10,11 15

1,238,810,11

−B³⁷⁸

A⁴⁴A^11,11 15

1,27,84.11,11

−B^4,10,11¹⁵

1,27,810,11

, (4.22)

whereB¹²³=A¹¹A²²A³³A₁₂₃/2,C₁²³⁴=A₁₂₃₄A²²A³³A⁴⁴/6, and

B¹= 2⁻¹A¹¹A²²A₁₂₂, B₁²³= 2⁻¹A₁₂₃A²²A³³. (4.23) Note that the implicit summations are over the i’s not on the left hand side. For example, in (4.14) the summation is overi₂, not i₀.

Note thatB⁵,B⁶,B₃⁵⁶,B₆¹³ andB₃²⁶ in Theorem 4.1 follow from the deﬁnitions given forB¹ andB₁²³. For example, B⁵ = B⁵²³A₂₃, B⁶ = B⁶²³A₂₃, B₃⁵⁶ = B⁵⁶⁴A₃₄ and B₆¹³ = B¹³⁴A₆₄. Similar comments apply to Theorem 4.2.

Now suppose we replace the real model (4.1) by the complexmodel

Y =S(θ) +einCⁿ, that isY_N =S_N(θ) +e_N in C for 1≤N ≤n (4.24) withe_N =e_N1+je_N2forj=√

−1 and{e_N1, e_N₂}independent and identically distributed with mean zero and cumulants{λ_r}. The LSE is again given by (4.2). This can be put in the framework of (4.1) with nreplaced by 2n, (so that Λ andA_1···r are half what they are for the complex version with e_N1 =e_N of (4.1)), but it is simpler to adapt the preceding as follows:

Λ_N·1...r = ∂₁. . . ∂_rS¯_NS_N/2−Y¯_NS_N_·1...r/2−Y_NS¯_N_·1...r/2, A_1...r = [1. . . r] =n⁻¹

n N=1

{∂₁. . . ∂_r|SN|²/2−ReS¯_NS_N_·1...r}

= n⁻¹(∂₁. . . ∂_r|S|²/2−Re S^∗S_·1...r).

(Recall that ¯Y_N is the complex conjugate ofY_N, and S^∗= ¯S.) Let us extend the notation of (4.3) by writing

¯π₁, π₂, π₃=n⁻¹ n N=1

S¯_N·π₁S_N·π₂S_N_·π₃

and so on forπ₁, π₂, . . .sequences of integers in 1. . . p. One obtains

nΛ_·1 = −Re e^∗S_·1, nΛ_·12=Re(S_·1^∗S_·2−e^∗S_·12),

A₁₂ = Re B₁₂= (B₁₂+ ¯B₁₂)/2 forB₁₂=S_·1^∗S_·2/n=¯1,2, (4.25) Λ_·1...r = Re(B_1...r−e^∗S_·1...r/n), A_1...r=Re B_1...r,

(10)

where

B₁₂₃ = 3

123

S_·1^∗S_·23 n= 3

123

¯1,23,

B₁₂₃₄ = 4 1234

S_·1^∗S_·234+ 3

123

S_·12^∗ S_·34 n= 4 1234

¯1,234+ 3

123

12,¯ 34.

Note that B_1...r can be written down using the form for the partial Bell polynomialB_r2on page 307 of [2]; his B₂₂=x²₁, his B₃₂= 3x₁x₂, hisB₄₂= 4x₁x₃+ 3x²₂, hisB₅₂= 5x₁x₄+ 10x₂x₃ so that

B₁₂₃₄₅= 5

S_·1^∗S_·2345+ 10

S_·12^∗ S_·345 n= 5

¯1,2345+ 10

12,345,

and so on. We now give the complex form of (D.5). Set γ_i=S_·π_i andT_i=e^∗γ_i. Ifr >1 then (−2)^r[π₁, . . . , π_r] = 2^rn⁻¹κ(Re T₁, . . . , Re T_r) =n⁻¹κ(T₁+ ¯T₁, . . . , T_r+ ¯T_r)

= r i=0

k(1ⁱ1^r−i) (^r_i)

π₁, . . . , π_i, π_i+1, . . . , π_r, (4.26)

wherek(1ⁱ1^j) =κ(e₁,· · · , e₁, e₁,· · ·, e₁), countinge₁ itimes ande₁ jtimes. Also the inner summation is over all such (π₁, . . . , π_i) giving diﬀerent terms. These joint cumulants can be written in terms of the real cumulants {λr}: k(1ⁱ1^r−i) = [1 + (−1)^r−ij^r]λ_r so thatk(1²) = 0, k(1¯1) = 2λ₂. For example,

4[π₁, π₂] = E T₁T₂+ 2

T₁T¯₂+ ¯T₁T¯₂ n= 2λ₂ 2

π₁, π₂= 4λ₂Real (π₁, π₂),

−8[π₁, π₂, π₃] = 2λ₃Real (1 +j)

π₁, π₂, π₃+ 3

π₁,π¯₂,π¯₃

,

16[π₁, π₂, π₃, π₄] = 4λ₄Real π₁, π₂, π₃, π₄+ 3

234

π₁, π₂, π₃, π₄

.

So, by (D.5) the cumulant coeﬃcients for the complex case are obtained from those for the real case by replacing π1, π₂ by Real(¯π₁, π₂),

π₁, π₂, π₃ by Real (1 +j)

π₁, π₂, π₃+ 3

π₁,π¯₂,π¯₃ 4, (4.27)

π₁, π₂, π₃, π₄ by Real π₁, π₂, π₃, π₄+ 3

234

π₁, π₂, π₃, π₄ 4,

(11)

and so on. A simpler way of seeing this – without having to involve the joint cumulantsk(1ⁱ1^r−i), is as follows.

Consider the complex numbersa=a₁+ja₂, b=b₁+jb₂, . . .Thena₁= (a+ ¯a)/2 anda₂=−(a−¯a)j/2. So, a₁b₁+a₂b₂ = Real(ab),

a₁b₁c₁+a₂b₂c₂ = Real((1 +j) abc+ 3

abc 4, (4.28)

a₁b₁c₁d₁+a₂b₂c₂d₂ = Real abcd+ 3

bcd

abcd 4

= Real

abcd+abcd+acbd+adbc 4, and so on. Now takea=S_N.π₁,b=S_N.π₂,. . .Thenn⁻¹_n

N=1(a₁b₁c₁+a₂b₂c₂) is twice the real version ofπ1, π₂, π₃for the real version of the complex version with nreplaced by 2n. By (4.28) it equals the right hand side of (4.27). Similarly, one can write downπ1,. . .,π_rfor the real version of the complex model in terms of π1,. . .,π_rfor the complex model. So, the real versions (4.4), (4.5), (4.6), imply the complex versions

k₁ⁱ¹ⁱ² = k¹²_·1 =λ₂A¹², kⁱ₁⁰ =k⁰_·1= 2⁻¹λ₂A⁰¹A²³Real ({−¯1,23 − ¯2,13+¯3,12}), k₂ⁱ¹ⁱ²ⁱ³ = k¹²³_·2 =A¹⁴A²⁵A³⁶Real λ₃(1 +j)

4,5,6+ 3

4,¯5,¯6 4 +λ²₂ 3 456

¯4,56

−6λ²₂B¹²³.

Similarly,kⁱ₂¹ⁱ² =k¹²_·2 can be written down from its real form given by (3.5), (3.11), (3.12), (3.13), andkⁱ₃¹ⁱ²ⁱ³ⁱ⁴= k¹²³⁴_·3 can be written down from its real form given by (3.6), (3.14)–(3.17). Note that if e₁ is complex normal with components having the same variance, then (4.26) implies that [π₁, . . . , π_r] = 0 forr >2 just as this holds by (D.5) for the real case (4.1) ife₁∼ N(0, λ₂).

5. Examples

We now drop the convention of Sections 2–4 of suppressing thei’s to the usual convention that S_N_·i₁_...i_r =∂_i₁. . . ∂_i_rS_N

for∂_i =∂/∂θ_i. That is, we writei1. . . i_r, where we had1. . . r, A_i₁_i₂ =i1, i₂, where we hadA₁₂=1,2 in (D.3),i1, i₂i₃, where we had 1,23in (D.4), and so on. So, now

A_rs=r, s=n⁻¹ n N=1

S_N_·rS_N_·s.

Example 5.1. Consider the RsignalM frequency problem: observe y_k =s(θ) +n_k inC^M fork= 1, . . . , K,

where n₁, . . . , n_K are independent CNM(0, vI_M), that is, with real and imaginary parts independent NM(0, vI_M/2). (So, λ₂ of Sect. 4 is v/2 and λ_r = 0 for r = 2.) Counting m = 0,1, . . . , M −1, suppose that themth component of s(θ) has the form

s_m(θ) = R r=1

a_rexp(jα_mr) =s_m1+js_m2

(12)

say, for j =√

−1, where α_mr =ϕ_r+ν_mw_r and ν_m = (m+m₀)/M, and {ar, ϕ_r, w_r} are real so that p= 3R and we can takeθ = (a, ϕ, w). The main parameter isw; (a, ϕ) are nuisance parameters. We shall obtain the leading cumulant coeﬃcients ﬁrstly by using the real model (4.1), and then for comparison using the complex model (4.24). For the real model the MLE θminimises

K k=1

|yk−s(θ)|²/2 = n

N=1

(Y_N−S_N(θ))²/2, wheren= 2kM, Y₁

S₁

, . . . , Y_n

S_n

=

y_km1 s_m1

,

y_km2 s_m2

: 0≤m < M, 1≤k≤K,

and y_km = y_km1 +jy_km2. This puts the problem into the real formulation of (4.1) with e_N ∼ N(0, λ₂).

The constant m₀ is arbitrary, since it reparameterises ϕ. Choose m₀ = −(M −1)/2, so that {νm} have arithmetic mean zero. For 1≤r, s≤R, s_m1·r = cosα_mr, s_m1·r+R =−a_rsinα_mr,s_m1·r+2R=−ν_ma_rsinα_mr, s_m2·r= sinα_mr,s_m2·r+R=a_rcosα_mr,s_m2·r+2R=ν_ma_rcosα_mr and

A_rs= (2M)⁻¹

M−1

m=0

2 j=1

s_mj·rs_mj·s.

Fix 1≤r, s≤R and set

ϕ_rs=ϕ_r−ϕ_s, w_rs=w_r−w_s, δ_m=α_mr−α_ms =ϕ_rs+ν_mw_rs. (5.1) Then the elements ofA={Aab: 1≤a, b≤3R} can be identiﬁed as follows.

a, a: r, s= (2M)⁻¹

M−1 m=0

cosδ_m,

So,r,r= 2⁻¹ and forr=s,

r, s = 2⁻¹ _1/2

−1/2

cos(ϕ_rs+xw_rs) dx+O(M⁻¹)

= (cosϕ_rs)w_rs⁻¹sin(w_rs/2) +O(M⁻¹) asM → ∞,

a, ϕ:r, s+R = a_s(2M)⁻¹

M−1

m=0

sinδ_m=a_s2⁻¹ _1/2

−1/2sin(ϕ_rs+xw_rs) dx+O(M⁻¹)

= a_s(sinϕ_rs)w_rs⁻¹sin(w_rs/2) +O(M⁻¹) ifr=s

= 0 ifr=s,