Adaptive Robust Efficient Methods for Periodic Signal Processing Observed with Colours Noises

(1)

HAL Id: hal-02334918

https://hal.archives-ouvertes.fr/hal-02334918

Submitted on 27 Oct 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Processing Observed with Colours Noises

Evgeny Pchelintsev, Mariana Marcokova, Serguei Pergamenchtchikov

To cite this version:

Evgeny Pchelintsev, Mariana Marcokova, Serguei Pergamenchtchikov. Adaptive Robust Eﬀicient

Methods for Periodic Signal Processing Observed with Colours Noises. Advances in Electrical and

Electronic Engineering, VSB-Technical University of Ostrava, 2019, �10.15598/aeee.v17i3.3132�. �hal-

02334918�

(2)

Adaptive Robust Efficient Methods for Periodic Signal Processing Observed with Colours Noises

Evgeny PCHELINTSEV

^1,2

, Serguei PERGAMENSHCHIKOV

^3,4

, Mariana MARCOKOVA

^5,6

1

Department of Mathematical Analysis and Theory of Functions, Faculty of Mechanics and Mathematics, Tomsk State University, Lenin Avenue 36, 634050 Tomsk, Russia

2

International Laboratory of Statistics of Stochastic Processes and Quantitative Finance, Tomsk State University, Lenin Avenue 36, 634050 Tomsk, Russia

3

Laboratory of Mathematics Raphael Salem, University of Rouen Normandie, Avenue de l’Universite 12, 76801 Saint Etienne du Rouvray, France

4

International Laboratory of Statistics of Stochastic Processes and Quantitative Finance, Tomsk State University, Lenin Avenue 36, 634050 Tomsk, Russia

5

Department of Structural Mechanics and Applied Mathematics, Faculty of Civil Engineering, University of Zilina, Univerzitna 8215/1, 010 26 Zilina, Slovak Republic

6

Department of Electronics and Nanoelectronics,

National Research University "Moscow Power Engineering Institute", Krasnokazarmennaya 14, 111250 Moscow, Russia

[email protected], [email protected], [email protected] DOI: 10.15598/aeee.v17i3.3132

Abstract. In this paper, we consider the problem of robust adaptive efficient estimating a periodic sig- nal observed in the transmission channel with the dependent noise defined by non-Gaussian Ornstein- Uhlenbeck processes with unknown correlation proper- ties. Adaptive model selection procedures, based on the shrinkage weighted least squares estimates, are pro- posed. The comparison between shrinkage and least squares methods is studied and the advantages of the shrinkage methods are analyzed. Estimation properties for proposed statistical algorithms are studied on the basis of the robust mean square accuracy defined as the maximum mean square estimation error over all pos- sible values of unknown noise parameters. Sharp ora- cle inequalities for the robust risks have been obtained.

The robust efficiency of the model selection procedure has been established.

Keywords

Asymptotic efficiency, model selection, non- parametric regression, Ornstein-Uhlenbeck pro- cess, periodic signals, robust quadratic risk, sharp oracle inequality, shrinkage estimation, weighted least squares estimates.

1. Introduction

In this paper, we consider the estimation problem for the 1-periodic signal S(t) on the basis of observations (y

t

)

_0≤t≤n

given by the stochastic differential equation:

dy

t

= S(t)dt + dξ

t

, 0 ≤ t ≤ n, (1) where n is the duration of observation and (ξ

t

)

_0≤t≤n

is unobserved colour noise. Note that if (ξ

t

)

_0≤t≤n

is Brownian motion, then we obtain the well-known "sig- nal + white noise" model which is very popular in sta- tistical radio-physics (see, for example, [1], [2] and [3]).

In this paper, we assume that the useful signal S is distorted by the impulse flow described by the non- Gaussian Ornstein-Uhlenbeck processes, which allows studying the signal estimation problems with depen- dent pulse noises, i.e. we assume that the noise process (ξ

t

)

_0≤t≤n

obeys the equation:

dξ

_t

= aξ

_t

dt + du

_t

, (2) u

t

= ´ n

1

w

t

+ ´ n

2

z

t

and z

t

= x ∗ (µ − µ) ˜

_t

, where a, n ´

1

and n ´

2

are some unknown constants, (w

t

)

_t≥0

is the standard Brownian motion, µ(ds dx) is the jump measure with deterministic compensator

˜

µ (ds dx) = dsΠ (dx), Π(·) is the Levy measure, i.e.

some positive measure on R

^∗

= R \ {0}, such that Π x

²

= 1 and Π x

⁶

< ∞. Here we use the no- tation Π (|x|

^m

) = R

R^∗

|y|

^m

Π (dy). Note that the Levy

(3)

measure Π ( R

^∗

) could be equal to +∞. We use ∗ for the stochastic integrals with respect to random measures (see [4], Chapters 2 and 3), i.e.:

x ∗ (µ − µ) ˜

_t

=

t

Z

0

Z

R^∗

y (µ − µ) (ds, dy) ˜ . (3)

It should be noted that if a = 0, then we obtain the Levy regression model considered in [5]. In the case when Π (·) = 0 we obtain the well-known Gaussian Ornstein-Uhlenbeck regression model introduced in [6]

and [7]. The model in the Eq. (1) and Eq. (2) in which the jump process (z

_t

)

_t≥0

is defined by the compound Poisson process was studied in [8] and [9]. However, the compound Poisson processes can describe only the large noise impulses of small fixed frequency, but the telecommunication and location systems may have the impulse noises with any frequency without any condi- tion. We note that in the papers [8] and [9] the pro- posed statistical procedures are based on the classical weighted least squares estimators.

The main goal of this paper is to develop a new improved adaptive robust efficient signal estimation methods for the non-Gaussian Ornstein-Uhlenbeck noise (ξ

t

)

_0≤t≤n

based on the general Levy processes with unknown distribution Q. We assume that this distribution belongs to the class Q

^∗_n

defined as a fam- ily of all these distributions for which the parameters

−a

∗

≤ a < 0, ´ n

1

≥ ξ

∗

and n ´

²₁

+ ´ n

²₂

≤ ξ

^∗

, where a

∗

, ξ

∗

and ξ

^∗

are some fixed positive bounds. The quality of an estimate S ˆ

_n

of the unknown signal S, i.e. some function of (y

_t

)

_0≤t≤n

, will be measured with the robust quadratic risk:

R

^∗

S ˆ

n

, S

= sup

Q∈Q^∗_n

R

Q

S ˆ

n

, S

, (4)

where

R

Q

S ˆ

n

, S

:= E

Q,S

S ˆ

n

− S

2

and (5)

kSk

²

=

1

Z

0

S

²

(t)dt.

Here E

_Q,S

is the expectation with respect to the distri- bution P

_Q,S

of the process in the Eq. (1) with a fixed distribution Q of the noise (ξ

t

)

_0≤t≤n

and a given func- tion S.

2. Shrinkage Estimation Methods

Let (φ

_j

)

_j≥1

be a trigonometric basis in L

₂

[0, 1]. We extend these functions by the periodic way on R i.e.

φ

_j

(t) = φ

_j

(t + 1) for any t ∈ R . For estimating the unknown function S in the Eq. (1), we consider its Fourier expansion:

S(t) =

∞

X

j=1

θ

j

φ

j

(t)

and (6)

θ

j

= (S, φ

j

) =

1

Z

0

S(t)φ

j

(t)dt.

The Fourier coefficients θ

j

can be estimated as:

θ ˆ

j,n

= 1 n

n

Z

0

φ

j

(t)dy

t

. (7) We define a class of weighted least squares estimates for S(t) as:

S ˆ

λ

=

n

X

j=1

λ(j)ˆ θ

j,n

φ

j

, (8) where the weights λ ∈ R

ⁿ

belong to some finite set Λ from [0, 1]

ⁿ

.

Now, for the first d ≤ n Fourier coefficients in Eq. (6), we use the improved estimation method pro- posed for parametric models in [10] and [11]. To this end we set θ ˜

_n

=

θ ˆ

_j,n

1≤j≤d

. In the sequel, we will use the norm |x|

²_d

=

d

P

j=1

x

²_j

for any vector x = (x

_j

)

_1≤j≤d

from R

ⁿ

. Now we define the shrinkage estimators as:

θ

^∗_j,n

= (1 − g (j)) ˆ θ

j,n

, (9) where g(j) =

^cⁿ

|

^θ^˜n

|

_d

1

_{1≤j≤d}

, 1

_A

is the indicator of the set A and c

n

is some known parameter such that c

n

≈

_n^d

as n → ∞. Now we introduce a class of shrink- age weighted least squares estimates for S as:

S

_λ^∗

=

n

X

j=1

λ(j)θ

^∗_j,n

φ

j

. (10) We denote the difference of quadratic risks of the esti- mates in Eq. (10) and Eq. (8) as ∆

_Q

(S) := R

_Q

(S

_λ^∗

, S)−

R

_Q

S ˆ

_λ

, S

. Now for this deviation, we obtain the fol- lowing result.

Theorem 1 Assume that for any vector λ ∈ Λ there exists some fixed integer d = d(λ) such that their first d components equal to one. Then for any n ≥ 1 and r > 0:

sup

Q∈Qn

sup

kSk≤r

∆

Q

(S) < −c

²_n

. (11)

The inequality in Eq. (11) means that non-

asymptotically, i.e. for any n ≥ 1 the estimate in the

(4)

Eq. (10) outperforms in mean square accuracy the es- timate in the Eq. (8). Moreover, as we will see below, nc

n

→ ∞ as n → ∞. This means that the improve- ment effect in the nonparametric case is more signifi- cant than for parametric regression [11].

3. Model Selection Procedure

This Section gives the construction of a model selection procedure for estimating a function S in the Eq. (1) on the basis of improved weighted least squares estimates and states the sharp oracle inequality for the robust risk of the proposed procedure.

The model selection procedure for the unknown func- tion S in the Eq. (1) will be constructed on the basis of a family of estimates (S

_λ^∗

)

_λ∈Λ

. The performance of any estimate S

_λ^∗

will be measured by the empirical squared error:

Err

_n

(λ) = kS

_λ^∗

− Sk

²

. (12) In order to obtain a good estimate, we have to write a rule to choose a weight vector λ ∈ Λ in the Eq. (6).

It is obvious that the best approach is to minimize the empirical squared error with respect to λ. Making use the estimate definition in the Eq. (6) and the Fourier transformation of S implies:

Err

n

(λ) =

n

X

j=1

λ

²

(j) θ

_j,n^∗

²

− 2

n

X

j=1

λ(j)θ

^∗_j,n

θ

j

+

n

X

j=1

θ

²_j

. (13) Since the Fourier coefficients (θ

j

)

_j≥1

are unknown, the weight coefficients (λ

_j

)

_j≥1

cannot be found by mini- mizing this quantity. To circumvent this difficulty one needs to replace the terms θ

^∗_j,n

θ

_j

by their estimators θ ˜

j,n

. We set:

θ ˜

j,n

= θ

^∗_j,n

θ ˆ

j,n

− σ ˆ

n

n , (14)

where σ ˆ

_n

is the estimate for the noise variance of σ

_Q

= E

_Q

ξ

_j,n²

which we choose in the following form:

ˆ σ

_n

=

n

X

j=

[

^√ⁿ

]

⁺¹

ˆ t

²_j,n

and ˆ t

_j,n

= 1 n

n

Z

0

φ

_j

(t)dy

_t

. (15)

For this change in the empirical squared error, one has to pay some penalty. Thus, one comes to the cost func- tion of the form:

J

n

(λ) =

n

X

j=1

λ

²

(j) θ

^∗_j,n

2

− 2

n

X

j=1

λ(j)˜ θ

j,n

+ δ P ˆ

n

(λ), (16) where δ is some positive constant and P ˆ

_n

(λ) is the penalty term defined as:

P ˆ

n

(λ) = σ ˆ

_n

|λ|

²_n

n . (17)

Substituting the weight coefficients, minimizing the cost function:

λ

^∗

= argmin

λ∈Λ

J

_n

(λ) (18)

in the Eq. (10) leads to the improved model selection procedure:

S

^∗

= S

_λ^∗∗

. (19) It will be noted that λ

^∗

exists because Λ is a finite set.

If the minimizing sequence in the Eq. (18) λ

^∗

is not unique, one can take any minimizer. In the case, when the value of σ

Q

is known, one can take σ ˆ

n

= σ

Q

and P

n

(λ) = σ

Q

|λ|

²_n

n

⁻¹

.

Theorem 2 For any n ≥ 2 and 0 < δ <

¹₂

, the robust risks defined in the Eq. (4) of estimate in the Eq. (19) for continuously differentiable function S satisfies the oracle inequality:

R

^∗

(S

^∗_λ∗

, S) ≤ 1 + 5δ 1 − δ min

λ∈Λ

R

^∗

(S

_λ^∗

, S) + B

_n^∗

nδ , (20) where the term B

_n^∗

is independent of S and such that B

_n^∗

n

⁻

→ 0 as n → ∞ for any > 0.

The inequality in Eq. (20) allows us to establish that the procedure in the Eq. (19) is optimal in the oracle inequalities sense. This property enables to provide asymptotic efficiency in the adaptive setting, i.e. when information about the signal regularity is unknown.

4. Asymptotic Efficiency

In order to study the asymptotic efficiency, we define the following functional Sobolev ball:

W

k,r

= (

f ∈ C

_p^k

[0, 1] :

k

X

i=0

f

⁽ⁱ⁾

2

≤ r )

, (21) where r > 0 and k ≥ 1 are some unknown parameters, C

_p^k

[0, 1] is the space of k times differentiable 1-periodic functions such that for any 0 ≤ i ≤ k − 1 : f

⁽ⁱ⁾

(0) = f

⁽ⁱ⁾

(1). In order to formulate our asymptotic results we set:

v

_n

= n

ξ

^∗

, l

_k

(r) = ((2k + 1) r)

^(2k+1)¹

k

π (k + 1)

_(2k+1)^2k

(22) and we denote by Σ

n

of all estimates S ˆ

n

of S mea- surable with respect to the σ-algebra generated by the process in the Eq. (1).

Theorem 3 The robust risk defined in the Eq. (4) ad- mits the following asymptotic lower bound:

lim inf

n→∞

inf

Sˆ_n∈Σ_n

v

^2k/(2k+1)_n

sup

S∈Wk,r

R

^∗

S ˆ

n

, S

≥ l

k

(r) .

(23)

(5)

This lower bound is sharp in the following sense.

Theorem 4 The robust risk defined in the Eq. (4) for the estimating procedure in the Eq. (19) has the follow- ing asymptotic upper bound:

lim sup

n→∞

v

^2k/(2k+1)_n

sup

S∈Wk,r

R

^∗

(S

^∗

, S) ≤ l

_k

(r) . (24)

Theorem 3 and Thm. 4 imply that the model selection procedure S

^∗

is efficient and the parameter l

k

(r) de- fined in the Eq. (22) is the Pinsker constant in this case [3].

5. Monte Carlo Simulations

In this section, we report the results of a Monte Carlo experiment to assess the performance of the proposed model selection procedure in the Eq. (19). In the Eq. (1) we choose 1-periodic function S which is defined as S(t) = t sin(2πt) + t

²

(1 − t) cos(2πt), for 0 ≤ t ≤ 1.

We simulate the Eq. (1) with the noise process defined as:

dξ

t

= −ξ

t

dt + 0.5dw

t

+ 0.5dz

t

, (25) where z

t

=

N_t

P

j=1

Y

j

, N

t

is a Poisson process with the intensity λ = 1 and (Y

j

)

_j≥1

is i.i.d. Gaussian (0, 1).

We use the model selection procedure defined in the Eq. (19) with the weights proposed in [8]: k

^∗

= 100 + √

ln n, =

_ln¹_n

and m =

1 ²

. We used the cost function with δ = (3 + ln n)

⁻²

. We define the em- pirical risk as R ¯

S, S ˜

=

¹_p

p

P

j=1

E ˆ

S ˜

n

(t

j

) − S (t

j

)

2

and E ˆ

S ˜

n

(·) − S (·)

2

=

_N¹

N

P

l=1

S ˜

_n^l

(·) − S (·)

2

with the frequency of observations p = 100001 and numbers of replications N = 10000.

Table 1 gives the values for the sample risks for dif- ferent numbers of observation period n.

Tab. 1: Empirical risks.

n R¯ S, S˜

R¯(S^∗, S) R¯ S, S˜

/R¯(S^∗, S)

100 0.0457 0.0289 1.6

200 0.0216 0.0089 2.4

500 0.0133 0.0021 6.3

1000 0.098 0.0011 8.9

6. Conclusion

In this paper, we considered the problem of nonpara- metric signal processing on the basis of the observa- tions with the dependent non-Gaussian impulse noises.

We developed adaptive efficient statistical model selec- tion procedures based on the shrinkage methods and we have shown that the shrinkage estimation methods con- siderably improve the non-asymptotic estimation accu- racy. The obtained theoretical results are confirmed by the numerical simulation. It turns out that numerically the improvement effect may increase 10 times. Next, for the developed statistical methods we obtained the adaptive efficiency property, which means that we pro- vide the best mean squares accuracy without using the smoothness information about the form of unknown signal. Moreover, in this paper, we studied the accu- racy properties for the proposed methods on the basis of the robust approach, i.e. uniformly over all possible unknown noise distributions. This allows us to syn- thesize the statistical algorithms possessing the high noise immunity properties. The results (their satisfac- tory concordance with the corresponding experimental data) can be used for the estimation of the signals.

Such problems are of a great importance in the fields of radio-and-hydroacoustic communications and posi- tioning, radio-and-hydrolocation, etc. (see [12] and ref- erences therein).

Acknowledgment

The results of this work are supported by the Min- istry of Science and Higher Education of the Russian Federation in the framework of the research project no. 2.3208.2017/4.6. The second author is partially supported by the Russian Federal Professor Program, project no. 1.472.2016/1.4 (Ministry of Science and Higher Education of the Russian Federation) and by the project XterM-Feder, University of Rouen. The results of Sec. 4. and Sec. 5. are supported by the RSF grant number 17-11-01049.

References

[1] IBRAGIMOV, I. A. and R. Z. KHASMINSKII.

Statistical Estimation: Asymptotic Theory. 1st ed.

New York: Springer, 1981. ISBN 978-1-4899-0027- 2.

[2] KASSAM, S. A. Signal Detection in Non- Gaussian Noise. 1st ed. New York: Springer, 1988.

ISBN 978-1-4612-3834-8.

[3] CHERNOYAROV, O. V., M. VACULIK,

A. SHIRIKYAN and A. V. SALNIKOVA. Statis-

tical Analysis of Fast Fluctuating Random Signals

with Arbitrary - Function Envelope and Unknown

Parameters. Communications - Scientific Letters

of the University of Zilina. 2015, vol. 17, no. 1a,

pp. 35–43. ISSN 1335-4205.

(6)

[4] CONT, R. and P. TANKOV. Financial Modelling with Jump Processes. 1st ed. Boca Raton: Chap- man & Hall, 2003. ISBN 1-58488-413-4.

[5] PCHELINTSEV, E., V. PCHELINTSEV and S. PERGAMENSHCHIKOV. Non asymptotic sharp oracle inequalities for the improved model selection procedures for the adaptive nonparamet- ric signal estimation problem. Communications - Scientific Letters of the University of Zilina. 2018, vol. 20, no. 1, pp. 72–76. ISSN 1335-4205.

[6] HOPFNER, R. and Y. A. KUTOYANTS. On LAN for parametrized continuous periodic sig- nals in a time-inhomogeneous diffusion. Statisti- cal Decisions. 2009, vol. 27, iss. 4, pp. 309–326.

ISSN 0721-2631. DOI: 10.1524/stnd.2009.1064.

[7] HOPFNER, R. and Y. A. KUTOYANTS. Esti- mating discontinuous periodic signals in a time- inhomogeneous diffusion. Statistical Inference for Stochastic Processes. 2010, vol. 13, iss. 3, pp. 193–

230. ISSN 1387-0874. DOI: 10.1007/s11203-010- 9046-7.

[8] KONEV, V. V. and S. PERGAMENSHCHIKOV.

Efficient robust nonparametric estimation in a semimartingale regression model. Annales de l’Institut Henri Poincare (B) Probability and Statistics. 2012, vol. 48, no. 4, pp. 1217–1244.

ISSN 0246-0203. DOI: 10.1214/12-AIHP488.

[9] KONEV, V. V. and S. PERGAMENSHCHIKOV.

Robust model selection for a semimartingale continuous time regression from discrete data.

Stochastic Processes and their Applications. 2015, vol. 125, iss. 1, pp. 294-326. ISSN 0304-4149.

DOI: 10.1016/j.spa.2014.08.003.

[10] KONEV, V. V., S. PERGAMENSHCHIKOV and E. PCHELINTSEV. Estimation of a regression with the pulse type noise from discrete data.

Theory of Probability and Its Applications. 2014, vol. 58, iss. 3, pp. 442–457. ISSN 0040-585X.

DOI: 10.1137/S0040585X9798662X.

[11] PCHELINTSEV, E. Improved estimation in a non-Gaussian parametric regression. Sta- tistical Inference for Stochastic Processes.

2013, vol. 16, iss. 1, pp. 15–28. ISSN 1387-0874.

DOI: 10.1007/s11203-013-9075-0.

[12] CHERNOYAROV, O. V., Y. A. KUTOYANTS and M. MARCOKOVA. On frequency estimation for partially observed system with small noises in state and observation equations. Communications - Scientific Letters of the University of Zilina.

2018, vol. 20, no. 1, pp. 66–71. ISSN 1335-4205.

About Authors

Evgeny PCHELINTSEV was born in Almaty, Kazakhstan. He received his M.Sc. in Mathematics from Tomsk State University in 2009, Ph.D. in Applied Mathematics and Statistics from Tomsk State University and Rouen University in 2012. His research interests include statistical modelling and identification of stochastic dynamic systems, soft computing, and simulation.

Serguei PERGAMENSHCHIKOV was born in Tomsk, Russia. He received his M.Sc. from Tomsk State University in 1980, Ph.D. in Probability Theory and Statistics from Tomsk State University in 1986, Doctor of Sciences in Mathematics from Central Economics and Mathematics Institute of RAS in 1994.

His research interests include statistical modelling and identification of stochastic dynamic systems, stochastic differential equations, optimization, and control.