HAL Id: hal-01265256
https://hal.archives-ouvertes.fr/hal-01265256
Submitted on 2 Feb 2016
HAL
is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire
HAL, estdestinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
ESTIMATORS
Alexander Goldenshluger, Oleg Lepski
To cite this version:
Alexander Goldenshluger, Oleg Lepski. GENERAL PROCEDURE FOR SELECTING LINEAR ES-
TIMATORS . Theory of Probability and Its Applications c/c of Teoriia Veroiatnostei i Ee Primenenie,
Society for Industrial and Applied Mathematics, 2013, 57 (2), pp.209-226. �10.4213/tvp4446�. �hal-
01265256�
Vol. 57, No. 2, pp. 1608
GENERAL PROCEDURE FOR SELECTING LINEAR ESTIMATORS ∗
GOLDENSHLUGER A. V.† AND LEPSKI O. V.‡ (Translated by )
Àííîòàöèÿ. In the general statistical experiment model we propose a procedure for selecting an estimator from a given family of linear estimators. We derive an upper bound on the risk of the selected estimator and demontrate how this result can be used in order to construct minimax and adaptive minimax estimators in specic nonparametric estimation problems.
Key words. statistical experiment, linear estimators, oracle inequality, adaptive minimax estimation, majorant.
1. Introduction. Let (X(n),B(n),P(n)f , f ∈ F), n ∈ N∗ be a sequence of statistical experiments generated by the observationX(n). HereB(n)is theσ-algebra, generated by the random element X(n)∈ X(n), and distribution ofX(n) belongs to the family of the probability measures(P(n)f , f ∈F). Let(D,D, ν)be a measurable space, and letF0 be a given set of measurable functionsf: D→R. In this paper we will suppose thatF⊆F0; spaces of continuous and bounded functions andL2(D, ν) are classical examples of the setF.
Consider the problem of estimating function f on the basis of the observation X(n). By an estimator of functionf we mean anyB(n)-measurable mapfb: X(n)→F0. With any estimatorfbwe associated the risk
(1) R(n)` [fb;f] = n
E(n)f [`(fb−f)]q o1/q
,
whereE(n)f is the expectation with respect to the probability measureP(n)f ,`: F0→ R+ is a seminorm, and q = 1 is a given real number. Our goal is to develop an estimator of functionf with small riskR(n)` [fb;f].
The following three models are typical examples of statistical experiments.
Example 1 (regression model). Let
X(n)={(Y1, Z1), . . . ,(Yn, Zn)}, n∈N∗, where
(2) Yi=f(Zi) +ξi, i= 1, . . . , n,
and ξi, i = 1, . . . , n, are independent identically distributed random variables with zero mean, Eξ1 = 0, and nite second moment,Eξ21 =σ2 <∞. The design points Zi, i= 1, . . . , n, are assumed to be either xed points inD, or independent random elements, identically distributed onD.
∗The rst author is supported by ISF grant 104/11.
†Department of Statistics, University of Haifa, 31905 Haifa, Israel; e-mail:
goldensh@stat.haifa.ac.il
‡Laboratoire d'Analyse, Topologie et Probabilites UMR CNRS 6632, Universite Aix-Marseille, 39, rue F. Joliot Curie, 13453 Marseille, France; e-mail: lepski@cmi.univ-mrs.fr
1
Example 2 (Gaussian white noise model). Let W be the white noise on D of intensity ν (see. [8, ñ. 36]). Consider the statistical experiment, generated by the observationX(n)={Yn(φ), φ∈L2(D, ν)},n∈N∗, whereYn(·)satises the equation (3) Yn(φ) =hf, φi+ 1
√n Z
D
φ(u)W(du) ∀φ∈L2(D, ν).
Hereh·,·iis the scalar product onL2(D, ν)andf ∈L2(D, ν). Recall, that (4) Yn(φ)∼ N hf, φi, n−1kφk22
,
wherek · k2 denotes the norm in the spaceL2(D, ν).
Example 3 (density model). Let P be a probability measure on the σ-algebra D with density f with respect to the measure ν. Assume that we observe vector X(n) = (X1, . . . , Xn), n ∈ N∗, where Xi, i = 1, . . . , n, are independent identically distributed random elements with values inD, and distributed according toP.
Choosing seminorm`(·)dierently in (1), we come to dierent estimation problems.
1. Global estimation. If we want to estimate the whole funstionf in a given set D0 ⊆ D, then it is natural to choose the seminorm ` as the Lp-norm on (D0, ν):
`(g) = kgkp,ν (in the sequel, for the sake of brevity, in the norm notation we omit dependence onν). In this case the corresponding risk is given by the formula (5) R(n)p [fb;f] =
h
E(n)f bf−fq
p
i1/q
, p∈[1,∞].
2. Pointwise estimation. If we are interested in estimating function f at a xed point x∈D0⊆D, then the seminorm and the corresponding risk are dened as follows:`(g) =g(x),
R(n)x [fb;f] = n
E(n)f bf(x)−f(x)qo1/q
.
Note, that in the denition of the seminorm we useD0⊆D. This one allows to avoid the discussion of boundary eects in those models where such eects can exist (Examples 1, 2 and Example 3, ifD is a bounded set).
Our approach to the outlined above estimation problem is based on the idea of selecting an estimator from a given family of linear estimators. Linear methods are ubiquitous in nonparametric estimation problems. It is well known that they are optimal in many statistical problems. Let us discuss a typical form of the linear estimator off in the models of Examples 13.
Regression model. Linear estimator of regression function f is dened by the formula
fb(x) = Xn j=1
Kj(x)Yj,
where the weightsKj(·), j= 1, . . . , n, satisfy the condition Pn
j=1Kj(x) = 1for any x ∈ D0. The following important example of the weights leads to the Nadaraya Watson kernel estimator (hereD⊆Rd):
Kj(x) = wh(Zj−x) Pn
i=1wh(Zi−x), j= 1, . . . , n,
where w: Rd → R is a given function, h∈ Rd+\ {0}, h = (h1, . . . , hd) and wh is given by wh(·) = [Qd
i=1h−i1]w(·/h). Here and in the sequel for vectors x, y∈Rd we understand the division x/y in the coordinatewise sense. Note that if variablesZj are deterministic, then
E(n)f bf(x)
= Xn j=1
Kj(x)f(Zj).
Gaussain white noise model. LetK: D×D→Rbe a function satisying the conditionR
DK(t, x)ν(dt) = 1,K(·, x)∈L2(D, ν)for any x∈D. A linear estimator off is dened by the expression
fb(x) =Yn K(·, x)
, x∈D0.
In this model for anyx∈D0estimatorfb(x)is a Gaussian random variable with mean R
DK(t, x)f(t)ν(dt)and varianceR
DK2(t, x)ν(dt).
LetD⊆Rdandνbe the Lebesgue measure. If, for example, functionw: D→R is such thatR
w(t) dt= 1, andh∈Rd+\{0}, then, settingK(t, x) = Qd i=1h−i 1
w((t− x)/h), we come to the kernel estimator.
Density model. A linear estimator in the density model is given by the formula fb(x) = (1/n)Pn
j=1K Xj, x
, x∈ D0, and the ParzenRosenblatt kernel estimator (here againD⊆Rd) is
fb(x) = 1 n
Yd
i=1
h−i1 Xn
j=1
w
Xj−x h
.
HereE(n)f [fb(x)] =R
K(t, x)f(t)ν(dt).
It should emphasized that linear estimators not necessarily depend linearly on the observations (Example 3). The common feature, however, is that for each xedxthe expectation offb(x)is a linear functional off. We take this property as the denition of a linear estimator in the context of the general statistical experiment (see Denition 1 in Section 2).
Thus, in dierent models linear estimators are characterized by some weight function K. It is known that accuracy of linear estimators is determined by the weightK, or, for kernel estimators, by kernel wand bandwidthh. Therefore, irrespectively of the considered model, while constructing a kernel estimator we naturally face the crucial problem: How to select the kernel and the bandwidth of the kernel estimator?
Or , more generally, how to select the weight function while constructing the lienar estimator? Note that the risk (1) (or any reasonable upper bound on it) depends on the function to be estimatedf; therefore, direct optimization of the risk with respect to the weight function is impossible.
In this paper we propose a solution the the described problem which is based on selection an estimator from a xed family of linear estimators. In order to clarify our approach, suppose for a moment that we have a family F(K) of linear estimator, indexed by a set of the weight function, i.e. F(K) = {fbK, K ∈ K}. Here fbK is the linear estimator associated with the weight function K (the precise denition will be given below). For example, we can view K as the set of all weights of the type[Qd
i=1h−i1]wh(·/h), where kernel w is xed, and badwidth h∈ Rd+\ {0} takes values in a given interval [hmin, hmax] ⊂ Rd; other examples of families K will be
considered below. We propose a fullt datadriven selection ruleKb, and, under rather general conditions on the family F(K), we establish an upper bound on the risk of the selected estimatorfbKb. The obtained upper bound has the following form: for any functionf ∈F
(6) R(n)` [fbKb;f]5 inf
K∈KU`(n)(K, f) + ∆(n)` (K),
where quantityU`(n)(K, f)depends onK,f, while the remainder∆(n)` (K)is independent of off and completely determined by the familyK. Note that inequality (6) itself does not guarantee that the selected estimator is good. However anàlysis of the expression on the right hand side of (6) allows to derive many useful minimax, adaptive minimax è oracle results in dierent estimation settings. Derivation of such results from (6) relies upon the following ideas.
Îðàêóëüíûå íåðàâåíñòâà.  íåêîòîðûõ çàäà÷àõ ìîæíî ïîêàçàòü, ÷òî ñó- ùåñòâóþò êîíñòàíòûc1è c2 òàêèå, ÷òî
(i)U`(n)(K, f)5c1R(n)` [fbK;f]äëÿ âñåõf ∈Fè K∈ K;
(ii)∆(n)` (K)5c2infK∈KR(n)` [fbK;f]äëÿ âñåõf ∈F.
Òîãäà (i) è (ii) âìåñòå ñ (6) âëåêóò îðàêóëüíîå íåðàâåíñòâî äëÿ ðèñêà âûáðàí- íîé îöåíêè:
R(n)` [fbKb;f]5(c1+c2) inf
K∈KR(n)` [fbK;f].
Äðóãèìè ñëîâàìè, ðèñê âûáðàííîé îöåíêè ñ òî÷íîñòüþ äî ìóëüòèïëèêàòèâíîé êîíñòàíòû ñîâïàäàåò ñ ðèñêîì íàèëó÷øåé îöåíêè èç F(K). Ñëåäîâàòåëüíî, åñëè ñåìåéñòâîF(K)ñîäåðæèò õîðîøóþ îöåíêó, òî ïðàâèëî âûáîðà èìèòèðóåò ýòó õîðîøóþ îöåíêó ñ òî÷íîñòüþ äî ïîñòîÿííîãî ìíîæèòåëÿ.
Ìèíèìàêñíûé ïîäõîä. Â ðàìêàõ ìèíèìàêñíîãî ïîäõîäà ïðåäïîëàãàåòñÿ,
÷òî ôóíêöèÿf ïðèíàäëåæèò íåêîòîðîìó ïîäìíîæåñòâóΣìíîæåñòâàF. Îáû÷íî Σ ôóíêöèîíàëüíûé êëàññ, êîòîðûé âûáèðàåòñÿ ñòàòèñòèêîì, è ýòîò âûáîð îñóùåñòâëÿåòñÿ íà îñíîâå èìåþùåéñÿ àïðèîðíîé èíôîðìàöèè îá îöåíèâàåìîé ôóíêöèè. Òî÷íîñòü îöåíêèfbèçìåðÿåòñÿ íàèõóäøèì (ìàêñèìàëüíûì) ðèñêîì íà êëàññåΣ:
R(n)` [fb; Σ] = sup
f∈Σ
R(n)` [fb;f],
è öåëü ñîñòîèò â òîì, ÷òîáû íàéòè îïòèìàëüíóþ ïî ïîðÿäêó îöåíêó fb∗ òàêóþ,
÷òî
R(n)` [fb∗; Σ]ϕn(Σ) := inf
fb
R(n)` [fb; Σ], n→ ∞;
çäåñü èíôèìóì áåðåòñÿ ïî âñåì âîçìîæíûì îöåíêàì. Èçâåñòíî, ÷òî ëèíåéíûå îöåíêè ÿâëÿþòñÿ îïòèìàëüíûìè ïî ïîðÿäêó âî ìíîãèõ çàäà÷àõ äëÿ ðàçíûõ ïî- ëóíîðì`è ôóíêöèîíàëüíûõ êëàññîâ Σ.
Òåïåðü ïðåäïîëîæèì, ÷òî ìû ïîëó÷èëè íåðàâåíñòâî âèäà (6), è ÷òî âûïîë- íåíû ñëåäóþùèå óñëîâèÿ:
(i) ñóùåñòâóåò âåñîâàÿ ôóíêöèÿ K∗ ∈ K òàêàÿ, ÷òî supf∈ΣU`(n)(K∗, f) ϕn(Σ)ïðèn→ ∞;
(ii)∆(n)` (K) =O(ϕn(Σ))ïðèn→ ∞.
Òîãäà îðàêóëüíîå íåðàâåíñòâî (6) âëå÷åòR(n)` [fbKb; Σ]ϕn(Σ), ò.å. âûáðàííàÿ îöåíêàfbKb ÿâëÿåòñÿ îïòèìàëüíîé ïî ïîðÿäêó íàΣ.
Ìèíèìàêñíîå àäàïòèâíîå îöåíèâàíèå. Îñíîâíîé íåäîñòàòîê ìèíèìàêñ- íîãî ïîäõîäà ñîñòîèò â òîì, ÷òî âûáîð âåñîâîé ôóíêöèè îïðåäåëÿåòñÿ òîëüêî ôóíêöèîíàëüíûì êëàññîìΣè ëèíåéíàÿ îöåíêà, êîòîðàÿ îïòèìàëüíà ïî ïîðÿäêó íà êëàññåΣ, îáû÷íî íåîïòèìàëüíà íà äðóãîì êëàññåΣ0.
Ýòîò ôàêò ìîòèâèðóåò ðàçðàáîòêó àäàïòèâíûõ îöåíîê, êîòîðûå îïòèìàëü- íû ïî ïîðÿäêó íà íåêîòîðîé øêàëå ôóíêöèîíàëüíûõ êëàññîâ {Σs, s ∈S}, à íå òîëüêî íà îäíîì êëàññåΣ. Çäåñüs òàê íàçûâàåìûé ìåøàþøèé ïàðàìåòð èS ìíîæåñòâî ìåøàþùèõ ïàðàìåòðîâ. Äðóãèìè ñëîâàìè, ìû õîòèì íàéòè îöåíêófb∗ òàêóþ, ÷òî äëÿ ëþáîãîs∈S èìååò ìåñòî
(7) R(n)` [fb∗; Σs]ϕ(Σs), n→ ∞.
Âïåðâûå îöåíêè, óäîâëåòâîðÿþùèå (7), áûëè ïîëó÷åíû â ðàáîòå [1] â ìîäåëè ãàóññîâñêîãî áåëîãî øóìà, êîãäà{Σs, s∈S} øêàëà êëàññîâ Ñîáîëåâà è`(g) = kgk2.  çàäà÷àõ ãëîáàëüíîãî îöåíèâàíèÿ (`(g) = kgkp, 1 5p5∞), àäàïòèâíûå ìèíèìàêñíûå îöåíêè áûëè ïðåäëîæåíû â ñòàòüå [7] äëÿ ñëó÷àÿ, êîãäà {Σs, s ∈ S} øêàëà êëàññîâ Ãåëüäåðà.
Èñïîëüçóÿ íåðàâåíñòâî (6), ëåãêî ïîêàçàòü, ÷òî îöåíêàfbKb óäîâëåòâîðÿåò (7), åñëè äëÿ âñåõs∈S âûïîëíÿþòñÿ ñëåäóþùèå óñëîâèÿ:
(i) äëÿ ëþáîãîs∈Sñóùåñòâóåò âåñîâàÿ ôóíêöèÿKs∈ Kòàêàÿ, ÷òîsupf∈Σ
sU`(n)(Ks, f) ϕn(Σs)ïðèn→ ∞;
(ii)∆(n)` (K) =O(infs∈Sϕn(Σs))ïðèn→ ∞.
Îäíàêî ñëåäóåò ïîä÷åðêíóòü, ÷òî ñîîòíîøåíèå (7) íå âñåãäà âûïîëíåíî. Íà- ïðèìåð, â çàäà÷å ïîòî÷å÷íîãî îöåíèâàíèÿ (`(g) =g(x)) íå ñóùåñòâóåò îöåíêè, êî- òîðàÿ ÿâëÿåòñÿ îïòèìàëüíîé ïî ïîðÿäêó îäíîâðåìåííî íà äâóõ êëàññàõ Ãåëüäåðà H(α1, L1)èH(α2, L2)ñα16=α2(ñì. [6]). Òåì íå ìåíåå ñóùåñòâóþò îöåíêè, ïî÷òè îïòèìàëüíûå ïî ïîðÿäêó (ñ òî÷íîñòüþ äî ëîãàðèôìè÷åñêîãî ïî n ìíîæèòåëÿ) íà øêàëå êëàññîâ Ãåëüäåðà, è ýòîò ëîãàðèôìè÷åñêèé ìíîæèòåëü íå ìîæåò áûòü óñòðàíåí. Çäåñü âàæíî îòìåòèòü, ÷òî îöåíêè, äîñòèãàþùèå íàèëó÷øåé ñêîðîñòè ñõîäèìîñòè íà øêàëå êëàññîâ, äàþòñÿ íåêîòîðîé ïðîöåäóðîé ñëó÷àéíîãî (èçìå- ðèìîãî ïî íàáëþäåíèþ) âûáîðà èç ñåìåéñòâà ëèíåéíûõ îöåíîê.
 íàñòîÿùåé ðàáîòå ìû ïðåäëàãàåì îáùóþ ïðîöåäóðó âûáîðà îöåíêè èç çà- äàííîãî ñåìåéñòâà ëèíåéíûõ îöåíîê. Íàøà ïðîöåäóðà ìîæåò áûòü ïðèìåíåíà ê ëþáîìó ñòàòèñòè÷åñêîìó ýêñïåðèìåíòó è ëþáîìó ñåìåéñòâó ëèíåéíûõ îöåíîê, óäîâëåòâîðÿþùåìó íåêîòîðîìó óñëîâèþ êîììóòàòèâíîñòè (ñì. ðàçä. 2).  ÷àñò- íîñòè, óïîìÿíóòîå óñëîâèå êîììóòàòèâíîñòè âûïîëíÿåòñÿ äëÿ ëþáûõ ÿäåðíûõ îöåíîê (åñëè èãíîðèðîâàòü ãðàíè÷íûå ýôôåêòû â ñõåìàõ, ãäå îíè âîçíèêàþò).
Ìû âûâîäèì âåðõíþþ ãðàíèöó íà ðèñê âûáðàííîé îöåíêè è ïîêàçûâàåì, êàê ýòà ãðàíèöà ìîæåò áûòü èñïîëüçîâàíà äëà ïîëó÷åíèÿ ìèíèìàêñíûõ è àäàïòèâíûõ ìèíèìàêñíûõ ðåçóëüòàòîâ â ðàçíûõ çàäà÷àõ (ñì. ðàçä. 5). Ïðîöåäóðà âûáîðà, ïðåäñòàâëåííàÿ â ýòîé ñòàòüå, îáîáùàåò è ðàçâèâàåò èäåè, ðàçðàáîòàííûå â [13], [14], [16] äëÿ ìîäåëåé ãàóññîâñêîãî áåëîãî øóìà è ïëîòíîñòè.
Ñòàòüÿ îðãàíèçîâàíà ñëåäóþùèì îáðàçîì.  ðàçä. 2 ìû îáñóæäàåì ñâîéñòâà ëèíåéíûõ îöåíîê è ñâÿçàííûõ ñ íèìè âåñîâûõ ôóíêöèé â êîíòåêñòå îáùåãî ñòàòè- ñòè÷åñêîãî ýêñïåðèìåíòà. Ïðåäëàãàåìàÿ ïðîöåäóðà âûáîðà òðåáóåò óñòàíîâëåíèÿ ðàâíîìåðíûõ âåðõíèõ ãðàíèö (ìàæîðàíò) äëÿ ïîëóíîðìû íåêîòîðûõ ñëó÷àéíûõ ïðîöåññîâ. Ñîîòâåòñòâóþùèå îïðåäåëåíèÿ è ïîíÿòèÿ ââîäÿòñÿ â ðàçä. 3.  ðàçä. 4
ìû îïðåäåëÿåì ïðîöåäóðó âûáîðà, ôîðìóëèðóåì è äîêàçûâàåì îñíîâíûå ðåçóëü- òàòû ýòîé ñòàòüè. Ðàçäåë 5 èëëþñòðèðóåò ïðèëîæåíèå îñíîâíûõ ðåçóëüòàòîâ ê çàäà÷àì îöåíèâàíèÿ ïëîòíîñòè âLpè îöåíèâàíèÿ â ìîäåëè projection pursuit ñ íàáëþäåíèÿìè â áåëîì øóìå.
2. Ëèíåéíûå îöåíêè.
2.1. Ëèíåéíûå îöåíêè è ñâÿçàííûå ñ íèìè âåñîâûå ôóíêöèè. Ïóñòü fb îöåíêà ôóíêöèè f íà îñíîâå íàáëþäåíèÿ X(n), ò.å. fbåñòü B(n)-èçìåðèìîå îòîáðàæåíèå èçX(n)âF0. Ïðåäïîëîæèì, ÷òî ìàòåìàòè÷åñêîå îæèäàíèåE(n)f [fb(x)], x∈D, ñóùåñòâóåò äëÿ âñåõf ∈Fè E(n)f [f(b·)]∈F0.
Definition 1. Îöåíêà fbôóíêöèèf íàçûâàåòñÿ ëèíåéíîé, åñëè ñóùåñòâóþò ôóíêöèÿK: D×D→Rèσ-êîíå÷íàÿ ìåðàµíàDòàêèå, ÷òî
(8) E(n)f bf(x)
= Z
D
K(t, x)f(t)µ(dt) ∀f ∈F, ∀x∈D.
 äàëüíåéøåì ëþáàÿ ëèíåéíàÿ îöåíêà áóäåò îáîçíà÷àòüñÿfbK, ñ ÿâíûì óêàçàíèåì ñîîòâåòñòâóþùåé ôóíêöèèK èç îïðåäåëåíèÿ (8).
Çàìåòèì, ÷òîµìîæåò ñîâïàäàòü èëè íå ñîâïàäàòü ñ ìåðîé ν, îïðåäåëåííîé âûøå. Âàæíûìè ïðèìåðàìè ìåðûµÿâëÿþòñÿ: (i) ìåðà Ëåáåãà íàD⊂Rd; (ii) ñ÷è- òàþùàÿ ìåðàPn
i=1δZi, ãäåZi ∈D, i= 1, . . . , n, çàäàííàÿ ïîñëåäîâàòåëüíîñòü èδz ìàññà Äèðàêà.
Èòàê, îöåíêà fbëèíåéíà, åñëè äëÿ âñÿêîãî x∈D ìàòåìàòè÷åñêîå îæèäàíèå fb(x)ÿâëÿåòñÿ ëèíåéíûì ôóíêöèîíàëîì îòf. Íàçîâåì
SK(x) = Z
D
K(t, x)f(t)µ(dt)
ëèíåéíûì ñãëàæèâàòåëåì, êîòîðûé áóäåì ïîíèìàòü êàê àïïðîêñèìàöèþ ôóíê- öèèf â òî÷êåx.
Òî÷íîñòü ëèíåéíîé îöåíêèfbKõàðàêòåðèçóåòñÿ ñìåùåíèåì (îøèáêîé àïïðîê- ñèìàöèèf ëèíåéíûì ñãëàæèâàòåëåìSK)
(9) BK(f, x) =BK(x) =E(n)f bfK(x)
−f(x) =SK(x)−f(x) è ñòîõàñòè÷åñêîé îøèáêîé
ξK(f, x) =ξK(x) =fbK(x)−E(n)f bfK(x) .
 ÷àñòíîñòè,fbK(x)−f(x) =BK(f, x)+ξK(f, x), è â ñèëó íåðàâåíñòâà òðåóãîëüíèêà R(n)` [fbK;f]5`(BK) +
n E(n)f
`(ξK)qo1/q
.
Ïîñëåäíåå íåðàâåíñòâî îáû÷íî íàçûâàåòñÿ â ëèòåðàòóðå bias-variance decomposition.
Ïåðåéäåì ê îáñóæäåíèþ íåêîòîðûõ åñòåñòâåííûõ ñâîéñòâ, êîòîðûìè äîëæíà îáëàäàòü ôóíêöèÿK â (8).
Definition 2. ÏóñòüD0 ⊆D. Ôóíêöèÿ K: D×D →Ríàçûâàåòñÿ âåñîâîé ôóíêöèåé èëè ïðîñòî âåñîì, åñëè
Z
D
K(t, x)µ(dt) = 1 ∀x∈D0.
Ìíîæåñòâî âñåõ òàêèõ âåñîâûõ ôóíêöèé áóäåì îáîçíà÷àòü W(D, D0). Äëÿ ëþ- áîãî âåñàK ìû èìååì
BK(f, x) = Z
D
K(t, x)[f(t)−f(x)]µ(dt), x∈D0;
ïîýòîìó ñìåùåíèå ïîñòîÿííîé ôóíêöèè òîæäåñòâåííî ðàâíî íóëþ. Âåñîâûå ôóíê- öèè ÿâëÿþòñÿ êëþ÷åâûìè ýëåìåíòàìè ïðè ïîñòðîåíèè ëèíåéíûõ îöåíîê â ðàç- ëè÷íûõ íåïàðàìåòðè÷åñêèõ ìîäåëÿõ. Ìû èëëþñòðèðóåì ýòîò ôàêò ñ ïîìîùüþ ïðèìåðîâ 13, ðàññìîòðåííûõ â ðàçä. 1.
Examples. 1. Ìîäåëü ðåãðåññèè ñ äåòåðìèíèðîâàííûì ïëàíîì. Ðàññìîòðèì ìîäåëü ðåãðåññèè (2), ãäå Zi, i= 1, . . . , n, äåòåðìèíèðîâàííûå ýëåìåíòû â D. Î÷åâèäíî, ÷òî äëÿ ëþáîé ôóíêöèèK: D×D→RîöåíêàfbK(·) =Pn
i=1K(Zi,·)Yi ÿâëÿåòñÿ ëèíåéíîé. Äåéñòâèòåëüíî,
E(n)f bfK(·)
= Xn
i=1
K(Zi,·)f(Zi) = Z
D
K(t,·)f(t)µ(dt), ãäåµ ñ÷èòàþùàÿ ìåðà íàD.
2. Ìîäåëü ãàóññîâñêîãî áåëîãî øóìà. Ðàññìîòðèì ìîäåëü, ââåäåííóþ â ïðèìå- ðå 2 èç ðàçä. 1. Â ñèëó (4), äëÿ ëþáîé ôóíêöèèK: D×D→R, óäîâëåòâîðÿþùåé K(·, x)∈L2(D, ν), îöåíêà fbK(·) =Yn K(·, x)
ëèíåéíà, è E(n)f bfK(x)
= Z
D
K(t, x)f(t)ν(dt).
Òàêèì îáðàçîì, â ýòîì ïðèìåðå ìåðû µèν ñîâïàäàþò.
3. Ìîäåëü ïëîòíîñòè. Â ïðèìåðå 3 èç ðàçä. 1 îöåíêàfbK(·) =Pn
j=1K(·, Xj) ëèíåéíà, ïîñêîëüêó
(10) E(n)f bfK(·)
= Z
D
K(t,·)f(t)ν(dt).
Ýòà îöåíêà ìîæåò áûòü ïîñòðîåíà äëÿ ëþáîé ôóíêöèè K: D ×D → R, äëÿ êîòîðîé èíòåãðàë â (10) îïðåäåëåí. Çàìåòèì, ÷òî çäåñü îïÿòüµ=ν. Ýòîò ïðèìåð äåìîíñòðèðóåò, ÷òî ëèíåéíàÿ îöåíêà íå îáÿçàòåëüíî ëèíåéíà ïî íàáëþäåíèÿì.
 ñëåäóþùåì îïðåäåëåíèè ìû ââîäèì íåêîòîðîå ïîäìîæåñòâî ìíîæåñòâà W(D, D0). ÏóñòüD1 ìíîæåñòâî òàêîå, ÷òîD0⊆D1⊆D.
Definition 3. ÔóíêöèÿK: D×D→Ròàêàÿ, ÷òî Z
D
K(t, x)µ(dt) = 1 ∀x∈D1, supp
K(·, x) ⊆D1 ∀x∈D0,
íàçûâàåòñÿ D1-âåñîâîé ôóíêöèåé èëè D1-âåñîì. Ìíîæåñòâî âñåõ D1-âåñîâ îáî- çíà÷àåòñÿWD1(D, D0).
ßñíî, ÷òî ëþáîéD1-âåñ ÿâëÿåòñÿ âåñîì. Äåéñòâèòåëüíî, ïåðâîå ñîîòíîøåíèå â îïðåäåëåíèè âëå÷åò, ÷òî WD1(D, D0) ⊂ W(D, D1), è W(D, D1) ⊂ W(D, D0), òàê êàê D0 ⊆ D1. Äëÿ D ⊆ Rd òèïè÷íûé ïðèìåð D1-âåñà äàåòñÿ ôóíêöèåé K: D×D→R, êîòîðàÿ óäîâëåòâîðÿåò ñëåäóþùèì óñëîâèÿì:
(i) äëÿ äåéñòâèòåëüíîãî ÷èñëàδ >0 ïðåäïîëîæèì, ÷òî K(t, x) = 0 ∀t, x∈Rd: kt−xk=δ,
ãäåk · k åâêëèäîâà íîðìà;
(ii)R
RdK(t, x)µ(dt) = 1 ∀x∈Rd.
Äëÿ ôèêñèðîâàííûõ èíòåðâàëîâ D0 ⊂ D1 ⊂ D ⊂ Rd íàéäåòñÿ äîñòàòî÷íî ìàëîåδòàêîå, ÷òîK ÿâëÿåòñÿD1-âåñîì.
2.2. Êîììóòàòèâíàÿ ñèñòåìà âåñîâ. Ñíàáäèì ìíîæåñòâî âñåõD1-âåñîâ WD1(D, D0)ñëåäóþùåé îïåðàöèåé: äëÿ ëþáîé ïàðûK, K0∈ WD1(D, D0)îïðåäå- ëèì
[K⊗K0](·,·) = Z
D1
K(·, y)K0(y,·)µ(dy).
Definition 4. Áóäåì ãîâîðèòü, ÷òî K ∈ WD1(D, D0) è K0 ∈ WD1(D, D0) êîììóòèðóþò, åñëè
[K⊗K0](·, x)≡[K0⊗K](·, x) µ-ï.â., ∀x∈D.
ÏîäìíîæåñòâîKìíîæåñòâàWD1(D, D0)íàçûâàåòñÿ êîììóòàòèâíîé ñèñòåìîé âåñîâ, åñëè ëþáàÿ ïàðà åãî ýëåìåíòîâ êîììóòèðóåò.
Êîììóòàòèâíûå ñèñòåìû âåñîâ èãðàþò âàæíóþ ðîëü ïðè ïîñòðîåíèè è â àíà- ëèçå íàøåé ïðîöåäóðû âûáîðà.  ÷àñòíîñòè, ïðè äîâîëüíî ñëàáûõ òåõíè÷åñêèõ óñëîâèÿõ ïðîöåäóðà âûáîðà ìîæåò áûòü ïðèìåíåíà ê ëþáîìó ñåìåéñòâó ëèíåé- íûõ îöåíîê, ïîðîæäåííîìó êîììóòàòèâíîé ñèñòåìîé âåñîâ.
Ïðèâåäåì íåñêîëüêî ïðèìåðîâ êîììóòàòèâíûõ ñèñòåì âåñîâ.
Example 4. Ïóñòü D = [−a, a]d, D0 = [−a0, a0]d è D1 = [−a1, a1]d, ãäå a >
a1 > a0 > 0 çàäàííûå äåéñòâèòåëüíûå ÷èñëà. Ïîëîæèì µ(dt) = dt, è ïóñòü Wδ ìíîæåñòâî íåïðåðûâíûõ ôóíêöèéw: Rd→Ròàêèõ, ÷òîR
Rdw(t) dt= 1è supp{w} ⊆[−δ, δ]d, ãäå0< δ <min{a−a1, a1−a0}. Ïóñòü
K[Wδ] = n
K: Rd×R→R: K(t, x) =w(t−x), w∈ Wδ
o .
Ëåãêî âèäåòü, ÷òî ìíîæåñòâîK[Wδ] ÿâëÿåòñÿ êîììóòàòèâíîé ñèñòåìîé âåñîâ. Â ñàìîì äåëå, èíòåãðèðîâàíèå â îïðåäåëåíèÿõ D1-âåñà è[K⊗K0] ìîæåò áûòü çà- ìåíåíî èíòåãðèðîâàíèåì ïî âñåìó ïðîñòðàíñòâóRd. Òàêèì îáðàçîì, îïåðàöèÿ⊗ ïðåäñòàâëÿåò ñîáîé ñâåðòêó∗, è ïîýòîìó
[K⊗K0] = [K∗K0] = [K0∗K] = [K0⊗K].
Íà ñàìîì äåëå, ñ òî÷íîñòüþ äî ãðàíè÷íûõ ýôôåêòîâ, ëþáîé íàáîð âåñîâ, ñîîòâåòñòâóþùèé ÿäåðíûì îöåíêàì, îáðàçóåò êîììóòàòèâíóþ ñèñòåìó âåñîâ. Çà- èíòåðåñîâàííîãî ÷èòàòåëÿ ìû îòñûëàåì ê ñòàòüå [13], ãäå ïðèâåäåíû ðàçëè÷íûå ïðèìåðû òàêèõ êîëëåêöèé, ñîîòâåòñòâóþùèõ ÿäåðíûì îöåíêàì â ñòðóêòóðíûõ ìîäåëÿõ.
Example 5. ÏóñòüΛ ïîäìíîæåñòâî ïðîñòðàíñòâàl2, è ïóñòü{ψk, k∈Nd} îðòîíîðìèðîâàííûé áàçèñ âL2 D, ν
, îáëàäàþùèé ñëåäóþùèìè ñâîéñòâàìè:
ψ0≡c6= 0, Z
D
ψk(t)ν(dt) = 0 ∀k6= 0 = (0, . . . ,0).
 ÷àñòíîñòè, òåíçîðíîå ïðîèçâåäåíèå îäíîìåðíûõ òðèãîíîìåòðè÷åñêèõ áàçèñîâ óäîâëåòâîðÿåò ýòèì óñëîâèÿì.
Ïóñòüλ= λk, k∈Nd
; ðàññìîòðèì ñåìåéñòâî âåñîâ K[Λ] =
n
Kλ: D×D→R: Kλ(t, x) = X
k∈Nd
λkψk(t)ψk(x), λ∈Λ o
.
Åñëè λ0ν(D) = c−1 äëÿ âñåõ λ ∈ Λ, òî Kλ ÿâëÿåòñÿ D-âåñîì. Òîãäà K Λ êîììóòàòèâíàÿ ñèñòåìà âåñîâ äëÿ ëþáîãîΛ ⊆l2, åñëè âûáðàòüµ =ν. Â ñàìîì äåëå, äëÿ âñåõλ, λ0 ∈Λèìååì
[Kλ⊗Kλ0](t, x) = Z
D
X
k∈Nd
λkψk(t)ψk(y) X
r∈Nd
λ0rψr(y)ψr(x)
ν(dy)
= X
k∈Nd,r∈Nd
λkλ0rψk(t)ψr(x) Z
D
h
ψk(y)ψr(y) i
ν(dy)
= X
k∈Nd
λkλ0kψk(t)ψk(x) = [Kλ0⊗Kλ](t, x).
Example 6. ÏóñòüD0=
Z1, . . . , Zn , ãäåZi, i= 1, . . . , n, ôèêñèðîâàííûå òî÷êè âD. Ïóñòüµ=Pn
i=1δZiè ïðåäïîëîæèì, ÷òî ìû èíòåðåñóåìñÿ îöåíèâàíèåì ôóíêöèè f â òî÷êàõ ïëàíà D0. Ýòî òèïè÷íàÿ çàäà÷à, âîçíèêàþùàÿ â ìîäåëè ðåãðåññèè. ÏóñòüWåñòü íåêîòîðîå çàäàííîå ìíîæåñòâî ôóíêöèéw: D×D→R; òîãäà ìíîæåñòâî âåñîâ ìîæåò áûòü âûáðàíî êàê ìíîæåñòâî(n×n)-ìàòðèö
K= n
K={w(Zi, Zj)}i,j=1,...,n, w∈ Wo .
Ñëåäîâàòåëüíî, K ⊗K0 = KK0, è ïîýòîìó K ÿâëÿåòñÿ êîììóòàòèâíîé ñèñòå- ìîé âåñîâ òîãäà è òîëüêî òîãäà, êîãäà ëþáàÿ ïàðà ìàòðèö èç K êîììóòèðóþò.
Íåîáõîäèìîå è äîñòàòî÷íîå óñëîâèå äëÿ ýòîãî ñîñòîèò â òîì, ÷òî âñå ìàòðèöû â K îäíîâðåìåííî äèàãîíàëèçóåìû [19, òåîðåìà 1.3.19].  ñòàòüå [22] ïðèâåäåíû ìíîãî÷èñëåííûå ïðèìåðû ëèíåéíûõ ñãëàæèâàòåëåé, êîòîðûå ñâÿçàíû ñ êîììó- òèðóþùèìè ìàòðèöàìè.
Ñëåäóþùåå ïðîñòîå óòâåðæäåíèå ÿâëÿåòñÿ áàçîâûì äëÿ ïðåäëàãàåìîãî ïðà- âèëà âûáîðà.
Lemma 1. Ïóñòü K êîììóòàòèâíàÿ ñèñòåìà âåñîâ; òîãäà äëÿ ëþáîé ïàðû âåñîâK, K0 ∈ K è äëÿ ëþáûõf ∈F èx∈D0
[SK⊗K0(x)−SK0(x)] = Z
D
K0(y, x)BK(f, y)µ(dy), (11)
[SK⊗K0(x)−SK(x)] = Z
D
K(y, x)BK0(f, y)µ(dy).
(12)
Proof. Óñòàíîâèì ñíà÷àëà (11). Èñïîëüçóÿ òåîðåìó Ôóáèíè è îïðåäåëåíèåD1- âåñà, èìååì äëÿ ëþáîãîx∈D0
Z
D
[K⊗K0](t, x)f(t)µ(dt) = Z
D
Z
D1
K(t, y)K0(y, x)µ(dy)
f(t)µ(dt)
= Z
D1
K0(y, x)f(y)µ(dy) +
Z
D1
K0(y, x) Z
D
K(t, y)(f(t)−f(y))µ(dt)
µ(dy).
Îñòàåòñÿ çàìåòèòü, ÷òîR
DK(t, y)[f(t)−f(y)]µ(dt) =BK(f, y)è ÷òî, ïî îïðå- äåëåíèþD1-âåñà, èíòåãðèðîâàíèå âåñàK(·, x), x∈D0, íà D1 ìîæåò áûòü çàìå- íåíî èíòåãðèðîâàíèåì íàD.
Ïîñêîëüêó (11) óñòàíîâëåíî äëÿ ïðîèçâîëüíîé ïàðû âåñîâ è, â ñèëó êîììó- òàòèâíîñòè,SK⊗K0 =SK0⊗K, ïðèõîäèì ê (12). Ëåììà äîêàçàíà.
3. Ìàæîðàíòû. Íàøå ïðàâèëî âûáîðà òðåáóåò óñòàíîâëåíèÿ âåðõíèõ ôóíê- öèé (êîòîðûå ìû íàçûâàåì ìàæîðàíòàìè) äëÿ ñåìåéñòâ ñëó÷àéíûõ âåëè÷èí, çàèíäåêñèðîâàííûõ âåñàìè è ñâÿçàííûõ ñî ñòîõàñòè÷åñêèìè îøèáêàìè ëèíåé- íûõ îöåíîê. Ìàæîðàíòû ìîãóò áûòü äåòåðìèíèðîâàííûìè èëè ñëó÷àéíûìè â çàâèñèìîñòè îò ðàññìàòðèâàåìîé çàäà÷è.  ýòîì ðàçäåëå ìû äàåì íåîáõîäèìûå îïðåäåëåíèÿ è ôîðìóëèðóåì îáùèå óñëîâèÿ íà ìàæîðàíòû.
ÏóñòüK íåêîòîðàÿ ñèñòåìà âåñîâ; äëÿ ëþáîé ïàðûK, K0∈ K ïîëîæèì ζK,K0 =` ξK⊗K0−ξK
∨` ξK0⊗K−ξK0 .
Ââåäåì òåïåðü îïðåäåëåíèå ðàâíîìåðíîé âåðõíåé ãðàíèöû äëÿ ñåìåéñòâà ñëó÷àé- íûõ âåëè÷èí
{ζK,K0, K, K0∈ K}.
Definition 5. Ïóñòü δ ∈ (0,1) ôèêñèðîâàííîå ÷èñëî. Áóäåì ãîâîðèòü,
÷òî ñåìåéñòâîB(n)-èçìåðèìûõ ïîëîæèòåëüíûõ ôóíêöèé {MK,K0(δ), K, K0∈ K}
ÿâëÿåòñÿδ-ìàæîðàíòîé, åñëè äëÿ âñåõf ∈Fè n∈N∗ (i)supK,K0∈K[ζK,K0−MK,K0(δ)]ÿâëÿåòñÿB(n)-èçìåðèìûì;
(ii)E(n)f {supK,K0∈K[ζK,K0−MK,K0(δ)]q+}5δq. Çäåñü[a]+= max(a,0),a∈R.
Çàìåòèì, ÷òîδ-ìàæîðàíòà îïðåäåëåíà íå åäèíñòâåííûì îáðàçîì. Îäíàêî îñ- íîâíûå ðåçóëüòàòû ýòîé ñòàòüè, ïðåäñòàâëåííûå â ï. 4.2, ñïðàâåäëèâû äëÿ ëþáîé δ-ìàæîðàíòû. Êîíå÷íî, â êîíêðåòíûõ çàäà÷àõ ìû èíòåðåñóåìñÿ ìèíèìàëüíûìè ìàæîðàíòàìè. Îñíîâíûì ñðåäñòâîì äëÿ èõ íàõîæäåíèÿ ÿâëÿþòñÿ ýêñïîíåíöè- àëüíûå íåðàâåíñòâà äëÿ ïîëóíîðì ãàóññîâñêèõ è ýìïèðè÷åñêèõ ïðîöåññîâ. Ìû îòñûëàåì ÷èòàòåëÿ ê íåäàâíåé ñòàòüå [15], ïîñâÿùåííîé ýòîé òåìàòèêå. Ïî îïðå- äåëåíèþ, δ-ìàæîðàíòà ÿâëÿåòñÿ ôóíêöèåé íàáëþäåíèÿ X(n). Çàìåòèì, îäíàêî,
÷òî â çàäà÷àõ, ãäå ðàñïðåäåëåíèå ñòîõàñòè÷åñêîé îøèáêè íå çàâèñèò îò îöåíè- âàåìîé ôóíêöèè (ìîäåëè ãàóññîâñêîãî áåëîãî øóìà è ðåãðåññèè), δ-ìàæîðàíòà MK,K0(δ)÷àñòî ìîæåò áûòü âûáðàíà äåòåðìèíèðîâàííîé.
4. Ïðàâèëî âûáîðà è îñíîâíûå ðåçóëüòàòû. Â ýòîì ðàçäåëå ìû îïðå- äåëÿåì ïðàâèëî âûáîðà èç ñåìåéñòâà ëèíåéíûõ îöåíîêF(K) ={fbK, K∈ K}, ãäå K êîììóòàòèâíàÿ ñèñòåìà âåñîâ.
Îáîçíà÷èì
MK∗(δ) = sup
K0∈KMK0,K(δ) ∀K∈ K
è áóäåì ñ÷èòàòü, ÷òî âåëè÷èíàMK∗(δ)ÿâëÿåòñÿB(n)-èçìåðèìîé äëÿ âñåõK∈ K.
4.1. Ïðàâèëî âûáîðà. Äëÿ ëþáîãîK∈ K ïîëîæèì RbK= sup
K0∈K
h
` fbK⊗K0−fbK0
−MK,K0(δ) i
+ 2MK∗(δ)
è îïðåäåëèì ïðàâèëî âûáîðà ñëåäóþùèì îáðàçîì:
K= arg inf
K∈KRbK. (13)
Åñëè K∈ K è K ÿâëÿåòñÿ B(n)-èçìåðèìûì, òî ïðàâèëî âûáîðà (13) ïðèâîäèò ê îöåíêå
(14) f¯=fbK.
Remark 1. Çàìåòèì, ÷òî RbK = MK∗(δ) äëÿ âñåõ K ∈ K. Ýòî íåðàâåíñòâî íåìåäëåííî ñëåäóåò èç îïðåäåëåíèéRbK èMK∗(δ):
RbK=h
` fbK⊗K−fbK
−MK,K(δ) i
+ 2MK∗(δ)=MK∗(δ).
 ÷àñòíîñòè, ýòî îçíà÷àåò, ÷òî ôóíêöèîíàë, êîòîðûé ìèíèìèçèðóåòñÿ â (13), ïîëîæèòåëåí.
Remark 2. Õîòÿ K êîììóòàòèâíàÿ ñèñòåìà âåñîâ, ÷òî âëå÷åò SK⊗K0 = SK0⊗K, ðàâåíñòâîfbK⊗K0 =fbK0⊗Kìîæåò íå âûïîëíÿòüñÿ. Îäíàêî åñëè ïîñëåäíåå ðàâåíñòâî èìååò ìåñòî, òîRbK â (13) ìîæíî ïåðåîïðåäåëèòü ñëåäóþùèì îáðàçîì:
RbK = sup
K0∈K
h
` fbK⊗K0−fbK0
−MK,K0(δ) i
+MK∗(δ).
Ëåãêî âèäåòü, ÷òî â ýòîì ñëó÷àåRbK=0 äëÿ âñåõK∈ K.
Remark 3. Íàøè îñíîâíûå ðåçóëüòàòû óñòàíàâëèâàþòñÿ äëÿ îáùåãî ñòàòè- ñòè÷åñêîãî ýêñïåðèìåíòà, è ïîýòîìó ìû ïðåäïîëàãàåì, ÷òîδ-ìàæîðàíòà çàäàíà.  îòëè÷èå îò ñàìîãî ïðàâèëà âûáîðà, êîòîðîå íå çàâèñèò îò ìîäåëè,δ-ìàæîðàíòà îïðåäåëÿåòñÿ êîíêðåòíîé ìîäåëüþ, è åå íàõîæäåíèå ìîæåò áûòü íåïðîñòîé çà- äà÷åé. Ïðèìåðû δ-ìàæîðàíò äëÿ íåêîòîðûõ êîíêðåòíûõ çàäà÷ ïðåäñòàâëåíû â ðàçä. 5.
Ïîä÷åðêíåì, ÷òî äîêàçàòåëüñòâî èçìåðèìîñòè è ïðèíàäëåæíîñòè K ñåìåé- ñòâó K ìîæåò áûòü óïðîùåíî, åñëè íåìíîãî ïåðåîïðåäåëèòü ïðàâèëî âûáîðà.
Äåéñòâèòåëüíî, ïóñòüKb ∈ Kòàêîâî, ÷òî äëÿ ôèêñèðîâàííîãî ÷èñëàτ >0 ñïðà- âåäëèâî íåðàâåíñòâî
RbKb 5 inf
K∈KRbK+τ.
Çàìåòèì, ÷òî ñóùåñòâîâàíèå òàêîãî Kb íå òðåáóåò äîêàçàòåëüñòâà, è ê òîìó æå ïðîáëåìà èçìåðèìîãî âûáîðà òàêæå ñóùåñòâåííî óïðîùàåòñÿ. Èòàê, åñëè âåñKb ÿâëÿåòñÿB(n)-èçìåðèìûì, òî ïðèõîäèì ê îöåíêå
fb=fbKb,
äëÿ êîòîðîé ìû è áóäåì äîêàçûâàòü íàø îñíîâíîé ðåçóëüòàò, òåîðåìó 1.
4.2. Îñíîâíûå ðåçóëüòàòû. Ââåäåì ñëåäóþùåå îáîçíà÷åíèå: äëÿ ëþáîãî K∈ K ïîëîæèì
EK(`, f) = sup
K0∈K` Z
D
K0(t,·)BK(f, t)µ(dt)
,
ãäåBK(f,·) ñìåùåíèå ëèíåéíîé îöåíêè, ñâÿçàííîé ñ âåñîìK (ñì. (9)).
Theorem 1. Ïóñòü K êîììóòàòèâíàÿ ñèñòåìà âåñîâ è F(K) ñîîò- âåòñòâóþùåå ñåìåéñòâî ëèíåéíûõ îöåíîê. Ïðåäïîëîæèì, ÷òî äëÿ ëþáîãîτ >0 ñóùåñòâóåòB(n)-èçìåðèìûé âåñKb ∈ K, óäîâëåòâîðÿþùèé íåðàâåíñòâó
RbKb 5 inf
K∈KRbK+τ.
Òîãäà äëÿ ëþáûõ f ∈F,n∈N∗,δ >0 è τ >0 R(n)` [fbKb;f]5 inf
K∈K
nR(n)` [fbK;f] + 3EK(`, f)
+ 5
E(n)f [MK∗(δ)]q 1/qo
+ 3δ+ 2τ.
Proof. Èñïîëüçóÿ íåðàâåíñòâî òðåóãîëüíèêà, ìû ìîæåì íàïèñàòü äëÿ ëþáîãî K∈ K
` fbKb−f
5` fbKb −fbKb⊗K
+` fbKb⊗K−fbK
+` fbK−f .
Áóäåì îöåíèâàòü îòäåëüíî êàæäîå ñëàãàåìîå â ïðàâîé ÷àñòè ïîñëåäíåãî íåðàâåí- ñòâà.
 ñèëó ëåììû 1 èìååì äëÿ ëþáîãîK∈ K RbK−2MK∗(δ) = sup
K0∈K
` fbK⊗K0−fbK0
−MK,K0(δ) 5EK(`, f) + sup
K0∈K
` ξK⊗K0−ξK0
−MK,K0(δ) 5EK(`, f) + sup
K,K0∈K
ζK,K0−MK,K0(δ) +.
Òàêèì îáðàçîì,
(15) RbK 5EK(`, f) + 2MK∗(δ) +ζ, ãäå äëÿ êðàòêîñòè ìû îáîçíà÷èëèζ= supK,K0∈K
ζK,K0−MK,K0(δ) +.
Êðîìå òîãî, ïðèìåíÿÿ ëåììó 1, ïîëó÷àåì äëÿ ëþáîé ïàðû âåñîâL, L0∈ K
` fbL⊗L0 −fbL
5EL0(`, f) +` ξL⊗L0−ξL
,
è, ñëåäîâàòåëüíî,
` fbL⊗L0−fbL
5EL0(`, f) +ML0,L(δ) +ζ5EL0(`, f) +ML∗(δ) +ζ.
Îòñþäà, ïîëàãàÿL0=K èL=Kb, ïîëó÷èì
` fbKb⊗K−fbKb
5EK(`, f) +M∗b
K(δ) +ζ.
(16)
 ñèëó çàìå÷àíèÿ 4.1, îïðåäåëåíèÿKb è íåðàâåíñòâà (15) èìååì äëÿ ëþáîãîK∈ K MK∗b(δ)5RbKb 5RbK+τ5EK(`, f) + 2MK∗(δ) +ζ+τ,
÷òî âìåñòå ñ (16) äàåò
` fbKb⊗K−fbKb
52EK(`, f) + 2MK∗(δ) + 2ζ+τ.
(17)