• Aucun résultat trouvé

A dual adaptive control theory inspired by Hebbian associative learning

N/A
N/A
Protected

Academic year: 2021

Partager "A dual adaptive control theory inspired by Hebbian associative learning"

Copied!
7
0
0

Texte intégral

(1)

A dual adaptive control theory inspired

by Hebbian associative learning

The MIT Faculty has made this article openly available.

Please share

how this access benefits you. Your story matters.

Citation

Jun-e Feng, Chung Tin, and Chi-Sang Poon. “A dual adaptive

control theory inspired by Hebbian associative learning.” Decision

and Control, 2009 held jointly with the 2009 28th Chinese Control

Conference. CDC/CCC 2009. Proceedings of the 48th IEEE

Conference on. 2009. 4505-4510. © 2010 Institute of Electrical and

Electronics Engineers.

As Published

http://dx.doi.org/10.1109/CDC.2009.5400831

Publisher

Institute of Electrical and Electronics Engineers

Version

Final published version

Citable link

http://hdl.handle.net/1721.1/59424

Terms of Use

Article is made available in accordance with the publisher's

policy and may be subject to US copyright law. Please refer to the

publisher's site for terms of use.

(2)

A dual adaptive control theory inspired by Hebbian associative learning

Jun-e Feng, Chung Tin and Chi-Sang Poon IEEE Fellow

Abstract— Hebbian associative learning is a common form of neuronal adaptation in the brain and is important for many physiological functions such as motor learning, classical condi-tioning and operant condicondi-tioning. Here we show that a Hebbian associative learning synapse is an ideal neuronal substrate for the simultaneous implementation of high-gain adaptive control (HGAC) and model-reference adaptive control (MRAC), two classical adaptive control paradigms. The resultant dual adap-tive control (DAC) scheme is shown to achieve superior track-ing performance compared to both HGAC and MRAC, with increased convergence speed and improved robustness against disturbances and adaptation instability. The relationships be-tween convergence rate and adaptation gain/error feedback gain are demonstrated via numerical simulations. According to these relationships, a tradeoff between the convergence rate and overshoot exists with respect to the choice of adaptation gain and error feedback gain.

I. INTRODUCTION

The remarkable ability of the nervous system to discern unknown or underdetermined changes in the environment has attracted much attention in the control systems and computer science literature. One reason of its rising popularity is that it inspires new insights for tackling some difficult issues such as robustness and computation efficiency, which are important for control systems in real-life problems. After all, the brain is an excellent example of an intelligent controller in many aspects.

In this regard, a biologically plausible adaptive control strategy will be of extreme interest to the parties of both control theory and their biological counterpart. Questions of biological realism are important not only for the sake of physiological validity, but also because they may reveal important design principles that are unique to biological neural networks. Such biologically based principles may lend improved control performance more compatible brain-like behavior that is otherwise not possible. Biological plausi-bility is also essential consideration in any hardware (e.g. VLSI) implementation of the model for massively parallel computation.

On the other hand, the theory of adaptive control has been extensively studied in the last several decades, and has

This work was supported by NIH grants HL072849, HL067966 and EB005460. We thank Dr. Anuradha Annaswamy for helpful discussions.

Jun-e Feng is with the School of Mathematics, Shandong University, Jinan, 250100, P.R.China, and also was with Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA 02139.fengjune@sdu.edu.cn

Chung Tin is with Harvard-MIT Division of Health Sciences and Tech-nology and Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139.ctin@mit.edu

Chi-Sang Poon is with Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA 02139. Tel: +1-617-2585405; Fax: +1-617-2587906cpoon@mit.edu

inspired many applications in the areas of robotics, aircraft and rocket control, chemical processes, power system and ship steering, etc. These theories may lend new concepts to the characterization of biologically-based neural control paradigms. In particular, high-gain adaptive control (HGAC) and model reference adaptive control (MRAC) are two basic adaptive control strategies that have found wide applications in these areas [7], [8], [12], [21]. However, there have been few studies on the dependence of convergence rate on various control parameters. Paper [20] establishes an adaptive observer for a class of nonlinear systems with arbitrary exponential rate of convergence. Tracking errors and prediction errors are combined in [29] in order to further improve the performance of the adaptive controller. In [30], the adaptive controller is constructed using an estimate of prediction error and the proportional-integral or integral-relay adjustment law. These previous results suggest that suitable combinations of different adaptation strategies may be advantageous in improving the overall performance of the resultant adaptive system.

In this paper, we propose a biologically plausible neural control paradigm based on the theory of classical condi-tioning and operant condicondi-tioning [1] and show that they are analogous to a combination of HGAC and MRAC. Figure 1 shows a hypothetical neuronal mechanism of reinforcement learning based on operant conditioning. A strong stimulus (unconditioned stimulus,zU S(t)) arriving at a synaptic site

may induce significant changes in post-synaptic potential which, when paired with a weak input (conditioned stimulus, zCS(t)), may strengthen the synapse in the conditioned

stimulus pathway through some associative form of Hebbian long-term potentiation [2], [3]. In operant conditioning, the reinforcement signalzU S(t) may derive from the behavioral

response y(t) in a closed loop. In particular, if zU S(t) is a

feedback error signal (e.g. tracking error in a movement task) the resultant learning rule is same as the delta rule commonly used in feedforward neural networks [4], [5], [6] and MRAC [29], [8].

It turns out that implementing MRAC with Hebbian learn-ing rules concurrently brlearn-ings about HGAC, since zU S(t)

itself is a powerful input that may exert profound influence on system response. The resultant dual control model is interesting in two respects. From an engineering standpoint, dual adaptive control proves to afford improved system per-formance that is not possible with either mode alone. From a biological standpoint, dual adaptive control by Hebbian learning suggests a possible mechanism whereby intelligent control may be achieved by biological systems.

This paper is organized as follows. We begin in the

(3)

ing section with a brief overview of HGAC and MRAC and a necessary lemma to be used in the subsequent analysis. We also examine the stability and convergence rate of HGAC and MRAC and their relations to the corresponding adaptation gains. In section III, we will formulate the dual adaptive control (DAC) strategy. Its performance is tested against sev-eral types of uncertainties via numerical simulation. Finally, section IV concludes the whole paper.

Error Feedback, e(t) Conditioned stimulus, zCS(t) Output y(t) Reference, ym(t)

G

e

C

Fig. 1. Feedback error learning based on operant conditioning. A conditioned stimulus zCS(t) with weak connectivity may

be-come sensitized after repetitive pairing with a strong conditioned stimulus, zCS(t). In this case, zU S(t) is the error feedback

sign and is itself a powerful input that may exert profound influence on system respose. The zU S(t) or e(t) pathway also

undergoes homosynaptic Hebbian learning so that its connectivity is strengthened by the continued presence of the feedback signal.

II. PRELIMINARIES ANDBACKGROUND

In this section we give a brief outline for two stable adap-tive control strategies: high-gain adapadap-tive control (HGAC) and model reference adaptive control (MRAC). Both strate-gies have been extensively studied in traditional control literature. In what follows we consider only linear, strictly minimum-phase SISO systems unless otherwise stated. A. High-gain adaptive control

Letym, ypbe the reference target and output of the

closed-loop system, respectively, and hp(s) ≡ gp· αp(s)/βp(s) be

the process to be controlled, where αp and βp are monic,

coprime polynomials andgpis a non-zero constant. Assume

that:

(p1) The sign of gp is known.

(p2) The relative degree (pole excess) △p≡ deg[βp(s)] −

deg[αp(s)] is 0 or 1.

Suppose the target is zero origin, i.e., ym ≡ 0 for all

t > 0. Then, from classical control theory the control u = −sgn(gp)k0e, where e = yp− ymis the tracking error, will

stabilize the system for some gaink0> 0 that is sufficiently

large. For any fixed k0, the system structure represents a

proportional feedback controller.

In practice, the value ofk0needed to stabilize the process

may not be known a priori because αp and βp are not

completely specified. In this event the feedback gain may be adaptively tuned, starting from an arbitrary initial value, by using an adaptation law that causesk0 to increase whenever

e 6= 0. A commonly used law is ˙k0(t) = ηe

2

(1)

where the adaptation rate η > 0 is an arbitrary constant. It is well known that e decays to 0 asymptotically as k0(t) converges to some finite limiting value. This adaptive

stabilization scheme using high-gain negative feedback may also be generalized to other situations where the sign ofgp

is unknown [26], [32]; the pole excess is greater than1 (see [22], [23]); the process is nonlinear [18]; and plants are with multiple inputs and multiple outputs [13].

On the other hand, the high-gain adaptive controller (1) suffers from a lack of robustness with respect to bounded disturbances. Sigma modification achieves boundedness of all solutions [14], [17], but it does not maintain stability in the noise-free case [27]. In [28], two data-driven stability in-dicators have been introduced in order to improve robustness. However, ifymis non-zero and/or time-varying, thene does

not generally go to zero for any finitek0. Thus HGAC is not

applicable to problems involving regulation about nonzero targets or tracking of time-varying targets.

B. Model reference adaptive control

Suppose the process as defined above is to be controlled to track a nonzero and time-varying targetym(t) derived from

a n-th order reference model with given transfer function hm(s) ≡ gm· αm(s)/βm(s), where αm andβm are monic,

coprime polynomials. Assume that hm(s) is strictly proper

and, further, is strictly positive real (SPR) [8], [24], [29], which implies that △m = 1, where △m ≡ deg[βm(s)] −

deg[αm(s)] is the pole excess of hm(s). Systems satisfying

the above conditions are said to be dissipative. For simplicity and without loss of generality we assume thatgm> 0 and

gp> 0. In addition, we assume that the order of the process

is related to the model order as follows: (p3) deg[βp(s)] ≤ n.

(p4) 0 ≤ △p≤ △m.

Compared to HGAC, more assumptions about the process dynamics are needed here. Ifhp(s) is completely given, one

may readily design appropriate feedback and feedforward compensators that would balance any differences between process and model dynamics.

Ifhp(s) is not completely known, the above compensation

gains must be adaptively tuned to achieve perfect tracking. A common controller

u(t) = kT(t) · z(t) (2) and the corresponding adaptive law is the delta rule [8], [24], [29]: ˙k(t) = −sgn(gp)ηΛz(t)e(t) (3) where k = £ k1 k2 κT1 κT2 ¤T , z = £ r yp vT1 v2T ¤T

, v1 and v2 are the outputs

of the compensation filters, sgn(gp) = 1 (by

assumption), and Λ = diag£

λ1 λ2 ... λ2n

¤ with λi> 0, i = 1, 2, ..., 2n.

Such tuning may not always be feasible because the adap-tive system may not converge. However, the SPR condition ensures global stability for the above adaptive system, such that e goes to 0 asymptotically. Under this condition, the

ThB09.4

(4)

gain k will converge to the desired values provided z is persistently exciting [7], [8], [21].

In the more general case where △m > 1, the SPR

condition no longer holds but stability may be achieved by using an “augmented error” for gain adaptation [8], [24], [29]. In this paper, however, we will only consider reference models that satisfy SPR condition unless stated otherwise.

Finally we introduce one lemma which is called Meyer-Kalman-Yacubovich Positive Real Lemma or Popov-Kalman-Yacubovich Positive Real Lemma [7], [8], [21].

Lemma 1: Let H(s) be a SPR rational transfer function with H(∞) = 0. Then H(s) has a minimal state variable realization{A, b, c} with H(s) = c(sI − A)−1b, where

A + AT = −QQT − γ2I, b = cT

(4) for some non-zero matrixQ and non-zero constant γ.

III. CONVERGENCE RULES FORHGACANDMRAC A. Convergence of HGAC

From (1) it can be shown that convergence speed of HGAC is increased with increasing η. The maximum convergence rate that can be achieved by increasing η is dependent on △p. For△p= 1, convergence of the adaptive system can be

made arbitrarily fast by increasingη toward infinity [19]. In the limit, the response of the closed-loop system is dictated by the high-gain feedback. For△p= 2, convergence rate is

limited by system dynamics which is characterized by a pair of complex conjugate poles whose real part is independent of the loop gain (as long as it exceeds a critical value) [22]. For△p> 2 , the adaptive system may become unstable for

larger η since some complex pole pairs of the closed-loop system may move toward the right hand-plane as the loop gain is increased [12], [23].

B. Convergence of MRAC

Although MRAC is known to be stable, factors governing its rate of convergence are poorly understood. Generally, one cannot increase the convergence rate of MRAC by simply increasing the adaptation gain. Transient behaviors of adap-tive control have not been well considered in the literature until [34]. Afterwards, there has been increasing interest in this subject (see [11], [25], [31]). A MRAC algorithm which utilizes multiple models is proposed in [11] to improve the transient performance under large parameter variations for a class of SISO systems. Reference [25] develops a strategy for improving the transient response by switching between multiple models of the plant. In [31], the estimation error generated by the identification scheme is used as a control signal to counteract errors resulting from the certainty equivalence design for improving the transient performance of adaptive systems. However, none of these references deals with the relationship between the convergence rate and the adaptation gain/feedback gain.

For MRAC (2) and (3), the error equation is e(t) = yp(t) − ym(t) = gp gm hm(s)[zT(t)δk(t)] (5) where δk(t) = k(t) − k∗ and k∗

is the set of optimal parameter values for the controller. From (3), we know that δk(t) satisfies the following differential equation:

˙

δk(t) = −sgn(gp)ηΛz(t)e(t) = −ηΛz(t)e(t). (6)

Now we present the general result on the convergence of MRAC for SPR models. The proof of exponential stability for the adaptive system (5) and (6) can be obtained from Lemma B.2.3 in reference [21] via some system transforma-tions and a coordinate transformation.

Lemma 2: (Exponential stability for MRAC) For any positiveη > 0, the adaptive system (5) and (6) is exponen-tially stable, if

(1) hm(s) is SPR;

(2) kz(t)k and k ˙z(t)k are uniformly bounded; (3) z(t) is persistently exciting.

Proof. Let δx =£

x1 x2 ... xn ¤ T

be the model state and{Am, bm, cm} be a minimal realization satisfying (4) in

Lemma 1, and e(t) = cmδx(t), which means that hm(s) =

cm(sI − Am)−1bm,Am+ ATm= −QQT− γ 2I, b

m= cTm.

Then the error equation can be written as · δ ˙x(t) δ ˙k(t) ¸ = · A m gp gmbmz T(t) −ηΛz(t)cm 0 ¸ · δx(t) δk(t) ¸ . (7) Taking ¯bm = gp gmbm and η = ηˆ gm

gp, we get the equivalent

form of (7): · δ ˙x(t) δ ˙k(t) ¸ = · Am ¯bmzT(t) −ˆηΛz(t)¯bT m 0 ¸ · δx(t) δk(t) ¸ . (8)

Without loss of generality we assume Λ = I. Indeed, if Λ 6= I, consider the change of coordinates

δx = δx, ¯δk = (LT)−1

δk, Λ = LT

L, L ∈ R2n×2n.

In the new coordinates, definez(t) = Lz(t), then (8) can be¯ equivalently rewritten as · δ ˙x(t) ˙¯ δk(t) ¸ = · Am ¯bmz¯T(t) −ˆη¯z(t)¯bT m 0 ¸ · δx(t) ¯ δk(t) ¸ . (9)

Letηˆ12δk = ¯ˆ δk, then we can describe (9) equivalently as

" δ ˙x(t) ˙ˆ δk(t) # = · Am ηˆ 1 2¯bmz¯T(t) −ˆη12z(t)¯b¯ Tm 0 ¸ · δx(t) ˆ δk(t) ¸ . (10) Furthermore, it is easy to check that z(t) is persistently¯ exciting and thatk¯z(t)k and k ˙¯z(t)k are uniformly bounded, since z(t) = Lz(t) and L is nonsingular. Hence we have¯ verified that system (10) satisfies all conditions in Theorem 2.3 of reference [7]. Therefore, system (10) is exponentially

stable. ¤

C. Relationship between convergence rateβ and adaptation gainη for MRAC

In this subsection, we shall discuss the relationship be-tween the convergence rate β and adaptation gain η for MRAC via numerical example, which will show that, when adaptation gain exceeds a certain threshold, the system

(5)

converges more slowly as the adaptation gain increases. Therefore, there is a limit for improving the convergence rate by increasing the adaptation gainη alone.

Example 1: Consider the second-order system with

Ap = · 0 1 1 0 ¸ , Bp= · 2 0 ¸ , Cp= £ 1 2 ¤ . Take the model reference system with

Am= · −4 1 1 −4 ¸ , Bm= · 1 0 ¸ , Cm=£ 1 0 ¤ ,

r(t) = sin(t) + sin(10t), and the initial values xp= xm=

£ 0 0 ¤T , z(0) = £ 1 1 1 0 ¤T , e(0) = 0, δk(0) = £ 0.6 2.5 7.5 −3.5 ¤T

and compensation filters initial valuesv(0) =£

1 1 ¤T .

The simulation results are shown in Figure 2. Figures 2(a), 2(b) and 2(c) show the tracking error forη = 500, η = 1000 and η = 1500, respectively, over time. These figures show that the error converges the fastest withη = 1000, compared with η = 500 or 1500. Hence, by further increasing the adaptation gain from 1000 to 1500, the system performance actually get worse. A more sophisticated tweak to the control strategy will be necessary to further improve the system performance. The following section will introduce more efficient controller, DAC.

IV. DUAL ADAPTIVE CONTROL

A. Establishment of Dual adaptive controller

We now derive the dual adaptive control (DAC) scheme that draws together MRAC and HGAC, and discuss the relationship between convergence rate and error feedback gain. First introduce the error feedback into controller (2), i.e.,

u(t) = kT(t) · z(t) − k0e(t) (11)

where k(t) still satisfies (3). We call the corresponding control scheme dual adaptive control (DAC); see Fig.3. In this case, the equivalent error equation for DAC system is

e(t) = gp gm

hm(s)[zT(t)δk(t) − k0e(t)]. (12)

The corresponding error equation becomes · δ ˙x(t) δ ˙k(t) ¸ = · ∆m gp gmbmz T(t) −ηΛz(t)cm 0 ¸ · δx(t) δk(t) ¸ . (13) where∆m= Am−k0 gp gmbmcm. SinceAm+A T m= −QQT− γ2I, we have Am− k0 gp gm bmcm+ (Am− k0 gp gm bmcm)T = −(QQT + γ2 I + 2k0 gp gm bmcm) = −(D + 2k0 gp gm bmcm) = −(D + 2k0 gp gm bmbTm) , − ¯D. 0 20 40 60 80 100 −0.015 −0.01 −0.005 0 0.005 0.01 0.015 Time [s] Tracking error e η=500 (a) η= 500 0 20 40 60 80 100 −0.015 −0.01 −0.005 0 0.005 0.01 0.015 Time [s] Tracking error e η=1000 (b) η= 1000 0 20 40 60 80 100 −0.015 −0.01 −0.005 0 0.005 0.01 0.015 Time [s] Tracking error e η=1500 (c) η= 1500

Fig. 2. Simulation for Example 1. 2(a), 2(b) and 2(c) are about the tracking error for η= 500, η = 1000 and η = 1500. These figures show that the tracking error for η= 1000 conveges faster than those of η= 500 and η = 1500.

Noting that error equations (13) and (7) have essentially the same structure, we obtain the following convergence rule for DAC, by direct applicating of Lemma 2.

Theorem 1: (Convergence rule for DAC with givenk0)

For any positiveη > 0 and k0> 0, if the DAC system (13)

satisfies all conditions in Lemma 2, then (13) is exponentially stable.

The advantages of introducing of error feedback into MRAC will be shown via the following example.

Example 2: Consider the system in Example 1. In order to improve the speed of tracking, we introduce error feedback. Figure 4 shows the tracking errors for error feedback gain k0 = 100 (pink line) and k0 = 0 (black line) under

adaptation gain η = 1000. It can be seen that the tracking error converges much faster with error feedback than without.

ThB09.4

(6)

e d e e e r u0 yp ym W1 W0 W2 Plant G + + _

Fig. 3. Dual Adaptive Control

0 20 40 60 80 100 −0.015 −0.01 −0.005 0 0.005 0.01 0.015 time [s] tracking error e k0=0 k0=100

Fig. 4. Tracking error curves for k0 = 100 and k0 = 0 under

adaptation gain η= 1000. Compared with the tracking error without error feedback gain (black line), the tracking error using error feedback converges much faster.

Remark 1: Advantages of introducing error feedback into MRAC: (1) It can improve the convergence rate, which cannot be achieved by simply increasing adaptation gainη; (2) It can reduce overshoot, as shown in Example 2. B. Robustness analysis

Adaptive systems are known to be sensitive to unknown disturbances or uncertainties in the system structure. Such non-robust behavior may be exacerbated by over-adaptation with excessive adaptation gains. Therefore, robustness of adaptive control has received much attention in recent years (see [9], [10], [15], [16], [33] and the references therein). In this subsection, we shall examine the robustness of DAC, which will be accessed by subjecting the system to unknown disturbance input, unknown system dynamics and high-frequency reference input.

1) Unknown disturbance input: For simplicity, we first study first-order plants with disturbance input:

˙yp= −apyp− d + bpu (14)

where d is an unknown nonzero constant, which can be regarded as a constant disturbance input. Furthermore, the corresponding model reference is

˙ym= −amym+ bmr(t) (15)

wherer(t) is reference input, amandbmare known constant

parameters, andamis required to be strictly positive. Without

loss of generalitybmcan be chosen strictly positive. In such

case, the dual adaptive controller becomes

u = ˆar(t)r + ˆay(t)yp+ ˆad− k0e (16)

whereˆar,ˆafandˆadare variable feedback gains. The adaptive

law is

˙ˆar= −sgn(bp)ηer

˙ˆay= −sgn(bp)ηeyp

˙ˆad= −sgn(bp)ηe.

(Note whenk0= 0, controller (16) with the above adaptive

law is equivalent to MRAC [29]). It is obvious that the controller above is robust for any constant disturbance input d.

For high-order systems, yp = hp(s)u + hf(s)d, where d

is unknown constant and hp(s) satisfying same conditions

as before. Then the corresponding control law still has the form of (11), but there k = £

k1 k2 ˆad κT1 κ T 2 ¤T , z = £ r yp 1 v1T v2T ¤T

. Furthermore, k and z still satisfy (3), where Λ = diag£

λ1 λ2 ... λ2n+1 ¤ with

λi> 0, i = 1, 2, ..., 2n + 1.

2) Fast-adaptation instability: For convenience of discus-sion, we will confine our analysis to the following example of first-order plant with unknown dynamics:

˙x = −x + bw + u µ ˙w = −w + 2u y = x

(17)

where b > 1/2 is a constant parameter, and µ is a small positive number which corresponds to the time constant of an un-modeled statew. The control objective is to track the outputxm of a reference model

˙xm= −xm+ r. (18)

Forµ = 0, the MRAC law

u = kr, ˙k = −ηer, e = x − xm (19)

guarantees thate → 0 as t → ∞ and all signals are bounded for any bounded inputr. For 1 >> µ > 0, the closed-loop plant is:

˙e = −e + bw + (k − 1)r µ ˙w = −w + 2kr

˙k = −ηre. (20)

For largeη the above MRAC system may become unstable. In particular, for anyr = constant the system is linear and time invariant. Hence, it can be shown by using the Routh-Hurwitz criterion that (20) is stable iff the adaptation gainη satisfies ηr2 < 1 µ 1 + µ 2b − µ ≈ 1 2bµ (21) where the approximation in the above inequality is from µ << 1.

(7)

Now suppose that in addition to the above MRAC law there is a negative error feedback with a gaink0>> 1. The

first two equations in (20) becomes

˙e = −(1 + k0)e + bw − (k + 1)r

µ ˙w = −w + 2kr − 2k0e. (22)

Using the Routh-Hurwitz criterion as before it can be shown that the dual adaptive control system is stable iff

ηr2< 1

µ

(1 + k0+ 2bk0)[1 + (1 + k0)µ]

2b − (1 + k0)µ

. (23)

For k0 = 0 the stability condition is identical to (21). For 1

2b << k0 < −1 + 2b/µ, however, the right-hand side of

(23) tends to k0/µ >> 1/2bµ. This result shows that the

dual adaptive control system is considerably more robust than MRAC.

V. CONCLUSION

Combining HGAC and MRAC, we have formulated DAC which outperforms either of the controller implemented alone. DAC scheme has proved superior in terms of conver-gence rate and robustness properties, which are demonstrated with several simulation results in this paper. Furthermore, DAC also shows to be promising in dealing with a class of nonlinear systems, which is omitted since the space limited.

REFERENCES

[1] B. R. Dworkin, Learning and Physiological Regulation, University of Chicago Press, Chicago, 1993.

[2] D. O. Hebb, The Organization of Behavior, Wiley, New York, 1949. [3] T. H. Brown, P. F. Chapman, E. W. Kairiss and C. L. Keenan,

”Long-term synaptic potentiation,” Science, vol. 242, pp. 724-728, 1988. [4] D. E. Rumelhart, G. E. Hinton and R. J. William, ”Learning internal

representations by error propagation.”, In Rumelhard, D.E., McClel-land, J. L., and the PDP Research Group, editors, Paralled Distributed Processing. Explorations in the Microstructure of Cognition. Volume 1: Foundations, pp. 318-362. The MIT Press, Cambridge, MA. [5] P. J. Werbos, ”Neural networks for control and system identification.”

Proceedings of the 28th Conference on Decision and Control, FL, USA, vol. 1, pp. 260-265, 1989.

[6] J. V. Shah and C.-S. Poon, ”Linear independence of internal represen-tations in multilayer perceptrons,” IEEE Trans. Neural Network, vol. 10, pp. 10-18, 1999.

[7] B. D. O. Anderson, R. R. Bitmead, C. R. Johnson, Jr., P. V. Kokotovi´c, R. L. Kosut, I. M. Y. Mareels, L. Praly, and B. D. Riedle, Stability of adaptive systems: passivity and averaging systems, ser. MIT Press Series in Signal Processing, Optimization, and Control, 8. Cambridge, MA: MIT Press, 1986.

[8] K. ˙Astr¨om and B. Wittenmark, Adaptive Control, 2nd ed. MA : Addison Wesley, 1995.

[9] G. Bartolini, A. Ferrara, and A. A. Stotsky, “Robustness and perfor-mance of an indirect adaptive control scheme in presence of bounded disturbances,” IEEE Trans. Automat. Control, vol. 44, no. 4, pp. 789– 793, 1999.

[10] M. Cadic and J. W. Polderman, “Strong robustness in adaptive con-trol,” in Nonlinear and adaptive control (Sheffield, 2001), ser. Lecture Notes in Control and Inform. Sci. Berlin: Springer, 2003, vol. 281, pp. 45–54.

[11] M. K. Ciliz and A. Cezayirli, “Increased transient performance for the adaptive control of feedback linearizable systems using multiple models,” Internat. J. Control, vol. 79, no. 10, pp. 1205–1215, 2006. [12] A. Ilchmann, “Non-identifier-based adaptive control of dynamical

systems: a survey,” IMA J. Math. Control Inform., vol. 8, no. 4, pp. 321–366, 1991.

[13] A. Ilchmann and D. H. Owens, “Threshold switching functions in high-gain adaptive control,” IMA J. Math. Control Inform., vol. 8, no. 4, pp. 409–429, 1991.

[14] P. A. Ioannou and P. V. Kokotovi´c, “Instability analysis and improve-ment of robustness of adaptive control,” Automatica J. IFAC, vol. 20, no. 5, pp. 583–594, 1984.

[15] P. Ioannou and J. Sun, Robust adaptive control. Prentice Hall, Upper Saddle River, NJ, 1996.

[16] P. Ioannou and A. Datta, “Robust adaptive control: design, analysis and robustness bounds,” in Foundations of adaptive control (Urbana, IL, 1990), ser. Lecture Notes in Control and Inform. Sci. Berlin: Springer, 1991, vol. 160, pp. 71–152.

[17] H. Kaufman, I. Barkana, and K. Sobel, Direct adaptive control algorithms, 2nd ed., ser. Communications and Control Engineering Series. New York: Springer-Verlag, 1998, theory and applications, Case studies contributed by David S. Bayard and Gregory W. Neat. [18] H. K. Khalil and A. Saberi, “Adaptive stabilization of a class of

nonlinear systems using high-gain feedback,” IEEE Trans. Automat. Control, vol. 32, no. 11, pp. 1031–1035, 1987.

[19] I. Mareels and J. W. Polderman, Adaptive systems: An introduction. Boston, MA: Birkh¨auser Boston Inc., 1996.

[20] R. Marino and P. Tomei, “Adaptive observers with arbitrary exponen-tial rate of convergence for nonlinear systems,” IEEE Trans. Automat. Control, vol. 40, no. 7, pp. 1300–1304, 1995.

[21] ——, Nonlinear control design: geometric, adaptive, and robust. Prentice Hall International (UK) Limited, 1995.

[22] A. S. Morse, “A three-dimensional universal controller for the adaptive stabilization of any strictly proper minimum-phase system with relative degree not exceeding two,” IEEE Trans. Automat. Control, vol. 30, no. 12, pp. 1188–1191, 1985.

[23] D. Mudgett and A. Morse, “A smooth, high-gain adaptive stabilizer for linear systems with known relative degree,” in Proc. Automatic Control Conf., Kyoto, Japan, 1989, pp. 2315–2320.

[24] K. Narendra and A. Annaswamy, Stable Adaptive Systems, 2nd ed., ser. Applied Mathematical Sciences. Englewood Cliffs, NJ: Prenctice Hall, 1989.

[25] K. S. Narendra and J. Balakrishnan, “Improving transient response of adaptive control systems using multiple models and switching,” IEEE Trans. Automat. Control, vol. 39, no. 9, pp. 1861–1866, 1994. [26] R. D. Nussbaum, “Some remarks on a conjecture in parameter adaptive

control,” Systems Control Lett., vol. 3, no. 5, pp. 243–246, 1983. [27] I. Polderman, JW. Mareels, “Robustification of high gain adaptive

control,” in Proceedings of the 6th European Control Conference, Porto, Portugal, 2001, pp. 2428–3333.

[28] ——, “Two scale high gain adaptive control,” Int. J. Adapt. Control Signal Process, vol. 18, pp. 393–402, 2004.

[29] J.-J. E. Slotine and W. Li, Applied nonlinear control. Prentice Hall: Englewood Cliffs, New Jersey, 1991.

[30] A. Stotsky, “Lyapunov design for convergence rate improvement in adaptive control,” Int. J. Control, vol. 57, no. 2, pp. 501–504, 1993. [31] J. Sun, “A modified model reference adaptive control scheme for

im-proved transient performance,” IEEE Trans. Automat. Control, vol. 38, no. 8, pp. 1255–1259, 1993.

[32] J. C. Willems and C. I. Byrnes, “Global adaptive stabilization in the absence of information on the sign of the high frequency gain,” in Analysis and optimization of systems, Part 1 (Nice, 1984), ser. Lecture Notes in Control and Inform. Sci. Berlin: Springer, 1984, vol. 62, pp. 49–57.

[33] X. Xie and J. Li, “A robustness analysis of discrete-time direct model reference adaptive control,” Int. J. Control, vol. 79, no. 10, pp. 1196– 1204, 2006.

[34] B. E. Ydstie, “Transient performance and robustness of direct adaptive control,” IEEE Trans. Automat. Control, vol. 37, no. 8, part I, pp. 1091–1105, 1992.

ThB09.4

Figure

Fig. 1. Feedback error learning based on operant conditioning.
Fig. 2. Simulation for Example 1. 2(a), 2(b) and 2(c) are about the tracking error for η = 500, η = 1000 and η = 1500
Fig. 3. Dual Adaptive Control

Références

Documents relatifs

In section 3, we apply this construction to the games model, define real-time strategies and give the Galois con- nection that make all the denotations of λ-terms real-time.. In

DECISION SUPPORT SYSTEMS AND PERSONAL COMPUTING Page 22 Keen/Hacka thorn -- Draft #5 (10/10/79). Stabell, 1971], seems fully applicable for PS, but there

  L’après  68  verra  se  multiplier  les  polars  de  tout  bord  qui 

Reciprocal Kinematic Control: using human-robot dual adaptation to control upper limb assistive devices Mathilde Legrand, Étienne de Montalivet, Florian Richer, Nathanaël

Keywords: Closed-loop identification, Input design, System identification, Constrained systems, Markov decision process, Model predictive

Definition 5: The feedback controller is said to achieve cost functional objective if and only if the interconnected system P f ∧ c C satisfies the stability and

The parallel version of the epistemic gossip problem with no temporal constraints except for a bound on the number of steps is NP-complete even when the goal is a conjunction

The work of [13], [14] considered chance constraints on joint random variables, using the result of [15] to show that the optimization resulting from the