• Aucun résultat trouvé

Reinforcement learning: a generic tool for solving  sequential decision-making problems for electrical systems

N/A
N/A
Protected

Academic year: 2021

Partager "Reinforcement learning: a generic tool for solving  sequential decision-making problems for electrical systems"

Copied!
12
0
0

Texte intégral

(1)

Reinforcement learning: a generic tool for

solving sequential decision-making problems for

electrical systems

(2)
(3)

Agent

Environment

state

reward

action

numerical value

(4)

The battery

controller

State: (i) the battery

level (ii)

Everything you know

about the market

Reward: The money

you make during the

market period.

The battery setting

for the next market

period.

+ the energy

market

(5)

Table taken from: “Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives”. M. Glavic, R. Fonteneau and D. Ernst. Proceedings of the 20thIFAC World Congress.

(6)

Learning:

Exploration/exploitation: Not always

take the action that is believed to be

optimal to allow exploration.

Generalization: Generalize the

experience gained in some states to

other states.

(7)
(8)
(9)

Learning

phase

Effect of the

resulting control

policy

First control law for stabilizing power systems every computed using reinforcement learning. More at: “Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem”. D. Ernst, M. Glavic, F.Capitanescu, and L. Wehenkel. IEEE Transactions on Syestems, Man, An Cybernetics—PART B: Cybernetics, Vol. 39, No. 2, April 2009.

(10)
(11)

Reinforcement learning for trading in the intraday market

More: “Intra-day Bidding Strategies for Storage Devices Using Deep Reinforcement”. I. Boukas, D. Ernst, A. Papavasiliou, and B. Cornélusse. Proceedings of the 2018 15th International Conference on the European Energy Market (EEM).

Complex problem:

Adversarial environment

Highly dimensional

Partially observable

Best results obtained with optimisation

of strategies based on past data

together with supervised learning to

learn from the optimised

strategies (imitative-learning type of

approach)

(12)

Figure

Table taken from: “Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives”

Références

Documents relatifs

In the case of an infinite number of distinct probability distribution functions, the assumptions that are sufficient conditions for consistency and asymptotic normality of

In section 3, we apply this construction to the games model, define real-time strategies and give the Galois con- nection that make all the denotations of λ-terms real-time.. In

CHROMATOGRAPHIC DETERMINATION OF CARBOFURAN AND 3-HYDROXY CARBOFURAN lN PLANTS, SOIL, WATER AND ARTIFICIAL DIETS OF Myzus

Si le séjour est prolongé, un traitement laxatif ponctuel peut être justifié afin d’éviter de gâcher ces vacances du fait de douleurs abdominales, de fissures ou d’un

For a time step of 6 days or smaller and with only the dominant statistical modes present in the reconstruction, the implicit estimation method performs better than the di-

1) Environmental details: This work focuses on reinforce- ment learning tasks, that observe continuous robotic control problems. As noted in section III, RotorS Gym framework, used

These experiments are designed to test the hypothesis that, under highly saturated conditions, coordination is beneficial when the amount of local traffic is small. Local

Sur ces bases, j’envisage les discours conversationnels en situation d’audition publique en commission parlementaire comme un processus de publicisation, c’est-à-dire