• Aucun résultat trouvé

Reinforcement learning, energy systems and deep neural nets

N/A
N/A
Protected

Academic year: 2021

Partager "Reinforcement learning, energy systems and deep neural nets"

Copied!
18
0
0

Texte intégral

(1)

Reinforcement learning, energy systems and deep

neural nets

(2)

Agent

Environment

state

reward

action

numerical value

(3)

The battery

controller

State: (i) the battery

level (ii)

Everything you know

about the market

Reward: The money

you make during the

market period.

The battery setting

for the next market

period.

+ the energy

market

(4)

Table taken from: “Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives”. M. Glavic, R. Fonteneau and D. Ernst. Proceedings of the 20thIFAC World Congress.

(5)

Learning:

Exploration/exploitation: Not always

take the action that is believed to be

optimal to allow exploration.

Generalization: Generalize the

experience gained in some states to

other states.

(6)
(7)
(8)

Learning

phase

Effect of the

resulting control

policy

First control law for stabilizing power systems every computed using reinforcement learning. More at: “Reinforcement Learning Versus Model Predictive Control: A Comparison on a Power System Problem”. D. Ernst, M. Glavic, F.Capitanescu, and L. Wehenkel. IEEE Transactions on Syestems, Man, An Cybernetics—PART B: Cybernetics, Vol. 39, No. 2, April 2009.

(9)
(10)

Reinforcement learning for trading in the intraday market

More: “Intra-day Bidding Strategies for Storage Devices Using Deep Reinforcement”. I. Boukas, D. Ernst, A. Papavasiliou, and B. Cornélusse. Proceedings of the 2018 15th International Conference on the European Energy Market (EEM).

Complex problem:

Adversarial environment

Highly dimensional

Partially observable

Best results obtained with optimisation

of strategies based on past data

together with supervised learning to

learn from the optimised

strategies (imitative-learning type of

approach)

(11)

“A critical present objective is to develop deep RL

methods that that can adapt rapidly to new tasks.”

(12)

Synaptic

plasticity

modulation

Neuro-Walking: a meta-RL problem solved

through synaptic plasticity and

(13)

Classical architecture for solving meta-RL problems:

(14)

Rectified Linear Unit:

Saturated Relu:

Parametrized sRelu:

Gated Recurrent Unit:

(15)
(16)
(17)
(18)

More: “Introducing neuromodulation in deep neural networks to learn adaptive behaviours”. N. Vecoven, D. Ernst,

A. Wehenkel and G. Drion. Download at:

https://arxiv.org/abs/1812.09113

Figure

Table taken from: “Reinforcement Learning for Electric Power System Decision and Control: Past Considerations and Perspectives”

Références

Documents relatifs

We conducted further analysis by collecting trajectories on hidden states vectors from 100 rollouts of the trained policy, dimensionality reduction was applied using T-SNE [35] on

1) Environmental details: This work focuses on reinforce- ment learning tasks, that observe continuous robotic control problems. As noted in section III, RotorS Gym framework, used

Fr, www.legifrance.gouv.fr , 10/06/2014 : « L'action est ouverte à tous ceux qui ont un intérêt légitime au succès ou au rejet d'une prétention, sous réserve

The SAEM algorithm is guaranteed to converge to a local maximum of the likelihood function for models in the exponential family, under general conditions that include the

We use Reinforcement Learning (RL) algorithms to determine the best operating mode according several parameters. This parameters depend of our application. And depending on

Sur ces bases, j’envisage les discours conversationnels en situation d’audition publique en commission parlementaire comme un processus de publicisation, c’est-à-dire

We tested Deep reinforcement learning for the optimal non-linear control of a chaotic system [1], using an actor-critic algorithm, the Deep Deterministic