• Aucun résultat trouvé

Machine Learning for Data-Driven Smart Grid Applications

N/A
N/A
Protected

Academic year: 2021

Partager "Machine Learning for Data-Driven Smart Grid Applications"

Copied!
51
0
0

Texte intégral

(1)

Machine Learning for Data-Driven Smart Grid Applications

Patrick Glauner, Jorge Augusto Meira and Radu State Interdisciplinary Centre for Security, Reliability and Trust,

University of Luxembourg

May 22, 2018

(2)

Motivation: non-technical losses (NTL)

Example of NTL: Two

assumed occurrences of NTL due to significant consumption drops

followed by inspections (visualized by vertical bars).

(3)

About us

We work on real-world machine learning problems together with industry partners

Recent research includes detection of electricity

theft/non-technical losses,

correction of biases in data and augmented reality

(4)

Goals of this tutorial

• Providing an introduction to machine learning

• Understanding the three pillars of machine learning

• Knowing when to use which model

(5)

Contents

• Introduction

• Supervised learning

• Unsupervised learning

• Reinforcement learning

• Deep Learning

• Conclusions

(6)

Introduction

"Artificial intelligence is the science of knowing what to do when you

don't know what to do."

(Peter Norvig)

https://www.youtube.co m/watch?v=rtmQ3xlt-4A

(7)

Introduction

Arthur Samuel (1959): “Field of study that gives computers the ability to learn without being explicitly programmed”.

(8)

Introduction

(9)

Introduction

(10)

Introduction

(11)

Introduction

“Machine Learning is a subset of Artificial Intelligence techniques which use statistical models to enable

machines to improve with experiences”

Use cases: data mining, autonomous cars, recommendation...

https://rapidminer.com/artificial-intelligence-machine-learning-deep-learning/

(12)

Introduction

Tom Mitchell (1998): "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E."

(13)

Introduction

source: https://www.forbes.com Alan

Turing

creates the

“Turing Test”

to

determine if

a

computer has

real

intelligence.

1950

1952

1957

1997

2006

2015

Arthur

Samuel

wrote the

first

computer

learning

program.

Frank

Rosenblatt

designed the

first

neural

network for computers (the perceptron).

IBM’s Deep Blue

beats the world champion

at chess.

Geoffrey

Hinton coins

the term

“deep

learning”

to explain new

algorithms

that let

computers

“see”

and

distinguish

objects and

text

in images

and videos.

AlphaGo

became the

first

computer Go

program

to beat a

human professional Go player

[...]

[...]

(14)

Introduction: three pillars of machine learning

Supervised learning: induce a function that maps from input to output.

Unsupervised learning: find hidden structure in data.

Reinforced learning: reward-based learning.

(15)

Supervised learning

Raw Data

Features

Models

(16)

Supervised learning

Regression Classification

(17)

Supervised learning

Regression Classification

(18)

Supervised learning

Regression Classification

(19)

Supervised learning

Regression Classification

Continuous values Discrete values (categories)

Class 1

Class 2

(20)

Supervised learning: use cases

• Detection of anomalies

• Forecasting

• Medical diagnosis

• ...

(21)

Supervised learning: models

• Linear/logistic regression

• Decision tree, random forest

• Support vector machine

• ...

• Neural networks, Deep Learning

(22)

Supervised learning: workflow

x y

Training set

Learning algorithm

features prediction

h

hypothesis

(23)

How to represent ‘h’ (hypothesis)

Supervised learning: regression

(24)

How to represent ‘h’ (hypothesis)

Supervised learning: regression

h (x) = 0 + 1x y = ax + b

(25)

h (x) = 0 + 1x y = ax + b

How to represent ‘h’ (hypothesis)

Supervised learning: regression

For example, housing price.

2014 2015 2016 2017 2018

~180k ~182k ~184k ~186k ???

(26)

h (x) = 1.5 + 0x h (x) = 0.5x h (x) = 1 + 0.5x

1 2

1 2

1 2

1 2 1 2

1 2

How to represent ‘h’ (hypothesis)

Supervised learning: regression

(27)

Choose 0 1 so that h (x) is close to y for our training set

h (x) = 0 + 1x

How to represent ‘h’ (hypothesis)

Supervised learning: regression

(28)

The idea is to minimize 0 1, so that h (x)-y tends to decrease.

Thus, we can define the cost function J( 0 1) aiming to minimize 0 1:

J( ) = 2⅀ (h (x) - y)²

Choose 0 1 so that h (x) is close to y for our training set

h (x) = 0 + 1x

How to represent ‘h’ (hypothesis)

Supervised learning: regression

(29)

Supervised learning: classification

Example of non-

technical losses (NTL):

Two assumed

occurrences of NTL due to significant

consumption drops

followed by inspections (visualized by vertical bars).

(30)

Weather

Supervised learning: decision tree

(31)

Weather

Sunny

Rain Cloudy

Supervised learning: decision tree

(32)

Weather

Sunny

Rain Cloudy

no Strong Wind

Weak Wind Normal Humidity

High Humidity

Supervised learning: decision tree

(33)

Weather

Sunny

Rain Cloudy

no Strong Wind

Weak Wind Normal Humidity

High Humidity

no yes no yes

Supervised learning: decision tree

(34)

Unsupervised learning

Supervised

Known labels

(35)

Unsupervised learning

Unknown labels

Supervised Unsupervised

Known labels

(36)

Unsupervised learning

Supervised Unsupervised

Unknown labels Known labels

(37)

Unsupervised learning

Supervised Unsupervised

Unknown labels Known labels

(38)

Unsupervised learning: clustering

K-means algorithm

1: Define K centroids randomly.

(39)

Unsupervised learning: clustering

K-means algorithm

1: Define K centroids randomly.

2: Associate every observation according to the nearest centroid.

(40)

K-means algorithm

1: Define K centroids randomly.

2: Associate every observation according to the nearest centroid.

Unsupervised learning: clustering

(41)

K-means algorithm

1: Define K centroids randomly.

2: Associate every observation according to the nearest centroid.

3: Define new centroids according to the mean of the clusters.

Unsupervised learning: clustering

(42)

K-means algorithm

1: Define K centroids randomly.

2: Associate every observation according to the nearest centroid.

3: Define new centroids according to the mean of the clusters.

4: Repeat step 2 and 3 to converge.

Unsupervised learning: clustering

(43)

Unsupervised learning: use cases

• Market segmentation

• Clustering of customers, news, etc.

• Dimensionality reduction of data

(44)

Unsupervised learning: models

• k-means clustering

• Expectation-maximization clustering

• Principal component analysis

• ...

(45)

Reinforcement learning

(46)

Reinforcement learning: use cases

• Planning

• Playing games, e.g. the game of Go

• ...

(47)

Reinforcement learning: models

• Value/policy iteration

• Q-learning

• Deep reinforcement learning

• ...

(48)

Deep Learning:

neural network

(49)

Deep Learning

(50)

Conclusions

• Machine Learning allows to learn complex statistical patterns from data

• Not much domain knowledge required

• Many applications in daily life

• Tell us more about your workflows so that we can figure out how Machine Learning can help you!

(51)

References

[1] T. Mitchell, "Machine learning", McGraw Hill, 1997.

[2] C. Bishop, "Pattern recognition and machine learning", Springer, 2006.

[3] P. Glauner, et al., "The Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study", Proceedings of the 25th European Symposium on Artificial Neural Networks,

Computational Intelligence and Machine Learning (ESANN 2017), Bruges, Belgium, 2017.

[4] P. Glauner, et al., "Impact of Biases in Big Data", Proceedings of the 26th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2018), Bruges, Belgium, 2018.

Références

Documents relatifs

Index Terms—Kernel-based methods, Gram matrix, machine learning, pattern recognition, principal component analysis, kernel entropy component analysis, centering

This was the case, for instance, when we replaced the inferred marginals and copula used to build the lPCEonZ with the true ones (not shown here): in both examples above, we

Pénal protection, electronic consumer, criminal responsibility, electronic payment cards, the penal code.. فوصلاوب ظيف ا دبع ما ا زكرﳌا – ﺔﻠيم ىقتﻠﳌا

The examples provided in this Section show that having simply more data is not always helpful in training reliable models, as the data sets used may be biased.. In the

- Ilies and gaverea, Lund, Daulatram B, Organization culture and job satisfaction, the journal of business & industrial marketing, N°3, 2003.. - Orerily, C and caldwell, D,

We have engineered the surface of the MSCs with the sialyl Lewis' (SLeX) moiety, found on the surfaces of leukocytes representing the active site of the P-selectin

not the ase for the smoothing s heme used in the POTENT method (the smoothed eld.. smoothed eld derived from the same velo ity eld sampled on a grid). This