t-distribution Stochastich Neighbor Embedding (t-SNE)

(1)

t-distribution Stochastich Neighbor Embedding (t-SNE)

Juliette & Thomas

Master 2 Économiste d'entreprise

April 17, 2020

M Éc E n

(2)

An unsupervised non-linear technique

•

t-SNE is a very recent algorithm developed in 2008 by Laurens van der Maatens and Georey Hinton.

•

It is a dimension reduction technique to maps multi-dimensional data to two or more dimensions

MEn t-SNE 3 / 16

(4)

What is t-SNE

Application elds

•

t-SNE is an algorithm usually used in the following areas:

I Medicine I Biology

I Signal treatments I Recognition

•

The following example of t-SNE is a representation of picture recognition:

Database: MNIST

MEn t-SNE 4 / 16

(5)

What is t-SNE

Dierence with PCA

•

PCA is a linear algorithm→it is not able to interpret complex polynomial relationship between features and so may lead to poor visualization.

•

PCA is a linear dimension reduction technique that seeks to maximize variance and preserves large pairwise distances. However it doesn't try to minimize the intra-group variance.

•

t-SNE diers from PCA by preserving only small pairwise distances or local similarities.

MEn t-SNE 5 / 16

(6)

What is t-SNE

Machine Learning trade o

•

t-SNE is more like a black box where the interpretation is put aside.

Credit: https://urlz.fr/crCP

MEn t-SNE 6 / 16

(7)

How t-SNE works

Step 1

•

We rst describe the functionning of a SNE algorithm.

•

Stochastic Neighbor Embedding (SNE) starts by converting the

high-dimensional Euclidean distances between data points into conditional probabilities that represent similarities.

•

We dene the similarity of datapointx_i to datapointx_j as:

Pj|i= exp(−||x_i−x_j||²/2σ_i²) Σk6=iexp(−||xi−xk||²/2σ_i²)

WherePj|iis the conditionnal probability: xi would pick xj as its neighbor if neighbors were picked in proportion to their probability density under a Gaussian centered atx_i

MEn t-SNE 7 / 16

(8)

How t-SNE works

Step 1

Consider the following example

Credit: https://www.youtube.com/watch?v=NEaUSP4YerM&t=587s

MEn t-SNE 8 / 16

(9)

How t-SNE works

Step 1

MEn t-SNE 9 / 16

(10)

How t-SNE works

Step 2

•

For the low-dimensional counterpartsyi andyj of the high-dimensional datapointsx_i andx_j it is possible to compute a similar conditional probability, which we denote by:

Qj|i= exp(−||yi−yj||²) Σ_k6=iexp(−||y_i−y_k||²)

•

Logically, the conditional probabilitiesPj|iandQj|imust be equal for a perfect representation of the similarity of the datapoints in the dierent dimensional spaces→the dierence between them must be zero for the perfect replication of the plot in high and low dimensions.

•

By this logic SNE attempts to minimize this dierence of conditional probability.

MEn t-SNE 10 / 16

(11)

How t-SNE works

Step 2

MEn t-SNE 11 / 16

(12)

How t-SNE works

Step 2

MEn t-SNE 12 / 16

(13)

How t-SNE works

Step 3

Here is the dierence between SNE and t-SNE. To calculate the similarity between datapoints in lower dimension the algorithme choose a t-distribution probability law instead of a Gaussian law. This allows a higher inter-group variance.

MEn t-SNE 13 / 16

(14)

How t-SNE works

Step 4

MEn t-SNE 14 / 16

(15)

Limits of t-SNE

Caveat emptor

•

It is important to analyze the performance of t-SNE. The algorithm computes pairwise conditional probabilities and tries to minimize the sum of the dierence of the probabilities in higher and lower dimensions.

•

This involves a lot of calculations and computations→the algorithm is quite heavy on the system resources. That's why is recommended to use a dataset with less than 10 000 points.

•

As already discuss, the t-SNE gives an output which is not interpretive. It may be ecient to use the output of a t-SNE as an input of unsupervised classication algorithm.

MEn t-SNE 15 / 16

(16)

Limits of t-SNE

Sources

•

https://distill.pub/2016/misread-tsne/

•

https://www.analyticsvidhya.com/blog/2017/01/t-sne-implementation-r- python/

•

https://www.youtube.com/watch?v=NEaUSP4YerM&t=587s

MEn t-SNE 16 / 16

t-distribution Stochastich Neighbor Embedding (t-SNE)