Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task

(1)

HAL Id: hal-02268478

https://hal.archives-ouvertes.fr/hal-02268478

Submitted on 21 Aug 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task

Jhosimar George Arias Figueroa

To cite this version:

Jhosimar George Arias Figueroa. Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task. LatinX in AI Research at ICML 2019, Jun 2019, Long Beach, United States. �hal-02268478�

(2)

Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task

Jhosimar George Arias Figueroa

jariasf03@gmail.com

Motivation

In many real world applications, it is relatively easy to acquire large amounts of unla- beled data. However, their corresponding labels often requires the eﬀort of a human expert and special devices. This bottleneck in the labeling results in small amounts of labeled data and large amounts of unlabeled data. Therefore, being able to utilize the surplus unlabeled data is desirable.

Background

Variational Autoencoder Evidence Lower Bound (ELBO)

Approximate p_θ(z|x) with variational posterior q_ϕ(z|x), deﬁne lower bound on log p(x)

Intuition

▶ Our encoder transforms the data into feature representations, from these representations we infer a Gaussian latent variable to provide more robust set of features and a discrete latent variable to provide category assignments.

▶ We take advantage of the small amount of labeled data to perform the category assignments for the unlabeled data.

▶ The decoder uses the gaussian representations and category information for reconstruction.

Probabilistic Model

Inference Model Generative Model

Evidence Lower Bound (ELBO)

The ﬁrst term is the reconstruction loss (L^R). The sec- ond and third terms work as regularizers of the categorical (L^C) and Gaussian (L^G) distributions respectively.

Auxiliary Task Loss

Assignment Loss ( L

^A

)

▶ In order to assign labels to the unlabeled data, we use the Distance-Weighted k-nearest neighbor (k-NN) on the feature space.

▶ Once we obtained the assignments for the unlabeled data, we use the Cross-entropy loss.

Metric Embedding Loss ( L

^M

)

▶ The goal of this loss is to regularize the feature space, in such a way that the distance between features that belong to the same category should be smaller than features of dif- ferent categories.

▶ We perform metric learning via the lifted structured loss.

L^M = 1 2|Pˆ|

∑

(i,j)∈Pˆ

max(0, J(f_i, f_j))²

J(f_i, f_j) = D(f_i, f_j) + max {

max

(i,k)∈Nˆ

α − D(f_i, f_k), max

(j,l)∈Nˆ

α − D(f_j, f_l) }

Proposed Loss Function

Proposed Auxiliary Task Loss

Assignment Loss ( L

^A

)

▶ In order to assign labels to the unlabeled data, we use the Distance-Weighted k-nearest

neighbor (k-NN) on the feature space.

▶ Once we obtained the assignments for the

unlabeled data, we use the Cross-entropy loss.

Metric Embedding Loss ( L

^M

)

▶ The goal of this loss is to regularize the

feature space, in such a way that the distance between features that belong to the same

category should be smaller than features of diﬀerent categories.

▶ We perform metric learning via the lifted structured loss.

L^M = 1 2|Pˆ|

∑

(i,j)∈Pˆ

max(0, J(f_i, f_j))²

J(f_i, f_j) = D(f_i, f_j) + max {

max

(i,k)∈Nˆ

α − D(f_i, f_k), max

(j,l)∈Nˆ

α − D(f_j, f_l) }

Network Architecture

Architecture of our proposed model. The encoder and decoder are given by Convolutional Neural Networks. The red diamonds represent the reconstruction loss (L^R), categorical loss (L^C), gaussian loss (L^G), assignment loss (L^A) and metric embedding loss (L^M).

Quantitative Evaluation

MNIST SVHN

Performance of our probabilistic model after adding the assignment (L^A) and metric embedding (L^M) loss at diﬀerent epochs on the MNIST and SVHN datasets.

Semi-supervised test error (%) on MNIST, SVHN and CIFAR-10 datasets for randomly and evenly distributed labeled data. Colored rows denote methods that use Bayesian approaches.

Model MNIST SVHN CIFAR-10

100 1000 4000

M1+TSVM 11.82 (± 0.25) 54.33 (± 0.11)

SWWAE 8.71 (± 0.34) 23.56

M1+M2 3.33 (± 0.14) 36.02 (± 0.10) SS-Clustering 1.66 (± 0.20) 20.66 (± 1.09) SDGM 1.32 (± 0.07) 16.61 (± 0.24) ADGM 0.96 (± 0.02) 22.86

Improved GAN 0.93 (± 0.65) 8.11 (± 1.3) 18.63 (± 2.32)

Conv-Ladder 0.89 (± 0.50) 20.40 (± 0.47)

Triplet GAN 0.91 (± 0.58) 5.77 (± 0.17) 16.99 (± 0.36) Bad GAN 0.79 (± 0.09) 4.25 (± 0.03) 14.41 (± 0.30) Proposed 1.15 (± 0.21) 11.74 (± 0.05) 28.75 (± 0.04)

Qualitative Evaluation

Random generation Style generation

Generated images of our proposed model trained on MNIST.

Visualization of the Feature Space

MNIST SVHN CIFAR-10

Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task

Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task

Jhosimar George Arias Figueroa

jariasf03@gmail.com

Motivation

Background

Intuition

Probabilistic Model

Auxiliary Task Loss

Assignment Loss ( L

)

Metric Embedding Loss ( L

)

Proposed Loss Function

Proposed Auxiliary Task Loss

Assignment Loss ( L

)

Metric Embedding Loss ( L

)

Network Architecture

Quantitative Evaluation

Qualitative Evaluation

Visualization of the Feature Space

LatinX in AI Workshop @ ICML 2019, Long Beach, CA, USA. https://github.com/jariasf/semisupervised-vae-metric-embedding