HAL Id: hal-02268478
https://hal.archives-ouvertes.fr/hal-02268478
Submitted on 21 Aug 2019
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task
Jhosimar George Arias Figueroa
To cite this version:
Jhosimar George Arias Figueroa. Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task. LatinX in AI Research at ICML 2019, Jun 2019, Long Beach, United States. �hal-02268478�
Semi-supervised Learning using Deep Generative Models and Metric Embedding Auxiliary Task
Jhosimar George Arias Figueroa
jariasf03@gmail.com
Motivation
In many real world applications, it is relatively easy to acquire large amounts of unla- beled data. However, their corresponding labels often requires the effort of a human expert and special devices. This bottleneck in the labeling results in small amounts of labeled data and large amounts of unlabeled data. Therefore, being able to utilize the surplus unlabeled data is desirable.
Background
Variational Autoencoder Evidence Lower Bound (ELBO)
Approximate pθ(z|x) with variational posterior qϕ(z|x), define lower bound on log p(x)
Intuition
▶ Our encoder transforms the data into feature representations, from these representations we infer a Gaussian latent variable to provide more robust set of features and a discrete latent vari- able to provide category assignments.
▶ We take advantage of the small amount of la- beled data to perform the category assignments for the unlabeled data.
▶ The decoder uses the gaussian representations and category information for reconstruction.
Probabilistic Model
Inference Model Generative Model
Evidence Lower Bound (ELBO)
The first term is the reconstruction loss (LR). The sec- ond and third terms work as regularizers of the categorical (LC) and Gaussian (LG) distributions respectively.
Auxiliary Task Loss
Assignment Loss ( L
A)
▶ In order to assign labels to the unlabeled data, we use the Distance-Weighted k-nearest neighbor (k-NN) on the feature space.
▶ Once we obtained the assignments for the un- labeled data, we use the Cross-entropy loss.
Metric Embedding Loss ( L
M)
▶ The goal of this loss is to regularize the fea- ture space, in such a way that the distance between features that belong to the same cat- egory should be smaller than features of dif- ferent categories.
▶ We perform metric learning via the lifted structured loss.
LM = 1 2|Pˆ|
∑
(i,j)∈Pˆ
max(0, J(fi, fj))2
J(fi, fj) = D(fi, fj) + max {
max
(i,k)∈Nˆ
α − D(fi, fk), max
(j,l)∈Nˆ
α − D(fj, fl) }
Proposed Loss Function
Proposed Auxiliary Task Loss
Assignment Loss ( L
A)
▶ In order to assign labels to the unlabeled data, we use the Distance-Weighted k-nearest
neighbor (k-NN) on the feature space.
▶ Once we obtained the assignments for the
unlabeled data, we use the Cross-entropy loss.
Metric Embedding Loss ( L
M)
▶ The goal of this loss is to regularize the
feature space, in such a way that the distance between features that belong to the same
category should be smaller than features of different categories.
▶ We perform metric learning via the lifted structured loss.
LM = 1 2|Pˆ|
∑
(i,j)∈Pˆ
max(0, J(fi, fj))2
J(fi, fj) = D(fi, fj) + max {
max
(i,k)∈Nˆ
α − D(fi, fk), max
(j,l)∈Nˆ
α − D(fj, fl) }
Network Architecture
Architecture of our proposed model. The encoder and decoder are given by Convolutional Neural Networks. The red diamonds represent the reconstruction loss (LR), categorical loss (LC), gaussian loss (LG), assignment loss (LA) and metric embedding loss (LM).
Quantitative Evaluation
MNIST SVHN
Performance of our probabilistic model after adding the assignment (LA) and metric embedding (LM) loss at different epochs on the MNIST and SVHN datasets.
Semi-supervised test error (%) on MNIST, SVHN and CIFAR-10 datasets for randomly and evenly distributed labeled data. Colored rows denote methods that use Bayesian approaches.
Model MNIST SVHN CIFAR-10
100 1000 4000
M1+TSVM 11.82 (± 0.25) 54.33 (± 0.11)
SWWAE 8.71 (± 0.34) 23.56
M1+M2 3.33 (± 0.14) 36.02 (± 0.10) SS-Clustering 1.66 (± 0.20) 20.66 (± 1.09) SDGM 1.32 (± 0.07) 16.61 (± 0.24) ADGM 0.96 (± 0.02) 22.86
Improved GAN 0.93 (± 0.65) 8.11 (± 1.3) 18.63 (± 2.32)
Conv-Ladder 0.89 (± 0.50) 20.40 (± 0.47)
Triplet GAN 0.91 (± 0.58) 5.77 (± 0.17) 16.99 (± 0.36) Bad GAN 0.79 (± 0.09) 4.25 (± 0.03) 14.41 (± 0.30) Proposed 1.15 (± 0.21) 11.74 (± 0.05) 28.75 (± 0.04)
Qualitative Evaluation
Random generation Style generation
Generated images of our proposed model trained on MNIST.
Visualization of the Feature Space
MNIST SVHN CIFAR-10