• Aucun résultat trouvé

Effective Training of Convolutional Neural Networks for Insect Image Recognition

N/A
N/A
Protected

Academic year: 2022

Partager "Effective Training of Convolutional Neural Networks for Insect Image Recognition"

Copied!
42
0
0

Texte intégral

(1)

Effective Training of Convolutional Neural Networks for Insect Image Recognition

Maxime Martineau, Romain Raveaux, Cl´ ement Chatelain, Donatello Conte, Gilles Venturini

ACIVS

LIFAT EA 6300

(2)

Outline

1. Context

2. Theoretical context

3. State of the art ?

4. Convolutional Neural Networks

5. Proposed method

6. Results

(3)

Context

(4)

Arthropod identification

Figure 1: Examples of insect images. At the top, image acuired in a controled

environment. At the bottom, image acuired in a field-based environment.

(5)

Applications

• Applied entomology

• Estimation of the insect populations

• Biodiversity assessment

• Integrated pest management

3

(6)

Why automation?

• Complex task

• Needs a lot of qualified workforce

(7)

Arthropod identification

How to automate the task ?

5

(8)

Theoretical context

(9)

Theoretical context

Image classifcation

Let an image x ∈ R n×m×3 and C a class set.

We are searching for the classifier function f s.t.:

f : R n×m×3 → C x 7→ f (x)

6

(10)

Theoretical context

Image-based insect classification

• High intra-class variability

• Sometimes low inter-class variability

• Multi-granularity

• Different sceneries (lab, field, . . . )

order

family

genus

species

(11)

State of the art ?

(12)

A survey on image-based insect classification?

44 articles

Features

•Granularity

•Number of taxons

•Type of capture

•Constrained pose?

•Datasets

•Area of image

•Preprocessing

•Types of features used

•Classifier(s) used

•Accuracy

•Validation

•Cited

Clustering

(13)

State of the art

colour SIFT shape

...

...

MLP BoW

Sparse

stacked auto-encoders

SVM DTreeMLP

...

...

kNN

Image capture

Feature

extraction Classification

entomart Gassoumi 2000

janzen.sas.upenn.edu

bagging boosting ...

9

(14)

Features used

Category Levels

Handcrafted features

Domain-dependent Wing’ Venations

Geometry Global and

generic image features

Shape Color Texture Raw Pixel Local

features

SIFT Others Mid-

level features

Unsupervised representations

BoW PCA Supervised

representations

MLP Sparse Coding Hierarchical rep-

resentations Auto-encoder

(15)

Conclusions

• More and more generic and learning approaches.

• But no Convolutional Neural Network approach

11

(16)

Conclusions

• More and more generic and learning approaches.

• But no Convolutional Neural Network approach

(17)

Convolutional Neural Networks

(18)

Convolutional Neural Networks

Neural networks using convolution

(19)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

13

(20)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

(21)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

13

(22)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

(23)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

13

(24)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

(25)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

13

(26)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

(27)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

13

(28)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

(29)

Convolution

Source :http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

13

(30)

Convolutional Neural Networks

Neural networks using convolution

(31)

Convolutional Neural Networks

• State of the art in image classification

• Can learn complex mapping between images and classes

• Needs a lot of data

15

(32)

A lot of data

ImageNet

• 1000 classes

• 15 M images Our dataset :

• 30 classes

• 3000 images

(33)

Proposed method

(34)

Proposed method

Efficiently apply transfer learning method to CNN on insect image

recognition.

(35)

Proposed method

ImageNet-1000

3x3 conv, 64 3x3 conv, 64 maxpool/2 3x3 conv, 128 3x3 conv, 128 maxpool/2 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv,512 maxpool/2

flatten fc, 4096 fc, 4096 fc, 1000

Target

3x3 conv, 64 3x3 conv, 64 maxpool/2 3x3 conv, 128 3x3 conv, 128 maxpool/2 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 maxpool/2 global avgpool

fc, 256 fc, n

18

(36)

Results

(37)

Comparative study

Model IRBI ImageNet-arthropods

Top-1 Top-5 Top-1 Top-5

SIFTBoW 52.3 % ± 3.7 82.7 % ± 3.3 11.7 % ± 0.2 25.9 % ± 0.4 VGG16-frsc 54.0 % ± 5.0 84.9 % ± 3.0 26.9 % ± 0.7 50.1 % ± 0.7 VGG16-fitu 73.6 % ± 1.8 92.4 % ± 2.2 43.5 % ± 1.1 71.3 % ± 0.8

Table 1: Recognition rates on 5-fold cross-validation

19

(38)

How much do we have to learn ?

ImageNet-1000

3x3 conv, 64 3x3 conv, 64 maxpool/2 3x3 conv, 128 3x3 conv, 128 maxpool/2 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv,512 maxpool/2

flatten fc, 4096 fc, 4096 fc, 1000

Target

3x3 conv, 64 3x3 conv, 64 maxpool/2 3x3 conv, 128 3x3 conv, 128 maxpool/2 3x3 conv, 256 3x3 conv, 256 3x3 conv, 256 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 maxpool/2 3x3 conv, 512 3x3 conv, 512 3x3 conv, 512 maxpool/2 global avgpool

fc, 256 fc, n

(39)

How much do we have to learn ?

21

(40)

Results updated

Model IRBI ImageNet-arthropods

Top-1 mean time (s) Top-1 mean time (s)

SIFTBoW 52.3 % ± 3.7 — 11.7 % ± 0.2 —

VGG16-frsc 54.0 % ± 5.0 101 26.9 % ± 0.7 1470 VGG16-fitu 73.6 % ± 1.8 102.6 43.5 % ± 1.1 1473.2 VGG16-fitu7 72.4 % ± 2.8 52.4 43.3 % ± 0.6 721.6

Table 2: Recognition rates and mean epoch times on 5-fold cross-validation

(41)

Conclusions

• Efficient way of using CNN with transfer learning

• Application to insect image recognition

• Training made ∼ 2x faster by omitting training some layers

23

(42)

Thanks !

Références

Documents relatifs

This article presents a method to easily train a deep convolutional neural network architecture using transfer learning, with application to the insect image recog- nition problem..

Figure 6: Accuracy stability on normal dataset (for class ”mountain bike”), showing a gradual increase of accuracy with respect to decreasing neurons’ criticality in case of

This paper represents the model for image captioning based on adversarial training process using hybrid convolutional fuzzy neural networks.. Adversarial training for automatic

Artificial intelligence has occupied the place of the sphere of research and today includes the paradigm of learning – machine learning [12; 13], which differs from the

Our approach generates a visu- alisation with three plots: (i) a 2D plot of the test data with the input images as the data points, (ii) a simple 2D plot of the training data, and

We pro- pose cross training method in which a CNN for each in- dependent modality (Intensity, Depth, Flow) is trained and validated on different modalities, in contrast to

Danut Ovidiu Pop, Alexandrina Rogozan, Fawzi Nashashibi, Abdelaziz Bensrhair. Fusion of Stereo Vision for Pedestrian Recognition using Convolutional Neural Networks. ESANN 2017 -

This article presents a method to easily train a deep convolutional neural network architecture using transfer learning, with application to the insect image recog- nition problem..