• Aucun résultat trouvé

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

N/A
N/A
Protected

Academic year: 2022

Partager "Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”"

Copied!
27
0
0

Texte intégral

(1)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Hugo RUIZ

1,2

, Marc CHAUMONT

1,4

, Mehdi YEDROUDJ

1,2

, Ahmed OULAD AMARA

1,2

, Fr´ ed´ eric COMBY

1,2

, G´ erard SUBSOL

1,3

LIRMM

1

, Univ Montpellier

2

, CNRS

3

, Univ Nˆımes

4

, Montpellier, France

December 29, 2020

ICPR’2021, International Conference on Pattern Recognition, MMForWILD’2021, Worshop on MultiMedia FORensics in the WILD, Lecture Notes in Computer Science, LNCS, Springer.

January 10-15, 2021, Virtual Conference due to Covid (formerly Milan, Italy).

(2)

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload

Experimental protocol Results

Conclusions and perspectives

(3)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Introduction

Steganography / Steganalysis

(4)

Empirical security measurement:

Steganalysis empirical security measurement ingredients:

I A few state-of-the art CNN networks, I A database,

I A scenario such as the clairvoyant:

= Laboratory scenario,

= Worst case attack for Alice.

(5)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Introduction

Empirical security measurement:

Steganalysis empirical security measurement ingredients:

I A few state-of-the art CNN networks,

→ Minimum size required?

I to face to database %, I to face to diversity %,

I to be in the over-parameterized region.

→ Accuracy ranking if database is larger?

I A database,

→ Minimum size to be better than a random guesser?

→ CNNs collapse or not if the training is larger?

(6)

(1) Macroscopic black-box first observations:

Model scaling general behavior:

→ It is beneficial using over-parameterized networks,

i.e. with millions of parameters i.e ≥ 10 6 .

(7)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Introduction

(2) Macroscopic black-box first observations:

Data scaling general behavior:

→ In the power-law region, the more data, the better results,

→ Power-law region seems to start between 10 4 to 10 5 images.

(8)

General model for those 2 behaviors:

The test error (noted ˜ ) can be simplified 1 in [*]:

˜

(m, n) = an −α

| {z }

dataset power −law

+ bm −β

| {z }

model power −law

+c ∞

I a, b, α, β, c ∞ real positive constants, I n = dataset size, m = model size,

I α and β control the exponential decreasing, I c ∞ the irreducible error.

[*] Rosenfeld, J.S., Rosenfeld, A., Belinkov, Y., Shavit, N.

A constructive prediction of the generalization error across scales ICLR’2020, Apr 2020.

1

in the power-law regions.

(9)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Introduction

Effect of increasing the dataset size:

In this paper, we use only one CNN and study the effect of database scaling.

In the dataset power-law region,

we should observe the exponential decreasing [*]:

(n) = a 0 n −α

0

+ c 0

[*] Hestness, J., Narang, S., Ardalani, N., Diamos, G., Jun, H., Kianinejad, H., Patwary, M.M.A., Yang, Y., Zhou, Y.

Deep Learning Scaling is Predictable, Empirically Unpublished - ArXiv 1712.00409, 2017.

(10)

Why studying the effect of increasing the dataset size?

Why studying this?

I ML community observed this power-law. What about steganalysis?

I Database scaling;

An important ingredient for empirical security analysis?

I Model scaling in steganalysis = future work

2

.

2First observations have been made during JPEG steganalysis Alaska#2 competition, when using the scalable modified EfficientNet network, which is based on the principle of building gradually larger/scalable EfficientNet networks.

(11)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Our test bench to assess scalability for DL-based steganalysis

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload

Experimental protocol Results

Conclusions and perspectives

(12)

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload

Experimental protocol Results

Conclusions and perspectives

(13)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the network for JPEG steganalysis:

Low Complexity network (LC-Net) [*]:

I One of the state-of-the-art CNN until mid-2020, I 20 times fewer parameters than SRNet,

I Faster learning than other networks, I Medium size model (3.10 5 parameters),

→ WARNING: model size close to the interpolation threshold.

→ early stopping during learning.

[*] Huang, J., Ni, J., Wan, L., Yan, J.

A Customized Convolutional Neural Network with Low Model Complexity for JPEG Steganalysis ACM IH&MMSec’2019. Jul 2019.

(14)

LC-Net rapid overview:

Convolution (HPF) TLU

Convolution Convolution Convolution

Preprocessing module

Cover Stego Softmax Fully Connected Classification module Block 1

Block 2

Block 3

Block 4

Block 6 Block 5

Convolutional module Input image

Ingredients:

I 30 SRM filters for the pre-processing module, I 6 blocks using residual connections,

I Blocks 3 to 6 downsample the feature maps,

I ReLU, Batch Norm, and 3x3 convolutions.

(15)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Our test bench to assess scalability for DL-based steganalysis Choice of the database

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload

Experimental protocol Results

Conclusions and perspectives

(16)

Choice related to the database:

Requirements:

I Grey level images (color steganalysis is not enough understood),

I More than one million images (needs large dataset), I A controlled database (easier to analysis and generate), I A diverse database (more realistic),

I A quality factor 75:

→ Robustness to quantization diversity is not enough understood,

→ Will facilitate future comparison with uncontrolled databases;

I Small size images (256×256; memory budget).

The LSSD database is available at:

http://www.lirmm.fr/~chaumont/LSSD.html .

(17)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Our test bench to assess scalability for DL-based steganalysis Choice of the payload

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload Experimental protocol Results

Conclusions and perspectives

(18)

Choice of the payload:

Objectives:

I Accuracy ∈ [60%, 70%] for a small database ( ' 20,000 images) i.e. being sufficiently far from the random-guess region, I → Large progression margin (when dataset is scaled), I → Room for future works (using better networks).

→ JUNIWARD at 0.2 bpnzacs for grey-level JPEG 256×256

images from LSSD database with a QF=75.

(19)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Experimental protocol

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload Experimental protocol Results

Conclusions and perspectives

(20)

Experimental protocol

Essential points:

I 4 learning sets: 20k, 100k, 200k, 1 million (cover+stego) JPEG images,

I 5 models for each learning set (std < to 0.8% for 20k),

I 1 unique test set: 200k (cover+stego) JPEG images,

I LC-Net hyper-parameters are almost the same as the paper,

I Use of an IBM container having access to 2 Tesla V100 GPU.

(21)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Results

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload

Experimental protocol Results

Conclusions and perspectives

(22)

Results

Figure: Average probability of error with respect to the learning database

size. Notice that the abscissa scale is logarithmic.

(23)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Results

Analysis

Essential points:

I Accuracy improved by 6% from 20k to 1M images, I LC-Net does not have its performance collapsing, I Standard deviation is getting smaller and smaller,

→ learning process is more and more stable.

Other facts:

I Time consumption:

I 20k ≈ 2h

I 1 million ≈ 10 days I Memory consumption:

I 20k ≈ 10 GB (MAT file in double precision)

I 1 million ≈ 500 GB (MAT file in double precision)

(24)

What about power-law?

Using a non-linear regression with Lagrange multipliers:

(n) = 0.492415n −0.086236 + 0.168059

I Erroneous to affirm that the irreducible P E = 16.8%,

I but without much error on the prediction, probability of error for 20M images should be close to 28%,

I For 2k images it was 37%,

→ 9% increase which is a considerable improvement in

steganalysis domain.

(25)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Conclusions and perspectives

Outline

Introduction

Our test bench to assess scalability for DL-based steganalysis Choice of the network for JPEG steganalysis

Choice of the database Choice of the payload

Experimental protocol Results

Conclusions and perspectives

(26)

Conclusions (1)

Error power-law is also observed for steganalysis:

I Even with a medium-size model (3 × 10 5 parameters), I Even starting with a medium-size database (2 × 10 4 images).

Take away message:

Increasing a lot (20 million images)

will make you win almost 10% in accuracy

(27)

Analysis of the Scalability of a Deep-Learning Network for Steganography ”Into the Wild”

Conclusions and perspectives

Conclusions (2)

Future work:

I Evaluate with more diversity (quality factors, payload sizes, embedding algorithms, colour, less controlled database), I Evaluate with other networks,

I Reduce learning time and optimize memory management, I Find a more precise irreducible error value,

I Study the slope of the power-law depending on the starting point of the CNN (use of transfer, use of curriculum, use of data-augmentation such as pixels-off),

I Find innovative techniques when the database is not huge in

order to increase the performances.

Références

Documents relatifs

The objective of this paper is to provide an algorithm based on Control Theory allowing: (i) to equilibrate the load on the AMF instances in order to maintain an optimal response

Utilisez les vues de côté pour construire les structures de cubes définies

Despite the fact that the Edinburgh Agreement was the winning alternative and became the social choice following the two referendums, this indicates that the Edinburgh Agreement

FIGURE 2.5 Les trois modes de croissance des couches minces 28 FIGURE 2.6 Allure général d’un voltamogramme et ses caractéristiques 29 FIGURE 2.7 Schéma théorique de la

price-sensitivity and our structural estimates indicate that the clear majority of the effect of all of our treatments (whether depletion, sugar drink or placebo drink) is not on the

Actually, the rigidity imposed by a monolithic building block draws a clear border on how computing, memory, storage and network resources may be expanded during future upgrades of

Similarly, we note from Table 13 that the conclusion from the comparison of results obtained by training on BOSS- Base or on the same M9Base also remains valid ; the fact of picking

For instance, with selected 100 compromised nodes (less than 2% of the total number of nodes), nearly all paths are prone to value privacy attacks, around 70% are prone to