• Aucun résultat trouvé

About the inevitability of adversarial examples

N/A
N/A
Protected

Academic year: 2021

Partager "About the inevitability of adversarial examples"

Copied!
6
0
0

Texte intégral

(1)

HAL Id: hal-01687936

https://hal.archives-ouvertes.fr/hal-01687936v6

Preprint submitted on 5 Mar 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

About the inevitability of adversarial examples

Adrien Chan-Hon-Tong

To cite this version:

Adrien Chan-Hon-Tong. About the inevitability of adversarial examples. 2019. �hal-01687936v6�

(2)

About the inevitability of adversarial examples

Adrien CHAN-HON-TONG March 5, 2019

Abstract

Adversarial examples are a burring issue for deep learning. Yet, the basic definition of adversarial example is not that trivial. Indeed, if one consider adversarial examples as points close to decision boundary, then, one can prove that adversarial examples are not only inevitable but almost everywhere. Yet, seeing adversarial examples as points not expected to be close to decision boundary may lead to very different conclusions.

1 Introduction

Since [4], deep learning is the state of the art of supervised machine learn- ing (see [5] for a review). Typically in computer vision, deep learning has brought an important jump in performance. Today, a large part of the community thinks that deep learning may allow critical applications like autonomous driving [1] or health care [2].

Yet deep learning successes are plagued by a phenomenon called ad- versarial examples [8, 12, 9].It is possible to design a specific invisible perturbation such as the network predicts different outputs between orig- inal and disturbed input.

There is a very large effort of the community to try to moderate this issue (see for example [3]). As this effort is currently not very efficient, [10] recently asks the question of the inevitability of adversarial examples.

Indeed, [10] takes a look into adversarial example from geometric point of view, and, offer very interesting statement. Eventually, [10] claims that adversarial example are inevitable.

Yet, in this paper, I discuss this claims. Off course, I do not discuss the results (which is properly proven) but I discuss the interpretation of the results. It seems that the basic definition of adversarial examples is not that trivial. And that using different definition of adversarial examples unsurprisingly leads to very different conclusion.

2 Some geometric lemmas

Consistently with classical notations, ||.||p forp ∈ {1,2,∞}will be the classical norm 1, 2 or∞. Bp(x, δ) will be the ball of centerxand radius δfor normp. |.|will be either the absolute value or the volume of a set.

(3)

2.1 L

2

ball

Let consider B2(0,1) the unit ball of RD, and, let define Sα,β = {x ∈ B2(0,1)/α ≤ xD ≤ β}. This is the cap of the ball enclosed between the two hyperplanes {x/xD = α}and {x/xD = β} (let notice that by rotation, any diameter can be used instead of theDaxis).

Let fix 0< ε <1, and, let imagine a classifier cutting the ball in two equivalent half: S−1,0andS0,1. The set of points close to the boundary of the classifier cut isSambiguous={x∈B2(0,1)/∃x0∈B2(0,1)/||x−x0||2≤ ε, {x, x0} 6⊂S−1,0, {x, x0} 6⊂S0,1}. Obviously,Sambiguous=S−ε,ε.

Now, let focus about the volume ofSambiguous: it is linked to the vol- ume of an hyperspherical cap. Indeed,|Sambiguous|=|B2(0,1)|−2×|Sε,1| and|Sε,1|is the volume of the cap of height 1−ε. Let defineµ2(D, ε) =

|Sambiguous|

|B2(0,1)| (this quantity will be discussed in section 3). According to [6], it holds that: µ2(D, ε) = 1−I2(1−ε)−(1−ε)2(D−12 ,21) = 1−I1−ε2(D−12 ,12) whereIx(a, b) is the regularized incomplete beta function.

Yet, according to [7], Ix(a, b) = xaa(O(1) +O(z)) →

a→∞ 0. In the particular case ofµ2(D, ε), this implies that: µ2(D, ε) →

D→∞1 (for fixed value ofε).

2.2 L

1

ball

Let consider B1(0,1) the unit ball of RD, and, let define Hα,β = {x ∈ B1(0,1)/α≤xD ≤β}. This is again the cap of the ball (forL1 norm) enclosed between the two hyperplanes{x/xD=α}and{x/xD=β}but here the axis of the cut is crucial.

Let fix 0 < ε < 1, and, let imagine again a classifier cutting the ball in two equivalent half: H−1,0 and H0,1. The set of points close to the boundary of the classifier cut is Hambiguous ={x ∈ B1(0,1)/∃x0 ∈ B1(0,1)/||x−x0||2 ≤ ε, {x, x0} 6⊂ H−1,0, {x, x0} 6⊂ H0,1}. Obviously, here again,Hambiguous=H−ε,ε.

Let define µ1(D, ε) = |Hambiguous|B |

1(0,1)| . As, one can write B1(0,1) as a disjoint union H(−1,−ε)∪H(−ε, ε)∪H(ε,1), one can take relative volume: |Hambiguous|B |

1(0,1)| +|B|H(ε,1)|

1(0,1)| +|H(−1,−ε)||B

1(0,1)| = 1. This can be reduce to

Hambiguous|

|B1(0,1)| = 1−2|B|H(ε,1)|

1(0,1)| for symmetry reason.

Yet, |B|H(0,1)|

1(0,1)| = 12, as H(0,1) is half the ball. And, for homothetic reason |H(δ,1)| = (1−δ)D|H(0,1)| (for δ ≥ 0). So, combining these equations, one get: µ1(D, ε) = 1−(1−ε)D

In particular, it holds thatµ1(D, ε) = 1−(1−ε)D

D→∞1.

2.3 L

ball

Let considerB(0,1) the unit ball ofRD, and, let defineGα,β = {x ∈ B(0,1)/α≤xD≤β}.

Let fix 0 < ε < 1, and, let imagine again a classifier cutting the ball in two equivalent half: G−1,0 and G0,1. The set of points close to the boundary of the classifier cut is Gambiguous ={x∈ B(0,1)/∃x0 ∈ B(0,1)/||x−x0||2≤ε, {x, x0} 6⊂G−1,0, {x, x0} 6⊂G0,1}=G−ε,ε.

(4)

Let defineµ(D, ε) = |G|Bambiguous|

(0,1)| . As all dimension except the last are not impacted by the cut of the ball per an hyperplan,µ(D, ε) = 2ε.

In particular, it holds thatµ(D, ε) = 2ε 9

D→∞1 ! Such situations wereµ(D, ε) 9

D→∞1 seem also to be possible with the L1 ball but by cutting it differently: let again considerB1(0,1) the unit ball of RD, but, let defineQα,β ={x∈B1(0,1)/α≤xT1≤β}. This is the cap of the ball but for hyperplane with normal vector1.

Let defineµ01(D, ε) = |Qambiguous|B |

1(0,1)| . Even if this question is not more investigated in this paper,µ01(D, ε) could exhibit an asymptotic behaviour different fromµ1(D, ε)...

3 Implications for supervised classifica- tion

The results from previous section are more or less weaker versions of the results from [10]. Mainly, these results prove that:

• all half cut of theL2 ball have a largeL2 εexpansion.

• some cut of theL1 ball have a largeL2 ε expansion (typically axis aligned cut).

• some cut of theLball have a smallL2 εexpansion (typically axis aligned cut).

Let note that [10] proves thatallcuts of theL2 sphere have a largeL2

εexpansion (in the sphere) which is a quite stronger result, but with the inconvenient of relying on a set with empty interior. Here, the presented results only consider hyperplan containing 0 (all for theL2 balls, or axis aligned for other balls). The only advantage of these statements compare to [10] is that proofs are simpler.

But anyway, the point of this paper is not to offer results complemen- tary to [10]. The point is to stress that depending on how are defined adversarial example, one could conclude that they are inevitable adver- sarial or non existent !

Letaand bbe two uniform distributions onH(−1,−12) andH(12,1) i.e. aandbare two distributions at opposite vertex of theL1ball. A good (maybe the best) classifier to separate these two set of points is obviously f(x) =

xD≤0 ⇒class a

otherwise ⇒class b which is probably the classifier result- ing from a support vector machine learning (SVM [11]). The point is that in this situation none real point from eitheraorbdistribution admit any εadversarial examples forε <1. But, on the same time, the probability of a point drawn uniformly in theL1 ball to admit an adversarial goes to 1 withD.

The critical question stressed by this paradox is the following: are ad- versarial points close the decision boundary ? or are adversarial points not expected to be close to the boundary ? In other words, the question is to know if interesting quantity is the probability of a point drawn uniformly

(5)

in the space to admit an adversarial or the probability of a point drawn from the underlying density to admit an adversarial.

Clearly, if one is concerned about the probability of a point drawn uniformly in the space to admit an adversarial, then the results from [10]

(and from section 2.1 and 2.2) show that this probability may go to 1 with D- so in this case, one can state that adversarial example are inevitable.

Now, if one is concerned about the probability that a point drawn from the underlying distribution of the data admit an adversarial, then this probability could be 0. Indeed, this probability is 0 for two distributions perfectly separated forεlower than the margin (as pointed in [3]).

Even more interesting, in the case of theLnorm, focusing of theεex- pansion of the classifier boundary may lead to poor classifier. Let consider the learning problem of separating 2 distribution centred on−1and on1.

One can usef(x) =

xD≤0 ⇒class a

otherwise ⇒class b which may still separate the two distribution and with additional property that the boundary of f has a smallεexpansion. Yet, usingg(x) =

1Tx≤0 ⇒class a otherwise ⇒class b seems to make more sense: along an axis the two distributions are atL2

distance 2 while along1they are at distanceL2 distance 2√

D. Yet, even if this is not investigated in this paper, theεexpansion of boundary ofg may be large.

So, to conclude this section, machine learning problems often involve samples belonging to a space Ω and resulting classifierfmay be functional on all points of Ω. But, focusing on the behaviour off on points of Ω that does not belong to the real distribution may lead to poor decision from learning point of view. It may also lead to dramatic statement like adversarial are inevitable which may not reflect the behaviour on real samples.

At this point, one could think about requiring from classifiers to pro- duce also a prediction about is this particular samplereal. Of course, as pointed in [10] and in examples from section 2, most space may be fill by unreal samples. But this is not a problem at all: it seems that this the case especially for images. Finally, to return to adversarial, one may claim that (L2 but it could be for an other norm) adversarial are not about

x∈Ω/∃x0Ω/||x−x0||2≤ε, f(x)6=f(x0) but about

x∈Ω given as non ambiguous/∃x0Ω/||x−x0||2≤ε, f(x)6=f(x0) To conclude, I claim thata good classifier should classify correctly all real image, detect non real image as it, and, real image should have non real image as only adversarial (class must not evolve locally except to mark that the sample is detected as unreal.

References

[1] Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, and

(6)

Bernt Schiele. The cityscapes dataset for semantic urban scene un- derstanding. InConference on Computer Vision and Pattern Recog- nition, 2016.

[2] Hayit Greenspan, Bram van Ginneken, and Ronald M Summers.

Guest editorial deep learning in medical imaging: Overview and fu- ture promise of an exciting new technique. IEEE Transactions on Medical Imaging, 35(5):1153–1159, 2016.

[3] Todd Huster, Cho-Yu Jason Chiang, and Ritu Chadha. Limitations of the lipschitz constant as a defense against adversarial examples.

arXiv preprint arXiv:1807.09705, 2018.

[4] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. InAdvances in neural information processing systems, pages 1097–1105, 2012.

[5] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.

Nature, 521(7553):436–444, 2015.

[6] Shengqiao Li. Concise formulas for the area and volume of a hyper- spherical cap.Asian Journal of Mathematics and Statistics, 4(1):66–

70, 2011.

[7] Jos L. Lpez and Javier Sesma. Asymptotic expansion of the incom- plete beta function for large values of the first parameter. Integral Transforms and Special Functions, 8(3-4):233–236, 1999.

[8] Seyed Mohsen Moosavi Dezfooli, Alhussein Fawzi, Omar Fawzi, and Pascal Frossard. Universal adversarial perturbations. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.

[9] Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. InSecurity and Privacy (EuroS&P), 2016 IEEE European Symposium on, pages 372–387. IEEE, 2016.

[10] Ali Shafahi, W Ronny Huang, Christoph Studer, Soheil Feizi, and Tom Goldstein. Are adversarial examples inevitable? arXiv preprint arXiv:1809.02104, 2018.

[11] Vladimir Naumovich Vapnik and Vlamimir Vapnik.Statistical learn- ing theory, volume 1. Wiley New York, 1998.

[12] Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan Yuille. Adversarial examples for semantic segmentation and object detection. In The IEEE International Conference on Com- puter Vision (ICCV), Oct 2017.

Références

Documents relatifs

(i) the efficiency of the first-order defense against iterative (non-first-order) attacks (Fig.1&amp;4a); (ii) the striking similar- ity between the PGD curves (adversarial

Machine assessment of perceptual similarity between two images the input image and its adversarial example is arguably as difficult as the original classification task, while

proposed a method for generating unrestricted adversarial examples from scratch instead of adding small perturbations on a source image and demonstrated that their generated

We consider a DQN-based adversarial agent, aiming to learn an optimal adversarial state-perturbation policy to minimize the return of its targets, consisting of DQN, A2C, and

In this work we propose a new method to generate natural adversarial examples for a pretrained classifier, using a modified loss function for the WGAN with a re-weighted distribution

Due to the fact that adversarial perturbations happen to be high frequency in nature, we reinforce the information bottleneck with frequency steering in the AE. We introduce the

Figure 7 shows operating characteristics against a deeper network ResNet-50 and its robust version ResNet-50R fine-tuned by adversarial retraining with PGD2.. Attacking a deeper

role of the reference attack in adversarial training. In this case, we follow the training process suggested by [32]: the model is first trained on clean examples, then fine-tuned