Retour aux articles
IAOpenAI News

Robust adversarial inputs

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that self-driving cars would be hard to trick maliciously since...

Le flux RSS ne fournissait qu'un extrait. FlowMarket a récupéré le contenu public disponible depuis la page originale, sans contourner les contenus réservés.

July 17, 2017

Robust adversarial inputs

Robust Adversarial Inputs

We’ve created images that reliably fool neural network classifiers when viewed from varied scales and perspectives. This challenges a claim from last week that self-driving cars would be hard to trick maliciously since they capture images from multiple scales, angles, perspectives, and the like.

Out-of-the-box  adversarial examples ⁠ (opens in a new window)  do fail under image transformations. Below, we show the same cat picture, adversarially perturbed to be incorrectly classified as a desktop computer by  Inception v3 ⁠ (opens in a new window)  trained on  ImageNet ⁠ (opens in a new window) . A zoom of as little as 1.002 causes the classification probability for the correct label tabby cat to override the adversarial label  desktop computer .

However, we’d suspected that active effort could produce a robust adversarial example, as adversarial examples have been shown to  transfer ⁠ (opens in a new window)  to the physical world.

Scale-invariant adversarial examples

Adversarial examples can be created using an optimization method called projected gradient descent to find small perturbations to the image that arbitrarily fool the classifier.

Instead of optimizing for finding an input that’s adversarial from a single viewpoint, we optimize over a large  ensemble ⁠ (opens in a new window)  of stochastic classifiers that randomly rescale the input before classifying it. Optimizing against such an ensemble produces robust adversarial examples that are scale-invariant.

Even when we restrict ourselves to only modifying pixels corresponding to the cat, we can create a single perturbed image that is simultaneously adversarial at all desired scales.

Transformation-invariant adversarial examples

By adding random rotations, translations, scales, noise, and mean shifts to our training perturbations, the same technique produces a single input that remains adversarial under any of these transformations.

Our transformations are sampled randomly at test time, demonstrating that our example is invariant to the whole distribution of transformations.

  • Ethics & Safety

Author

Related articles

Point E A System For Generating 3d Point Clouds From Complex Prompts

Publication Dec 16, 2022

Multimodal Neurons

Milestone Mar 4, 2021

CLIP

Milestone Jan 5, 2021

Besoin d'un workflow n8n ou d'aide pour l'installer ?

Après la veille, passez à l'action : trouvez un template n8n ou un créateur capable de l'adapter à vos outils.

Source

OpenAI News - openai.com

Voir la publication originale