Depth-2 Neural Networks Under a Data-Poisoning Attack

arXiv2005.01699MaRDI QIDQ6340003

Anirbit Mukherjee, Sayar Karmakar, Theodore Papamarkou

Publication date: 4 May 2020

Abstract: In this work, we study the possibility of defending against data-poisoning attacks while training a shallow neural network in a regression setup. We focus on doing supervised learning for a class of depth-2 finite-width neural networks, which includes single-filter convolutional networks. In this class of networks, we attempt to learn the network weights in the presence of a malicious oracle doing stochastic, bounded and additive adversarial distortions on the true output during training. For the non-gradient stochastic algorithm that we construct, we prove worst-case near-optimal trade-offs among the magnitude of the adversarial attack, the weight approximation accuracy, and the confidence achieved by the proposed algorithm. As our algorithm uses mini-batching, we analyze how the mini-batch size affects convergence. We also show how to utilize the scaling of the outer layer weights to counter output-poisoning attacks depending on the probability of attack. Lastly, we give experimental evidence demonstrating how our algorithm outperforms stochastic gradient descent under different input data distributions, including instances of heavy-tailed distributions.

Has companion code repository: https://github.com/papamarkou/neurotron_experiments

Mathematics Subject Classification ID

Stochastic programming (90C15)

This page was built for publication: Depth-2 Neural Networks Under a Data-Poisoning Attack

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6340003)