Robustness Testing of Neural Networks

This was a design project under the supervision of Prof. Tirtharaj Dash at BITS Pilani, K K Birla Goa Campus. The aim of this project was to train a model which generates minimum noise to fool a pre-trained neural network.

The following was done as part of the design project :

  1. Initially, random Gaussian noise was added to different (deep/shallow) neural networks trained on MNIST data and their robustness was compared by measuring the minimum amount of noise that needed to be added to produce a certain threshold of misclassification
  2. After this we attempted to train the noise using a neural network. For this purpose, a basic dataset of modulus-2 was used. A classifier NN was trained on it, and then an adversarial NN was used to produce minimum noise which when added to inputs of the classifier NN produces wrong outputs. Two loss functions were used to train the adversarial NN :
  • Misclassification Loss – The accuracy between completely wrong labels of inputs and labels produced by classifier NN when inputs + noise by adversarial NN is fed
  • Regularization Loss – L2 loss corresponding to the amount of noise produced by adversarial NN
  1. After this, a Dog/Cat image dataset was used so that the noise added could be visualized easily, and it could be interpreted if the added noise is less perceptible to human eyes but crucial in changing the class of a pre-trained neural network. An auto-encoder architecture was employed for the same, and results were compared by trying different architectures, loss weightage, input size, etc.
  2. To try and improve the noise producing capacity of adversarial NN, we tried to convert it completely into a CNN, employing a deep architecture by using VGG16. However, we have had some problems in making the model converge to produce stable noise, and further work needs to be done by trying different combinations in architecture.
Shivin Thukral
Shivin Thukral
Machine Learning Engineer

Working as an MLE on building recommendation systems using ML and NLP techniques

Related