Find out techniques that can be used to increase your training data.

This post was orignally published in Towards Data Science publication on Medium.

Click here to view this post in Medium.

Photo album (Photo by True Agency on Unsplash)

Computers outperform humans in image and object recognition.

Big corporations like Google and Microsoft have beat the human benchmark on image recognition [1, 2]. On average, human makes an error on image recognition tasks about 5% of the time. As of 2015, Microsoft’s image recognition software reached an error rate of 4.94%, and at around the same time, Google announced that its software achieved a reduced error rate of 4.8% [3].

How is this possible?

This was possible by training deep convolutional neural networks on millions of training examples from ImageNet dataset which contains hundreds of object categories [1].

One MILLION training data!

German 1 million mark stamp (Image by Hebi B. from Pixabay)

“To teach a computer to recognize a cat from many angles, for example, could require thousands of photos covering a variety of perspectives.”- TOM SIMONITE

Image of a cat (Photo by Mikhail Vasilyev on Unsplash)

A large amount of data is required to train a deep convolutional neural network for computer vision task successfully. This is because these neural networks have multiple hidden processing layers and as the number of layers increases, the number of examples it needs to learn from increases as well. If enough training data is not available, the model tends to learn the training data too well. This is called overfitting. If a model overfits, its ability to generalize is very poor and thus it performs on unseen data is low.

But what if huge training data is not available?

It is not necessary that for all the image recognition task that we have at hand has millions of training examples. For some tasks, it is even a challenge to gather thousands of example images. This is usually the case for medical images such as mammography for breast cancer detection and localization, chest x-rays for lung cancer detection, or MRI scans to locate brain tumours.

It comes down to one question.

How do we train a model that performs well for these tasks when we only have limited data?

Generate more training data by using AUGMENTATION

When we have only a small amount of image data for training a deep convolutional neural network, we can use data augmentation techniques to generate more training data from the ones that we already have.

Data augmentation is a technique of generating multiple images for the original image. There are several different techniques for data augmentation and Mikolajczyk and Grochowski in their paper [4], categorize these techniques into two sub-categories: data augmentation using basic image manipulation and data augmentation using deep learning approaches.

Geometric transformation

Geometric transformations such as flipping, cropping, rotation and translation are some commonly used data augmentation techniques. We will discuss them in brief in this post.

Flipping

Image of dog — Original image of a dog on the left, horizontally flipped image about centre on the right [6]

Flipping is taking a mirror image of any given image. It is one of the easiest augmentation techniques. An image can be flipped either horizontally or vertically. Horizontal flipping is, however, more common amongst the two.

Cropping

Image of cat — Orignal and random cropped image of a cat [7]

Cropping is a data augmentation technique for reducing the size of the original image by either cropping the boundary pixels. The spatial dimension is not preserved in cropping. In this type of data augmentation, it is not guaranteed that the transformed image belongs to the same output label as the original image.

In the above image, four images are generated from the original image by cropping pixels from left and right. The size of the cropped images is reduced from 256x256 to 227x277.

Rotation

An image can be rotated either left or right on an axis between 1 to 359 degrees. Rotation between 1 to 20 degrees is known as a slight rotation and can be a useful technique for augmenting original image. As the degree of rotation increases, the transformed data might not be able to preserve its original label.

Translation

Tennis ball — Original and translated images of a tennis ball [8]

Translation is a technique of shifting the image to the left, right, up or down. This can be a very useful transformational technique to avoid positional bias in the data. When an image is shifted, the remaining space is either filled by 0s, 255s or filled with random noise, thus preserving the original size of the image.

GAN‑based Data Augmentation

Generative Adversarial Network (GAN) also referred to as GAN is a generative modelling technique, where artificial instances are created from a dataset in such a way that the similar characteristics of the original set are retained [9].

A GAN is made of 2 Artificial Neural Networks (ANNs) that compete against each other, the Generator and the Discriminator. The first creates new data instances while the second evaluates them for authenticity [10].

Here are images of faces generated by a GAN, which was trained on human faces. Note that these are synthetic faces and not of a real person.

Synthetic faces generated by GAN trained on human images [10]

These are some of the data augmentation techniques that are commonly used in-order to generate more data from limited dataset, so that a more effective convolutional neural network can be trained.

Olaf and his team used data augmentation techniques such as shifting, rotation and random elastic deformation on microscopical images to train a U-net architecture model while only having limited training data and won ISBI cell tracking challenge 2015 in these categories by a large margin [11].

So, the next time you are sort on data while training a convolutional neural network, use these techniques to create some more of it.

What data augmentation techniques have you used? Share your thoughts as comment below.

Sources:

[1] https://www.eetimes.com/document.asp?doc_id=1325712

[2] https://venturebeat.com/2017/12/08/6-areas-where-artificial-neural-networks-outperform-humans/

[3] https://www.theguardian.com/global/2015/may/13/baidu-minwa-supercomputer-better-than-humans-recognising-images

[4] Mikolajczyk, A., & Grochowski, M. (2018). Data augmentation for improving deep learning in image classification problem. 2018 International Interdisciplinary Phd Workshop (Iiphdw). doi: 10.1109/iiphdw.2018.8388338

[5] Perez, L., & Wang, J. (2017). The efectiveness of data augmentation in image classifcation using deep learning. Stanford University Research Report.

[6] https://snow.dog/blog/data-augmentation-for-small-datasets

[7] https://www.learnopencv.com/understanding-alexnet/

[8] https://nanonets.com/blog/data-augmentation-how-to-use-deep-learning-when-you-have-limited-data-part-2/

[9] Shorten, C., & Khoshgoftaar, T. (2019). A survey on Image Data Augmentation for Deep Learning. Journal Of Big Data, 6(1). doi: 10.1186/s40537–019–0197–0

[10] Henrique, F., & Aranha, C. (2019). Data Augmentation Using GANs. Proceedings Of Machine Learning Research.

[11] Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation.

#machinelearning #deeplearning #dataaugmentation #convolutionalneuralnetwork #artificialintelligence #trainingnetwork #sabinapokhrel #suchitech

Generate More Training Data When You Don’t Have Enough