Using AI to detect Cat and Dog pictures, with Tensorflow & Keras. (2)

Image for post
Image for post
Cat or Dog?

Welcome back on our journey towards an effective Cat — Dog analyzer using Keras.

If you haven’t seen part 1 please go here.

For the rest of you, our previous neural network peaked approximately with a 65% validation accuracy rate. Our goal today is to use a convolutional neural network to enhance our classifier.


In the current architecture, we will not transform our dataset from RGB to gray. Thus we will repeat all the same steps for data preparation but ignore the RGB-Gray code block:

Now, just as in part 1, we wish to transform our datasets into NumPy arrays for ease of data manipulation.

Running both code blocks should output the following:

Image for post
Image for post
Data Shape

As you can see we have 23,262 images with corresponding 23262 labels. Notice, also, that our trainX is of the shape (64, 64, 3). This represents a 64 px by 64px image with a 3 long RGB code associated with each pixel.

Model Development:

Next, we want to begin the development of our model. Our model will consist of 2D convolutional layers followed by a LeakyRelu activation layer and a MaxPooling layer. If you wish to know how exactly convolutional neural networks work here is a good link.

Our network will accept 64,64,3 inputs, unlike our vanilla neural network. This is due to the nature of convolutional neural networks. However, following the convolutional layers, we will introduce a flatten layer and several Dense layers to output a value of 0 or 1. Whereby 0 represents a cat and 1 represents a dog.

The rest of the code, including optimizers and validation datasets, is unchanged from the part 1 vanilla neural network.

Now, the observant reader will notice we imported a package called “regularizers” and used it within our Dense layers. As we have seen before, neural networks suffer from overfitting, regularization helps offset this overfitting.

Regularization essentially limits overfitting by reducing the impact layers from the several paths back have on the current output. Essentially it weakens the impact of neurons in the early stages.

Analysis of Network:

Just as before, we wish to acquire a visual comparison of how accurate our neural network is with validation and training data.

Let’s rerun our visualization code from part 1.

Image for post
Image for post
Accuracies granted from our CNN

As you can see we have acquired a higher accuracy of approximately 84%. This is ~20% more accurate than our simple vanilla neural network! Thus clearly displaying CNN’s objective power in analyzing images compared to vanilla neural networks.

We will next write some code to test a random cat/dog image on our network. All you have to do in your case is upload an image in google collab and to change the name variable to your corresponding image.

In my case I got this output:

Image for post
Image for post
Classification output

Further optimization of our network:

Unfortunately, if we take a closer look at our graph, the training accuracy is much higher than our corresponding validation data. To counter this we will encounter the Dropout layer which eliminates a percentage of neural outputs randomly from the previous layer.

We will also introduce a new method in Keras called ImageDataGenerator. This method generates more training data from our initial training data, whereby the extra data will be image transformations of our initial dataset.

By expanding our dataset through image transformations our dataset will be larger and more unique, which inturn exposes more aspects of our data to the network and expanding the possible dataset. This has the effect of eliminating the overfitting of our dataset.

Firstly, let’s return to our second block which transforms our trainX and trainY into numpy arrays. We will do most of our data augmentation here.

The above block will separate our data into validation_data and partial training data. The latter will be expanded with several image transformations such as horizontal flips, zooms, cropping of images, and rotations.

Next, we will go down to our models and remove the data separation portion as well as add a 50% drop out layer.

We will also change our fitting of the network to run for 100 epochs and replace batches with steps_per_epochs. The latter due to ImageDataGenerator fixing the batch_size at 32.

Now if we run the image visualization portion again, we get the following output:

Image for post
Image for post
CNN with image augmentation.

As you can see both the validation accuracy and training accuracy follow the same trend, suggesting that overfitting has drastically been reduced. We have acquired a consistent maximum accuracy of 88%.

In later pieces, we will utilize pre-trained models and other avenues to drastically increase the accuracy of our model.

Next Article:


***Github Code*** :

Written by

A Somali physicist, electrical engineer, Software enthusiast, and political enthusiast.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store