Please see Github Repository
This repository presents my attempt to build a dog breed identifier tool using neural network designed with Keras.
This work is inspired by the Kaggle Dog Breed Identification Challenge(I did not take part in the competition because it was too late to submit). However, being a doggo friend, I enjoyed the topic and wanted to take the opportunity to improve my skills in computer vision! So, I took a dog breed identification dataset and built my own classifier.
As specified on Kaggle: You are provided with a training set and a test set of images of dogs. Each image has a filename that is its unique id. The dataset comprises 120 breeds of dogs. The goal of the competition is to create a classifier capable of determining a dog’s breed from a photo.
The dataset is composed of 133 different breeds with 8350 pictures in total. The number of pictures per breed goes roughly from 25 to 100.
Then, it’s not enough to train a neural network. It will overfit the images and so the error on the test set will be high.
To deal with it, I will apply the two methodologies:
Remark: Because of the lack of power resource (especially I don’t have GPU), I wasn’t able to run the code and get the results with a great accuracy. However, this code works perfectly well and it will be possible to select the best model once trained on GPU.
This is a multi-label classification, with 133 labels.
The evaluation metric is accuracy score.
The project is decomposed in 3 parts:
The framework of this notebook is:
The number of images per breed is too small in the dataset. It makes the model more likely to overfit. The first step is to proceed to the augmentation of the input data.
We can operate several operations to augment the number of images. Moreover, the customer is not a professional photographer, so this kind of transformation is necessary to add robustness to the ‘amateurism’ of the photo:
Practically, I used the keras train generator built-in function.
In general, to build our model, we can do:
Because of the previous reasons and the low number of data, we will use a pre-trained model.
Pre-trained models are here advantageous because:
With keras, we can have access to several models (in bold the one tested here):
My final architecture is:
The framework is to split the dataset into one training and one testing (80% for the training). Then, the 2 models ((VGGNet + customized) & (Xception + customized)) are trained using 80% of the dataset. Then 10% are used as validation set during the training. The remaining 10% are used to test both models and choose the best one.
Remark: Having no GPU, I can’t train the models so the results are not representative of the real performances. After an hour of training, we went from 1% to 15% of accuracy which is much more than 1/133 (nb of labels). So, with more training time we can go to a better accuracy.
Using the best model, I am able now (once trained with a GPU) to identify the breed of dogs or even to know which breed a human face looks alike !
As seen, it is possible to build a neural network to identify the breed of dogs.
We can imagine to build such a model to also identify from a picture the age (at least young/old, given we already know the breed), the gender and the presence of diseases. From this informations, an insurance company can make a better pricing of insurance plans.
Technically, it would be possible to improve the model through: