CNN Part 3: Setting up Google Colab and training Model using TensorFlow and Keras

Welcome to the part 3 of this CNN series., previously we learned about the overview of Convolutional Neural Network and how to preprocess the data for training, In this lesson, we will train our Neural network in Google Colab.

First, let’s discuss on Google colab .if you dont have GPU and still want to learn ML then trust me this is the best place to work. Google Colab or Google Colabotary is the platform which allows you to  train your machine learning or Tensorflow project on GPU for free, No upfront cost no credit card required nothing, just log in to your google account and open a google colab, it will open a python notebook just like jupyter, and then you can train your model from there

Now lets come back to our project, for training first open a Google Drive and create a folder, with any name, the idea behind it is we will save all the files related to project in that folder only to manage the data easily. I created a folder named Cat vs Dog. Then create three more folder Pickle, Logs and Models.

Then upload pickle files in pickle folder and we will use Logs and Models folder to store logs and models file for the later evaluation to select the best model.

Now open google colab and click on new python 3 notebook then click on New python notebook You will get this window

This is the python notebook like jupyter and from now onwards we will use it to train a model. To use the GPU you have to change the runtime by clicking Runtime then select change runtime type and select GPU from there.

Next step is to link our Google Drive with colab so that we can access files from there, we will also use google drive to save all our logs and models.To mount google drive run these two lines

From google.colab import drive
drive.mount(“/content/gdrive”)

Then you will ask to click a link and it will redirect you to authorization page, then copy the link and paste it. After successful authentication, you will get this output

Mounted at /content/drive

Now we are all set to train our very first model.

I am uploading a full code here, and after that, I am giving you an explanation of every step. Thus If you have any doubt at any particular step, you can just navigate to that particular line.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.utils import to_categorical
import pickle
import time

NAME = "3-conv-128-layer-dense-1-out-2-softmax-categorical-cross-2-CNN"

pickle_in = open("/content/drive/My Drive/Projects/cat vs dog/Xv.pickle","rb")
X = pickle.load(pickle_in)

pickle_in = open("/content/drive/My Drive/Projects/cat vs dog/Y.pickle","rb")
y = pickle.load(pickle_in)
y = to_categorical(y)

X = X/255.0

model = Sequential()

model.add(Conv2D(128, (3, 3), input_shape=(50,50,1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(128, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(128, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Flatten())  

model.add(Dense(128))
model.add(Activation('relu'))

model.add(Dense(2))
model.add(Activation('softmax'))

tensorboard = TensorBoard(log_dir="/content/drive/My Drive/Projects/cat vs dog/Logs2/{}".format(NAME))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model.fit(X, y,
          batch_size=32,
          epochs=10,
          validation_split=0.3,
          callbacks=[tensorboard])

model.save(f"/content/drive/My Drive/Projects/cat vs dog/Models/{NAME}.model")

First of all we are importing all our necessory liberaries and models..

We are importing an activation layer, Conv2D layer, Dropout or what not. We will discuss everything we import here when we use it in the program. As of now just keep in mind what we imported.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.utils import to_categorical
import pickle
import time

Moving forward

Here we are defining name to save logs, its a good practise ot set a well defined name so that when we use it later, the name is itself explainable.

Next we are defining the path to the pickle files that we created in part 2.

To use these files click on the arrow symbol on top left corner it will slide a window then click on files. After that navigate to your pickle file. Your drive is mounted in a ‘content’ directory

Then search for the pickle,  copy path and paste it, dont forget to add “/” at the start of the path.

Next, we are normalizing the X as we discussed in part 2.  We know the lowest and highest value of a color pixel i.e 0 and 255, here pixel is varying between 0-255 which is a huge range so we normalize it between 0-1.

Here we are selecting model. Sequential model is an API which allows us to create a custom model layer by layer.

just like you have probably heard of a custom PC in which you can choose RAM size or HDD size or processor according to your requirement, in the same way, we are choosing here, if we need a layer we add it, if not we delete it. If we need it in the less amount then we can toggle with filters.

As you can see we are adding conv2D layers, Dense layers, MaxPooling layers etc, so here we are making a model which is custom, we also defining the number of filters. So here we are building a custom model with the help of sequential API

model = Sequential()

First of all, we are adding Conv2D layers(Hidden Layer), remember the jungle example in part 1, if you haven’t read part 1 yet then I recommend you to read it by clicking here. there I tried to explain the working of this layer using simple example.

Here Convolution layer takes the data from the window of (3,3) and then make a conclusion using filters. Here we are using 64 filters which we can change to any number.

enerally we take 32,64,128,256 and so on but its not defined that you have to choose like this only you can take any number, Now when the conclusion is made by first layer, this pass to the second layer individually and the same process happens there too. again we can specify thenumber of filters.

Next is activation function, again it all depend on personal choice and for better understanding you have to dig deeper. Relu is a rectified linear function, this is the most popular and versalitefunction, if you dont know what to do then just use relu.

Next is MaxPooling2D. we also covered this layer in the first lesson what it does is taking the 2×2 window and finding the max value, and allow that value will pass to the next layer.

Here we are also adding the Dropout layer. what this layer do is, it will drop certaiin percentage of conclusions at every step to avoid over fitting of data.

model.add(Conv2D(128, (3, 3), input_shape=(50,50,1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(128, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

model.add(Conv2D(128, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))

Flatten layer is use to flat the matrix. basically it is used to convert 3D array layers to 1D array layers, because dense layer works on 1D layer so we have flatten before giving to the Dense layer

one thing to notice is that Dense layer is used with the last convolution layer because it performs the classification, and in the end we again use the dense layer with output 2 because we programmed our model to detect two feature,

model.add(Flatten())  

model.add(Dense(128))
model.add(Activation('relu'))

model.add(Dense(2))
model.add(Activation('softmax'))

Next, we are creating callback using TensorBoard. A callback is kind of a script which runs during compilation of the model. Here we are using this callback to track the logs to analyze various models using TensorBoard which we will cover in the next lesson. Here we are specifying the path to Logs folder.

tensorboard = TensorBoard(log_dir="/content/drive/My Drive/Projects/cat vs dog/Logs2/{}".format(NAME))

In this step, we are compiling the model. The loss is also a kind of classification function. We use binary cross entropy when we pass 2 or more labels but we need classification probability in one array, then it distributes the probability in 1 element between 0 and 1, like if there are 2 features in training set then detection probability for 1st become 0-0.5 and for 2nd it will become 0.5-1

Next is categorical cross entropy, we use it when we have 2 or more input and we want seperation probability then it will return a matrix of 2 elements each denoting its own probability of similarity.

The third one is the sparse categorical cross entropy. This is the same as the cross-entropy but we use it when we want to see the prediction like if you have 9 features and the test image is close to 9th label then it will show something like 9.92.

Next, we are using optimizer as Adam because we always do so. Seriously I never use any other optimizer then Adam ever and it gives me decent result always, then we are adding a matrix which we want to focus. here is accuracy in our case.

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

In the end we are fitting the model, here we are passing the features and labels, then epochs, batchsize in which we want to process the data, epochs are the iteration,like how many times we run the model in mone go. and the most important ,

validation split, this split test and train data automatically, I dont know why people used to seperate train and test data before feeding into this neural network and why not using the keras inbuild function which can automatically do the same job

And in the last we are passing the callback , in this case we pass tensorboard because we want to see the logs, you can also pass like save the model when accuracy is highest or validation loss in lowest and all that stuffs,

model.fit(X, y,
          batch_size=32,
          epochs=10,
          validation_split=0.3,
          callbacks=[tensorboard])

model.save(f"/content/drive/My Drive/Projects/cat vs dog/Models/{NAME}.model")

After running on 10 epochs we get an accuracy 92%. in the end we are saving the model.

Now if you want to predict the data from this folder you have to convert the image in the same way as you send into Neural Network like the same width and size and colour.

Here is the code for this

def prepare(filepath):
    img_size = 80  
    img = cv2.imread(filepath, cv2.IMREAD_GRAYSCALE)/255  
    img_resize = cv2.resize(img_array, (img_size, img_size))  
    return new_array.reshape(-1, img_size, img_size, 1)
  
prediction=model.predict(prepare("/content/drive/My Drive/Projects/cat vs dog/Dog.jpg"))
print(prediction)

I have a image of a dog, lets see what result we get

[[0.9980282  0.00197181]]

Woohh, Model is 99.8% sure that its a dog,

But this may not be the best accuracy maybe we overfit the data and if it does then we have to discard this model and try again with a different set, but how do we know and what if our next model will fail too. We will discuss this question in the next lesson and will also learn about Tensor board.

That’s all for this lesson.

Thanks for reading and if you have any doubt or suggestion then please comment below.

Here is the link for CNN part 4: How to use TensorBoard to avoid Overfitting

2 thoughts on “CNN Part 3: Setting up Google Colab and training Model using TensorFlow and Keras”

    1. Hey Sherri, actually I used pickle so that I don’t have to train the dataset, again and again, it just stores the training part of you can say the work at that point. you can use pickle in any program where you are undergoing various stages and you have to store the checkpoints or data/outcome till that point so that if you mess up in future, you have the data to work on.

Leave a Reply

Your email address will not be published. Required fields are marked *