- Speed Detector.
- Traffic control. Like if the car is speeding then we can capture the image
- Counting the cars and so on
We have two option to get the bounding boxes, the first method is to train a Convolutional Neural Network and for this, we should train a neural network with the bunch of cars images but we also have to manually select the boundary of each and every car and train the model with this boundary data.
The second option is to use the pre-trained data of object detection by TensorFlow and use it to get x,y coordinate of every car on the road.
So lets start
First we need to install following dependencies
- TensorFlow
pip install tensorflow
If you are using a dedicated GPU then install GPU version
pip install tensorflow-gpu
Note: Tensorflow is not compaitable with python 3.7 till feb,19, So you have to use python 3.6
2. Jupyter Notebook
pip install jupyter
3. Matplotlb
pip install matplotlib
4.OpenCV
pip install opencv-python
5. lxml
pip install lxml
After installing the dependencies, we have to download two files. First one is the Object Detection Model from TensorFlow Git
To download just open this git repository and download as a zip file.
The second one is the protocol buffers or
Unzip both the files, then open models and then click on the research folder and open terminal or command prompt from there.

Next, copy the path of the

then run this command from the command prompt or terminal
Path-you-copied object_detection/protos/*.proto --python_out=
This is it for the Windows user, but for the Mac user there is one more command.
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
This is it for the installation.
Now have to open jupyter notebook we can open it from the same command prompt that we use earlier. You can open notebook from anywhere then you just have to navigate it to this folder.
Now to open it type jupyter notebook and it will open a notebook for you in a browser
jupyter notebook
which looks like this then click on object detection as highlighted.

then click on the object detection tutorial file.

Then from cell click on Run all. It will download the latest model.

Now our model is downloaded and we can work on it.
Open a python file and save the following code, but don’t forget to save the file in the object detection folder because there are many modules like collections and utils which imports the functions
I am attaching full code here. Code have comments so its a self explanatory. The area of interest is the bounding box which we will cover after this code.
''' importing the necessory depedencies. there are many modules like collections and utils which need to call from object detection folder so dont forget to save the code in that folder. ''' import numpy as np import os import six.moves.urllib as urllib import sys import tarfile import tensorflow as tf import zipfile from collections import defaultdict from io import StringIO from PIL import Image import cv2 from utils import label_map_util from utils import visualization_utils as vis_util # Path to frozen detection graph. This is the actual model that is used for the object detection. PATH_TO_CKPT = "/Users/pushkarsingh/Downloads/models/research/object_detection/ssd_mobilenet_v1_coco_2017_11_17/frozen_inference_graph.pb" # List of the strings that is used to add correct label for each box. PATH_TO_LABELS = "/Users/pushkarsingh/Downloads/models/research/object_detection/data/mscoco_minival_ids.txt" NUM_CLASSES = 90 # ## Load a (frozen) Tensorflow model into memory. detection_graph = tf.Graph() with detection_graph.as_default(): od_graph_def = tf.GraphDef() with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid: serialized_graph = fid.read() od_graph_def.ParseFromString(serialized_graph) tf.import_graph_def(od_graph_def, name='') # Loading label map # Label maps map indices to category names, so that when our convolution network predicts `5`, we know that this corresponds to `airplane`. Here we use internal utility functions, but anything that returns a dictionary mapping integers to appropriate string labels would be fine label_map = label_map_util.load_labelmap(PATH_TO_LABELS) categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True) category_index = label_map_util.create_category_index(categories) with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: cap=cv2.VideoCapture("/Users/pushkarsingh/Downloads/videoplayback.mp4") ## out = cv2.VideoWriter('/Users/pushkarsingh/Desktop/output.mp4',0, 30.0, (640,480)) while(cap.isOpened()): ret, image_np = cap.read() if ret==True: image_np=cv2.resize(image_np,(420,220)) image_np_expanded = np.expand_dims(image_np, axis=0) image_tensor = detection_graph.get_tensor_by_name('image_tensor:0') # Each box represents a part of the image where a particular object was detected. boxes = detection_graph.get_tensor_by_name('detection_boxes:0') # Each score represent how level of confidence for each of the objects. # Score is shown on the result image, together with the class label. scores = detection_graph.get_tensor_by_name('detection_scores:0') classes = detection_graph.get_tensor_by_name('detection_classes:0') num_detections = detection_graph.get_tensor_by_name('num_detections:0') # Actual detection. (boxes, scores, classes, num_detections) = sess.run( [boxes, scores, classes, num_detections], feed_dict={image_tensor: image_np_expanded}) # Visualization of the results of a detection. vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes),np.squeeze(classes).astype(np.int32),np.squeeze(scores),np.squeeze(scores),category_index,use_normalized_coordinates=True,line_thickness=1) line_thickness=1) for i,b in enumerate(boxes[0]): if classes[0][i] == 3 or classes[0][i] == 6 or classes[0][i] == 8: if scores[0][i] >= 0.5: mid_x = (boxes[0][i][1]+boxes[0][i][3])/2 mid_y = (boxes[0][i][0]+boxes[0][i][2])/2 apx_distance = round(((1 - (boxes[0][i][3] - boxes[0][i][1]))**2),1) cv2.putText(image_np, '{}'.format(apx_distance), (int(mid_x*420),int(mid_y*220)), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255,255,255), 2) if apx_distance <=0.4: if mid_x > 0.2 and mid_x < 0.8: cv2.putText(image_np, 'WARNING!!!', (50,50), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0,0,255), 2) ## out.write(image_np) cv2.imshow('window',image_np) if cv2.waitKey(25) & 0xFF == ord('q'): break else: break cap.release() ##out.release() cv2.destroyAllWindows()
Since detecting distance in a 2D image is kind of impossible so we are taking different approaches. As you can see in the vid that the bounding box are bigger for the cars that are near and its smaller for the cars that are far from us.
so the vehicle which have small bounding box is not a threat for us.
Thus, we are using the above logic to detect the distance. if the box size is bigger then a
We are selecting the only 3 classes. I.e car-3 bus-6 and truck-8.
The box in this model return the x1,y1,x2,y2 coordinate. So first we determine the approx distance of the vehicle. Here we are taking the y coordinate of the bounding box.
But it returns in percentage so we have to deal with it in the same way. It’s not an actual distance, it’s just an idea. Then if approx distance value is less then or equal to 0.4 and it is in a line of sight which we define in the next few
After all done we release the cap and close our windows
This is it from my side, If you have any comment concern doubts or suggestion then please leave below.
Thanks for reading 😀
Hello, thank you for your tutorial, it’s really usefull.
Im wondering, if there is any way to reproduce the same app on mobile using tensorFlow Lite instead of regular tensorFlow ?
Thanks, Massyl,
I think the idea of a mobile app is really good, I will see if I can create an app like this.