Getting started Optional set up items 3D graphics Camera Trap Remote Control by smart phone Object tracking BrickPi robot

Object Tracking BrickPi Robot

This page is about a project to create an intelligent Raspberry Pi powered mobile robot. The goal is to have a robot that can teach itself to track and chase objects. There are three phases: Object Tracking, Motor Control and Machine Learning. The first two are complete and the robot can track an object of a given colour and chase it. However it does not teach itself to do this. Over the summer one of my students will be creating an Artificial Neural Network for the machine learning part of the project.

Phase 1: Object detection

Program files on Github:
There are two version of the files: one for use on a Raspberry Pi with camera module, and the other for use on a PC with webcam. There are few changes, but in order to keep the code clean and clear I have made separate versions.

First you need to make sure that your Raspberry Pi is properly set up (see Getting Started), and can obtain an image programmatically. My preferred option is to use v4l2, see the notes under Camera Trap. You will also need to install the open Computer Vision library openCV, with 'sudo apt-get install libopencv-dev python-opencv'.

1. Getting an image

Once you have those elements installed you can obtain an openCV image object with the python program Open it in IDLE, and select Run>Run Module from the menu bar. You should see an image from the camera appear on screen. The press a key to see a transformed image.

Once you have that working try the following:

2. Identify a region by hue

Computers normally store an image as a giant matrix with three values for each pixel: the intensity if red, green and blue (RGB values) that combine to make the colour of the pixel. A simple but fairly robust method of indetifying an object is by colour. However you want to specify the colour in a way that isn't too much affected by how light or dark the lighting on the object is, or how washed out or exposed the image is. This is tricky when specifying ranges of RGB values, but can be done by looking at the hue of the object.

This is done in the program, again press a key to step through the images. The function cv2.cvtColor(image,cv2.COLOR_BGR2HSV) converts the representation from three RGB values for each pixel, to a Hue, Saturation and Value value for each pixel. Hue give the essential colour, Saturation gives the intensity of that colour and Value gives the overall brightness of the pixel, as depicted in this image.

By specifying a tight range of hue values, and a very wide range of saturation and value values, we should identify all regions that contain objects of a given colour, regardless of lighting conditions. The print statement in the program will output the HSV values of the centre pixel of the image to the console.

The variables lower_pink and upper_pink in the program are used to specify Hue between 160 and 175, which is roughly the pink of pink post-it notes, and saturation and value values between 50 and 255, i.e. weak and dark up to strong and bright pink.

The function cv2.inRange is used to create a mask - a matrix of 0s and 255s with 255s where the corresponding pixel was sufficiently pink, and a 0 elsewhere. I also create an opposite mask (mask_inverted), by swapping 0s and 255s. 0 and 255 are used, because when interpreted as a greyscale image, this gives a black and white mask. The masks are used to make two images - one where I convert the original image to greyscale, then do a bitwise-and with the inverted mask to keep only pixels that were not pink, and the other from a bitwise-and of the original image and the mask to keep only the pink pixels. Combining these gives an image where pink parts are kept but everything else is greyscale.

Once you have this working try the following:

3. Target the direction with the greatest match for your hue

The next step, program, involves a basic operation on the matrix values and a loop. First we take the mask, which is a matrix of 255s and 0s, where 255s represent pixels that are pink. We want the total number of pink pixels in each vertical line of the image.

We can get this from the function np.sum(mask,axis=0). (Note: np. means use a NumPy operation, which is a useful library of mathemtical functions in Python.) np.sum(mask,axis=0) means take the matrix 'mask' and sum each column, so LRarray = np.sum(mask,axis=0)/255 means we set LRarray to be a list of numbers, one for each column, where each value is the sum of the column values divided by 255 - i.e. the number of pink pixels in that vertical line.

The loop starts with 'for i in range(w):'. This means 'for each value of i in the range 0 up to to w-1 do whatever is in the indented lines below'. Computers always count from 0, so we cover each of the w columns by counting from 0 to w-1. Python programs use indentation to define loops and other blocks of code - so getting the indentation right is important.

In the loop, for each i, we look at the ith entry in our list of numbers LRarray (denoted LRaray[i]) and ask if it is the biggest we have seen yet. The biggest value seen yet is called max_x_intensity; if LRarray[i] is bigger than this, then we remember this value of i (by setting max_x_coordinate to i), and reset max_x_intensity to the new biggest, otherwise we skip those two lines.

The result of the loop is that we know what value of i corresponds to the column with the largest number of pink pixels, and we have remembered it as max_x_coordinate. We then draw a line on the image from the top of this column to the bottom.

Once you are happy with this, try the following:

4. Do it for real-time video

Now that we have loops, it is an easy step to do everything for live video, as is done in program We simple add a huge loop around the code, and remove any wait-for-key-presses or showing of non-finished images. Note that we have had to indent all the previous code, so that it is in the block that will be looped over.

Here I used a 'while(True):' loop. This loop will just keep on looping until we break it. I have replaced the function cv2.waitKey(0), which means wait here forever for a key press, with 'key_pressed = cv2.waitKey(1)' which means only wait 1 millisecond, but if a key is pressed, remember which one in the variable key_pressed. We can then check if the Esc key was pressed (key 27) and if so break (exit the loop), if not just keep looping.

5. Move from direction with greatest hue, to identifying objects

The approach we have so far could be confused between a large number of small pink objects one above the other and a single large pink object. Really we want to identify a single large object, and openCV has a convenient way to do this. In order to save ourselves a fair amount of effort, let's use openCV functions from here. We can use a function to find 'contours' which are outlines of a region of a single colour. The contours themselves are lists of points (given by x and y coordinates) which surround a region of pink in a join-the-dots style. A second function then draws the contours.

6. Target the largest matching object

Now we have almost got what we want. Once we have the list of contours, we can look through it looking for the biggest contour by area. For fun, rather than just outlining the area, let's draw a target over it, and make the target size dependent on the size of the object we found. This is done in

7. Final example

This example is from one of our Image Processing courses, and uses a slightly more advanced approach to track the target through time (Kalman Filters, here is an attempt to explain them simply!), but is still based on hue selection.
  • Choose a target by dragging the mouse over a region: example code.

  • This additional example (only accessible from inside our network), is part of a phd a project and is currently gatherring data for further analysis. The video feed shows a thermal imaging camera pointing at our car park. Any vehicles or people that move across the scene should be detected and categorised using a machine learning, not only into type (Car, Van, Person etc.) but also by activity (Walking, Running, Digging, etc.).
  • Thermal imaging feed.
  • Extensions

    There are many directions in which you could take this further. I have used the information from the processed image to get a mobile robot to follow an object (details will appear below), but you could also:

    Phase 2: Controlling the robot

    The robot I created uses the BrickPi system for connecting to Lego Mindstorms motors. You will need to follow their instructions in order to install the necessary drivers - I took the easy option of using their Raspian image and then installing everything else I wanted, but you can modify an existing system.

    The first step is to gain control of the motors. The Lego motors attach with a simple cable to one of the ports on the BrickPi board - make sure you know which ports you have used (see image to the right)! You can then run a simple test program to check the basic set up works. There is one in the BrickPi folder Sensor_Examples installed with the drivers ( Note that the motor speeds must be set between -255 and 255, but low speeds (less than 100) may give poor results - i.e. not rotate under any load. Also the command 'BrickPi.MotorSpeed[PORT_A] = 200' sets the value desired for the motor speed, but does not actuallyt make the motor run. Instead you must use the command 'BrickPiUpdateValues()' to actually push the values to the motors, moreover you must push the values to the motors very frequenly, every 1/10 of a second, or the motors stop! Look at the example code to see how the command is repeated so often.

    Unfortunately this is not a convenient thing to do when you want to be doing something else with your code. One solution is to set up a thread to repeat this command all the time. A thread is like a mini-program within your program that keeps on running on its own even when your main code has gone on to do other things. The next example includes some code at the beginning to launch a thread whose only purpose is to repeat the BrickPiUpdateValue() command every 1/10 second - once it is set up the rest of the code continues as normal.

    Once the motors are working you will have to build some sort of vehicle. I based mine on the classic Lego caster-bot, and BrickPi have a similar design called the Simplebot. This means I have a left and right motor, and by varying the speeds the bot can turn either as it moves forward or on the spot. You can run the motors with only a standard powersource plugged in to the Raspberry Pi - this is how I use it for testing as long as the bot does not need to travel far! However to have independence from wires you need to use batteries to power the Raspberry Pi and BrickPi. I have found that a pack of 8 rechargable AA batteries (or 6 non-rechargable) plugged in to the BrickPi board gives enough power to drive the motors and power the Raspberry Pi board for quite a while. However other options are available.

    However we do not want the bot to follow a fixed program. instead we want it to interpret what it sees, and behave accordingly. So make sure you have included the camera module in your bot and it is mounted to look forward. Next take the video analysis loop from step 6 above and add it to the main loop of the Threaded_Motor_Test, then set the motor speeds appropriately each time a video frame is analysed, i.e. to turn left when the target is on the left of the screen, and right when it on the right. The code gives a basic template, but there is much more that could be done: Note: once you get into chasing objects, you need quick reactions. You will need to look through your code and make sure you are not doing anything unecessary in the main loop, and that the image size (w,h) is as small as possible (so that image anlysis is quick) but still large enough that objects can be reliably detected.

    Raspberry Pi is a trademark of the Raspberry Pi Foundation