PixelLib – Segmenting Objects and Videos in Python

5/5 - (1 vote)

This tutorial is about applying object segmentation in five lines of code. We show a beginner-friendly code implementation using the PixelLib library on Google Colab.

Does machine learning have to be hard? Well, back then when everything must be coded from scratch, yes. But today, we are stepping on the shoulders of giants. Lots of libraries are available out there to help us apply machine learning easier, so we don’t have to reinvent the wheel. In this tutorial, we are going to look at PixelLib. It is an easy-to-use library that segments objects in images and videos with five lines of code. Five lines of code for deep learning? Exactly. This article will bring you from A to Z with code snippets and video complements. So read on and see it for yourself πŸ™‚

You can also check out this machine learning project with a video guide and advanced tips and tricks on the Finxter Computer Science Academy:

Install and Import Modules

Feel free to download the script for this tutorial from this GitHub repo. We will execute it on Google Colab and use some free GPU resources for model training. If you would like to try Google Colab out, head over to the site and sign up using your Gmail account. It looks like a Jupyter Notebook but with its storage location in your Google Drive. Upload the script onto your Google Colab and execute the blocks as we guide you in this article.

Execute the following command on a terminal or command prompt to clone any GitHub repo:

$ git clone https://github.com/username/project_name.git

The next step is to enable the GPU resource in our Colab environment. There are two ways to do this:

  • Method 1: Click on the Edit tab. At the Notebook settings, choose GPU at the drop-down, and click Save.
  • Method 2: Click on the Runtime tab. At the Change Runtime Type, choose GPU, and click Save.

Feel free to check out the video version of this tutorial for more in-depth explanations.

Now, install the necessary packages using pip:

$ pip install tensorflow--gpu pixellib --upgrade

As PixelLib relies on the TensorFlow library, we need to install that as well. If your execution environment has no GPU resource, omit the β€œ–gpu” for the installation.

Now let’s import the required libraries:

import pixellib
from pixellib.instance import instance_segmentation
import matplotlib.pyplot as plt

We imported the main pixellib library as well as its instance_segmentation() function. There you go, we just executed two out of five lines of code for object segmentation. And we also imported the Matplotlib library to display images on our notebook. That’s all the libraries we need for this code project.

Although we only focus on the instance segmentation, PixelLib has much more to offer. For example, background editing, semantic segmentation, and custom training. Check out its detailed GitHub repo to learn more.

Create Instance and Load Pretrained Model

To use the instance segmentation function, we must first initialize it:

segment_image = instance_segmentation()

The initialized function is passed to a variable named segment_image. PixelLib takes a pre-trained model and performs object segmentation on a given input. It then returns a bounding box and a boundary for every detected object in the input.

We are going to use the Mask RCNN as the pre-trained model. Go ahead to this resource page at the PixelLib GitHub repo and download the .h5 model file. We will also need a sample image and a sample video to test how well PixelLib performs. If you would like to use the same resources as this tutorial, they can be downloaded from this site. Also, check out this relevant video:

Short video of a busy city scene.

Also, feel free to use your images and videos to test the library out. Once you have the model and samples ready, upload them to Google Drive.

Now, go back to your Google Colab interface and mount the Google Drive to it so that you can access the files. To do that, click on the Google Drive icon at the left of the interface for mounting – as shown in Figure 1.

Figure 1: Mount Google Drive on Google Colab.

You will see a folder named β€œdrive” appear on the data repository once it is mounted. You can also upload files directly onto Google Colab, but it will not be permanent storage. Once you are off your Google Colab tab, the files will not be there the next time you open them.

Now you have access to the files. Time to reveal the fourth line of code:


The load_model() function takes a string path and uses the model for the segmentation work. Your folder location might be different than the sample code, so adjust accordingly.

Object Segmentation on Images

Let’s display the sample image using matplotlib and see how it looks like:

plt.figure(figsize=(26, 18))
im_bike_lane = plt.imread("/content/drive/MyDrive/bike-lane.jpg")

We defined a plotting figure of size 26 inches in width and 18 inches in height. Then, we loaded and displayed the sample image using the imread() and imshow() functions. Figure 2 shows the image, which is a normal street view consisting of people, cars, bicycles, etc.

Figure 2: Our sample image, a street view photo (source).

OK, now comes the code for instance segmentation. Execute the following line:

       output_image_name="/content/drive/MyDrive/ bike-lane_segmented.jpg")

We set the configurations for segmentImage() and let it do the heavy-lifting. The show_bboxes config is set to True because we want every detected object to having a bounding box. We set extract_segmented_objects to False to not extract objects as individual images. The save_extracted_objects is set to False to not save objects as separate files. The output_image_name config is the path where you want the new image to be saved. Figure 3 shows the processed image with all detected objects.

Figure 3: The sample image overlaid with segmented object boundaries and bounding boxes.

Not bad! Without model training and tuning, PixelLib managed to detect objects from the image. Categories like cars, bicycles, persons and traffic lights are detected. If you chose to save object segments, you will see object crop-outs saved as individual files.

What if we want to segment only one, or a handful of categories instead of all categories? That can also be easily configured, as follows:

target_classes = segment_image.select_target_classes(person=True)
target_classes = segment_image.select_target_classes(car=True,bicycle=True)


We gave the select_target_classes() function the category names to segment for. The configuration is passed to the target_classes variable. The first target_classes variable has config β€œperson=True”. Only objects that are predicted as Person in the image will be segmented. Likewise, the second target_classes variable sets Car and Bicycle categories as True. That means we want to segment only these two objects. Figure 4 shows the segmented image with the target class Person. Figure 5 shows the one with target classes Car and Bicycle.

Figure 4: Segmented objects predicted with the label β€œPerson”.
Figure 5: Segmented objects predicted with labels β€œCar” and β€œBicycle”.

So, what do you think about the object segmentation ability of PixelLib? Impressive, isn’t it? Objects are detected accurately according to the labels. Boundaries are defined precisely around the detected objects. Some objects are detected accurately even when half of their shapes are occluded.

Object Segmentation on Videos

So far, so good on images. Now let’s see how PixelLib performs for video segmentation. Figure 6 shows a screenshot of the video, which is another street view with humans and vehicles.

Figure 6: Our sample video, a recording of street view (source).

The code for video instance segmentation is like image segmentation. Surely you get the drill now with the straightforward function names. Execute the following code:

segment_video = instance_segmentation()
       frames_per_second= 5,  

We initialized the instance_segmentation() function again and passed it to the segment_video variable. The same pre-trained model is loaded using the load_model() function. We used the process_video() function and gave it the same configs. The additional frames_per_second config is the frame rate of the video to be saved.

You might realize that it took a long time for PixelLib to process the video compared to the image. That is normal because videos are a collection of images. PixelLib processes the video frame by frame and performs object segmentation. Figure 7 shows a screenshot of how a segmented video frame looks like.

Figure 7: Our sample video with object segmentation.

As expected, the library accurately detected objects in the video. That’s a ready-to-use object segmentation library without any hyperparameter tweaking! If you think PixelLib is useful, check out its repo to explore more functionalities.


That’s it! You have learned that there exists a well-performing object segmentation library. And it can be implemented in five lines of code. Another useful tool to add to your machine learning skillset. I hope this was a fun learning process for you! If you encounter any issues and would like an in-depth walkthrough of the code, the video explanation is there to help you out. Happy learning!

You can check out this tutorial with a video guide and advanced guidance on the Finxter Computer Science Academy: