Blog CS

Building a Computer Vision rig with SimpleCV and IPython

November 3, 2014/0/0

Home / Building a Computer Vision rig with SimpleCV and IPython / Building a Computer Vision rig with SimpleCV and IPython

A project of mine requires some image processing and OCR, which has led me into the interesting world of computer vision.

My initial research led me pretty quickly to OpenCV which is used by google among others for all things computer vision. However, this seemed a bit overkill for my use case, and with a steeper learning curve than I was willing to start with, so I decided to start from scratch first to build my knowledge.

After playing around in R and using tools like tesseract and imagemagick, and trying some different cutting, projection and learning techniques, I finally came across the library I had been looking for – SimpleCV is a python library, which is designed to be easy to use, and has wrappers for the big libraries like opencv and tesseract. I also recommend the ebook which doesn’t take too long to read, but gets you up to speed quickly with the major fundamentals of computer vision (and how to use them…).

Take a look at this video for a demonstration by one of the developers:

I’ve been using the library for a while now, in particular the blob extract and feature extraction elements, in order to create features to feed a scikit-learn based svm model, with very good results.

Click permalink below for some advice on how to set up a computer vision rig with SimpleCV.

I was particularly interested in extracting some features from images, and using a support vector machine to classify these images into different segments. SimpleCV comes with a load of functionality to make loading images and extracting features very easy, so I was pretty pleased when I found it.

However, that changed temporarily as I tried to install it. Despite having an ‘easy to install’ superpack, it turned out to be a pretty painful experience getting the full experience working nicely in iPython Notebook. I won’t go into all the details, but I’ll highlight a couple of the bigger issues, and their solutions.

First off, it’s up to you, but I recommend installing the whole rig into a python virtualenv, or in my case onto a linux virtual machine. The main reason for isolating it is because there are a lot of dependencies, so you run the risk of messing up your normal workspace if you install it all natively.
On top of that, I wanted to make my setup mobile across the computers I use (which run different OS), so the virtual machine allows me to do that. Also, it seems marginally easier to set things up on a linux host (I’m using ubuntu gnome), using apt-get for most of the big dependancies.
I started by installing a fresh copy of Anaconda, which is a python distribution which ships with the major data science libraries like numpy and scipy.
I chose to install the 32 bit version of anaconda, even though I was on a x64 machine. This is because most of the packages were far easier to install as 32 bit.
This one may be obvious, but it’s worth mentioning: I have to go through a web proxy at work. Normally, I set this at the system level, and most programs know to use the system proxy. However, pip, the python package manager, is an exception to that rule, and you have to explicitly feed it the proxy details as an argument.
At one point after installation, I was getting errors on the simpleCV hello world example. The solution was that the installation I’d used had omitted the ‘sampleimages’ folder, which it turns out is critical to running most things, because it includes a logo which, if missing, causes the display function to crash.
When trying to get the images to display on Ipython Notebook:
- Make sure you launch notebook with the ubiquitous ‘simplecv notebook pylab==inline’
- Include the line ‘%matplotlib inline’ in one of your first notebook cells.
```
%matplotlib inline
disp = Display(displaytype='notebook')
init_options_handler.enable_notebook()
```
- Then when you want to show images, instead of using the .show() method, use:
```
img.save(disp)
```
  This should make your images appear in line as desired!
On one installation, even after all this I was getting the following error when I tried to display an image:
```
IOError: [Errno 22] invalid mode ('rb') or filename: u'/tmp/e:\\temp\\tmprch1le.png'
```
To fix this, I followed the stack trace to the ImageClass.py file, and looked at the save method. I found the offending line at about position 2,019:
```
loc = '/tmp/' + tf.name.split('/')[-1]
```
I changed this to the following, and everything started working…
```
loc = tf.name.split('/')[-1]
```
Note: I also spotted the same pattern in a different override of the save method, so changed it there too…
Finally, don’t go to all the bother of installing the orange machine learning dependencies – just use scikit learn which has everything you should need.

Hopefully these steps will help, and the result is a great set of tools. Most functionality works with videos as well as for images, and there are some other functions for finding movement. It also ships with some pre-trained HAAR cascades, which can be used to quickly find faces, and facial keypoints in images and videos. I wonder if I can make my sonos turn on by winking at my webcam…

willycs40 Uncategorized Computer Vision

Navigation

« NEXT

Add comment

Cancel a reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.