Sunday, November 23, 2014

Practical Python and OpenCV: a review (part 1)

A few weeks ago, I purchased the premium bundle of Practical Python and OpenCV consisting of a pdf book (Practical Python and OpenCV) and short Python programs explained in the book, a Case Studies bundle also consisting of a pdf book and Python programs, and a Ubuntu VirtualBox virtual machine with all the computer vision and image processing libraries pre-installed.

In this post, I will give my first impression going through the Practical Python and OpenCV part of the bundle.  My intention at this point is to cover the Case Studies part in a future post, and conclude with a review of the Ubuntu VirtualBox, including some annoying Windows specific problems I encountered when I attempted to install use it and, more importantly, the solutions to these problems. (In short: my Windows 8 laptop came with BIOS setttings that prevented such VirtualBoxes from working - something that may be quite common.)

The Practical Python and OpenCV pdf book (hereafter designated simply by "this book") consists of 12 chapters.  Chapter 1 is a brief introduction motivating the reader to learn more about computer vision.  Chapter 2 explains how to install NumPy, SciPy, MatplotLib, OpenCV and Mahotas.  Since I used the virtual Ubuntu machine, I skipped that part.  If I had to install these (which I may in the future), I would probably install the Anaconda Python distribution (which I have used on another computer before) as it already include the first three packages mentioned above in addition to many other useful Python packages not included in the standard distribution.

Chapter 3 is a short chapter that explains how to load, display and save images using OpenCV and friends.  After reading the first 3 chapters, which numerically represent one quarter of the book, I was far from impressed by the amount of useful material covered.  This view was reinforced by the fourth chapter (Image Basics, explaining what a pixel is, how to access and manipulate pixels and the RGB color notation) and by the fifth chapter explaining how to draw simple shapes (lines, rectangles and circles).  However, and this is important to point out, Chapter 5 ends at page 36 ... which is only one-quarter of the book.  In my experience, most books produced "professionally" tend to have chapters of similar length (except for the introduction) so that one gets a subconscious impression of the amount of material covered in an entire book by reading a few chapters.  By contrast here, the author seems to have focused on defining a chapter as a set of closely related topics, with little regards to the amount of material (number of pages) included in a given chapter.  After reading the entire book, this decision makes a lot of sense to me here - even though it initially gave me a negative impression (Is that all there is? Am I wasting my time?) as I was reading the first few chapters.  So, if you purchased this book as well, and stopped reading before going through Chapter 6, I encourage you to resume your reading...

Chapter 6, Image Processing, is the first substantial chapter.  It covers topics such as Image transformations (translation, rotation, resizing, flipping, cropping), image arithmetic, bitwise operation, masking, splitting and mergin channels and conclude with a short section on color spaces which I would probably have put in an appendix.  As everywhere else in the book, each topic is illustrated by a simple program.

Chapter 7 introduces color histograms explaining what they are, and how to manipulate them to change the appearance of an image.

Chapter 8, Smoothing and Blurring, explains four simple methods (simple averaging, gaussian, median and bilateral) used to perform smoothing and blurring of images.

Chapter 9 Thresholding, covers three methods (simple, adaptive, and Otsu and Riddler-Calvard) to do thresholding.  What is thresholding?... it is a way to separate pixels into two categories (think black or white) in a given image.  This can be used as a preliminary to identify individual objects in an image and focus on them.

Chapter 10 deals with Gradients and Edge Detection.  Two methods are introduced (Laplacian and Sobel, and Canny Edge Detector).  This is a prelude to Chapter 11 which uses these techniques to count the number of objects in an image.

Chapter 12 is a short conclusion.

After going (quickly) through the book, I found that every individual topic was well illustrated by at least one simple example (program) showing the original image and the expected output.  Since the source code and images used are included with the book, it was really easy to reproduce the examples and do further exploration either using the same images or using my own images.   Note that I have not (yet) tried all the examples but all those I tried ran exactly as expected and are explained in sufficient details that they are very straightforward to modify for further exploration.

For the advanced topic, you will not find some theoretical derivation (read: math) for the various techniques: this is a book designed for people having at least some basic knowledge of Python and who want to write programs to do image manipulation; it is not aimed at researchers or graduate students in computer vision.

At first glance, one may think that asking $22 for a short (143 pages) ebook with code samples and images is a bit on the high side as compared with other programming ebooks and taking into account how much free material is already available on Internet. For example, I have not read (yet) any of the available tutorials on the OpenCV site  ... However, I found that the very good organization of the material in the book, the smooth progression of topics introduced and the number of useful pointers (e.g. Numpy gives nb columns X nb of rows unlike the traditional rows X cols in linear algebra; OpenCV store images in order Blue Green Red, as opposed to the traditional Red Green Blue, etc.) makes it very worthwhile for anyone that would like to learn about image processing using OpenCV.
I should also point out that books on advanced topics (such as computer vision) tend to be much pricier than the average programming book.  So the asking price seems more than fair to me.

If you are interested in learning about image processing using OpenCV (and Python, of course!), I would tentatively recommend this book.  I wrote tentatively as I have not yet read the Case Studies book: it could well turn out that my recommendation would be to purchase both as a bundle.  So, stay tuned if you are interested in this topic.

3 comments:

istudy said...

Thanks for providing your opinion, very helpful post.

coze said...

interesting. I'm interested in the book, he did a real good job presenting it on the website. I guess your review is about the 1st edition, and probably 2nd edition which is out now and for openCV3 is even better.

However he seems to have bumped the prices to $47 for just the books, and $94 with the ubuntu image. Which is kind of outrageous. Gonna go check free sources first and Packtlib. Quality of books on packtlib is not on par, but I got a YEARLY subscription for like $50, which gives you access to everything (including videos) packt has published for one year. which is a pretty good deal imho.

ராமன் ராஜா said...

I have read the whole book and done the exercises. My recommendation: The books is good for beginners, so go for it. But the Ubuntu virtual machine is unnecessary. It is quite easy to install Python XY or Anaconda along with OpenCV.
I use a Windows 7 PC with Python 2.7 and Open CV 3.0. They worked after some initial hiccups. (Tip: Do not change the default directories for installation, and set the system PATH yourself. Copy the file cv2.pyd from OpenCV to your \Lib\site-packages folder).