Python Lesson 42: Intro To Computer Vision Part One-Reading Images (AI pt. 8)

Advertisements

Hello everybody,

Michael here, and today’s Python lesson looks to be quite a bit of fun! Wonder why?

We’re introducing a new topic today-computer vision!

Now, one of the Python concepts I’ve covered over the course of this blog’s run is NLP-or natural language processing, which is a form of AI. Computer vision (or CV for short) is also a form of AI, but instead of using text language, computer vision deals with images.

An intro to OpenCV

To further explore computer vision in Python, we’ll introduce a package called OpenCV. If you don’t already have this package installed, here’s the line of code to run (either on your IDE or command prompt) to install it:

pip install opencv-python
  • Yes, you’ll need to install the opencv-python package. Using the line pip install opencv won’t work.
  • If you pip installing this or any other package on your IDE, include the ! before the pip install line.

Once we get our package installed, let’s start exploring the fun stuff it can do!

And now, time to explore OpenCV

The OpenCV concepts we’ll explore in this post are two of its simpler functions-reading an image onto the IDE and displaying that image onto the IDE!

Before we begin, here’s the image I’ll be using for this section in case you want to follow along with this tutorial.

  • Regular readers of Michael’s Programming Byte’s will likely recognize this cat as Simba/The Orange Boy who (along with his sister Marbles) got a well-deserved recognition on the second part of my fifth anniversary post (The Glorious Five-Year Plan Part Two).

Now to read the image into Python, here’s the code we’ll use for this tutorial:

import cv2

cat = cv2.imread(r'C:\Users\mof39\Downloads\IMG_5427.jpg', cv2.IMREAD_COLOR)
cv2.imshow("image", cat)
cv2.waitKey(60000)
cv2.destroyAllWindows()

Not sure what all this code means? Let’s break it down:

  • To use Python’s OpenCV package, you’d need to run the line import cv2, not import opencv.
  • The cv2.imread() function takes two parameters-the path to your image and the mode you want to use to read the image into the IDE. As of this writing, OpenCV has 13 different modes to read an image into the IDE! The IMREAD_COLOR mode allows you to display the image as a standard color image
  • The cv2.imshow() function also takes two parameters-the word image and the image variable (cat in this case) and unlike the cv2.imread() function, this function displays the image on your IDE (in this case, a window will pop up)
  • The cv2.waitKey() method takes an integer as a parameter. The point of this function is to close the window with the image after a specified number of milliseconds-I used 60000 milliseconds in this case (equal to 1 minute).
  • The cv2.destroyAllWindows() function takes no parameters since all it does is, well, destroy all open windows in the IDE after the specified number of millseconds (specified by the cv2.waitKey() function).

The cv2.waitKey() and cv2.destoryAllWindows() functions are optional to include, but if you don’t include them, the window with the image will simply stay open unless you close it.

Troubleshooting and the magic of OpenCV colorscales

Now here’s what the image looks like after it’s read into the IDE with OpenCV:

I’ll be honest, even though OpenCV did succesfully read and display the image, it didn’t do a good job of processing the image. How can we fix this? Take a look at the code below:


import matplotlib.pyplot as plt
cat=cv2.imread(r'C:\Users\mof39\Downloads\IMG_5427.jpg', cv2.IMREAD_COLOR)
plt.figure(figsize=(8,8))
plt.imshow(cat)

Now, let’s see what kind of output we get:

OK, so what did we do differently? Well, we used a combination of the MATPLOTLIB* and OpenCV packages to read our image onto the IDE and display it as well. While that combination of packages did the trick when it came to displaying the entire image on the IDE, you’ll notice that the cat (along with most other things in the image) looks rather blue.

*The plt.imshow() function is a MATPLOTLIB function.

Why might that be? After all, we did use the IMREAD_COLOR mode to read the image into the IDE, so how did we get this result? Take a look at the revised code below for a solution to this issue:

cat=cv2.imread(r'C:\Users\mof39\Downloads\IMG_5427.jpg', cv2.IMREAD_COLOR)
cat=cv2.cvtColor(cat, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(8,8))
plt.imshow(cat)

Here’s the output we get:

I used most of the same code as I did for the previous example, with one additional line-the line that uses the cv2.cvtColor() function.

Why did I need to use this function? Well, if you’re wondering why Orange Boy looked a little blue in the first image, that’s because OpenCV read the first image in the BGR colorscale, which is the OpenCV default method of reading images.

The BGR colorscale stands for blue, green, red colorscale and it reads in images based off of the intensity of blue, green, and red light in the image. Since the blue light in the image was the most intense, the image looked blue upon display.

Now, here’s where the cv2.cvtColor() function comes in! This function takes two parameters-the image whose colorscale you want to convert and the conversion mode you want to use for the image. There are over a dozen conversion modes you can use for the image, but in this example, we’ll use the COLOR_BGR2RGB conversion mode, which changes the image’s colorscale from BGR to RGB (or red, green, blue). After running this function, then run the plt.imshow() function on the color-converted Orange Boy image so that you can see the normal-looking image, not the Blue Boy image.

Now, why does OpenCV read in images with a BGR colorscale by default? The reason for this is because back when OpenCV was first developed (in the summer of 2000), the BGR colorscale was the more popular colorscale to use for computer graphics. However, the RGB colorscale has since become the more widely-adopted colorscale for computer graphics (including Python packages like MATPLOTLIB)-and in all my honesty, I think it makes images display a lot better.

Another colorscale conversion, why not?

Now that I’ve taught you the basics of reading images into the Python IDE with the OpenCV package, let’s have a little fun with the colorscale conversions, shall we? Take a look at the code below:

cat=cv2.imread(r'C:\Users\mof39\Downloads\IMG_5427.jpg', cv2.IMREAD_COLOR)
cat=cv2.cvtColor(cat, cv2.COLOR_BGR2RGB)
cat=cv2.cvtColor(cat, cv2.COLOR_RGB2HSV)
plt.figure(figsize=(8,8))
plt.imshow(cat)

In this example, I converted the image of Orange Boy first to the RGB colorscale and then to the HSV (hue, saturation, value) colorscale. As you can see, the cat looks like something out of a thermal camera.

  • If you want an image like the one I got here, then you’ll need to first convert the image to RGB colorscale BEFORE converting to another colorscale!
  • The cv2.cvtColor() function contains about 150 different color conversion modes!
  • Just in case you’re wondering what the HSV colorscale is, here’s a simple three-bullet-point explanation:
  • H stands for hue, which represents the type of color the image contains. The value for H is represented on the color wheelusing a value from 0-360 degrees (as in angle degrees).
  • S stands for saturation, which represents the intensity (or well, saturation) of the color. The value for S is represented on a scale between 0% (fully desaturated pure grey)-100% (fully saturated pure color)
  • V stands for value (or brightness), which represents how bright or dark the color will appear in the image. The value for V, just as with S, is represented on a scale between 0% (fully black)-100% (fully illuminated color).

Thanks for reading and be sure to stay tuned for Part Two of Intro to Computer Vision!

Michael

Leave a ReplyCancel reply