November 2023 - Michael's Programming Bytes

Python Lesson 47: Image Rotation (AI pt. 13)

Let’s rotate an image!

First off, let’s figure out how to rotate images with OpenCV. Here’s the image we’ll be working with in this example:

This is an image of the Jumbotron at First Horizon Park in Nashville, TN, home ballpark of the Nashville Sounds (Minor League Baseball affilate of the Milwaukee Brewers)

Now, how do we rotate this image? First, let’s read in our image in RGB colorscale:

import cv2
import matplotlib.pyplot as plt

ballpark=cv2.imread(r'C:\Users\mof39\Downloads\20230924_140902.jpg', cv2.IMREAD_COLOR)
ballpark=cv2.cvtColor(ballpark, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10))
plt.imshow(ballpark)

Now, how do we rotate this image? Let’s start by analyzing a 90-degree clockwise rotation:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_90_CLOCKWISE)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

All it takes to rotate an image in OpenCV is the cv2.rotate() method and two parameters-the image you wish to rotate and one of the following OpenCV rotation codes (more on these soon):

cv2.ROTATE_90_CLOCKWISE (rotates image 90 degrees clockwise)
cv2.ROTATE_180 (rotates image 180 degrees clockwise)
cv2.ROTATE_90_COUNTERCLOCKWISE (rotates image 270 degrees clockwise-or 90 degrees counterclockwise)

Let’s analyze the image rotation with the other two OpenCV rotation codes-first off, the ballpark image rotated 180 degrees clockwise:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_180)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

Alright, pretty impressive. It’s an upside down Jumbotron!

Now to rotate the image 270 degrees clockwise:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_90_COUNTERCLOCKWISE)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

Well well, it’s the amazing rotating Jumbotron!

And yes, in case you’re wondering, the rotation code cv2.ROTATE_90_COUNTERCLOCKWISE is the correct rotation code for a 270 degree clockwise rotation because a 90 degree counterclockwise rotation is the same thing as a 270 degree clockwise rotation.

Now, I know I just discussed three possible ways to rotate an image. However, what if you wanted to rotate an image in a way that’s not 90, 180, or 270 degrees. Well, if you try to do so with the cv2.rotate() method, you’ll get an error:

clockwiseBallpark = cv2.rotate(ballpark, 111)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

TypeError: Image data of dtype object cannot be converted to float

When I tried to rotate this image 111 degrees clockwise, I got an error because the cv2.rotate() method will only accept one of the three rotation codes I mentioned above.

Let’s rotate an image (in any angle)!

However, if you want more freedom over how you rotate your images in OpenCV, use the cv2.getRotationMatrix2D() method. Here’s an example as to how to use it:

height, width = ballpark.shape[:2]
center = (width/2, height/2)
rotationMatrix = cv2.getRotationMatrix2D(center,55,1)
rotatedBallpark = cv2.warpAffine(ballpark, rotationMatrix,(height, width)) 
plt.figure(figsize=(10, 10))
plt.imshow(rotatedBallpark)

To rotate an image in OpenCV using an interval that’s not a multiple of 90 degrees (90, 180, 270), you’ll need to use both the cv2.getRotationMatrix2D() and the cv2.warpAffine() method. The former method sets the rotation matrix, which refers to the degree (either clockwise or counterclockwise) that you wish to rotate this image. The latter method actually rotates the image.

Since both of these are new methods for us, let’s dive into them a little further! First off, let’s explore the parameters of the cv2.getRotationMatrix2D() method:

center-this parameter indicates the center of the image, which is necessary for rotations not at multiples-of-90-degrees. To get the center, first retrieve the image’s shape and from there, retrieve the height and width. Once you have the image’s height and width, create a center 2-element tuple where you divide the image’s width and height by 2. It would also be ideal to list the width before the height, but that’s just a programmer tip from me.
angle-the angle you wish to use for the image rotation. In this example, I used 55, indicating that I want to rotate the image 55 degrees clockwise. However, if I wanted to rotate the image 55 degrees counterclockwise, I would’ve used -55 as the value for this parameter.
scale-This is an integer that represents the factor you wish to use to zoom in the rotated image. In this example, I used 1 as the value for this parameter, indicating that I don’t want to zoom in the rotated image at all. If I’d used a value greater than 1, I’d be zooming in, and if I was using a value less than 1, I’d be zooming out.

Next, let’s explore the parameters of the cv2.warpAffine() method!

src-The image you wish to rotate (in this example, I used the base ballpark image)
M-The rotation matrix you just created for the image using the cv2.getRotationMatrix2D() method (ideally you would’ve stored the rotation matrix in a variable).
dsize-A 2-element tuple indicating the size of the rotated image; in this example, I used the base image’s height and width to keep the size of the rotated image the same.

Now for some extra notes:

Why is the rotation method called warpAffine()? This is because the rotation we’re performing on the image is also known as an affine transformation, which transforms the image (in this case rotating it) while keeping its same shape.
You’ll notice that after rotating the image using the cv2.warpAffine method, the entire image isn’t visible on the plot. I haven’t figured out how to make the image visible on the plot but when I do, I can certainly share my findings here. Though I guess a good workaround solution would be to play around with the size of the plot.

Thanks for reading, and for my readers in the US, have a wonderful Thanksgiving! For my readers elsewhere on the globe, have a wonderful holiday season (and no, this won’t be my last post for 2023)!

Python Lesson 46: Image Blurring (AI pt. 12)

Intro to image blurring

Now that we know a little bit about image blurring, let’s explore it with code. Here’s the image that we’ll be using:

The photo above is of Stafford Park, a lovely municipal park in Miami Springs, FL.

Unlike image eroding, image blurring has a pretty self-explanatory description since the aim of this process is to, well, blur images. How can we accomplish this through OpenCV?

Before we get into the fun image-blurring code, let’s discuss the three main types of image blurring that are possible with OpenCV:

Gaussian blur-this process softens out any sharp edges in the image
Median blur-this process helps remove image noise* by changing pixel colors wherever necessary
Bilateral blur-this process makes the central part of the image clearer while making any non-central part of the image fuzzier

*For those unfamiliar with image processing, image noise is (usually unwanted) random brightness or color deviations that appear in an image. Median blurring assists you with removing image noise.

Now that we know the three different types of image blurring, let’s see them in action with the code

Gaussian blur

Before we start to blur the image, let’s read in the image in RGB colorscale:

import cv2
import matplotlib.pyplot as plt

park=cv2.imread(r'C:\Users\mof39\OneDrive\Documents\20230629_142648.jpg', cv2.IMREAD_COLOR)
park=cv2.cvtColor(park, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10))
plt.imshow(park)

Next, let’s perform a Gaussian blur of the image:

gaussianPark = cv2.GaussianBlur(park, (7, 7), 10, 10)
plt.figure(figsize=(10,10))
plt.imshow(gaussianPark)

Notice anything different about this image? The sharp corners in this photo (such as the sign lettering) have been smoothed out, which is the point of Gaussian blurring (to smooth out rough edges in an image).

Now, what parameters does the cv2.GaussianBlur() method take?

The image you wish to blur (park in this case)
A 2-integer tuple indicating the size of the kernel you wish to use for the blurring process-yes, this is similar to the kernels we used for image erosion in the previous post Python Lesson 45: Image Resizing and Eroding (AI pt. 11) (we’re using a 7-by-7 kernel here).
Two integers that represent the sigmaX and sigmaY of the Gaussian blur that you wish to perform. What are sigmaX and sigmaY? Both integers represent the numerical factors you wish to use for the image blurring-sigmaX being the factor for horizontal blurring and sigmaY being the factor for vertical blurring.

A few things to keep in mind regarding the Gaussian blurring process:

Just as you did with image erosion, ensure that both dimension of the blurring kernel are positive and odd-numbered integers (like the 7-by-7 kernel we used above).
sigmaX and sigmaY are optional parameters, but keep in mind if you don’t include a value for either of them, both will default to a 0 value, which might not blur your picture the way you intended. Likewise, if you use a very high value for both sigmas, you’ll end up with a very, very blurry picture.

Median blur

Since median blurring helps remove image noise, we’re going to be using this altered park image with a bunch of noise for our demo:

Next up, let’s explore the median blur with our noisyPark image:

medianPark = cv2.medianBlur(noisyPark, 5)
plt.figure(figsize=(10,10))
plt.imshow(medianPark)

As you can see, median blurring the noisyPark image cleared out a significant chunk of the image noise! But how does this function work? Let’s explore some of its parameters:

The image you wish to blur (noisyPark in this case)
A single integer indicating the size of the kernel you wish to use for the blurring process-yes, this is similar to the kernels we used for Gaussian blurring, but you only need a single integer instead of a 2-integer tuple (we’re using a 5-by-5 kernel here). The integer must be a positive and odd number since the kernel must be an odd number (same rules as the Gaussian blur apply here for kernel creation).

Bilateral blur

Last but not least, let’s explore bilateral blurring! This time, let’s use the non-noise altered park image.

bilateralPark = cv2.bilateralFilter(park, 12, 120, 120) 
plt.figure(figsize=(10,10))
plt.imshow(bilateralPark)

Wow! As I mentioned earlier, the purpose of bilteral blurring is to make a central part of the image clearer while make other, non-central elements of the image blurrier. And boy, does that seem to be the case here since the central element of the image (the park sign and all its lettering) really pops out while everything in the background seems a bit blurrier.

How does the cv2.bilateralFilter() function work its magic? Here’s how:

The image you wish to blur (park in this case)
The diameter (in pixels) of the region you wish to iterate through to blur-in this case, I chose a 12-pixel diameter as my “blurring region”. It works in a similar fashion to the kernels we used for our “erosion region” in the previous lesson.
The next two integers-both 120-are the sigmaColor and sigmaSpace variables, respectively. The sigmaColor variable is a factor that considers how much color should be considered in the blurring process while the sigmaSpace variable is a factor that considers the proximity of several elements in the image (such as the runners in the background). The higher both of these values are, the blurrier the background will be.

Thanks for reading,

Michael