image manipulation Archives - Michael's Programming Bytes

Python Lesson 47: Image Rotation (AI pt. 13)

Let’s rotate an image!

First off, let’s figure out how to rotate images with OpenCV. Here’s the image we’ll be working with in this example:

This is an image of the Jumbotron at First Horizon Park in Nashville, TN, home ballpark of the Nashville Sounds (Minor League Baseball affilate of the Milwaukee Brewers)

Now, how do we rotate this image? First, let’s read in our image in RGB colorscale:

import cv2
import matplotlib.pyplot as plt

ballpark=cv2.imread(r'C:\Users\mof39\Downloads\20230924_140902.jpg', cv2.IMREAD_COLOR)
ballpark=cv2.cvtColor(ballpark, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10))
plt.imshow(ballpark)

Now, how do we rotate this image? Let’s start by analyzing a 90-degree clockwise rotation:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_90_CLOCKWISE)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

All it takes to rotate an image in OpenCV is the cv2.rotate() method and two parameters-the image you wish to rotate and one of the following OpenCV rotation codes (more on these soon):

cv2.ROTATE_90_CLOCKWISE (rotates image 90 degrees clockwise)
cv2.ROTATE_180 (rotates image 180 degrees clockwise)
cv2.ROTATE_90_COUNTERCLOCKWISE (rotates image 270 degrees clockwise-or 90 degrees counterclockwise)

Let’s analyze the image rotation with the other two OpenCV rotation codes-first off, the ballpark image rotated 180 degrees clockwise:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_180)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

Alright, pretty impressive. It’s an upside down Jumbotron!

Now to rotate the image 270 degrees clockwise:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_90_COUNTERCLOCKWISE)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

Well well, it’s the amazing rotating Jumbotron!

And yes, in case you’re wondering, the rotation code cv2.ROTATE_90_COUNTERCLOCKWISE is the correct rotation code for a 270 degree clockwise rotation because a 90 degree counterclockwise rotation is the same thing as a 270 degree clockwise rotation.

Now, I know I just discussed three possible ways to rotate an image. However, what if you wanted to rotate an image in a way that’s not 90, 180, or 270 degrees. Well, if you try to do so with the cv2.rotate() method, you’ll get an error:

clockwiseBallpark = cv2.rotate(ballpark, 111)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

TypeError: Image data of dtype object cannot be converted to float

When I tried to rotate this image 111 degrees clockwise, I got an error because the cv2.rotate() method will only accept one of the three rotation codes I mentioned above.

Let’s rotate an image (in any angle)!

However, if you want more freedom over how you rotate your images in OpenCV, use the cv2.getRotationMatrix2D() method. Here’s an example as to how to use it:

height, width = ballpark.shape[:2]
center = (width/2, height/2)
rotationMatrix = cv2.getRotationMatrix2D(center,55,1)
rotatedBallpark = cv2.warpAffine(ballpark, rotationMatrix,(height, width)) 
plt.figure(figsize=(10, 10))
plt.imshow(rotatedBallpark)

To rotate an image in OpenCV using an interval that’s not a multiple of 90 degrees (90, 180, 270), you’ll need to use both the cv2.getRotationMatrix2D() and the cv2.warpAffine() method. The former method sets the rotation matrix, which refers to the degree (either clockwise or counterclockwise) that you wish to rotate this image. The latter method actually rotates the image.

Since both of these are new methods for us, let’s dive into them a little further! First off, let’s explore the parameters of the cv2.getRotationMatrix2D() method:

center-this parameter indicates the center of the image, which is necessary for rotations not at multiples-of-90-degrees. To get the center, first retrieve the image’s shape and from there, retrieve the height and width. Once you have the image’s height and width, create a center 2-element tuple where you divide the image’s width and height by 2. It would also be ideal to list the width before the height, but that’s just a programmer tip from me.
angle-the angle you wish to use for the image rotation. In this example, I used 55, indicating that I want to rotate the image 55 degrees clockwise. However, if I wanted to rotate the image 55 degrees counterclockwise, I would’ve used -55 as the value for this parameter.
scale-This is an integer that represents the factor you wish to use to zoom in the rotated image. In this example, I used 1 as the value for this parameter, indicating that I don’t want to zoom in the rotated image at all. If I’d used a value greater than 1, I’d be zooming in, and if I was using a value less than 1, I’d be zooming out.

Next, let’s explore the parameters of the cv2.warpAffine() method!

src-The image you wish to rotate (in this example, I used the base ballpark image)
M-The rotation matrix you just created for the image using the cv2.getRotationMatrix2D() method (ideally you would’ve stored the rotation matrix in a variable).
dsize-A 2-element tuple indicating the size of the rotated image; in this example, I used the base image’s height and width to keep the size of the rotated image the same.

Now for some extra notes:

Why is the rotation method called warpAffine()? This is because the rotation we’re performing on the image is also known as an affine transformation, which transforms the image (in this case rotating it) while keeping its same shape.
You’ll notice that after rotating the image using the cv2.warpAffine method, the entire image isn’t visible on the plot. I haven’t figured out how to make the image visible on the plot but when I do, I can certainly share my findings here. Though I guess a good workaround solution would be to play around with the size of the plot.

Thanks for reading, and for my readers in the US, have a wonderful Thanksgiving! For my readers elsewhere on the globe, have a wonderful holiday season (and no, this won’t be my last post for 2023)!

Python Lesson 46: Image Blurring (AI pt. 12)

Intro to image blurring

Now that we know a little bit about image blurring, let’s explore it with code. Here’s the image that we’ll be using:

The photo above is of Stafford Park, a lovely municipal park in Miami Springs, FL.

Unlike image eroding, image blurring has a pretty self-explanatory description since the aim of this process is to, well, blur images. How can we accomplish this through OpenCV?

Before we get into the fun image-blurring code, let’s discuss the three main types of image blurring that are possible with OpenCV:

Gaussian blur-this process softens out any sharp edges in the image
Median blur-this process helps remove image noise* by changing pixel colors wherever necessary
Bilateral blur-this process makes the central part of the image clearer while making any non-central part of the image fuzzier

*For those unfamiliar with image processing, image noise is (usually unwanted) random brightness or color deviations that appear in an image. Median blurring assists you with removing image noise.

Now that we know the three different types of image blurring, let’s see them in action with the code

Gaussian blur

Before we start to blur the image, let’s read in the image in RGB colorscale:

import cv2
import matplotlib.pyplot as plt

park=cv2.imread(r'C:\Users\mof39\OneDrive\Documents\20230629_142648.jpg', cv2.IMREAD_COLOR)
park=cv2.cvtColor(park, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10))
plt.imshow(park)

Next, let’s perform a Gaussian blur of the image:

gaussianPark = cv2.GaussianBlur(park, (7, 7), 10, 10)
plt.figure(figsize=(10,10))
plt.imshow(gaussianPark)

Notice anything different about this image? The sharp corners in this photo (such as the sign lettering) have been smoothed out, which is the point of Gaussian blurring (to smooth out rough edges in an image).

Now, what parameters does the cv2.GaussianBlur() method take?

The image you wish to blur (park in this case)
A 2-integer tuple indicating the size of the kernel you wish to use for the blurring process-yes, this is similar to the kernels we used for image erosion in the previous post Python Lesson 45: Image Resizing and Eroding (AI pt. 11) (we’re using a 7-by-7 kernel here).
Two integers that represent the sigmaX and sigmaY of the Gaussian blur that you wish to perform. What are sigmaX and sigmaY? Both integers represent the numerical factors you wish to use for the image blurring-sigmaX being the factor for horizontal blurring and sigmaY being the factor for vertical blurring.

A few things to keep in mind regarding the Gaussian blurring process:

Just as you did with image erosion, ensure that both dimension of the blurring kernel are positive and odd-numbered integers (like the 7-by-7 kernel we used above).
sigmaX and sigmaY are optional parameters, but keep in mind if you don’t include a value for either of them, both will default to a 0 value, which might not blur your picture the way you intended. Likewise, if you use a very high value for both sigmas, you’ll end up with a very, very blurry picture.

Median blur

Since median blurring helps remove image noise, we’re going to be using this altered park image with a bunch of noise for our demo:

Next up, let’s explore the median blur with our noisyPark image:

medianPark = cv2.medianBlur(noisyPark, 5)
plt.figure(figsize=(10,10))
plt.imshow(medianPark)

As you can see, median blurring the noisyPark image cleared out a significant chunk of the image noise! But how does this function work? Let’s explore some of its parameters:

The image you wish to blur (noisyPark in this case)
A single integer indicating the size of the kernel you wish to use for the blurring process-yes, this is similar to the kernels we used for Gaussian blurring, but you only need a single integer instead of a 2-integer tuple (we’re using a 5-by-5 kernel here). The integer must be a positive and odd number since the kernel must be an odd number (same rules as the Gaussian blur apply here for kernel creation).

Bilateral blur

Last but not least, let’s explore bilateral blurring! This time, let’s use the non-noise altered park image.

bilateralPark = cv2.bilateralFilter(park, 12, 120, 120) 
plt.figure(figsize=(10,10))
plt.imshow(bilateralPark)

Wow! As I mentioned earlier, the purpose of bilteral blurring is to make a central part of the image clearer while make other, non-central elements of the image blurrier. And boy, does that seem to be the case here since the central element of the image (the park sign and all its lettering) really pops out while everything in the background seems a bit blurrier.

How does the cv2.bilateralFilter() function work its magic? Here’s how:

The image you wish to blur (park in this case)
The diameter (in pixels) of the region you wish to iterate through to blur-in this case, I chose a 12-pixel diameter as my “blurring region”. It works in a similar fashion to the kernels we used for our “erosion region” in the previous lesson.
The next two integers-both 120-are the sigmaColor and sigmaSpace variables, respectively. The sigmaColor variable is a factor that considers how much color should be considered in the blurring process while the sigmaSpace variable is a factor that considers the proximity of several elements in the image (such as the runners in the background). The higher both of these values are, the blurrier the background will be.

Thanks for reading,

Michael

Python Lesson 45: Image Resizing and Eroding (AI pt. 11)

Resizing images

First off, let’s start this post by exploring how to resize images in OpenCV. Here is the image we’ll be working with throughout this post

This is an image of a hawk on a soccer goal at Sevier Park (Nashville, TN), taken in August 2021.

Now, how could we possible resize this image? Take a look at the code below (and yes, we’ll work with the RGB colorscale version of the image) to first upload and display the image:

import cv2
import matplotlib.pyplot as plt

hawk=cv2.imread(r'C:\Users\mof39\Downloads\20210807_172420.jpg', cv2.IMREAD_COLOR)
hawk=cv2.cvtColor(hawk, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(9, 9))
plt.imshow(hawk)

Before we start with resizing the image, let’s first get the image’s size (I’ll explain why this information will be helpful later):

print(hawk.shape)

(3000, 4000, 3)

To get the image’s size, use the print([image variable].shape) method. This method returns a 3-integer tuple that indicates height, width and dimensions; in the case of the hawk image, the image is 3000 px tall by 4000 px wide and 3-dimensional overall (px stands for pixels-recall that computer image dimensions are measured in pixels).

Now, how can we resize this image? Take a look at the code below:

smallerHawk = cv2.resize(hawk, (2000, 1500))
plt.imshow(smallerHawk)

As you can see here, we reduced the size of the hawk image in half without cropping out any of the image’s elements. How did we do that? We used the cv2.resize() method and passed in not only the hawk image but also a 2-integer tuple-(2000, 1500)-to indicate that I wanted to reduced the size of the hawk image in half.

Now, there’s something interesting about the (2000, 1500) tuple I want to point out. See, when we listed the shape of the image, the 3-inter tuple that was returned (3000, 4000, 3) listed the image’s height before its width. However, in the tuple we passed to the cv2.resize() method, the image’s width (well, half of the image’s width) was listed before the image’s height (rather, half the height). Listing the width before the height allows you to properly resize the image the way you intended.

Now, what happens when we make this image bigger? Take a look at the following code:

largerHawk = cv2.resize(hawk, (6000, 8000))
plt.figure(figsize=(9, 9))
plt.imshow(largerHawk)

Granted, the image may not appear larger at first, but that’s mostly due to how we’re plotting it on MATPLOTLIB. If you look closely at the tick marks on each axis of the plot, you will see that the image size has indeed doubled to 6000 by 8000 px.

Image erosion

The next image manipulation technique I want to discuss is image erosion. What does image erosion do?

The simple answer is that image erosion, well, erodes away the boundaries on an image’s foreground object, whatever that may be (if it helps, think of the OpenCV image erosion process like geological erosion, only for images). How the image erosion is acoomplished is more complicated than a simple method like cv2.resize(), however let’s explore the image erosion process in the code below:

import numpy as np
kernel = np.ones((5,5), np.uint8)
erodedHawk = cv2.erode(hawk, kernel)
plt.figure(figsize=(10,10))
plt.imshow(erodedHawk)

OK, so aside from the cv2.erode() method, we’re also creating a numpy array. Why is that?

Well, the numpy array kernel (aptly called kernel) is essentially a matrix of 1s like so:

Since we specified that our matrix is of size (5, 5), we get a 5-by-5 matrix of ones. Pretty simple right? Here are some other things to keep in mind when creating the kernel:

Make sure the kernel’s dimensions are both odd numbers to ensure the presence of a central point in the kernel.
Theoretically, you could create a kernel of 0s, but a kernel of 1s is better suited for image erosion.
Ideally, you should also include np.uint8 as the second parameter in the kernel creation. For those who don’t know, np.unit8 stands for numpy unsigned 8-bit integer. The reason I suggest using this parameter is because doing so will store the matrix as 8-bit integers, which is beneficial for memory optimization in computer programs.

Now, how does this kernel help with image erosion? See, the 5-by-5 kernel that we just created iterates through the image we wish to erode (hawk in this case) by checking if each pixel that borders the kernel’s central pixel is set to 0 or 1. If all pixels that border the central pixel in the image are set to 1, then the central pixel is also set to 1. Otherwise, the central pixel is set to 0?

What do the 0s and 1s all mean here? Notice how the leaves on the tree in this eroded image look slightly darker than the tree leaves in the original image. That’s because image erosion manipulates an image’s foreground (in this case, OpenCV percieves the tree as the foreground) by removing pixels from the foreground’s boundaries, thus making certain parts of the image appear slightly darker after erosion. The slightly darker tree leaves make the image of the hawk stand out more than it did in the original image.

Thanks for reading,

Michael