And Now Let’s Create Some AI Art (Midjourney Version)(AI pt. 15)

Advertisements

Hello everybody,

Michael here, and for my final post of 2023, I wanted to try something a little different! Usually on this blog, I like to only use tools that are open-source (aka free)-this way, all of you can follow along with my tutorials.

However, for this post I wanted to try something different-Midjourney! Just like DALLE-2, Midjourney is an AI text-to-art generator (you may recall that I explored DALLE-2 in the post /And Now Let’s Create Some AI Art! (AI pt.6)). However, unlike DALLE-2, Midjourney cannot be used for free. But since I wanted to fool around with Midjourney, I thought I could do post on it for all of my loyal readers!

Let’s begin!

Five fast facts about Midjourney

In my intro paragraph, I did mention that Midjourney was an AI text-to-art generator. Here are five more fast facts about Midjourney:

  • It works in a very similar manner to DALLE-2 in the sense that both tools are text-to-art generators.
  • However, unlike DALLE-2, Midjourney wasn’t developed by OpenAI (it was created by Midjourney Labs).
  • As of this writing, Midjourney is currently in open beta mode, as has been the case since its creation in July 2022.
  • Midjourney utilitzes Discord as an interface to generate its AI art.
  • As long as you’re on a paid subscription, you can generate as many images as you want (unlike DALLE-2, where the free trial limits you to a certain number of image generations a month)

Setting up Midjourney

Before we start polaying aroung with the magic of Midjourney, you’ll need two things to set it up:

  • A Midjourney subscription
  • A Discord account (I’ll explain this later)

If you need assistance setting up Midjourney, please follow the stpes in this link-https://docs.midjourney.com/docs/quick-start.

Now, why would you need a Discord account? See, even though Midjourney is separate from Discord, I mentioned earlier that Midjourney currently uses Discord as its interface to generate AI art via a Midjourney Discord bot (which makes Midjourney a bit more convoluted to set up than DALLE-2).

And Now Let’s Make Some Midjourney AI Art

Once you’ve gotten Midjourney set up, let’s get started creating our very own AI art!

First, let’s open up our Discord Midjourney bot:

When you open up the bot, you’ll see the bot’s homepage. To start creating Midjourney art, go to any of the channels that start with newbies.

As you can see here, I am currently in the newbies-122 channel, which is where I can start generating AI art.

To begin with the AI art generation, I will first run the /imagine command and then type the prompt A Christmas card featuring Santa Claus and his reindeer saying "Happy Holidays To You" that is drawn in pencil sketch with lots of color. Let’s see what Midjourney spits out!

As you can see, after a few minutes, Midjourney will (just like DALLE-2) spit out four different images based off the prompt you submitted. Also, if you look at each image closely, you’ll see that Midjourney, like DALLE-2, doesn’t have the hang of generated coherent text (but it sure is good at generating gibberish).

However, you may be wondering what all of these buttons below the generated images do. Allow me to explain:

  • The U1-U4 buttons allow you to output only one of the four generated images. U1 represents the image in the upper left hand corner, U2 represents the image in the upper right hand corner, U3 represents the image in the lower left hand corner and U4 represents the image in the the lower right hand corner.
  • The V1-V4 buttons also represent the four images (V1=upper left hand image, V2=upper right hand image, V3=lower left hand image, V4=lower right hand image) but unlike the U1-U4 buttons, these buttons allow you to modify the prompt on an individual image-or as Midjourney calls it, “remixing” each image.
  • The refresh button allows you to generate four different images with the same prompt.
  • A note about these buttons: you can also use them for other users’ images, not just your own (I mean, it is fun to see other users’ prompts).

A little note about the Midjourney interface

If you’re playing around with Midjourney, you’ll notice that you’re far from the only one generating AI art. In fact, there are certianly going to be thousands of users at any given trying to generate their own AI art. While I think it’s neat that anyone can log onto Midjourney at any time, it also makes the user interface less user-friendly since if you want to find your generated art, you’ll need to do quite a bit of scrolling!

This image was taken at 11:32AM, so this should give you an idea as to how many people are generating Midjourney art at once!

Luckily, if you want to easily find your Midjourney art, head on over to https://www.midjourney.com/explore, log in to your Midjourney account, and navigate to My Images:

In this interface, I have all of the images I generated through Midjouney throughout the duration of my subscription. This way, in case I want to find any image I generated on Midjourney, all I need to do is go here.

  • As you can see from the screenshot above, the functionality to /imagine new prompts from this interface has yet to be implemented as of this writing (December 2023).

You can even click on an image to see its prompt in case you want to use and/or modify that prompt for a future image generation:

As you see here, this image was generated as part of the Christmas card prompt I wrote earlier. Personally, even though I asked the image to generate a Happy Holidays To You card featuring Santa and his reindeer, I instead get whatever this is. Santa has one reindeer in this image (and its wearing what I think is a necklace of leaves for some reason). The reindeer has six legs. There is no text in this image. The other creatures in this images look like two gerbils and two gremlin-things (it would be a stretch to call them elves). At least there’s something resembling a Christmas village in the background.

Let’s try some other scenarios

Next up, let’s try some other Midjourney scenarios! First up, let’s see Midjourney’s capabilities for generating realistic looking photographs.

Since it’s the holidays, let’s try this prompt next-/imagine A Nikon photo of the Avengers at a Christmas party at Avengers tower. Iron Man, Black Widow, Hulk, Captain America, Thor, and Hawkeye are there. 16:9 aspect ratio

Yes, Midjourney can even set aspect ratios and camera model-styles for the AI images it generates. Let’s take a look at the images we got from this prompt:

Throughout these four images, the only character that Midjourney seems to get right each time is Iron Man. Marvel fans like myself will liekly recognize several mistakes throughout these four images:

All of this just goes to show you that Midjourney, like DALLE-2, doesn’t have the best attention to detail when generating AI art.

The images I got when I used this prompt-/imagine A Nikon photo of Kang The Conqueror and Thanos at a Christmas Party, 16:9 aspect ratio-weren’t much better. Here’s one such image:

I mean, at least Midjourney made Thanos purple, but Kang the Conqueror is another story entirely (unless he happens to be one of Kang’s many variants).

Let’s try generating AI people!

So now that we’ve explored some AI art-generation scenarios, let’s try something different! As you may recall from the post And Now Let’s Create Some AI Art! (AI pt.6), DALLE-2 wasn’t the best when it came to generating images of real people. However, let’s see if Midjourney is up for the task!

Here’s the prompt I’ll use-/imagine Stephen Curry drawn in Simpsons style

And here are the generated images:

Not gonna lie, I’m surprised that not only did Midjourney generated a pretty accurate-looking Stephen Curry, but also that it generated the correct Golden State Warriors logo and generated the text Golden State Warriors correctly. However, Midjourney didn’t really replicate the Simpsons art style and in the third image, it got Steph’s jersey number wrong (he wears a #30 jersey, not a #35-which is the number Kevin Durant wore during his stint with the Warriors).

Now, just as I did with my DALLE-2 experiment, I’ll try to generate an image of a female public figure and see where that goes. Here’s the prompt I used-/imagine A drawing of Margot Robbie in colored pencil sketch, and here’s the output:

Not gonna lie, but I’m amazed at how much Midjourney’s generated images actually resemble the real Margot Robbie. Unlike DALLE-2, which didn’t allow me to generate image of female public figures for some reason, Midjourney does allow for these types of image generations and does a scarily accurate job of it too.

And now, let’s go to an AI-generated place!

So, we’ve tested how well Midjourney can replicate pop culture and people, but let’s see how well it knows places. Here’s the prompt I’ll use-/imagine A neon rendering of Bicentennial Capitol Mall State Park in Nashville, TN-and here’s the output:

From these images, I see that Midjourney at least got the Capitol part right (Bicentennial Capitol Mall State Park does have a view of the Tennessee state capitol building from the park), but the four images generated look like they could be a part of Downtown DC, not Downtown Nashville. At least Midjourney included a park in each of these four images.

If you’re wondering what Bicentennial Capitol Mall State Park looks like, here’s a picture of it (more specifically, the amphitheater in all its glory):

This is just a small sampling of the things Midjourney is capable of, and although it doesn’t always have the best attention to detail (or greatest text-generation abilities), it is still an amazing AI tool-though it can never, ever, ever replace human creativity or talent (or your friendly neighborhood coding tutorial writer).

With all that said, thank you all for another wonderful year of programming and development (and I hope you learned something along the way). Have a happy and festive holiday season to you all, and I’ll see you in 2024 for another amazing year of development and learning! Keep calm and code on!

AI-generated Santa wishes you a happy holiday season!

Michael!

Python Lesson 48: Image Borders (AI pt. 14)

Advertisements

Hello everybody,

Michael here, and in today’s post, we’ll explore image border creation using OpenCV!

Let’s create some image borders!

Before we start creating image borders, here’s the image we’ll be using for this lesson:

This is an image of Fry and Bender pumpkins (two main characters from the animated sitcom Futurama if you weren’t familiar) that I created with a few Sharpie markers this past Halloween (creative I know).

Now, let’s read in our image to the IDE in RGB form:

import cv2

import matplotlib.pyplot as plt

pumpkins = cv2.imread(r'C:\Users\mof39\Downloads\20231031_231020.jpg', cv2.IMREAD_COLOR)
pumpkins = cv2.cvtColor(pumpkins, cv2.COLOR_BGR2RGB)

Now that we’ve done that, let’s add a simple yellow border around our image:

pumpkinsNormalized = pumpkins / 255.0


pumpkinsWithYellowBorder = cv2.copyMakeBorder(pumpkinsNormalized, 30, 30, 30, 30, cv2.BORDER_CONSTANT, value=[1, 1, 0])

plt.figure(figsize=(10, 10))
plt.imshow(pumpkinsWithYellowBorder)
plt.show()

To add a simple yellow rectangular border around our image, we’ll need to use the cv2.copyMakeBorder() method and include the following parameters:

  • The image where you will add a border
  • Four integers indicating the border’s thickness (in pixels) on the top, bottom, left, and right sides of the border, respectively
  • One of five possible OpenCV image border modes-in this case, I used the border mode cv2.BORDER_CONSTANT, which adds a simple colored rectangular border to the image.
  • A three-integer array indicating the border color in RGB notation (more on that shortly)

Now, you may see something unfamiliar above-pumpkinNormalized. This indicates that I have normalized the image. What does that mean?

In this case, normalizing an image means scaling all the pixels-and in turn, colors-to ensure that all of the pixels and colors in the image are using the same scale. This is important since I discovered that OpenCV has a bug (as of this writing) where when you try to add a border to an image, it will add a border to the greyscale version of the image (even if you read it into the IDE in RGB scale). Normalizing the image ensures that the border will be added to the RGB version of the image.

  • It likely goes without saying, but if you want the border on the RGB image, please remember to add the border to the normalized image, not the initial image you read into the IDE (even if you did convert it to RGB colorscale).

Now, as for the RGB color array, let’s dive into that. The array works the same way as other forms of RGB notation (e.g. RGB(210, 12, 12) represents the intensity of the red, green and blue colors) but with one key difference-the values used only range from 0 to 1 (including decimals between these two integers). The values still represent the intensity of the red, green and blue colors in the image, respectively, but the representation looks more like the percent of a certain color and less like an integer. In this example, since I wanted a simple yellow border on the image, I used the array [1, 1, 0] which is the same as saying RGB(255, 255, 0) because in both notations creating yellow requires full (or 100%) red and green but no blue.

Other border modes

Now, one thing to keep in mind with OpenCV’s image border modes is that, as of December 2023, there are no ways to make fun dotted/dashed/dotted-and-dashed borders yet (though of course, that could change).

However, aside from the simple cv2.BORDER_CONSTANT mode that creates a simple rectangular border around the image, there are four other OpenCV image border modes. Let’s explore one of them-cv2.BORDER_REFLECT, which adds a reflective border to the image. To change the border from a simple colored border to a reflective one, let’s change this one line of code:

pumpkinsWithReflectiveBorder = cv2.copyMakeBorder(pumpkinsNormalized, 40, 40, 40, 40, cv2.BORDER_REFLECT)

All I had to do to modify the code to get a reflective border was change this one line by changing the border thickness (30 to 40 pixels), removing the color (since this border mode doesn’t require a color), and changing the border mode to cv2.BORDER_REFLECT and, well, check out the image with a reflective border:

In this image, there are a few spots where the reflective border is hard to find, but it’s there (and quite prominent in the bottom side of the image where if you look close enough, you can see the reflections of the pumpkins.

Thank you,

Michael