How to create images and visuals with generative AI

How to create images and visuals with generative AI

There’s one moment in the process of creating a blog post or news article that every small publisher dreads:

“What do I use for my featured image?”

Agencies and media companies have creative directors, photographers and artists at their beck and call to create this image for them. But what about the rest of us?

Some of us will head over to Google Images despite our better judgment. Others will go to a free site like Pexels or Unsplash. Some will go to sites like Adobe Stock, iStock or Shutterstock to pay for an image.

Hopefully, everyone reading this knows why it’s not a great idea to steal images off the web. Unless you’re using a public domain image, the images you download are owned by somebody.

If you plan on growing your business or brand, you don’t want your site filled with unlicensed images that may come back to haunt you one day.

As for stock photos, everyone who’s used a stock photos site has experienced the frustration of searching through page after page search results and never finding the right one. So many stock photos are repetitive, generic or trite that they’ve literally become a joke.

And if you happen to find a decent stock photo, chances are it’s been used over and over again.

For example, this photo of a diverse group of co-workers on Pexels has been downloaded over 75,000 times and appears in Google Images on 175 sites. Which, ironically, is the opposite of “diversity.”

AI image generators

Remember I said big companies have creative directors, photographers and artists at their beck and call? With AI image generators, you can now have all these, too.

Right now, two types of sites are becoming widely used to generate images from text.

The first are sites that focus only on images. The most popular is Midjourney. The next most popular are sites powered by the open-source Stable Diffusion model, such as Stability.Ai’s own DreamStudio. 

Creatives and designers tend to favor these platforms because of their exclusive focus on AI art; they are at the cutting edge of image quality and allow many customization and fine-tuning options for artists.

For this article, I’m going to focus on AI chatbots, which are a bit more accessible to marketers and non-artists.

As of this writing, Anthropic’s Claude doesn’t support text-to-image and Google Gemini is too inconsistent for my tastes. (Most of the prompts I test there result in an error message or an image that doesn’t match what I asked for.)

On the other hand, OpenAI’s ChatGPT (with image generation powered by DALL-E) and X’s Grok (with image generation powered by FLUX.1) are getting jaw-droppingly good. 

As of this writing, ChatGPT Plus costs $20 a month. It includes DALL-E image generation and access to the ChatGPT chatbot.

ChatGPT is what I had in mind when I wrote my article back in April predicting that people would use Google less once they got used to using AI chatbots. Since then I’d say 80% of the searches I used to do on Google I now do on ChatGPT.

Grok comes as part of the Premium Tier of the social media platform X and costs $8 a month. For that price you get access to FLUX.1 image generation, as well as Grok’s chatbot and premium features on X.

As for which you should choose, I would suggest both.

Right now, I see ChatGPT still ahead of Grok as far as its usefulness as a chatbot, while Grok is arguably superior at generating art.

As you’ll see in a second, $28 a month is a pittance compared to the value you get from image generation alone, not to mention all the other ways AI chatbots can increase your productivity.

Generative AI as your personal creative director, photographer and artist

For those of you who have never used an AI chatbot to do text-to-image generation before, I’ll give a quick rundown of how it works..

Let’s say that you’re writing a blog post or an article on how to buy a mattress and you get to that point of having to choose a featured image.

Instead of hunting all over for an image, you just type this into your chatbot.

“Draw me a box mattress in a store.”

Here are the results I get:

ChatGPT

Grok

You can see that Grok understood what I meant, while ChatGPT thought I was talking about a “mattress in a box.” Score one for Grok.

While it’s a nice photorealistic image, it’s really nothing that you can’t find on any stock photo site. And let’s face it – it’s just as boring, repetitive and unoriginal as most “stock photos of mattresses.” 

Let’s change that.

Getting a little more detailed in your prompt

Let’s say that in your article you referenced the story of The Princess and the Pea. And it dawned on you that a nice visual might be a princess sleeping on a stack of mattresses. 

Type this prompt into your chatbot:

“Generate an image of a princess sleeping on top of a stack of mattresses.”

Here’s what ChatGPT gave me:

And what Grok gave me:

You can start to see the difference in how ChatGPT and Grok approach “art.”

ChatGPT tends to favor illustrations, while Grok seems to favor photorealism. But of course, you can “ask” either to try to draw in whatever style you like.

I should say that I didn’t get these images right away from either AI. In fact, the first images I got from both didn’t match what I wanted at all. But I “talked” to the chatbot just as I would to a Creative Director.

Here was my “conversation” with Grok to get to this final image:

“Draw me a picture of a stack of mattresses with a princess sleeping on top.”

“Those don’t look like mattresses, they look more like blankets. Can you draw me the kind of box mattress you’d find in a store?”

“I need them stacked up with a princess sleeping on top.”

“No no, draw me at least 10 mattresses stacked on top of each other with a princess sleeping on top.”

“This is good, but make the mattresses all have different patterns.”

It took a while, but I finally got one I was happy with.

Notice that all I had to do was have a “conversation” with Grok, just like I would with a creative director. And unlike a real creative director, Grok didn’t want to throw me out a window after the seventh round of changes.

Now search on any stock photo site for “princess and the pea” or “stacked mattresses”; chances are you won’t find anything nearly as good as you see here.

The girl you see sleeping on top of the mattress? She doesn’t exist. No model release is needed because there is no real human in that photo. 

As you can imagine, this changes everything. Instead of spending thousands of dollars for a photo shoot or $200 for a stock photo subscription, I just spent $8 and about 2 minutes of my time. 

How in the world does AI generation work? 

Imagine that you wanted to learn to draw a picture of a golden retriever. The first step would be to learn basic art techniques, like drawing basic shapes, adding texture and detail and adding shading and depth.

You’ll need to study a lot of pictures of golden retrievers to understand their structure, form and movement. And you’ll need a lot of practice and iteration before your drawing starts to look like the real thing.

That’s essentially the same way that AI models work, except in the AI world this process goes by names like “Generative Adversarial Networks” and “Diffusion Models.”

The difference is that while you probably only have a few hours a week to learn and practice, AI models can “learn and practice” instantly and continuously.

Plus, they have access to billions upon billions of images to train them, including public domain images, Creative Commons images and image data licensed to them by stock photo companies.

Dig deeper: Visual optimization must-haves for AI-powered search

Getting ideas from AI

Let’s get back to that hypothetical blog post I was writing.

While images of a mattress in a store or even a cute picture of a princess sleeping on a stack of mattresses may get people’s attention, will it get them to click and scroll to read your article?

That is the whole point of the featured image.

In addition to generating an image for you, you can use AI to help you come up with ideas in the first place.

Let’s try this. Instead of telling the AI what to generate for us, let’s ask for advice.

Tinggalkan Balasan

Alamat email Anda tidak akan dipublikasikan. Ruas yang wajib ditandai *