Research Guides: Artificial Intelligence and Images: How AI creates images

What is AI?

Artificial Intelligence (AI) simulates human brain processes using machines, primarily computer systems.

AI creates new content from existing sets of data - text, images, video files, and code scraped from internet databases.

Watch the PBS video below that helps explain:

Advances in artificial intelligence raise new ethics concerns
8:00 minute video from PBS explains AI

How does AI work to create images?

Artificial neural networks mimic the brain's process to recognize patterns. Convolutional neural networks (ConvNets) specialize in the ability to identify objects and patterns in data. The neurons are a specialized form that works in a similar manner as the human eye. Although not as complex as the human brain, the machine can recognize an image in a way similar to how humans see.

Training a ConvNet involves feeding millions of images from a database, such as ImageNet, WordPress, Blogspot, Getty Images, and Shutterstock. For example, ImageNet contains over 14 million URLs of images, but does not own the copyright for the images. These images are annotated by hand to specify the content, and training continues to be tuned until particular classifications are learned.

It is important to note: the machine does not see images, but instead sees a set of numbers. An image is broken into pixels, which are represented by numbers which represent lines, edges, colors, etc.

(From: DeepDream: How Alexander Mordvintsev Excavated the Computer’s Hidden Layers)

According to Canva's Image Generator section:

"To create AI-generated images, the machine learning model scans millions of images across the internet along with the text associated with them. The algorithms spot trends in the images and text and eventually begin to guess which image and text fit together. Once the model can predict what an image should look like from a given text, they can create entirely new images from scratch based on a new set of descriptive text users enter on the app."

To produce an image, a user enters keywords and a model generates images utilizing those keywords. Strings of words can be used.

For instance: " orange cat yawning on a braided rug in front of a fireplace" produced these nine images from craiyon.com in under two minutes:

How AI turns text into images from PBS NewsHour explains the process.

Several years ago, AI-generated images used GANs, or generative adversarial networks, but they were fairly limited in what they could create. Now, AI models are trained on hundreds of millions of images and each is paired with a descriptive text caption. This new process, called "diffusion," starts by breaking down each image to random pixels (visual noise) that don't represent anything specific. Diffusion inverts the process and the model can go from the noise back to the original image. This serves as the model background instruction for concepts like objects or artistic style.

Engineered Art

According to this article from Mind Matters News, AI-generated art is not AI art, but instead engineer-generated art. Engineers build the mathematical models that train the AI. "Computer engineers are generating mind-blowing images using highly sophisticated mathematical models." This also can be compared to music -- top electronic musicians are highly skilled computer engineers, as are video game sound designers.

It is important to remember that the creativity in AI art comes from a HUMAN source. Think about creativity. Humans continually come up with new, improved ideas and concepts while AI connects the human innovation by modeling the source. The source has to exist first for AI to be trained.