What Is DALL-E and What Is It Capable Of?

March 20, 2023 • April Miller

Advertisements

Have you heard of DALL-E? Have you taken it for a spin? This machine learning tool is taking the AI and visual arts worlds by storm and igniting some interesting controversies.

What is DALL-E – often stylized as “DALL·E” – what is it used for, who has access, and what does it mean for the future of artificial intelligence and visual artistry?

What Is DALL-E and What Does It Do?

DALL-E is a machine learning platform, designed by OpenAI, that creates wholly unique images using nothing more than strings of text provided by the user.

Here’s an example:

Suppose the user wants a digital illustration of “an armchair in the shape of an avocado.” All they have to do is enter that string of text into DALL-E, and the AI does the rest.

DALL-E was created by OpenAI and first announced to the public in January 2021. OpenAI itself was founded in December 2015 by Elon Musk, Sam Altman, Ilya Sutskever, and others. It is a laboratory dedicated to studying AI, its capabilities, and its applications. OpenAI operates a for-profit company – OpenAI LP – and a nonprofit parent company called OpenAI Inc.

How Does DALL-E Work?

DALL-E – now in its second iteration, called DALL-E 2 – isn’t quite a fully fledged artificial intelligence (AI), but it uses machine learning, which is one of the building blocks of AI.

The underlying foundation of DALL-E is a previous machine-learning model, also developed by OpenAI, called GPT-3. GPT-3 is a natural language processing (NLP) engine that boasts of using 175 billion parameters to analyze and realistically imitate human language.

In order to function, DALL-E’s logical and creative apparatus must be trained using an existing body of data. In this case, that data takes the form of millions of image-and-text pairs gathered from public databases on the internet.

According to OpenAI, DALL-E is capable of some surprising, novel, otherworldly, and sometimes downright macabre interpretations of the user’s artistic intent. OpenAI says DALL-E can:

  • Create anthropomorphized objects and animals
  • Combine disparate concepts in artistically cohesive ways
  • Render text and transform existing images to create unique works

OpenAI says DALL-E has 12 billion parameters to work with, compared to GPT-3’s 175 billion, but don’t let that fool you – they call its capabilities “diverse,” which seems like an understatement. The results delivered by this machine learning tool have been – quite literally – out of this world. It’s enough to have some artists worrying about the future of their craft.

Who Can Use DALL-E Right Now?

When it was first announced, interested parties had to join a waitlist before they could gain access. In September 2022, OpenAI removed the waitlist for DALL-E and signaled that anybody can now use the machine learning platform.

OpenAI claims over 1.5 million people have tinkered with DALL-E so far. DALL-E 2 has not yet been made available to the public, but several high-profile brands have. This includes Heinz and Nestlé, are apparently helping to test its capabilities.

DALL-E is free to use, but visitors have a fixed number of credits to use per month before they have to begin paying.

What Does DALL-E Mean for the Future of Visual Arts?

We mentioned that DALL-E and tools like it are becoming controversial in some circles.

The biggest story broke when an image created entirely by DALL-E competitor, Midjourney, won first prize in a digital art competition. Jason M. Allen, who lives in Pueblo West, Colorado, took home a blue ribbon at the Colorado State Fair for his – or rather DALL-E’s – digital painting entitled “Théâtre D’opéra Spatial.” The French translation is “Space Opera Theater.”

Allen was treated to swift backlash from his competition and vocal members of the artistic community. The artist submitted his work under the name “Jason M. Allen via Midjourney” so its provenance would be clear. Allen says he broke none of the contest’s rules and maintains “I’m not going to apologize for it.”

Still, Midjourney, DALL-E, and Stable Diffusion – all image generators powered by machine learning – continue to attract consternation from lifelong artists and those who take the long route by drawing and painting them by hand, whether with paint and a brush or a digital graphics processing program.

Some artists greeted this news with barely contained frustration and even anger.

One Twitter user, reacting to Allen’s win, declared, “We’re watching the death of artistry unfold right before our eyes.”

Olga Robak, a spokesperson for the Colorado Department of Agriculture – the sponsors of the contest – cited the rules of the competition, which allows “any artistic practice that uses digital technology as part of the creative or presentation process.” Robak says she knew Midjourney was a digital design tool, but not an AI-based one. She says she’d still have given Allen the blue ribbon if she’d known in advance and echoes the artist’s sentiments that he neither knowingly nor accidentally broke competition rules.

Where Do AI Image Generators Go From Here?

Facebook’s parent company, Meta, announced in September 2022 a product inspired by DALL-E, called Make-A-Video, that’s exactly what it sounds like. Instead of providing AI with text cues so it can create a still image, users instead generate entire videos using strings of text.

For example: In a sample video, the prompt “a teddy bear painting a portrait” yielded an imperfect but still fairly impressive end result.

In these early stages, it’s likely that such AI-based modeling could serve as an important conception tool in the planning and storyboarding phases. They could also provide a rough starting point for human editors and illustrators to build upon. That primitive-looking video of a teddy painting a self-portrait could look much more convincing after a professional artist gets their hands on it. For now, it’s a deeply fascinating proof of concept.

In addition to questions of fairness in art competitions, other potential controversies abound. Deepfakes are one of the most worrisome implications of unchecked and unregulated AI. Deepfakes are images and videos convincingly altered by AI to make it look like the subject said or did something they didn’t actually say or do. Could the fallout of widespread deep fakery make art contest tussles look like child’s play? It’s possible. For now, you, too, can join the thousands asking AI to open its black box and summon forth previously unseen wonders.

Image source: https://openai.com/blog/dall-e/

bg-pamplet-2