I've been as fascinated with DALL·E, OpenAI's text-interpreting image generating software, as every other tech geek on the internet seeing pictures of Shrek in a wedding gown splattered across their Twitter feed. And although it's little more than a gimmick at this stage, the capabilities of DALL·E look wildly impressive and could feasibly open up an entire new ecosystem of computer-aided design and art in industries ranging anywhere from fashion to video gaming. But it's really hard to gauge just how impressive the tech is when all I have seen is the 5-10 viral images tech journalists and blogs have been sharing. I don't know if the quality of these images is representative of what all users are experiencing or if journalists are cherry picking the best examples and inflating just how magical the results are.
I wanted to find out for myself. So I submitted my name to the DALL·E access waitlist and waited patiently for weeks for the powers that be to give me access to the software.
And last week, I finally got access.
So here I am, sitting on my couch, sipping on a blood orange soda from Trader Joe's, half-asleep but anxiously ready to delve into every remote crevice of my brain to think of a bunch of twisted crap for DALL·E to try and draw.
I will be creating 10 images, and with each I will show the exact text I entered into DALL·E, the images DALL·E created, and a few words with my reaction. I'm pumped. Let's do it.
1. "Kermit the frog in a canoe rowing down Niagara Falls"
Okay, I genuinely laughed out loud when I saw these images come out. Holy crap is this freaky. Sure, the frog doesn't really look anything like Kermit, which kind of surprises me considering how consistent of a look Kermit has had for over 50 years. But nevertheless I was pretty skeptical this machine would pop out anything remotely close to what I asked for, and these images easily exceed that standard.
2. "A parade of dinosaurs marching over the Golden Gate Bridge"
Admittedly, all of these images definitely look like someone went to the dollar section of their local Target and bought a bunch of plastic dinosaur action figures for a weird pseudo-apocalyptic photoshoot in San Francisco. But I don't think I mean that in a bad way.
3. "The Cat in the Hat doing a pirouette at his ballet recital, wearing a purple tutu"
This obviously isn't the Cat in the Hat that I was looking for. But they are certainly cats in hats, and boy oh boy are they feeling themselves. DALL·E even gave Cat #2 a cute little pirate hat and a sword! Without even asking! Just look at that KING! My heart is melting for Cat #2. I never understood furies for most of my life, but I kind of get it now. Not fully. But kind of.
4. "Two boys jumping into a lake of orange juice on a warm summer day"
DALL·E, I'll say it: you screwed this one up. I asked for a lake of orange juice. Instead you just gave me two boys at a lake showing concerning levels of excitements to be drinking orange juice. Like "You should take that child to a therapist" levels of excitement. Thank you, next.
5. "A hand-drawn picture of a bald eagle flying over the Grand Canyon while carrying the Declaration of Independence"
I am absolutely blown away by how accurately this software can replicate different artistic styles. Everything about these images looks like it was drawn by a human hand rather than generated with a computer, almost as if the imperfections in DALL·E's drawing skills actually make it look more authentic because it feels more human. It's clear by this point that DALL·E has trouble though with proper nouns that are built using very basic words, since "Cat in the Hat" and "Declaration of Independence" clearly did not get understood correctly by the software.
6. "An oil painting of SpongeBob SquarePants standing on top of Mount Everest at sunrise"
This is the first time I can say confidently that DALL·E clearly understood and reproduced the character I asked for so bravo to DALL·E. Although I must say DALL·E's version of Mount Everest looks like Big Bird diarrhea but so it goes.
7. "A drawing of a mouse licking a dog licking a tiger licking a buffalo licking an elephant"
I typed out this prompt 1) because I thought it was absurd and 2) because I wanted to test how well DALL·E understood how objects are related to each other in a sentence and whether or not it understood the proportions of how large each animal was. I think it failed both tests but what an utterly adorable failure this is. I am content.
8. "A 3D render of a grand piano floating underwater"
I was hoping to get something a bit more surreal here, but probably my fault for specifying it should be a 3D render, which likely outputs a more hyperrealistic image. Oh well. The more you know.
9. "A mugshot of Charlie Brown, black and white"
Honestly, silly me at this point for not assuming the software would see "mugshot" and generate a bunch of illustrations on coffee mugs. Fair play, DALL·E, you spicy little weasel. These are definitely not Charlie Brown but then again I don't know what 18-year-old Charlie Brown would have looked like. Maybe his hair grew longer and, judging from picture #4, he got gauges. Maybe DALL·E just knows more than we do.
10. "A colorful Van Gogh style painting of sunrise in Yosemite National Park, lots of oranges and reds"
I don't know how else to say it. These are certifiably beautiful. They're art. And I never thought I would say that coming out of this exercise. But they're genuinely art.
Believe the hype. Sure, the software isn't perfect, and it still uses plastic action figure dinosaurs as its reference for all prehistoric creatures, and it clearly believes Charlie Brown developed a serious meth habit later in life. But DALL·E easily surpassed all of my expectations for what this software was capable of. I thought I would see it purely as a gimmick and never feel inclined to return to this tool after today. But now I'm thinking about all the new graphic design possibilities this tool opens up, and my creative juices are totally flowing. And at the end of the day, that's exactly what I think good technology is supposed to be: a collection of tools that allows our already creative brains to do even more creative things. DALL·E does this. It's not perfect, and I'm sure there is an entire library of ways this software could be used nefariously. But for the time being, I'm extremely impressed.
Cheers to you, OpenAI.
Comments