“Why ChatGPT Fails at Creating Magic Eye Images”

Hey there, curious minds! Have you ever fiddled with those eye-bending Magic Eye stereograms from the ’90s? You know, those images that look like random, noisy patterns until you defocus your eyes just right and—bam!—a 3D picture leaps out at you? Yeah, those. Well, guess what? Our trusty AI pal, ChatGPT, claims it can generate them!

At least, that’s what it says.

So, buckle up because we’re diving into a hilariously misguided adventure where I tried getting ChatGPT to create some of these hidden 3D wonders. And spoiler alert: it didn’t go quite as planned.

The Magic Eye Misadventure Begins

Imagine this. You’ve read about how folks tried creating these Magic Eye images with various AI models before, but it never crossed your mind to ask ChatGPT to take a shot at it until fellow enthusiast Pippin suggested it. Intrigued, I ventured to generate the hidden dolphin image you see above. What cracked me up was the absolute certainty in the caption.

How does it end up being so wrong yet so confidently delivered?

Here’s the deal: ChatGPT has come across information about Magic Eye stereograms in its internet training phase. When I asked it for a “magic eye 3D stereo image of a unicorn,” it relayed a fantastical, detailed description to DALL-E3, the actual image generator. The prompt went something like this:

“A magic eye 3D stereogram that reveals a hidden image of a unicorn when viewed correctly. The stereogram should be intricate and filled with a variety of colors to ensure the 3D effect is prominent, allowing the unicorn to appear as if it is standing in a magical forest once the viewer’s eyes adjust. The background should consist of complex patterns and mystical elements to enhance the illusion of depth and the magical atmosphere, encouraging viewers to find the hidden unicorn. The unicorn itself should be designed to be a delightful surprise for those adept at viewing magic eye images, with its horn and mane clearly discernable amidst the pattern.”

Impressive, right? Well, here’s the twist. This is the image DALL-E3 provided ChatGPT in response:

Imagine expecting a rich, magical forest scene and getting… well, not that.

Why the Miscommunication?

ChatGPT doesn’t have image recognition, so it can’t verify what DALL-E3 actually spat out. It acts as if it received the perfect image every time. It goes on, narrating about an illusion that’s, quite frankly, not even there. It’s like gesturing to a green screen, assuming there’s a jaw-dropping 3D unicorn prancing around.

The Gemini Angle

Just for kicks, I tried this with Google’s Gemini (previously known as Bard), which also boasts an image generation model called Imagen 2. Gemini tries its hand at Magic Eye images, too, but let’s just say subtlety isn’t its strong suit.

What’s amusing is that Gemini has an image description model it can call on. You’d think this means it would recognize its own art’s shortcomings, right? Nope. If you start a fresh chat and submit one of its “hidden” dove images, you get a straightforward description of a dove—no mention of Magic Eye illusions.

My hunch? Gemini’s image recognition kicks in only when a user uploads an image, sticking to text elsewhere. That’s like having an app on your phone that only wakes up for certain tasks. Handy sometimes, but not exactly seamless.

One Model, Multiple “Apps”

This brings me to a broader point: AI models like ChatGPT and Gemini might seem like all-powerful, single programs, but they’re really like a phone filled with specialized apps. Text generation can prompt image generation, but they’re distinct entities, not an all-encompassing AI brain.

Even when ChatGPT is just generating text and technically could check its own work, it tends to assume it nailed it. Take ASCII art, for example. When I asked ChatGPT to generate and then review its own ASCII text, it amusingly misread its own creation, labeling it “ERROR.” Yep, it’s just as bad at validating ASCII art.

Wrapping Up Our 3D Illusion Fun

Generating 3D illusions and ASCII art might not be groundbreaking, but it’s a quirky reminder: these models don’t fully grasp what we’re asking for or follow the steps to achieve it perfectly. When ChatGPT gets new functionalities like image recognition or descriptions, it’s not the text model evolving; it’s just accessing another “app.”

So, if you’re up for one last 3D illusion challenge, try this: I asked for a super tricky one that might stump you at first. When you finally see it, you’ll recognize a… sea turtle!

Ready to dive into your own experiments or had a quirky AI adventure? Share in the comments below! Let’s geek out together.

Leave a Reply Cancel reply