OpenAI's AI Image Update Solves Text & Consistency

OpenAI just quietly dropped a massive update that might finally fix the biggest headache in AI image generation: consistency. While the world was distracted by Google’s recent momentum, the team at OpenAI released a new default model, GPT Image 1.5, and completely overhauled how users interact with visuals inside ChatGPT.

This isn’t just a minor speed boost; it represents a fundamental shift in how the platform handles creative workflows. I just watched a fantastic breakdown from an AI expert who dove deep into this release, bypassing the marketing fluff to test what the model can actually do. The creator of this video, Matt Wolfe, spent time running side-by-side comparisons against Google’s top model, which he playfully refers to as “Nano Banana Pro,” to see if OpenAI has reclaimed the throne. What he found was a mix of incredible utility features that were strangely buried in the announcement and significant improvements in how the AI understands complex, multi-step instructions.

The most immediately noticeable change is the user experience. Image generation is no longer treated as a side project; it now has a dedicated “Images” tab in the sidebar. The expert points out that this new interface is clearly designed for people who aren’t prompt engineers. It offers a visual menu of preset styles: options like “pop art,” “sugar cookie,” “claymation,” or “doodle,” allowing users to click a vibe rather than struggling to describe it. But the real power lies in a feature OpenAI barely mentioned: Likeness Retention. The innovator behind this review discovered that you can now perform a one-time upload of a photo (like a selfie) and instruct ChatGPT to remember that appearance for future generations. This eliminates the tedious need to upload your headshot every single time you want to generate a new thumbnail or stylized portrait.

Here is a deeper look at the three critical performance tests the expert ran to see how GPT Image 1.5 stacks up against the competition.

📌 The “Chain of Thought” Editing Test

The biggest frustration with AI images is that the model often forgets what it’s doing halfway through an editing process. The expert ran a fascinating multi-step experiment to test this. He started with a generic stock photo of a family eating dinner. Then, he issued a series of sequential commands: add snow outside the window, age everyone up by 15 years, change the style to an oil painting, replace the kitchen with a log cabin, and finally, remove everyone except one person.

This is where GPT Image 1.5 really shined compared to Google’s model. While Google’s “Nano Banana” generated arguably prettier, more photorealistic individual images, it completely lost the plot by the end. When asked to switch to a cabin, Google reverted the style back to a photo. When asked to remove people, it forgot the cabin environment entirely. In contrast, OpenAI’s model maintained the context through every single step. It remembered the snow, the aging (mostly), the oil painting style, and the cabin setting all the way to the final image.

The takeaway here is clear: if you need to iterate on a design without starting over, GPT Image 1.5 has significantly better “memory” for the conversation.

📌 The Text and Layout Precision Test

For designers and marketers, generating legible text has always been a nightmare. To test this, the industry pro asked both models to generate a photorealistic image of a laptop displaying a pricing page. The prompt was demanding: it needed a bold header, a comparison table with three rows, a footnote with an asterisk, and fine print legal text, all readable and correctly spelled.

The difference was night and day. OpenAI’s model nailed it. The text on the screen was crisp, the spelling was perfect, and even the tiny legal disclaimer at the bottom was legible. You could actually read the terms of service. Google’s model, on the other hand, produced what looked like a blurry, upscaled mess. While the general layout was there, the text looked like “pixel soup” that had been artificially sharpened.

For anyone building UI mockups, pitch decks, or marketing assets where text placement matters, OpenAI is currently unmatched.

📌 The Identity and Branding Consistency Test

Finally, the reviewer wanted to see how well the model could handle branding and specific likenesses across different styles. He uploaded a photo of himself and asked for three distinct portraits: a Renaissance painting, an 80s studio photo, and a modern cinematic film still. GPT Image 1.5 successfully produced three separate, distinct images that clearly looked like him while perfectly adhering to the requested eras. Google’s model struggled here, oddly blending the styles together into a single confusing composition.

He also ran a brand consistency test, asking for a coffee shop logo to be applied to a storefront sign, a cup, and a bag of beans. Both models actually performed well here, maintaining the logo perfectly. However, OpenAI grouped them into a single cohesive brand image, while Google split them into separate files.

The verdict is that while Google might still have a slight edge in raw photorealism for things like crowds and textures, OpenAI has built a far superior tool for getting actual work done. The ability to follow complex instructions, render perfect text, and remember context makes it the smarter choice for professionals.

If you want to see the side-by-side visual comparisons and the exact prompts used to break these models, you definitely need to watch the full breakdown!

Check out the full video here.

📌 The “Chain of Thought” Editing Test

📌 The Text and Layout Precision Test

📌 The Identity and Branding Consistency Test

Related: