Apple has made a move that has raised eyebrows across the AI research world – and not without reason. Under the playful title “Pico-Banana-400K”, the company has released a dataset of more than 400,000 images, freely available for anyone to use in research. Its purpose: to train the next generation of image-editing AI systems. The twist? Much of the work behind it was done by a Google AI.
At its heart, the project explores how machines can learn to edit images through natural language prompts – essentially turning Photoshop into something you control with words rather than clicks. To teach a model what it means to “soften the light” or “add a smile”, researchers need vast numbers of examples. That’s precisely what Apple’s new dataset provides: a huge collection of scenes, objects and edits, each paired with carefully crafted textual instructions.
To build it, Apple’s research team first turned to the public Open Images platform, assembling a diverse set of photos featuring people, objects and text-heavy scenes. Then came Google’s Gemini-2.5-Flash, which generated 35 different types of editing commands – the sort of instructions real users might give when prompting an AI to modify an image: change the lighting, add a filter, imitate an artist’s style.
The edits themselves were carried out by another model, Nano Banana, while a second Google system, Gemini-2.5-Pro, was tasked with judging the results. Only those image edits that accurately reflected the original prompts made it into the dataset. The process became a collaborative relay between multiple AIs – a rare example of how machine systems can check and refine one another’s work.
The outcome: 257,000 successful edits, 72,000 that required multiple prompts, and 56,000 failed attempts – all retained deliberately. The Apple researchers argue that AI can learn as much from its mistakes as from its triumphs. The result is a teaching tool not just for producing good edits, but for understanding why others fall short.
In their accompanying paper, the Apple team describes this interplay of models as a scalable framework for generating high-quality image edits. The complete dataset is now available on GitHub under a non-commercial research licence, allowing academic and experimental use but excluding commercial exploitation.
Perhaps most striking, though, is what the release symbolises. Apple, long known for its secrecy, has chosen transparency – and, in a sense, collaboration – with a rival’s technology at its core. It’s an unexpected gesture in a fiercely competitive field, hinting that the future of AI might not be defined by isolated innovation, but by systems – and companies – learning side by side.

