AI Model Collapse: Why Real Human Data is Essential for the Future of Artificial Intelligence

23. July 2023

There’s a new trend in the air, spreading with rapid speed across the digital world: Generative Artificial Intelligence (AI), represented by platforms such as ChatGPT for text and Stable Diffusion for images, is now available to a wider audience. While the potential of these technologies is rendering our digital experience exciting and diverse, it also carries certain risks that have been highlighted by a team of researchers.

We’re living in an era where AI-generated content is inevitably populating the internet. A parallel phenomenon is the practice of AI companies combing through the internet for freely available data to train their language and image models. However, as emphasized in a study by Cornell University, there lies a tangible danger when the data used to train these models are produced by the models themselves.

Let’s delve deeper into the world of “model collapse,” a phenomenon that happens when AI models train on their own outputs in an endless loop. What happens exactly? In the first round, these models lose a portion of the actual information about the world. But with every subsequent generation, they begin mixing the remaining information from the real world with those they’ve created themselves. The outcome is a steadily increasing distortion of reality. A text-based AI trained in this manner could end up sounding less and less human-like – an unwanted outcome that runs counter to the original intent.

So, what’s the solution, you ask? Fresh, human-generated data is the keyword. That, however, is easier said than done, as it’s often unclear whether internet data are human or machine-generated. Researchers underscore the need to ensure access to the original data models were trained on and regularly refresh the data pool with new, non-AI-generated data.

This calls for a synchronized effort from AI communities and companies to create clarity on which data are of human origin and which are produced by AI models. Unless such measures are undertaken promptly, developing new AI models trained on genuine human data could become an increasingly challenging task. Our digital future hinges on how we master this challenge.

DeepL Revolutionizes Translations with New AI Technology

The Importance of Prompting Training for Companies

Apple Embraces Artificial Intelligence at WWDC

Exemplary technological expertise in leadership: the case of Don Beyer

Luma AI and the Dream Machine

Company Watch: HeyGen and the future of digital identity

The AI revolution in startups: navigating the sea of possibilities

LinkedIn: The Driving Force of Innovation in the Digital Age

AI Act Triggers Meta’s Withdrawal of New AI Models from the EU

DeepL Revolutionizes Translations with New AI Technology

Microsoft Warns ‘Skeleton Key’ Can Crack Popular AI Models for Dangerous Outputs

OpenAI’s New Classification: A Breakthrough Toward Artificial General Intelligence?

Gartner Advises Caution: The 2024 Hype Cycles for Cloud Technology and Enterprise Networking

Innovation explained: Spatial Computing

Technologies 2024: The five key trends that will shape our world

Generative AI at the forefront: Insight into the Gartner Hype Cycle 2023

AI Model Collapse: Why Real Human Data is Essential for the Future of Artificial Intelligence

Ähnliche Artikel

Kommentare

LEAVE A REPLY Cancel reply

Follow us

FUTURing