Google Imagen Video – when a text is automatically turned into a high-definition video

9. October 2022

It’s hard to avoid the headlines about AI-based image creation. Image portals are banning artificially created images, artists are winning prizes without making a brush stroke. Now comes the next logical step on top of static image creation – moving image creation. Whether Meta, Open AI or Google, many companies are working on their first solutions, which sometimes leads to funny results.

Imagen is already familiar to the web as Google’s way to create images from text, but the extension of the tool unveiled in May will now be usable for video creation. As the Imagen team announced in a recent paper, the system is based on a “cascade of diffusion models.” As with image AI, the system needs only text input, and from that input the system then generates incremental high-resolution video. After a neural network creates an initial video, additional processing stages follow, continuously improving spatial fidelity and dynamics. According to the Google team at Imagen, up to 24 frames per second with an HD resolution of 1280×768 pixels are currently possible.

The exact mathematical and technical parameters behind the tech company’s latest development can be read in detail in the paper. Also what they learned from the use of the text coder T5-XXL, which derives meaning and task from the entered text templates.

As with the image AI Imagen, it is also not possible to test the possibilities of the video tool. The team justifies this procedure with security concerns, since problematic images still have to be sorted out to prevent potential misuse. This is also urgently necessary, at least according to the assumption, because since the model was trained with freely available images from the Internet, explicit content must first be filtered out.

But where is the journey heading? Is it the swan song of artists and creative professionals, since films can now also be created fully automatically after artworks, or will it rather be a tool that opens up completely new possibilities.

Two New AI Labels for Music: Why Transparency Alone Won’t Solve the Problem

The New Soft Skills for Early-Career Professionals: Why AI Is Making Human Capabilities More Valuable

AI Leap: Why Estonia Is Making AI a Core Skill Instead of Banning It

Malta Is Giving Its Citizens ChatGPT Plus: When AI Becomes Public Infrastructure

AI Dubbing Under Fire: Why Germany Is Particularly Sensitive to Synthetic Voices

Midjourney vs Disney, Universal and Warner Bros.: Why the AI lawsuit is putting pressure on both sides

AI Influencers Are Moving into the Mainstream – But Trust Remains Critical

Claude Design: how Anthropic aims to reshape the design process with AI

Two New AI Labels for Music: Why Transparency Alone Won’t Solve the Problem

AI Dubbing Under Fire: Why Germany Is Particularly Sensitive to Synthetic Voices

Innovation explained: Loop Engineering

Midjourney vs Disney, Universal and Warner Bros.: Why the AI lawsuit is putting pressure on both sides

The New Soft Skills for Early-Career Professionals: Why AI Is Making Human Capabilities More Valuable

AI Agents in the Real World: The Unusual Experiments of Andon Labs

Harness engineering: why reliable AI is built around the model, not inside it

Copilot Tasks: When To-Do Lists Start Completing Themselves

Google Imagen Video – when a text is automatically turned into a high-definition video

Ähnliche Artikel

Kommentare

LEAVE A REPLY Cancel reply

Follow us

FUTURing