Artificial Intelligence (AI) is transforming the realms of imagination and reality alike. The recent introduction of Sora by OpenAI, a text-video AI generator, is a testament to this.
The latest in this line of innovation is 'Genie', an interactive 2D video game creation model unveiled by Google's DeepMind team. Google Genie is an AI platform that can generate video games from a single image prompt or text description.
This project, developed by Google DeepMind's Open-Endedness Team, has the potential to revolutionise entertainment, game development, and even robotics. The 'world model' Genie is trained on a large dataset of 200,000 hours of unlabelled video footage, primarily from 2D platformer games. Unlike traditional AI models, Genie learns from the actions and interactions within these videos.
Genie comprises three core components: the Video Tokenizer, the Latent Action Model, and the Dynamics Model. The Video Tokenizer processes video data into manageable units, or 'tokens'. The Latent Action Model analyses transitions between consecutive frames in the videos, identifying eight fundamental actions.
The Dynamics Model predicts the next frame in the video sequence, taking into account the current state of the game world and generating the subsequent visual result. This process creates the illusion of an interactive game experience.
Notably, Genie is still under development and comes with limitations including:
However, once the Genie is released, it is expected to revolutionise creativity across numerous domains. Its ability to generate interactive worlds from minimal input will open doors for exciting possibilities in the future of entertainment, education, and beyond.
Copyright©2024 Living Media India Limited. For reprint rights: Syndications Today