Just yesterday, Xbox from Microsoft rolled out Muse, which they describe as “a generative AI model crafted for gameplay ideation.” Accompanying this launch was a scholarly article on Nature.com, a blog post, and a YouTube video. Now, if you’re scratching your head over what “gameplay ideation” entails, Microsoft clarifies it as the process of generating “game visuals, controller actions, or both.” However, its practical implications seem rather constrained and don’t come anywhere near bypassing the complex phases of actual game development.
Nonetheless, some of the details surrounding it are truly captivating. The training utilized a massive scale on H100 GPUs, requiring a whopping million updates just to stretch a single second of real gameplay into an extra nine seconds of dynamic and engine-faithful simulated gameplay. Most of the training data was sourced from existing multiplayer gameplay sessions, which is worth noting.
This wasn’t as simple as running a game on one regular PC. Microsoft needed a hefty cluster of 100 Nvidia H100 GPUs, which significantly raises the stakes in terms of cost and energy consumption. And despite such extensive resources, it only drops a resolution of 300×180 pixels for those added nine seconds of gameplay simulation.
Shifting focus, one of the standout demonstrations with Muse was its ability to replicate existing props and foes within a game environment, mimicking their capabilities—it’s tough not to question this approach, considering the hefty hardware expenses and power consumption to achieve something that traditional development tools handle with ease.
There’s a certain novelty in witnessing Muse keeping the essence of object permanence and mirroring the game’s inherent behaviors, yet, for practical purposes, it doesn’t hold a candle to the effectiveness of conventional game development processes.
Despite the potential Muse might show down the line, it currently lands among numerous other projects attempting to fabricate entire gameplay experiences through AI. It maintains a noteworthy level of engine fidelity and object permanence, but the method feels inefficient for game development, testing, or playing. After delving deeply into the available literature, it’s hard to fathom why anyone would choose this route.