Part 1: Build dynamic and engaging mobile games with multi-agent reinforcement learning

August 16, 2023

3 minute read time.

Part 1 of 3 Blog Series

In March 2023, the Game Developers Conference (GDC), one of the biggest events for video game developers, was held in San Francisco. Last year, we showcased a one-on-one boss battle featuring a knight character controlled by an ML-based game AI. That was a single-agent system, but this year, we have expanded the scope to multi-agent systems, presenting a talk and demo called "Candy Clash." Using the Unity ML-Agents Toolkit, we developed the multi-agent system where dozens of rabbit characters work as a team, aiming to crack their opponent's egg.

In this blog series, I'll explain how we developed this game demo. Part 1 provides a general overview of the demo.

We hope that this blog series will interest many game developers in machine learning technology.

Candy Clash

In the Candy Clash demo, numerous rabbit characters split into two teams and act according to the situation, aiming to crack each other's eggs. The rabbit characters' actions are selected by their assigned neural network (NN) models, and the game was developed to demonstrate how agents behave as a group. Below is a screenshot of the game.

Candy Clash Demo

Figure 1. Candy Clash demo

The objective of this demo is to either crack the opponent's egg or defeat all the opponent’s rabbits. The gauge at the top of the game screen shows the eggs' hit points (HP). Below that, a gauge shows the remaining number of rabbits for each team. Also, cannons fire towards areas with a high concentration of rabbits, either at regular intervals or when the user presses a button. The cannons are controlled by human-programmed logic rather than ML. This programming indirectly affects the game's outcome. At GDC, we ran this demo on a Pixel 7 Pro and achieved a performance of 60 fps with 100 ML agents.

ML Agents and multi-agent scenarios

ML-Agents Toolkit enables game developers to train intelligent game AI agents within games and simulations. You can create more realistic and engaging gameplay experiences because non-player characters (NPCs) can learn from their surroundings and react more naturally to player inputs. Agents are trained through Reinforcement Learning (RL) using data gathered from their environment. This learning enables users to improve over time and make decisions based on what they have learned. Our previous blog post introduces the basic mechanism used for ML-Agents. Unity’s official documentation gives more details.

Multi-agent systems involve multiple ML agents working together to achieve a common goal. This scenario can potentially solve problems that are difficult for single-agent systems. This approach can lead to more complex and dynamic gameplay experiences, because different agents can take on different roles and cooperate or compete with each other. For example:

In strategy games, ML agents can be responsible for controlling different units. This makes each game session more unpredictable and challenging.
In racing games, ML agents can work together to create a more competitive environment. Each agent can adopt various racing strategies to outperform the others.

By incorporating multi-agent systems, developers can create games that keep players engaged with novel experiences.

Design Approaches for multi-agent system

When developing multi-agent systems, there are several approaches. Your approach depends on:

The type of game you are developing
The resources available
Your implementation ideas

When the game setting is simple, you might create multiple instances of a single agent. In more complex settings, you might want a centralized agent to control all the characters. In recent years, an approach called Centralized Training, Decentralized Execution framework has emerged. In this framework, agents share data during training to learn optimal actions. During inference, agents act independently without sharing data with each other. Another blog post explains how multi-agent works in Unity.

Various Possible Design Approaches for Multi-Agent

Figure 2. Various possible design approaches for multi-agents

In Candy Clash, there are 3 roles for the rabbit characters:

Attacker: The attacker's goal is to crack the opponent's egg.
Defender: The defender's goal is to protect their own egg.
Wanderer: The wanderer's goal is to defeat the opponent rabbits.

A planner agent dynamically selects the roles assigned to each rabbit character. Through this combination of planner and roles, the rabbits' behaviors change and adapt in real-time to various game situations.

Our Approach

Figure 3. Our approach for Candy Clash

In part 2, I explore agents’ design in more details.

AI blog

Empowering engineers with AI-enabled security code review

Michalis Spyrou

Metis uses AI to detect design flaws and logic bugs missed by traditional tools, helping teams build secure software with context-aware reviews.
- July 17, 2025
Get ready for Arm SME, coming soon to Android

Eric Sondhi

Build next-gen mobile AI apps with SME2—no code changes needed. Accelerate performance across devices using top AI frameworks and runtimes.
- July 10, 2025
One year of Arm KleidiAI in XNNPack: Seamless and transparent AI performance

Gian Marco Iodice

A year of Arm KleidiAI in XNNPack brings major ML performance boosts—no code changes needed. Transparent, seamless acceleration on Arm CPUs.
- July 10, 2025

AI blog

Announcements

Architectures and Processors blog

Automotive blog

Embedded and Microcontrollers blog

Internet of Things (IoT) blog

Laptops and Desktops blog

Mobile, Graphics, and Gaming blog

Operating Systems blog

Servers and Cloud Computing blog

SoC Design and Simulation blog

Tools, Software and IDEs blog

Part 1: Build dynamic and engaging mobile games with multi-agent reinforcement learning

Candy Clash

ML Agents and multi-agent scenarios

Design Approaches for multi-agent system

Empowering engineers with AI-enabled security code review

Get ready for Arm SME, coming soon to Android

One year of Arm KleidiAI in XNNPack: Seamless and transparent AI performance