Why Most Game Automation Approaches Eventually Reduce to Two Models

When engineers first encounter game automation, they usually start looking for a GameDev equivalent of Selenium or Playwright.

The logic seems straightforward:

Web development has Selenium.
Mobile development has Espresso and XCUITest.
Therefore, games should have a similar standard automation tool.

In practice, such a tool still does not exist.

After working on several game projects, I’ve noticed that most solutions eventually reduce to two fundamental models:

Engine Integration
Visual Validation

Almost every other approach is either a variation of one of these models or a combination of both.

Why Game Development Ended Up in a Different Place

Web automation relies on the DOM.

Mobile automation relies on accessibility frameworks.

Automation tools have a standardized way of interacting with applications. They can locate elements, read their state, and perform actions.

Games are different.

They may use Unity, Unreal, proprietary engines, or entirely custom UI frameworks. From the perspective of an external tool, a game often looks like nothing more than a collection of pixels.

There is:

no common interaction protocol;
no universal element tree;
no standard that works consistently across different projects.

This is one of the main reasons why the industry has never produced a true “Selenium for games.”

Model #1: Engine Integration

The first model is based on direct interaction with the game.

This is typically achieved by exposing a dedicated integration layer:

HTTP APIs;
WebSockets;
RPC;
Debug Commands;
Custom Automation Drivers.

Instead of interacting with the game through the screen, automation gains access to the internal state of the system.

For example:

GetPlayerState()
GetQuestStatus()
GetInventory()
GetObjectProperty()

This approach is usually:

faster;
more stable;
easier to maintain;
easier to debug.

However, it requires investment from the development team.

For smaller studios, building and maintaining such infrastructure may simply be too expensive.

Why Larger Teams Often Build Their Own Solutions

There are existing tools on the market.

A good example is AltTester for Unity.

For smaller Unity teams, tools like AltTester can be an excellent starting point. They allow teams to introduce automation without building an entire platform from scratch.

As projects grow, however, requirements tend to move beyond simple object interaction.

Teams eventually need to:

access internal game state;
collect diagnostic information;
integrate with internal services;
interact with game mechanics;
control testing scenarios.

At that point, a custom integration layer often provides more flexibility than a generic solution.

This becomes especially true for projects that have existed for years and involve dozens or hundreds of engineers.

Model #2: Visual Validation

The second model focuses on the visual representation of the game.

Typical techniques include:

screenshot testing;
visual regression testing;
image comparison;
OCR;
object recognition.

Visual Validation answers a simple question:

Does the game look correct to the player?

These tests help identify:

rendering issues;
visual artifacts;
missing objects;
UI defects;
incorrect content presentation.

Visual Validation is particularly important in games because a significant portion of the player experience is tied directly to what appears on the screen.

Engine and Visual Are Not Competing Approaches

After learning about these two models, the natural question is:

Which one is better?

In practice, this is often the wrong question.

Engine Integration and Visual Validation solve different problems.

They are not alternatives to one another.

They validate different aspects of product quality.

Engine Integration answers:

Does the game behave correctly?

For example:

Can the player collect an artifact?
Is experience awarded correctly?
Does a quest complete successfully?
Is the economy calculated correctly?
Does the game world transition into the expected state?

Engine-based tests work with the logic and state of the system.

Visual Validation answers:

Does the game look correct?

For example:

Is the scene rendered correctly?
Are all objects visible?
Is the UI positioned correctly?
Are visual effects displayed as expected?

Visual tests work with the presentation layer of the system.

This is why one approach cannot replace the other.

A character may successfully collect an artifact from the perspective of game logic, while the artifact model is not rendered on the screen.

Likewise, a scene may look perfect while rewards are not granted or a quest never completes.

A Mature Automation Architecture

In successful projects, automation responsibilities are often separated into distinct layers.

Engine Driver

Provides access to the internal state of the game.

GetPlayerState()
GetObjectProperty()
GetQuestStatus()

Test Harness / Cheats

Responsible for preparing testing scenarios and game states.

SetPlayerLevel()
GrantCurrency()
UnlockCharacter()
TriggerEvent()
ResetAccount()

Visual Validation Layer

Responsible for validating visual output.

screenshot comparison;
visual regression testing;
UI validation;
layout validation.

Each layer serves a different purpose and can evolve independently.

Why Everything Still Reduces to These Two Models

There is nothing fundamentally new about game automation.

The same underlying ideas have existed in web, mobile, and enterprise software testing for years.

The difference is not in the approaches themselves.

The difference is the lack of a common standard.

The web converged around the DOM.

Mobile platforms converged around accessibility frameworks.

Game development has never converged around a shared automation protocol.

As a result, most solutions eventually revolve around two core ideas:

access the internal state of the game;
validate the visual representation of the game.

In my opinion, this is one of the reasons why the industry still does not have a universal “Selenium for games.”

The problem is not a lack of tools or ideas.

The problem is that games are both complex software systems and highly visual products.

Validating those two aspects of quality still requires different approaches.