Simulation And Rival Hero AI¶

Purpose¶

This document defines the official direction for gameplay AI that must behave like a player, an enemy commander, or a rival hero while remaining deterministic and simulation-friendly.

It covers:

simulator player policies
enemy and boss combat AI structure
rival hero AI structure
policy families for balance testing
telemetry expectations for AI-driven play

Core Rule¶

Gameplay AI is not an LLM problem.

Gameplay AI should be implemented through deterministic policy systems, scripted phase logic, heuristic evaluation, and seeded randomness where explicitly permitted by the rules.

This keeps simulation reproducible and combat behavior testable.

The Simulator As A Third UX¶

EchoSpire has three major gameplay-facing execution surfaces:

Console client
Unity client
Simulation engine

The simulator is a real UX surface even though it has no player-facing presentation.

Its job is to:

act like different kinds of players
reveal balance outliers
test content safely at scale
stress rival hero and objective logic
compare policy quality under shared seeds

Policy Interface Direction¶

The current IDecisionPolicy shape is the correct starting point.

It should remain responsible for:

combat card play selection
path selection
reward selection
ability use decisions

It should be extended over time with richer context, not replaced by agent calls.

Recommended future extensions include:

optional mulligan or starting-hand logic
explicit protectable-priority evaluation
relic purchase or service choice logic
retreat, risk, or contract acceptance logic when those systems exist

Player Policy Families¶

The simulator should support a portfolio of policies, not one universal AI.

Recommended baseline policy families are:

Random¶

Purpose:

lower-bound benchmark
regression sanity test
stress random interaction surfaces

Greedy Damage¶

Purpose:

maximize immediate damage each turn
reveal burst-skewed balance problems

Biases:

high damage now
low setup patience
low protectable safety unless directly tied to victory

Greedy Survival¶

Purpose:

test defensive and sustain-oriented play
reveal stall and turtling problems

Biases:

block and sustain first
preserve HP and objective safety
delay risky lines unless payoff is obvious

Objective-Aware¶

Purpose:

evaluate protection, escort, and defense encounters properly
validate universal objective-protection tooling across classes

Biases:

protectable survival value is very high
intercept, taunt, control, repair, and escort-safe pathing are prioritized

Economy Optimizer¶

Purpose:

test long-run value behaviors
expose reward and shop balance issues

Biases:

card quality over immediate reward spikes
relic synergy
route quality
reduced waste in upgrade and service decisions

Mechanic Specialist Policies¶

Purpose:

stress specific class fantasies
validate that archetypes actually function under competent play

Recommended examples:

Anchor Bulwark
Anchor Cataclysm
Drifter Ghost
Drifter Executioner
Conduit Regulator
Conduit Singularity
Machinist Architect
Machinist Saboteur
Catalyst Alchemist
Catalyst Aberrant

Each specialist policy should weight scoring differently instead of requiring a different decision interface.

Hybrid Heuristic¶

Purpose:

serve as the best available approximation of a competent human player
become the primary balance policy once mature

Biases:

blends survival, damage, setup, objective safety, and long-run value
uses contextual scoring instead of rigid single-axis greed

Card Play Scoring Model¶

Recommended scoring dimensions for each candidate action are:

immediate damage value
survival value
protectable safety value
combo setup value
future hand quality value
class mechanic advancement value
energy efficiency
target quality
risk or self-damage burden
board-control value

Different policies should reuse the same scoring categories but assign different weights.

Objective-Aware Scoring¶

Protection and escort encounters require additional scoring terms.

Recommended terms are:

damage prevented on protectable
intercept value
repair or heal value on objective
taunt or target-redirection value
suppression value against high-threat attackers
path safety for moving escorts
checkpoint progress value

This is required if simulator outputs are going to say anything meaningful about escort and protection balance.

Enemy AI Structure¶

Standard enemies should use an authored-plus-heuristic model.

Recommended structure:

encounter design defines the legal move package
enemy archetype defines target preferences and behavior bias
heuristic selector chooses the best legal move for the current state
seeded RNG breaks ties or supports intentionally variable enemy styles

Recommended enemy archetypes include:

Bruiser
Defender
Artillery
Skirmisher
Controller
Summoner
Hunter
Objective Breaker

These archetypes should influence evaluation weights rather than hardcode every turn.

Boss AI Structure¶

Bosses should use scripted phase trees plus tactical selectors.

Recommended structure:

phase script defines transitions, narrative arc, and unlock conditions
each phase has its own legal action set
tactical scoring chooses the most appropriate current action
emergency rules handle protectables, lethal setup, or desperation states

This creates bosses that feel authored but still responsive.

Recommended boss evaluation inputs:

current phase
hero HP and block state
protectable or escort state
board width
ally or summon count
recent player pattern
whether the boss needs to force tempo change

Rival Hero AI¶

Rival Heroes should use a player-like policy model rather than ordinary enemy AI.

They are not just monsters with cards attached.

The recommended rival hero model is:

limited deck with a readable archetype
class mechanic awareness
faction passive awareness
tactical use of hero powers where allowed
policy profile that fits the rival's identity

Rival Hero Policy Families¶

Recommended rival hero styles are:

Duelist¶

favors single-target pressure
values tempo and clean trades
strongest in one-on-one threat conversion

Controller¶

slows the player down
values disruption, status effects, and board denial
protects allied supports well

Commander¶

plays around adds, Constructs, summons, or support units
values sequencing and protection of board assets

Opportunist¶

pressures low HP windows
exploits exposed protectables and weak turns
punishes greed

Nemesis¶

tailored recurring rival policy
may include authored heuristics that counter the specific campaign hero fantasy it is meant to oppose

Rival Hero Readability Rule¶

Rival Heroes must stay more readable than a full human PvP opponent.

That means:

constrained deck sizes
constrained power usage
visible priorities where possible
authored personality through policy selection rather than hidden randomness

Telemetry Requirements¶

Every AI-driven run or combat should log:

policy id
class and faction
protectable or escort presence
route choice sequence
reward choice sequence
combat turns to resolution
major mechanic triggers
objective success or failure
key lethal or failure windows

For rival heroes and bosses, telemetry should also record:

phase reached
selected action families
target categories chosen
protectable pressure rate
kill conversion rate

Balance Use Cases¶

This AI model exists to answer questions like:

does a class actually function under competent objective-aware play?
are escort missions unfair for specific classes?
do rival heroes behave distinctly enough?
are bosses producing readable pressure instead of random spikes?
are specific relics or cards only overpowered under a certain policy family?

Testing Strategy¶

Required testing layers are:

deterministic replay for each policy under shared seeds
regression tests for scoring changes
scenario tests for protection, escort, boss, and rival hero encounters
comparative simulation runs across multiple policy families
policy-versus-policy analysis for rival hero duels when applicable

Official Recommendation Summary¶

The canonical direction is:

treat simulation as a first-class execution surface
implement player behavior through deterministic policy families
use authored-plus-heuristic AI for normal enemies
use scripted phase trees plus tactical selection for bosses
use player-like policy logic for rival heroes
score protection and escort encounters explicitly, not as normal combat variants
use telemetry to compare policy quality, balance pressure, and encounter fairness