Terra Nova Documentation

Installation

Install Terra Nova from the official repository:

git clone https://github.com/trevormcinroe/terra_nova/

Requirements

Terra Nova has been tested with the following Python + JAX combinations on Linux:

- Python 3.9.18 + JAX 0.4.30

- Python 3.13.9 + JAX 0.8.0

Quick Start

Get started with Terra Nova in just a few lines of code:

import argparse
import os
import pickle
import jax

from sim.build import build_simulator


parser = argparse.ArgumentParser()
parser.add_argument("--seed", type=int)
parser.add_argument("--num_steps", type=int, default=300)
parser.add_argument("--map_folder", type=str)
parser.add_argument("--distributed_strategy", type=str)
parser.add_argument("--memory_length", type=int, default=1)
args = parser.parse_args()

all_maps = os.listdir(args.map_folder)

games = []

for game in all_maps:
    if ".gamestate" not in game:
        continue
    with open(f"{args.map_folder}/{game}", "rb") as f:
        gamestate = pickle.load(f)
    games.append(gamestate)

(
  env_step_fn,
  games,
  obs_spaces,
  episode_metrics,
  players_turn_id,
  obs,
  GLOBAL_MESH,
  sharding
) = build_simulator(
    games,
    args.distributed_strategy,
    jax.random.PRNGKey(args.seed),
)

for game_step in range(args.num_steps):
    for agent_step in range(6):
        actions = ...
        (
          games,
          obs_spaces,
          episode_metrics,
          new_players_turn_id,
          next_obs,
          rewards,
          done_flags,
          selected_actions
        ) = env_step_fn(
            games, actions, obs_spaces, episode_metrics, players_turn_id
        )

        players_turn_id = new_players_turn_id
        obs = next_obs

For more detail, see the file training_demo.py.

Action Space

Summary

Terra Nova's action space is made of several sub-action spaces that each control a different mechanic in the environment. Below is a high-level summary of the array shapes the environment expects to receive from each agent at each timestep. These actions should be given to the environment in the form of a nested tuple in the following order:

action = Tuple[
  Tuple[jnp.ndarray, jnp.ndarray, jnp.ndarray, jnp.ndarray],  # tuple of four arrays
  jnp.ndarray,  # array
  jnp.ndarray,  # array
  jnp.ndarray,  # array
  Tuple[jnp.ndarray, jnp.ndarray],  # tuple of two arrays
  Tuple[jnp.ndarray, jnp.ndarray]  # tuple of two arrays
]
                      

Below is further detail on the endpoint being controlled by each sub-action space. Each each given to the environment should have two leading axes. The first is the number of devices (e.g., GPUs) the simulator is distributed over (num_devices). The second is the number of parallel games per device (num_games).

1

Trade Deals

These actions control the process through which agents may trade resources, embassies, etc. Only one trade deal can be offered per agent per turn.

Field	Shape	Semantics
Accept/Deny	`(num_devices, num_games, 6, 2)`	Binary {accept, deny} for each other agent's offer.
Ask (per agent)	`(num_devices, num_games, 6, 56)`	Distribution over 56 tradeables, represents what the agent would like to receive in the trade.
Offer	`(num_devices, num_games, 56)`	Distribution over 56 tradeables, represents what the given agent is offering.
Counterparty	`(num_devices, num_games, 6)`	Distribution over other agents. pairs with Ask/Offer to form a proposal.

2

Social Policies

The actions controlling social policy selection will only be executed if the agent has generated enough culture to allow for a new selection.

Field	Shape	Semantics
Policy	`(num_devices, num_games, 54)`	Distribution over all social policies.

3

Religion

Similar to social policies, the actions controlling religious tenet selection will only be executed if the agent has generated enough faith to allow for a new selection.

Field	Shape	Semantics
Tenet	`(num_devices, num_games, 91)`	Distribution over all religious tenets.

4

Technology

Similar to religion, the actions controlling technology selection will only be executed if the agent has generated enough science to allow for a new selection. Once a technology has been selected, research towards that technology cannot be stopped.

Field	Shape	Semantics
Tech	`(num_devices, num_games, 81)`	Distribution over all technologies.

5

Units

The actions for units control both the type of action and where on the map the selected action type should occur. A unit's type (e.g., worker versus trade caravan) determines which action categories are valid. The tiles upon which a given action category can be executed depends on how many action points a given unit type has, the action category itself, the map's terrain, and several other factors.

Field	Shape	Semantics
Action Type	`(num_devices, num_games, 30, 19)`	19-category action distribution, independent per unit (30).
Target Hex	`(num_devices, num_games, 30, 2772)`	Distribution over 42×66 hexes, independent per unit (30).

6

Cities

The actions for cities control what is being made in each city as well as which tiles are being worked within the owned tiles of each city. Once a construction target has been selected, the thing being constructed in each city cannot be changed.

Field	Shape	Semantics
Worked Tiles	`(num_devices, num_games, 10, 36, 36)`	Citizen-to-tile assignment, independent per city (10)
Build Choice	`(num_devices, num_games, 10, 185)`	Distribution over buildables (buildings & units), independent per city.

Detail

The actions in Terra Nova represent a preference order over outcomes. As several mechanics require sequential resolution, the most preferred outcome is not always a valid, executable action. For example, consider the following scenario:

A given agent has two cities and inputs a preference order over things to build in those two cities like:

City 1
1. Swordsman
2. Great Library
3. ...
City 2
1. Great Library
2. Worker
3. ...

Swordsman require a connected iron resource to build, and the Great Library is a World Wonder (i.e., only one of a given agent's cities may attempt to build this at a time). If the agent has iron connected, City 1 will begin a swordsman and City 2 will begin the Great Library. However, if the agent does not have iron connected, City 1 will begin the Great Library and City 2 will begin a worker. It is due to the conditional and sequential nature of action resolution that the environment requires the full distribution over all possible actions for all controllable endpoints at each timestep.

Because the environment automatically handles masking and ultimately the selection of specific actions, the user will not know the exact actions that will be executed ahead of time (without an accurate world model). To help with training, the environment will return a summary of which index was ultimately selected for each of the sub-action categories.

Executed Actions

As described in the previous step, agent must submit their actions over all endpoints as a preference order over all possible actions. To help with training, the environment will return the actions that were ultimately exectuted in the environment. The sub-action spaces will be in the same order on the way out as they are on the way in.

Similar to the observation space, when information is either missing or non-applicable, values will be -1. For example, if an agent has 10 of the maximum 30 units, selected-action information for the final 20 indices of unit-related arrays will be filled with -1.

1

Trade Deals

Field	Shape	Semantics
Accept/Deny	`(num_devices, num_games, 6)`	Binary {accept, deny} for each other agent's offer.
Ask	`(num_devices, num_games)`	Integer corresponding to trade deal ask.
Offer	`(num_devices, num_games)`	Integer corresponding to trade deal offer.
Counterparty	`(num_devices, num_games)`	Integer corresponding to trade deal counterpary.

2

Social Policies

Certain actions in the game grant an agent a free social policy on occasion. Each game returns a vector of integers of shape (2,) representing the (non-free, free) selections, respectively.

Field	Shape	Semantics
Policy	`(num_devices, num_games, 2)`	Integers corresponding to social poicies selected (non-free, free).

3

Religion

When founding a religion, agents must select three tenets at once. To accomidate for this information, all turns will return a vector of integers of shape (3,). During religion founding, the indices correspond to Founder tenet, Follower tenet, Follower tenet. When other types of tenets are selected, only the first element will correspond to a tenet integer.

Field	Shape	Semantics
Policy	`(num_devices, num_games, 3)`	Integers corresponding to religions tenets selected.

4

Technology

Certain actions in the game grant an agent a free technologies on occasion. Each game returns a vector of integers of shape (2,) representing the (non-free, free) selections, respectively.

Field	Shape	Semantics
Tech	`(num_devices, num_games, 2)`	Integers corresponding to technologies selected (non-free, free).

5

Units

Field	Shape	Semantics
Action Type	`(num_devices, num_games, 30)`	Integers corresponding to the action-categories selected.
Target Hex	`(num_devices, num_games, 30)`	Integers corresponding to the map's hex-tiles selected.

6

Cities

Field	Shape	Semantics
Worked Tiles	`(num_devices, num_games, 10, 36)`	Integers corresponding to the tiles selected to be worked.
Build Choice	`(num_devices, num_games, 10)`	Integers corresponding to the buildings/units selected to be constructed.

Observation Space

The observation space in Terra Nova is large, but organized and hierarchical. Some components are in the shape of the map (i.e., a (..., 42, 66, ...) array), some are arrays, some are vectors, and some are scalars. The information also spans numerical type: some are continuous values and some are non-ordinal integers. Also, due to partial observability, any of the information could be "unknown", which is always signified with a value of -1. Here we first describe the entire layout of a Terra Nova observation (along with the data shapes). Then we detail the contents of each item in an observation.

1

Main Observation

Field	Shape	Semantics
elevation_map	`(42, 66)`	Describes the elevation level of each tile. Takes values in [0, 1, 2, 3], representing ocean, flatland, hill, and mountain, respectively.
terrain_map	`(42, 66)`	Describes the terrain type of each tile. Takes values in [0, 1, 2, 3, 4, 5] representing ocean, grassland, plains, desert, tundra, and snow, respectively.
edge_river_map	`(42, 66, 6)`	Describes whether a river is on the edge of each tile. Takes values in [0, 1].
lake_map	`(42, 66)`	Describes whether a lake exists on each tile. Takes values in [0, 1].
feature_map	`(42, 66)`	Describes the feature on each tile. Takes values in [0, 1, 2, 3, 4, 5, 6] representing none, forest, jungle, marsh, oasis, floodplains, and ice, respectively.
nw_map	`(42, 66)`	Describes the presence of Natural Wonders in each tile. Takes values in [0...17], where 0 indicates no Natural Wonder and the remaining integers represent one of the Natural Wonders.
cs_ownership_map	`(42, 66)`	Describes which city state owns each tile. Takes values in [0...12], where 0 indicates no ownership and the remaining integers represent ownership by one of the city states.
units	`...`	See "Units Observation".
technologies	`(81)`	Describes the technologies researched as a boolean 1/0.
policies	`(54)`	Describes the social policies selected as a boolean 1/0.
player_cities	`...`	See "Player Cities Observation".
yield_map_players	`(42, 66, 7)`	Describes the yields given by the map's tiles.
visible_resources_map_players	`(42, 66)`	Describes the resources the agent can see with its current technology.
science_reserves	`()`	Describes the built up science, empire-wide. This science is put towards completing research.
culture_reserves	`()`	Describes the built up culture, empire-wide. This culture is put towards selecting social policies.
faith_reserves	`()`	Describes the built up faith, empire-wide. This faith is put towards selecting religious tenets and purchasing religious buildings.
is_researching	`()`	Describes the technology being researched currently as a categorical integer.
num_trade_routes	`()`	The maximum number of trade caravans the agent can maintain currently.
cs_perturn_influence	`(12)`	Describes the amount of influence the agent is exerting over each city state.
num_delegates	`(6)`	Describes the number of World Congress delegates from each agent.
culture_threshold	`()`	Describes the amount of culture build up required to unlock the next social policy.
religious_tenets	`(6, 91)`	Describes the religious tenets belonging to each agent's religion.
free_techs	`()`	Describes the number of technologies the agent can select for free on this turn. Only one free technology may be selected per turn.
free_policies	`()`	Describes the number of social policies the agent can select for free on this turn. Only one free policy may be selected per turn.
great_works	`(4)`	Describes the number of Great Works an agent has.
culture_info	`...`	See "Culture Observation"
improvement_additional_yield_map	`(42, 66, 7)`	Describes the yields from tile improvements in addition to the yields from the base map.
improvement_map	`(42, 66)`	Describes the improvements on the map tiles.
road_map	`(42, 66)`	Describes the presence of roads on the map tiles.
gpps	`(6)`	Describes the number of great person points toward each great person type, empire-wide.
gp_threshold	`()`	Describes the number of great person points required to generate a new great person.
golden_age_turns	`()`	Describes the number of turns remaining in the current golden age.
tourism_total	`(6, 6)`	Describes the amount of tourism generated by each agent onto each agent.
culture_total	`(6,)`	Describes the cumulative amout of culture generated.
citystate_info	`...`	See "City State Observation."
visibility_map	`(42, 66)`	Describes the visibility status of the map tiles. Takes values in [0, 1, 2] representing not ever seen, seen before but not currently, and currently see, respectively.
trade_offers	`(6, 2)`	Describes the incoming trade offers from each agent in an ask/offer format.
trade_ledger	`(6, 10, 2)`	Describes the currently-active trade deals with each other agent.
trade_length_ledger	`(10)`	Describes the turns remaining for each active trade deal.
trade_gpt_adjustment	`()`	The net gold per turn adjustment due to trade deals.
trade_resource_adjustment	`(52)`	The net resource adjustment due to trade deals.
have_met	`(16)`	Describes whether the agent has met other agents and city states.
at_war	`(6, 6)`	Describes the war status between each pair of agents.
has_sacked	`(6, 6)`	Describes whether a given agent has sacked another agent's capital.
treasury	`()`	Describes how much gold the agent has built up.
happiness	`()`	Describes how happiness an agent has, empire-wide.
most_population	`()`	The integer of the agent with the most population.
least_population	`()`	The integer of the agent with the least population.
most_crop_yield	`()`	The integer of the agent with the most food produced.
least_crop_yield	`()`	The integer of the agent with the least food produced.
most_manufactured_goods	`()`	The integer of the agent with the most production.
least_manufactured_goods	`()`	The integer of the agent with the least production.
most_gnp	`()`	The integer of the agent with the most gold per turn.
least_gnp	`()`	The integer of the agent with the least gold per turn.
most_land	`()`	The integer of the agent with the most land.
least_land	`()`	The integer of the agent with the least land.
most_soldiers	`()`	The integer of the agent with the most miilaty.
least_soldiers	`()`	The integer of the agent with the least military.
most_approval	`()`	The integer of the agent with the most happiness.
least_approval	`()`	The integer of the agent with the least happiness.
most_literacy	`()`	The integer of the agent with the most technology researched.
least_literacy	`()`	The integer of the agent with the least technology researched.
player_id	`()`	The integer relating to the agent's id.
current_turn	`()`	The turn number.

2

Units Observation

All fields that begin with 6 (the number of players) are revealed on a full-visibility basis only. That is, when a given player can see another player's units, the information about that unit is revealed. Likewise, when the seen units dieappear from full visibility, that information is removed from the observation.

Field	Shape	Semantics
unit_type	`(6, 30)`	Describes the type of unit (e.g., worker versus settler). Takes values in [0...37], where 0 indicates no unit and the remaining integers represent one of the unit types.
unit_rowcol	`(6, 30, 2)`	Describes the (row, col) location of units. Takes values in [0...65].
unit_ap	`(6, 30)`	Describes the action points of a unit, which determines how far a unit can move.
engaged_for_n_turns	`(30)`	Describes for how may turns a unit is engaged in an action.
engaged_action_id	`(30)`	Describes the action category the unit is currently engaged in.
trade_to_player_int	`(30)`	If the unit is a caravan and is engaged in a traderoute, describes to which player the caravan is sent.
trade_to_city_int	`(30)`	If the unit is a caravan and is engaged in a traderoute, describes to which city the caravan is sent.
trade_from_city_int	`(30)`	If the unit is a caravan and is engaged in a traderoute, describes from which city the caravan is sent.
trade_yields	`(30, 2, 10)`	If the unit is a caravan and is engaged in a traderoute, describes the yields produced by that trade route in a from/to format.
combat_bonus_accel	`(6, 30)`	Describes the bonus the unit receives to its combat strength when attacked or attacking.
health	`(6, 30)`	Describes the current health of the unit.

3

Culture Observation

Field	Shape	Semantics
building_yields	`(10, 8)`	Describes the additional yields from buildings due to the agent's social policies.
yields_per_kill	`(8)`	Describes the yields an agent generates for each unit killed.
honor_finisher_yields_per_kill	`(8)`	Describes the yields an agent generates for each unit killed due to having completed the Honor social policy tree.
cs_resting_influence	`()`	Describes the minimum influence value an agent has over all city states.
cs_trade_route_yields	`(10, 2, 10)`	Describes the additional yields from a trade route sent from each city to a city state in the order [food, production, gold, faith, culture, science, happiness, tourism, religious pressure, influence.]
additional_yield_map	`(42, 66, 7)`	Describes the yields due to social policies in addition to the yields from the base map.

4

City State Observation

Field	Shape	Semantics
religious_population	`(12, 6)`	The population count following each religion in the game for each city state.
relationships	`(12)`	Describes the agent integer of each city state's ally.
influence_level	`(12)`	Describes the agent's influence level over each city state.
cs_type	`(12,)`	Describes the type of city state.
quest_type	`(12)`	Describes the active quest for each city state.
culture_tracker_mine	`()`	Describes the amount of culture generated towards culture quests.
faith_tracker_mine	`()`	Describes the amount of faith generated towards faith quests.
tech_tracker_mine	`()`	Describes the number of technologies researched towards technology quests.
trade_tracker_mine	`(12)`	Describes the total number of turns put towards trade routes for trade route quests.
religion_tracker_mine	`()`	Describes the number of turns the agent's religion has spent as the majority population in a city state towards religion quests.
wonder_tracker_mine	`()`	Describes the number of wonders per turn the agent has built towards wonder quests.
resource_tracker_mine	`()`	Describes the number of resources the agent has improved towards resource quests.
culture_tracker_lead	`()`	Describes the value for the leader in culture quests.
faith_tracker_lead	`()`	Describes the value for the leader in faith quests.
tech_tracker_lead	`()`	Describes the value for the leader in technology quests.
trade_tracker_lead	`()`	Describes the value for the leader in trade route quests.
religion_tracker_lead	`()`	Describes the value for the leader in religion quests.
wonder_tracker_lead	`()`	Describes the value for the leader in wonder quests.
resource_tracker_lead	`()`	Describes the value for the leader in resource quests.
city_rowcols	`(12, 2)`	Desribes the (row, col) location of city states.

5

Player Cities Observation

Fields beginning with 6 (the number of agents in the game), are discoverable by all agents in the game.

Field	Shape	Semantics
city_ids	`(6, 10)`	Describes the type of city. Takes values in [0, 1, 2] where 0 is no city, 1 is a capital city, and 2 is a non-capital city.
city_rowcols	`(6, 10, 2)`	Describes the (row, col) position of cities. Takes values in [0...65].
ownership_map	`(6, 10, 42, 66)`	Describes the ownership status of each tile on a per-player, player-city basis. Takes values in [0, 1, 2, 3] representing not owned, could own, does own, and city center, respectively.
yields	`(10, 7)`	Describes the yields for each city in the order [food, production, gold, faith, culture, science, happiness].
city_center_yields	`(10, 7)`	Describes the yields for each city center in the order [food, production, gold, faith, culture, science, happiness].
building_yields	`(10, 8)`	Describes the yields in total for each building per city in the order [food, production, gold, faith, culture, science, happiness, tourism].
population	`(6, 10)`	Describes the population of each city.
worked_slots	`(10, 36)`	Describes whether or not a tile is being worked in a city. Takes values in [0, 1].
specialist_slots	`(10, 6)`	Describes how many specialists are in each city in the order [artist, musician, writer, engineer, merchant, scientist].
gw_slots	`(10, 5)`	Describes the number of Great Works in each city in the order [writing, art, music, artifact].
food_reserves	`(10)`	Describes the amount of excess food each city has. This excess goes towards population growth.
growth_carryover	`(10)`	Describes the percent of excess food carried over as reserves when a city grows.
prod_reserves	`(10)`	Describes the amount of built up production a city has. This reserve goes towards construction.
prod_carryover	`(10)`	Describes the percent of excess built up production carried over after a city completes construction.
is_constructing	`(10)`	Describes the object the city is constructing as a categorical integer.
bldg_maintenance	`(10)`	Describes the amount of gold per turn each city is paying to maintain its buildings.
defense	`(6, 10)`	Describes the defensive prowess of each city.
hp	`(6, 10)`	Describes the current health of each city.
buildings_owned	`(10, 148)`	Describes the buildings each city owns. Takes values in [0, 1].
resources_owned	`(10, 52)`	Describes the resources connected to each city. Takes values in [0, 1].
additional_yield_map	`(42, 66, 7)`	Describes the yields provided by buildings in a city additional to the yields from the base map.
is_coastal	`(6, 10)`	Decribes whether a city is on the map coast. Takes values in [0, 1].
religion_info	`...`	See "Religion Observation".
culture_reserves_for_border	`(10)`	Describes the culture build up that will be spent on expanding a city's border.
great_person_points	`(10, 6)`	Describes the build up of great person points per city in the order [artist, musician, writer, engineer, merchant, scientist].

6

Religion Observation

Field	Shape	Semantics
religious_population	`(10, 6)`	Describes the population count that follows each religion in each city.
religious_tenets_per_city	`(10, 91)`	Describes the religious tenets being applied to each city. This is determined by the religion with the majority population. Takes values in [0, 1].
building_yields	`(10, 8)`	Describes the extra yields the buildings in each coty produces due to religion.
additional_yield_map	`(10, 42, 66, 7)`	Describes the yields provided by religion in a city additional to the yields from the base map.
cs_perturn_influence_cumulative	`(10, 12)`	Describes the cumulative religious pressure applied from each city onto each city state.
player_perturn_influence_cumulative	`(10, 6, 10)`	Describes the cumulative religious pressure applied from ech city onto each other player's cities.

Generating Maps

Terra Nova comes with 10k+ maps by default. We also provide a map-generation utility for users interested in larger training and/or evaluation runs. The map generation process is driven by jax's PRNG system, allowing for full determinism. To generate more maps, run the following:

CUDA_VISIBLE_DEVICES= python3 generate_maps.py --start_seed 1 --num_maps 10

We recommend the user runs map generation on the CPU for speed considerations. One way to accomplish this is with the preamble CUDA_VISIBLE_DEVICES=, which effectively hides CUDA devices from the python process. The user only needs to specify --start_seed and --num_maps, which controlls the first attempted seed and the number of maps to generate, respectively.

Not every seed will lead to successful map generation. Terra Nova's map generation process follows a similar ruleset as the map generation process from LekMod, which leads to maps that are balanced in terms of resource and fertility distribution across the landmass. If a given seed will not produce a map that satisfices the balance constraints, the generation script will automatically move onto the next seed.

Recording Games

Terra Nova comes with utilities for recording games that your agents play. See below for a short snippet on how to use the recorder. For more detail, see the file recording_demo.py.

We highly recommend using a CPU-only job to perform game recording due to the memory usage of the recorder.

from game.recorder import GameStateRecorder

...

# To initialize the recorder, we need to extract a single game from the bundle.
# This requires indexing within *both* the device and
# game axis => games[device_idx][games_idx]
# This exact process will need to be repeated after each player
# takes its step within each game turn.
gamestate = jax.tree_map(lambda x: x[args.device_idx][args.games_idx], games)
recorder = GameStateRecorder.create(reference_gamestate=gamestate, num_steps=args.num_steps)
recorder = recorder.record(gamestate)


for recording_int in range(args.num_steps):
    for agent_step in range(6):
        actions = ...
        (
          games,
          obs_spaces,
          episode_metrics,
          players_turn_id,
          obs,
          rewards,
          done_flags,
          executed_actions
        ) = env_step_fn(
            games, random_actions, obs_spaces, episode_metrics, players_turn_id
        )

    gamestate = jax.tree_map(lambda x: x[args.device_idx][args.games_idx], games)
    recorder = recorder.record(gamestate)

recorder.save_replay(f"./renderer/saved_games/{args.save_filename}")

Viewing Games

After recording a game, we can view the replay using the Viewer. The Viewer leverages WebGL and your browser's hardware acceleration to render images to the screen. To run the viewer, open a terminal window and run the following command from within the Terra Nova folder:

bash ./run_viewer.sh

Then, the viewer can be opened in your preferred browser (we recommend Chrome) using the path printed to the terminal. The Viewer allows us to click through each turn in the replay while having full visibility into each agent's gamestate. The Viewer provides information on technologies, social policies, trade deals, victory progress, plots for various metrics, and more.

Software Documentation

Installation

Requirements

Quick Start

Action Space

Summary

Trade Deals

Social Policies

Religion

Technology

Units

Cities

Detail

Executed Actions

Trade Deals

Social Policies

Religion

Technology

Units

Cities

Observation Space

Main Observation

Units Observation

Culture Observation

City State Observation

Player Cities Observation

Religion Observation

Generating Maps

Recording Games

Viewing Games