Managers and MDP Terms¶

GeneLab follows the Isaac Lab-style manager-based environment pattern: the environment owns the simulation loop, while MDP behavior is split into named terms managed by specialized managers.

Why split the MDP¶

A robot learning environment usually contains many concerns:

action decoding
command sampling
observations
rewards
terminations
reset and interval events
curricula
metrics

Putting all of that directly in step() makes tasks hard to inspect and override. Manager-based configs keep each concern named, typed, and discoverable.

The manager set¶

Manager	Config field	Role
`ActionManager`	`actions_cfg`	Converts policy actions into simulator targets or torques.
`CommandManager`	`commands_cfg`	Maintains sampled goals such as target velocity or reference motion.
`ObservationManager`	`observations_cfg`	Computes named observation groups such as `policy` and `critic`.
`RewardManager`	`rewards_cfg`	Computes weighted reward terms and episode summaries.
`TerminationManager`	`terminations_cfg`	Separates terminated and time-out conditions.
`EventManager`	`events_cfg`	Runs startup, reset, and interval randomization or perturbation.
`CurriculumManager`	`curriculum_cfg`	Adjusts difficulty or task state at reset boundaries.
`MetricsManager`	`metrics_cfg`	Records non-reward diagnostics.

Runtime order¶

ManagerBasedRlEnv builds the Genesis scene first, then creates managers, binds sensors, applies startup events, and runs an initial reset. During each env step it processes actions, advances the scene for decimation physics ticks, refreshes state, updates sensors, computes commands, events, rewards, metrics, terminations, resets done envs, and finally computes observations.

This order matters because terms read shared state. For example, reward terms see the state after actions and sensor updates; reset events run before the next observation is emitted.

Term names are part of the interface¶

Term names appear in override paths and logs. A reward named track_lin_vel can be changed from the CLI and appears in episode summaries. Stable names make experiments easier to compare.

Managers and MDP Terms¶

Why split the MDP¶

The manager set¶

Runtime order¶

Term names are part of the interface¶

Where to continue¶