Skip to content

MDP Terms Reference

genelab.mdp is a reusable term library. It does not define a task by itself; task configs select functions and classes from this library and wire them into managers.

Actions and commands

Area Public pieces
Actions JointPositionActionCfg, JointPositionAction
Cartesian actions DifferentialIKActionCfg, DifferentialIKAction, BinaryGripperActionCfg, BinaryGripperAction
Velocity commands UniformVelocityCommandCfg, UniformVelocityCommand
Motion commands MotionCommandCfg, MotionCommand, MotionLoader

Actions convert policy outputs to simulator control. Commands hold sampled goals that observations and rewards can read.

Action terms

Term Action dim Description
JointPositionAction matched joints Maps policy output to joint-position targets for the selected joints.
DifferentialIKAction 3 or 6 Converts end-effector deltas into arm joint targets with damped-least-squares Jacobian IK.
BinaryGripperAction 1 Snaps matched finger joints to an open or closed target from one scalar action.

DifferentialIKActionCfg.body_name selects the controlled link, and joint_names selects the arm joints used as IK columns. With use_orientation=False, the policy emits (dx, dy, dz); with use_orientation=True, it emits (dx, dy, dz, droll, dpitch, dyaw) as an axis-angle delta. scale caps the physical delta per control step, damping regularizes the IK solve, and max_delta_joint keeps each joint update local.

The robot articulation must set requires_jac_and_ik=True before a DifferentialIKAction term is constructed, otherwise Genesis does not allocate the Jacobian data needed by the solver.

BinaryGripperActionCfg uses threshold to choose between closed_pos and open_pos. It is often paired with a Cartesian arm term; for example, the Franka Cartesian pick-and-place task composes DifferentialIKAction(body_name="hand") with BinaryGripperAction to expose a 4-D (dx, dy, dz, gripper) action space.

Observations

Common observation functions include base velocity, projected gravity, relative joint position and velocity, last action, generated commands, sensor data, contact features, terrain height scans, and motion-tracking state.

Observation functions should return (num_envs, d) or (num_envs,) tensors. The observation manager handles optional noise, scaling, clipping, and group concatenation.

Rewards and terminations

Reward functions cover velocity tracking, action smoothness, joint acceleration, orientation, limits, foot clearance, slip, air time, self collision, angular momentum, and motion-tracking errors. Termination functions cover time-out, orientation, root height, and motion-tracking failure.

Reward functions should return (num_envs,); termination functions should return (num_envs,) boolean tensors.

Events, curricula, metrics, and noise

Area Examples
Events reset_root_state_uniform, reset_joints_to_default, push_by_setting_velocity
Curricula terrain_levels_vel, commands_vel
Metrics mean_action_acc, angular_momentum_mean, air_time_mean, slip_velocity_mean
Noise Unoise, Gnoise
Domain randomization mdp.dr.body, mdp.dr.joint, mdp.dr.geom

NoiseCfg's canonical home is genelab.contracts, which lets the observation manager type-hint noise without importing genelab.mdp. Concrete models remain available from genelab.mdp.noise and the public facades.

Where to continue