yaw_bot is an Isaac Lab project for training, evaluating, and debugging a two-wheel balancing robot with leg joints.
Language: 简体中文
The name Yaw Bot carries two meanings:
YAWrefers to yaw-angle stability and heading controlYAWalso stands forYou Always Walk
The project currently focuses on:
- balancing and standing
- forward/backward command tracking
- equivalent leg-angle mapping for a simplified parallel-leg model
- wheel-ground contact debugging
- RSL-RL based training and playback
This project is based on the open-source bipedal wheeled robot from StackForce:
In this repository, we additionally use an inverse-solution mapping function to convert the parallel-leg structure into an equivalent joint representation for training and control in Isaac Lab.
The registered task is:
Template-Yaw-Bot-Direct-v0
- a direct RL task implementation for Isaac Lab
- a robot asset and articulation config for the yaw bot
- PPO configs for RSL-RL
- training and playback scripts
- a small utility script for checking the current knee-angle mapping
- source/yaw_bot/yaw_bot/tasks/direct/yaw_bot/yaw_bot_env.py Main direct RL environment
- source/yaw_bot/yaw_bot/tasks/direct/yaw_bot/yaw_bot_env_cfg.py Environment configuration, commands, rewards, observations, terrain and disturbance switches
- source/yaw_bot/yaw_bot/tasks/direct/yaw_bot/agents/rsl_rl_ppo_cfg.py PPO runner configuration
- source/yaw_bot/yaw_bot/robots/yaw_bot_cfg.py Robot articulation and actuator definitions
- assets/robots/yaw_bot/yaw_bot.urdf Robot URDF, including wheel collision geometry
- assets/robots/yaw_bot/config.yaml Asset conversion configuration
- scripts/rsl_rl/train.py Training entry point
- scripts/rsl_rl/play.py Playback entry point
- scripts/calc_knee_angle.py
Helper for computing knee angle
tfromaandb
- Isaac Lab installed and working
- a Python environment that can import:
isaaclabisaaclab_tasksisaaclab_rlrsl_rl
- Windows or Linux
This repository is intended to live outside the main Isaac Lab repository and be installed as an editable package.
Install the extension package from the repository root:
python -m pip install -e source/yaw_botIf you normally launch inside the Isaac Lab wrapper, use:
.\isaaclab.bat -p -m pip install -e source\yaw_botYou can list registered environments with:
python .\scripts\list_envs.pyLook for:
Template-Yaw-Bot-Direct-v0
Basic training command:
python .\scripts\rsl_rl\train.py --task Template-Yaw-Bot-Direct-v0Typical Isaac Lab launch form on Windows:
.\isaaclab.bat -p .\scripts\rsl_rl\train.py --task Template-Yaw-Bot-Direct-v0Useful overrides:
--num_envs <N>--max_iterations <N>--seed <N>--video
Training logs are written under:
logs/rsl_rl/yaw_bot_direct/<timestamp>/
Each run directory typically contains:
model_*.ptevents.out.tfevents.*params/env.yamlparams/agent.yamlexported/
Resume from a previous checkpoint with:
python .\scripts\rsl_rl\train.py --task Template-Yaw-Bot-Direct-v0 --resume --load_run <run_name> --checkpoint <model_file>Example:
python .\scripts\rsl_rl\train.py --task Template-Yaw-Bot-Direct-v0 --resume --load_run 2026-03-22_20-04-04 --checkpoint model_999.ptNotes:
- resuming creates a new run directory rather than overwriting the old one
- old checkpoints may fail to resume if the observation dimension changed
- some older runs may still live under:
logs/rsl_rl/cartpole_direct/
Playback command:
python .\scripts\rsl_rl\play.py --task Template-Yaw-Bot-Direct-v0 --checkpoint <checkpoint_path>Example:
python .\scripts\rsl_rl\play.py --task Template-Yaw-Bot-Direct-v0 --checkpoint .\logs\rsl_rl\yaw_bot_direct\2026-03-22_20-04-04\model_999.ptCurrent playback behavior:
- playback forces a single environment
- termination is disabled during playback
- command resampling is disabled during playback
- playback is designed for manual keyboard command input
Manual control keys in the simulation window:
Wforward commandSbackward commandAleft yaw commandDright yaw commandLclear command
The terminal prints the current commanded linear and yaw speed whenever the manual command changes.
The helper script:
computes the knee angle t from branch hip angle a and mapped hip angle b.
Example:
python .\scripts\calc_knee_angle.py 10 20This prints:
ain degreesbin degreestin degrees
The current policy outputs 6 actions:
- left branch hip
a - left mapped hip
b - right branch hip
a - right mapped hip
b - left wheel torque command
- right wheel torque command
Leg control:
- the policy outputs
aandb - the environment computes the semantic knee angle
t = f(a, b) - the simulated servo targets become
[left_hip, left_knee, right_hip, right_knee]
Wheel control:
- wheels are controlled with
set_joint_effort_target(...) - wheel sign conventions are unified in the environment so semantic forward wheel motion is consistent across left and right wheels
The real mechanism is treated as an equivalent simplified structure in simulation.
Current implementation includes:
- branch hip angle
a - mapped hip angle
b - derived knee angle
t
The geometry conversion is implemented in:
The knee torque mapping function exists as a placeholder interface, but torque-equivalent actuation is not yet fully wired into the leg control path.
Current command-tracking observation size is 22.
The layout is:
- root quaternion: 4
- root angular velocity: 3
- projected gravity: 3
- velocity commands: 2
- wheel positions: 2
- wheel velocities: 2
- last actions: 6
Total:
22
The observation configuration lives in:
The environment supports command-tracking mode through:
use_velocity_commands
The current command set is:
- linear x velocity command
- yaw angular velocity command
These commands are sampled inside the environment during training and written manually during playback.
Relevant config fields:
command_lin_vel_x_rangecommand_yaw_vel_rangecommand_resample_time_rangecommand_lin_vel_x_min_abscommand_yaw_vel_min_abscommand_yaw_probability
The current reward structure includes:
- alive reward
- termination penalty
- body angle penalty
- angular-velocity penalties
- vertical-velocity penalty
- optional leg pose and symmetry regularization
- linear command-tracking reward
- wheel-based linear command-tracking reward
The main reward config is in:
The actual implementation is in:
The robot uses three actuator groups:
hip_jointsknee_jointswheel_joints
Current control split:
- hips and knees use position targets
- wheels use effort targets
Current actuator config is in:
Termination is based on non-wheel body contact through a ContactSensor.
Tracked links for termination include:
BodyL_leg1L_leg2R_leg1R_leg2
Wheel contact is allowed.
The environment supports these optional features:
- flat terrain
- rough terrain
- IMU noise
- random body force pulses
- random body torque pulses
These are all controlled in:
Typical workflow is staged:
- train standing or straight-line motion on flat terrain
- add forward/backward command tracking
- add yaw control
- add disturbances and rough terrain
The wheel collision geometry in the URDF is currently a cylinder, not a mesh collider.
Current wheel collision dimensions:
- diameter
65 mm - width
28 mm
This is defined in:
This change was made to improve wheel-ground rolling behavior compared with mesh-based convex hull collision.
Recent versions of the environment log wheel- and command-related diagnostics to TensorBoard.
Useful tags include:
Diagnostics/root_lin_vel_xDiagnostics/wheel_semantic_forward_velDiagnostics/wheel_effort_cmd_absDiagnostics/wheel_surface_speedDiagnostics/wheel_body_speed_slip_absDiagnostics/lin_cmd_sign_match_rateDiagnostics/forward_cmd_success_rateDiagnostics/backward_cmd_success_rateDiagnostics/servo_pose_errorDiagnostics/servo_joint_vel_sqDiagnostics/gravity_xy_errorDiagnostics/root_vertical_vel_abs
These are especially useful for checking:
- whether the wheels are actually being driven
- whether forward and backward commands are learned with the correct sign
- whether wheel-ground slip is dominating
- whether the leg controller is over-constraining propulsion
- the gym task name still uses the template-style prefix
Template- - some older logs remain under the old template experiment name
cartpole_direct - yaw control is often trained later than forward/backward control, so not every checkpoint can turn
- old checkpoints may not load if observation dimensions changed
- the equivalent knee torque mapping is not yet part of the actual leg actuation path
- the repository still contains some template/example files, such as ui_extension_example.py
Useful lightweight checks from the repository root:
python .\scripts\calc_knee_angle.py 10 20
python -m py_compile .\source\yaw_bot\yaw_bot\tasks\direct\yaw_bot\yaw_bot_env.py
python -m py_compile .\source\yaw_bot\yaw_bot\tasks\direct\yaw_bot\yaw_bot_env_cfg.py
python -m py_compile .\source\yaw_bot\yaw_bot\robots\yaw_bot_cfg.py- the intended editable install package is:
source/yaw_bot
- root-level setup.py exists, but the actual Isaac Lab extension package is under:
source/yaw_bot
- extension metadata lives in:
This repository contains code derived from the Isaac Lab project template and retains the original upstream headers in many files.