Robotics Projects

Selected Robotics Projects (Supervision & Technical Guidance)

  1. Sim-to-Real Transfer for JetBot on Maze Escaping using Safe Reinforcement Learning
  2. Safe Navigation of JetBot in Slippery Terrain via Domain Randomization
  3. Safe Reinforcement Learning for Multi-Robot Systems in Hazardous Environments

Sim-to-Real Transfer for JetBot on Maze Escaping using Safe Reinforcement Learning

  • Students: Rocca Federico, Florian Tanguy
  • Supervision: Tingting Ni, Kai Ren
  • Goal: Develop a Lagrangian-PPO framework on a wheeled mobile-robot testbed to navigate a maze and reach the nearest exit, ensuring safety (obstacle avoidance) using feedback of robot position and relative position to obstacles/goals.
  • Work:
    • Formulated the navigation challenge as a Constrained Markov Decision Process and implemented a Lagrangian-PPO approach to enforce safety constraints.
    • Approach 1: Leveraged NVIDIA IsaacLab to train policies across multiple parallel environments on GPU to accelerate learning.
    • Approach 2: Performed system identification to build a high-fidelity simulation model of the JetBot platform, utilizing JAX for accelerated learning.
    • Developed a Sim-to-Real pipeline exporting policies to a ROS2 node for inference on a physical JetBot using OptiTrack for state estimation.
  • Outcome:
    • Simulation Performance: Both approaches achieved ~100% collision-free trajectories within their respective simulation environments.
    • Sim-to-Real Gap (IsaacLab): Policies trained in NVIDIA IsaacLab struggled to transfer, identifying critical gaps in the physics/actuation modeling that affected real-world safety.
    • Real-World Success (Custom simulation environment): Policies trained on the self-built simulation model bridged the reality gap, maintaining ~100% collision-free trajectories when deployed on the physical robot.
SimulationReal World (JetBot)
(Back to Top)

Safe Navigation of JetBot in Slippery Terrain via Domain Randomization

  • Student: Florian Tanguy
  • Supervision: Tingting Ni, Kai Ren
  • Goal: Develop a data-driven domain randomization framework to enable a wheeled mobile robot to navigate low-friction, slippery terrain, bridging the “Sim-to-Real” gap caused by unmodeled actuator deadzones and complex friction dynamics.
  • Work:
    • Migrated simulation to JAX for massive parallelization (4,096 environments) and differentiable physics, incorporating System Identification data to model multi-modal friction dynamics (distinct “slippery” vs. “grippy” regimes) and non-linear actuator deadzones.
    • Implemented Data-Driven Domain Randomization using Gaussian Mixture Models to sample physics parameters, ensuring the training distribution covered real-world failure modes.
    • Addressed “perceptual aliasing” (where sliding mimics stopping) by implementing Recurrent Neural Networks (LSTMs) for policy structure, which outperformed memoryless Feedforward networks and frame stacking with MLPs.
  • Outcome:
    • Achieved robust navigation across heterogeneous real-world environments, demonstrating emergent recovery behaviors (such as “wiggling”) to escape slippery terrain.

Video: Navigation performance on slippery terrain using different policies: 1. Memoryless Feedforward 2. MLP 3. LSTM.

(Back to Top)

Safe Reinforcement Learning for Multi-Robot Systems in Hazardous Environments

  • Student: Xiao Zhou
  • Supervision: Tingting Ni, Kai Ren
  • Goal: Utilize Constrained Multi-Agent Reinforcement Learning to enable robots to collaborate on survivor rescue missions while ensuring their own safety in hazardous environments featuring spreading fires and obstacle avoidance.
  • Work:
    • Formulated the rescue mission as a Constrained Decentralized Partially Observable Markov Decision Process.
    • Implemented constrained MAPPO, constrained IPPO, and Decentralized CRPO to learn cooperative policies that strictly enforce safety, ensuring robots avoid fires and collisions while rescueing survivors.
  • Outcome:
    • Achieved robust collaborative behaviors where robots successfully coordinate to locate and save survivors.

Video: Multi-robot coordination in hazardous fire environments: Red = Fire, Blue = Survivors, Green = Exits.

(Back to Top)