Motion Control of High-Dimensional Musculoskeletal System with Hierarchical Model-Based Planning

Yunyue Wei¹, Shanning Zhuang¹, Vincent Zhuang², Yanan Sui¹

¹Tsinghua University ²Google DeepMind
International Conference on Learning Representations (ICLR), 2025

Paper Code (Comming soon)

Figure 1. High-dimensional musculoskeletal system control over a large collection of motion tasks.

In this paper, we aim to control high-dimensional musculoskeletal systems to achieve movement tasks.
We expect the control method to have the following properties:
• (near) real-time control generation
• training-free deployment over different task and environments
• robust performance subject to sudden model changes

We use the model, MS-Human-700, as the target high-dimensional musculoskeletal system.
It is a comprehensive whole-body model consisting of 90 rigid body segments, 206 joints, and 700 muscle-tendon units.
Both non-linearity neural-muscle-joint dynamics and the high-dimensional state and control spaces presents significant challenges to control the system.

High-dimensional Musculoskeletal Control with Deep Reinforcement Learning

Among existing trials for high-dimensional musculoskeletal control, deep reinforcement learning (DRL) are predominantly used, where the successes come with the following significant requirements:
• long training time to generate effective control
• reference trajectories to guide the learning process
• specific task and model to reduce training difficulty

Below we demonstrate the control performance from the current state-of-the-art DRL-based method, DynSyn, on the MS-Human-700 model, where we observe that DynSyn:
• fails to learn a natural gait without reference trajectories
• fails over unexpected terrain conditions
• fails when the model suddenly changes

Video 1. Unnatural behavior without reference trajectory

Video 2. Failure when the terrain condition changes

Video 3. Failure when the posterior leg muscles are suddenly disabled

Model Predictive Control with Morphology-aware Proportional Control (MPC²)

We propose Model Predictive Control with Morphology-aware Proportional Control (MPC²), a hierarchical model-based planning algorithm to address the challenges of high-dimensional musculoskeletal control.
Our method has two major components:
• Model predictive position controller
A sampling-based model predictive controller plans the target posture of the agent, with instant rollouts for rapid response to state changes.
• Morphology-aware proportional controller
A proportional controller adaptively coordinates the actuators to achieve the target joint positions, with gain parameters dynamically adjusted according to the system morphology.

Figure 2. Algorithmic pipeline of MPC²

Musculoskeletal Control with MPC²

Full-body motion control

We show control sequences over full-body motion tasks using MPC².
Our proposed method demonstrates:
• near real-time stable control of the full-body musculoskeletal model, where no model-based planning algorithm has achieved
• zero-shot adaption to different tasks and terrain conditions, where no previous DRL-based method has demonstrated success in whole-body musculoskeletal systems.

Video 4. Standing upright

Video 5. Walking over flat floor

Video 6. Walking over rough terrain

Video 7. Walking up and down a slope

Video 8. Walking up and down a stair

Sports imitation

We also demonstrate that the training-free, near-real-time control generation of MPC² enables efficient reward engineering, facilitating stable control of sports imitation.

Video 9. Soccer motion imitation

Adaptation to model changes

Compared to the fragile control policy of DRL-based methods, MPC² is capable of:
• zero-shot adaption to sudden model changes
• achieving robust control even in the presence of actuator faults.

Video 10. Adaptation to posterior leg muscles disabled

Robust to perturbation forces

MPC² is capable of rapid adaption to sudden perturbation forces to the pelvis, demonstrating robust control performance.

Video 11. Consistent 100N from random directions

Video 12. 0.2s 500N from random directions

Control over ostrich models

We demonstrate that MPC² is capable of controlling the ostrich model (120 muscles) with the same cost function used for human model walking.

Video 13. Control over ostrich models with same cost function as human walking

Automatic cost function design

Combining Bayesian optimization with MPC², we can optimize the cost function weights to improve the forward speed of the ostrich without manual tuning.

Figure 5. Optimization over forward speed

Video 14. Optimized gait of ostrich (0.90m/s to 2.08m/s)

Figure 6. Optimization over forward speed

Video 15. Optimized gait of human (0.79m/s to 1.24m/s)

Center of mass polygon support

We demonstrate that MPC² maintains larger polygon support than DynSyn during walking, enhancing the stability.

Video 16. CoM polygon support of MPC²
area: 0.0761±0.0260m²

Video 17. CoM polygon support of DynSyn
area: 0.0582±0.0151m²

Energy consumption

We demonstrate that MPC² is capable of reducing the energy consumption by over 75% compared to DynSyn during walking.

Figure 7. Energy consumption comparison

Performances of MPC baselines

We demonstrate that baseline MPC methods provided by Mujoco MPC fail to achieve walking over the MS-Human-700 model.

Non sampling-based MPC require long planning time due to the computation of the derivative of the high-dimensional dynamics, hinders real-time decision making.

Video 18. Gradient Descent

Video 19. iLQG

Video 20. iLQS

Sampling based MPC struggles to sample effective control sequences due to the high-dimensional action space.

Video 21. Cross Entropy

Video 22. Robust Sampling

Video 23. Sample Gradient

Performances of DRL baselines

Below we show the control performance of MPO and DEP-RL on the MS-Human-700 model (choosing best from 3 random seeds). We found them fail to either stand or walk with the high-dimensional model.

Video 24. MPO standing

Video 25. MPO walking

Video 26. DEP-RL standing

Video 27. DEP-RL walking

Dexterous manipulation over the arm model

We demonstrate that MPC² is capable of controlling the arm musculoskeletal model (85 muscles) to manipulate cube to a sequence of target orientations.

Video 28. Dexterous manipulation over arm musculoskeletal model

Conclusion

We propose a high-dimensional control method, MPC², that is capable of:
• achieving near real-time stable control of comprehensive musculoskeletal systems
• enabling training-free full-body motion control across a wide range of motion tasks, many of which have not been achieved by state-of-the-art DRL-based methods
• rapid adaption to sudden model changes, fully leveraging the over-actuated nature to achieve robust control.

BibTeX

@inproceedings{iclr2025mpc2,
          title={Motion Control of High-Dimensional Musculoskeletal Systems with Hierarchical Model-Based Planning},
          author={Wei, Yunyue and Zhuang, Shanning and Zhuang, Vincent and Sui, Yanan},
          booktitle={The Thirteenth International Conference on Learning Representations},
          year={2025}
        }