Abstract
OverviewWe introduce DQ-Bench, the first benchmark for dynamic object grasping with quadruped robots, supporting realistic dynamics, diverse objects, multi-level task difficulty, and comprehensive evaluation. Building upon this benchmark, we propose DQ-Net, a teacher–student framework combining a Grasp Fusion Module and lightweight dual-view student network for stable and efficient whole-body dynamic grasping. Extensive experiments show DQ-Net outperforms baselines in both success rate and responsiveness.
DQ-Bench
Benchmark- Built on Isaac Gym for high-performance simulation.
- Four difficulty levels: Low-Speed 2D → High-Speed 3D with rough terrain.
- Diverse seen/unseen YCB objects for generalization testing.
- Evaluation metrics: GSR, OSSR, TSC.
DQ-Net Framework
MethodDQ-Net integrates a Grasp Fusion Module (GFM) with a hierarchical teacher–student structure:
- Teacher: Uses privileged info (pose, velocity, point cloud) + GFM to output optimal grasp poses.
- Student: Lightweight Transformer-based network with dual-perspective depth & mask inputs.
- Low-level controller: Ensures coordinated locomotion and manipulation.
Results
ExperimentsDQ-Net achieves the highest grasp success rates across all difficulty levels and unseen object categories.
Video
DemoCitation
BibTeX
@misc{liang2025wholebodycoordinationdynamicobject,
title = {Whole-Body Coordination for Dynamic Object Grasping with Legged Manipulators},
author = {Qiwei Liang and Boyang Cai and Rongyi He and Hui Li and
Tao Teng and Haihan Duan and Changxin Huang and Runhao Zeng},
year = {2025},
eprint = {2508.08328},
archivePrefix = {arXiv},
primaryClass = {cs.RO},
url = {https://arxiv.org/abs/2508.08328}
}