1. Video Processing
Modeling complex, fine-grained hand-object interactions remains challenging, in part due to the limited availability of dedicated datasets and specialized capture methods. Existing motion capture systems are generally limited to basic motion types, such as grasping, and interactions with primitive rigid or articulated objects. To facilitate the exploration of intricate, dexterous in-hand manipulations with more complex objects, we present DexterCap. We first design a robust, low-cost, and high-fidelity motion capture hardware system that acquires reliable data even in the presence of self-occlusion and complex manipulation. To ensure accurate capture despite severe occlusions, we introduce a specialized patch maker equipped with an effective detection and optimization pipeline. We further develop an automated data augmentation pipeline to reconstruct and refine motion data with minimal manual effort, improving both efficiency and data quality. Using this system, we create the DexterHand dataset, which includes subtle, fine-grained manipulation behaviors and interactions with multi-jointed objects such as a Rubik’s cube. By releasing the dataset and supporting source code to the community, we hope that DexterCap will facilitate further research on intricate hand-object interactions.
Cuboid 0
Cuboid 1
Cuboid 2
Cylinder
Plate
Prism
Ring
Rubik's Cube
@misc{liang2025dextercap,
title = {DexterCap: An Affordable and Automated System for Capturing Dexterous Hand-Object Manipulation},
author = {Liang, Yutong and Xu, Shiyi and Zhang, Yulong and Zhan, Bowen and Zhang, He and Liu, Libin},
journal = {arXiv preprint arXiv:2601.05844},
year = {2026},
url = {https://arxiv.org/abs/2601.05844}
}