The ADL4D Recording Studio
A synchronized multi-modal capture rig for 4D human activities.
The Multi-View Capture Rig
Capturing 4D activities of daily living requires a synchronized multi-modal setup capable of handling heavy occlusion and fast motion. Our studio is designed to produce providing dense, consistent data for complex two-person interactions.
Hardware Specifications
Our tabletop setup features a Plug-n-Play System designed for rapid deployment and High Accuracy Calibration, featuring a cage-like structure instrumented with:
- 8x RealSense D435 Cameras: Providing volumetric RGB-D capture from widely spaced angles to minimize occlusion.
- 8x Optitrack Prime 13 X Cameras: Capturing precise, ground-truth 6DoF poses of objects using IR markers.
- 4x Spotlights: High-intensity lights with diffusers to ensure uniform, controlled illumination.
The “Rolling Shutter” Challenge
A critical technical hurdle in multi-view interaction capture is motion blur.
- Problem: While the IR sensors use a global shutter, the RealSense D435 color sensors employ a rolling shutter. In standard indoor lighting, this results in significant motion blur during fast hand movements, degrading 3D reconstruction quality.
- Solution: We utilize high-intensity spotlights to flood the scene, allowing us to reduce the exposure time to a fixed 0.2 ms. This aggressive exposure setting effectively duplicates a global shutter’s sharpness, eliminating blur while maintaining stable frame rates.
Data Synchronization
To ensure precise temporal alignment across modalities, all sensors operate within a unified temporal framework:
- RGB-D Data: Recorded at variable rates between 30-90 FPS, depending on stream configuration.
- Optitrack Data: Captured at a high-speed 120 FPS.
Global Time Synchronization: Each sensor publishes its data stamped with a synchronized global time, established during system startup. To handle potential connection drops, the system supports dynamic ROS node re-registration, ensuring seamless reintegration without sync loss.
Post-Processing: For consistency in downstream learning tasks, the high-frequency streams are binned and downsampled to a stable framerate between 15-30 FPS.
Academic Impact & Usage
The ADL4D Recording Studio is not just a research benchmark but a living lab. Its plug-n-play nature and high-accuracy calibration have made it a central tool for education and research at TUM.
- Research: Utilized in multiple Student Theses and PhD projects for data acquisition and algorithm testing.
- Education: Full-time use in the Project Lab Human Activity Understanding. Students use the rig to capture own datasets and test novel computer vision algorithms.