cv

Professional Experience. Click to expand sections.

Basics

Name Partha Pratim Nath
Label Machine Learning Engineer | 3D Pose Estimation and Reconstruction
Email nath.partha@outlook.com
Phone +49 15239522871
Url https://nath-partha.github.io
Summary

Bridging 3D Vision & Language at Scale

I am a Machine Learning Engineer specializing in 3D Perception Methods with over 2 years of experience refining prototypes from SOTA research. My work ranges from constructing the ADL4D dataset for complex human activity understanding to training large-scale 3D VLMs (SpatialLM, LLAMA3.2) on high-performance cloud clusters.

My technical foundation is built on three pillars: 3D human tracking, rigid/non-rigid object pose estimation, and point cloud scene reconstruction. I am currently focused on unifying these distinct fields to tackle the most complex challenges in spatial intelligence and embodied AI.

Key highlights

  • 🚀 Nvidia Inception Program
    Preparing experiments and team for the Nvidia-DGX-Inception Program(Accepted)
  • 🚀 3D Object Detection R&D
    Implementing Detr fundamentals to pointcloud detection tasks using local attention methods
  • 🚀 3D Vision-Language Models
    Prototyping and Reverse engineering Pointcloud VLMs from SOTA research
  • 📚 Teaching & Curriculum Design
    Project Lab Human Activity Understanding creation @ TUM
  • 🏆 Kurt Fischer Prize
    Markerless multi-subject hand tracking awarded the Kurt Fischer Prize

Work

  • 2024.12 - Present
    Machine Learning and Computer Vision Engineer
    Cirqular Pointcloud Analytics GMBH
    End-to-end R&D and MLOps for large-scale 3D VLM and Scan2BIM pipelines.
    • ML Ops & Distributed Training Infrastructure
      • Scaled pointcept training engine using standalone ZeRO optimizers and model sharding to resolve memory bottlenecks.
      • Evaluated DeepSpeed compilation strategies for Pointcept module incompatibilities and workarounds.
      • Re-architected training backend by wrapping HuggingFace Trainer, unifying distributed strategies with custom checkpointing and synchronisation for config/source files.
      • Deployed on-premise ClearML infrastructure for experiment tracking.
      • Evaluated external libraries for unified cloud resource provisioning and training jobs.
      • Integrated multi-TB scale datasets for preprocessing and training in segmentation and detection tasks
    • R&D: 3D Vision-Language & Object Detection
      • Integrated LLMs (LLama3.2, Qwen2) with 3D backbones (Ptv3, Sonata) and reverse-engineered SOTA (SpatialLM, Locate3D) to build custom VLM training pipelines on A100 clusters.
      • Developed and ablated 3D-Detr/Roomformer architectures. Tested 3D local attention encoders; reformulated losses to improve rotation regression for high-aspect-ratio objects.
      • Utilized 150k in GCP startup credits to scale data augmentation and model training experiments on A100 clusters.
    • Production Engineering & Scan2BIM
      • Implemented and containerized Lidar panoptic segmentation pipelines for Scan2BIM/Scan2CAD tasks, deploying robust models (Ptv3, Sonata) for Industry Foundation Classes.
      • Maintained production code and managed model updates for out-of-core segmentation pipeline.
    • Visualization & Strategy
      • Technical Communication: Visualized progress using Rerun, Open3D, and high-quality GRUT (Nvidia) renders for internal presentations and the Nvidia-DGX-Inception Program.
      • Collaboration: Contributed to core research objectives regarding superpoints, object detection, and 3D reconstruction.
  • 2023.09 - 2024.09
    System Engineer: Machine Learning and Computer Vision
    RevTec Systems AG: Casinos Austria International
    Object Detection and Tracking in RGBD images in on-prem Casino surveillance.
    • Real-time Surveillance System
      • Object Detection and Tracking in RGBD images in on-prem Casino surveillance.
      • Developed real-time surveillance software tracking currency, gestures, and equipment.
      • Reviewed and integrated external projects to handle camera calibration and drift stabilization.
      • Managed the CVML lifecycle and outreach for alpha customers (UK & ZA), providing rolling updates and leadership demos.
    • Customer Onboarding Toolkit
      • Built a customer onboarding toolkit using SAM and foundation models to generate customer-specific object models and datasets.
      • Improved legacy code and automated models preparation, successfully reducing installation timelines from 2 weeks to less than 5 days.
      • Evaluated and Integrated newer model compilation tools and edge devices for inference scaling options.
  • 2022.10 - 2023.05
    Software Engineering Intern
    Infineon Technologies AG
    Developed Machine Vision Software for Human Pose in 3D Camera.
    • 3D Machine Vision Development and data generation
      • Torch and opensource based detection and tracking | 3D Multiview Calibration.
      • Sensor data acquisition library to train gesture detection radar pipelines with cameras.
      • Designed scalable calibration routines for multi-camera setups and prototyped RGB-only multiview data acquisition.
  • 2021.06 - 2023.08
    Research and Teaching Assistant
    TUM Chair of Media Technology
    Awarded Kurt Fischer Prize for Markerless Motion Capture research.
    • Research: Markerless Motion Capture (Kurt Fischer Prize)
      • Created a novel markerless motion capture toolkit [RGB images]
      • Created a high fidelity human + object interaction dataset that outperformed previous contributions in pose diversity, accuracy and ability to robustly record very long sequences
      • Utilised deep learning pose estimation, 3D multiview algorithms and linear mathematical solving to robustly calculate 3D humans in view
      • Benchmark Tasks (3D Tracking, Hand Mesh Recovery, Hand Action Segmentation)
      • Featured: https://www.ce.cit.tum.de/en/lmt/home/ Slide 6
    • Teaching Assistantship + New Project Lab
      • Designed Course | Guided Projects | 3DML Topics | ICP . Camera Projection . Rendering | Demo Scripts
      • Course Link: https://www.ce.cit.tum.de/en/lmt/lehre/projektpraktikum-project-lab-human-activity-understanding/
    • Multicamera Studio Setup
      • Designed Multicamera Studio for RGBD streams with Realsense Sensors in a streamlined Setup
      • Low Latency | Extrinsics Calibration | 8-12 Cameras | Distributed ROS | Optitrack Integration
    • VR Simulation Tool
      • UE4 based VR simulation and photorealistic data capture tool built on https://sim2realai.github.io/UnrealROX/

Education

  • 2020.10 - 2023.09
    Master
    Technical University of Munich
    School of Computation, Information and Technology
    • Kurt-Fischer €1000 Prize
  • 2016.06 - 2020.05
    Bachelor
    SRM Institute Of Science & Technology
    Electronics and Communication Engineering
    • Project: Multispectral Optics Module for a firefighting robot
    • First Class with Distinction

Skills

3D Vision & VLMs
PointTransformerV3
Sonata
LLaMA3.2
Qwen2
CLIP
SpatialLM
Locate3D
Deep Learning Frameworks
PyTorch
Deepspeed
Huggingface
PyTorch3D
Detectron2
Pointcept
mmLabs
Production ML & Cloud Tools
GCP (A100)
AWS (L40, A100)
Docker
Multi-node Training
Model Deployment
3D Understanding
Scan2CAD
Point Cloud Segmentation
Point Mesh Loss Functions
ICP
SLAM
Multiview Geometry
Camera Calibration
Programming & Core Tools
Python
C++
OpenCV
Open3D
NumPy
Scikit-learn
Pytorch3d
Rerun
CVAT
OpenSource Datasets
*ADL4D[1.1M]
ScanNet++
Structured3D
H2O3D
DexYCB
SpatialLM
CV4AEC

Publications

Languages

English
Native
German
Conversational
Misc (French, Hindi, Bengali)
Beginner/Conversational

Interests

Robotics
Perception Stack
Safe Grasp/Interaction
Motion Capture
Marker / Markerless
Monocular and Multiview
Parametric and Non-Parametric
3D Scanning
Point/ Mesh Reconstruction
Point Focused Gaussian Splatting
Scan Vectorisation(Scan2CAD)
Drone/Autonomous Camera Tracking
Point/Object/Area Tracking
Ground Stabilization

References

Dr Rahul Gopal Chaudhari
TUM Senior Scientist
M.Sc. Marsil Zakour
TUM Doctoral Candidate
Maximilian Strobel
Infineon, System Architect Machine Learning
Michael Winking
Infineon, Staff Engineer