Challenges in Pointcloud Vectorisation & Scan2BIM

Role: Machine Learning Engineer
Focus: Scan-to-BIM, Large Scale Pointcloud Processing, 3D VLMs

The Problem: From Raw Scans to Structured BIM

The construction industry faces a massive bottleneck: converting raw digital twins into usable data.

The Bottleneck: Digital twins, captured via LiDAR, result in massive, unstructured point clouds. Converting these into Industry Foundation Classes (IFC) for Building Information Modeling (BIM) is a manual, slow, and error-prone process.
Technical Challenges:
- Class Imbalance: Structural elements like walls and floors dominate 90% of the scene, while critical elements like pipes or valves are rare.
- Geometric Ambiguity: In noisy data, distinguishing a square column from a rectangular pipe require context beyond local geometry.
- Scale: A single building scan can exceed terabytes of data, far surpassing standard GPU memory limits.

Transitioning from noisy, unstructured LiDAR data (Left) to structured, semantic BIM models (Right).

The core vectorisation engine converts massive point clouds into CAD primitives.

Segmentation: We deployed Point Transformer V3 (Ptv3) and Sonata to handle semantic segmentation of massive point sets.
Optimization: I led the engineering effort in loss function engineering. We reformulated 3D DETR losses with keypoint regression, specifically solving the “rotation regression” problem for thin, high-aspect-ratio objects like walls and beams.

This pipeline is fed by cutting-edge research modules that injected “common sense” into the geometry.

VLM Integration: pioneered the integration of LLama 3.2 and Qwen2 directly with 3D backbones. This allows the model to leverage “common sense” reasoning to resolve geometric ambiguities (e.g., “a pipe usually connects to a wall”).

The Dual-Pipeline Architecture: Stable Engineering fed by Experimental Research.

My role required a rigorous scientific process to bridge the gap between academic theory and production-grade stability.

Sourcing & Evaluation: Actively monitored SOTA research streams (e.g., SpatialLM, SceneScript, Locate3D), assessing new papers weekly for their theoretical applicability to our specific domain of dense, noisy industrial scans.
Adaptation (Reverse Engineering): Translated theoretical architectures into custom internal codebases. This involved deeply reverse-engineering academic repositories to decouple core innovations from dataset-specific hacks.
Testing & Validation: Implemented robust testing frameworks using publicly released checkpoints to establish baselines before internal training.
Scaling & Confirmation: Conducted supervised tuning of promising models on A100 clusters. We empirically determined utility at scale before green-lighting models for integration into the production pipeline.
Visualization: Maintained transparency with internal stakeholders by tracking experiments and results via ClearML, Rerun, and Open3D.

A successfully maintained Pointcloud-to-Layout algorithm that leveraged segmentation, instancing, and primitive vectorisation, featuring constant updates to models and algorithms as the customer set expanded to reveal more data and edge cases.
Constant innovation through internal evaluation of new prototypes and testing in alpha builds to continuously improve product outputs.