CVPR 2024 · Highlight 🤩

CAD-SIGNet
CAD Language Inference from Point Clouds
using Layer-wise Sketch Instance Guided Attention

Mohammad Sadil Khan¹ · Elona Dupont¹ · Sk Aziz Ali^1,2 · Kseniya Cherenkova^1,3 · Anis Kacem¹ · Djamila Aouada¹

¹SnT, University of Luxembourg · ²DFKI AV Group · ³Artec3D

Figure: Full design history recovery from an input point cloud (top-left) and CAD-SIGNet user interaction (bottom-left and right).

01 / Contribution

What We Propose

We propose CAD-SIGNet, an end-to-end trainable and auto-regressive architecture to recover the design history of a CAD model represented as a sequence of sketch-and-extrusion from an input point cloud. Our model learns visual-language representations by layer-wise cross-attention between point cloud and CAD language embedding.

End-to-End Auto-Regressive Network

An end-to-end trainable auto-regressive network that infers CAD language given an input point cloud.

Architecture

Multi-Modal Transformer Blocks

Multi-modal transformer blocks with a mechanism of layer-wise cross-attention between point cloud and CAD language embedding.

Transformer

Sketch Instance Guided Attention

An SGA module which guides the layer-wise cross-attention mechanism to attend on relevant regions of the point cloud for predicting sketch parameters.

SGA Module

02 / Architecture

Method Overview

Figure: Method Overview. CAD-SIGNet (left) is composed of \(\mathbf{B}\) Multi-Modal Transformer blocks, each consisting of an \(\operatorname{LFA}\) module to extract point features \(\mathbf{F}_{b}^v\), and an \(\operatorname{MSA}\) module for token features \(\mathbf{F}_{b}^c\). An SGA module (top right) combines \(\mathbf{F}_{b}^v\) and \(\mathbf{F}_{b}^c\) for CAD visual-language learning. A sketch instance (bottom right), \(\mathbf{I}\), obtained from the predicted extrusion tokens is used to apply a mask \(\mathbf{M}_{\text{sga}}\) during the cross-attention in SGA module to predict sketch tokens.

03 / Results

Visual Results

We evaluated CAD-SIGNet on two reverse engineering scenarios:

Design History Recovery — tested on DeepCAD, CC3D, and Fusion360 (cross-dataset).
Conditional Auto-Completion from User Input

For scenario (1), DeepCAD is used as baseline. For scenario (2), SkexGen and HNC have been used.

Task Description: Given an input point cloud, the task is to infer the CAD design sequence.
Note: All models are trained on DeepCAD dataset.

Task Description: This task consists of recovering the ground-truth CAD construction history given a complete point cloud and a partial CAD sequence. All models are trained on DeepCAD and tested on the same dataset.

04 / Video

Paper Video

Click to watch

05 / Acknowledgement

Acknowledgement

The present project is supported by the National Research Fund, Luxembourg under the BRIDGES2021/IS/16849599/FREE-3D, IF/17052459/CASCADES and Artec3D.

06 / Citation

Cite Our Work

If you find this work useful, please consider citing:


@InProceedings{Khan_2024_CVPR,

  author    = {Khan, Mohammad Sadil and Dupont, Elona and Ali, Sk Aziz and Cherenkova, Kseniya and Kacem, Anis and Aouada, Djamila},

  title     = {CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention},

  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},

  month     = {June},

  year      = {2024},

  pages     = {4713-4722}

}

CAD-SIGNet CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention