A2Z-10M+: Geometric Deep Learning with A-to-Z BRep Annotations for AI-Assisted CAD Modeling and Reverse Engineering

Pritham Kumar Jena1,2 Bhavika Baburaj1,2 Tushar Anand2 Vedant Dutta2
Vineeth Ulavala2 Sk Aziz Ali1,2

13D Vision Group (3DVG) 2BITS Pilani, Hyderabad, India

IEEE CVPR 2026

arXiv Paper Code 🗃️ Dataset Explorer 📄 Poster
A2Z-10M Teaser

A2Z-10M+: The largest multi-modal annotation resource for BRep learning, unifying 10 million annotations across 1 million ABC CAD models. By tightly integrating high-resolution scan-like meshes, 3D hand-drawn sketches, explicit geometric–topological BRep descriptors (co-edges, corners, surfaces), and rich mechanical-language captions and tags, A2Z bridges raw geometry with semantic understanding. The scale, diversity, and structural depth of A2Z unlock an unprecedented foundation for next-generation representation learning, cross-modal reasoning, and intelligent X-to-CAD systems.

01

Teaser

02

Contribution

We propose A2Z-10M+, The largest dataset with multimodal annotations to advance X-to-BRep representation learning and precision engineering for numerous applications like Product Design, Retrieval, Language-guided CAD retrival genration within eXtended Reality (XR) environment, and many others. Our main contributions are:

  1. Highly realistic 3D Scan pairs of 1 million low Poly CAD models under ABC dataset.
  2. Multi-Level CAD Sketches obtained by simulating different skill levels of drawing artists. Realistic deformation, tapering, opennings, hand-drawing wobble, curve openning, and stroke patterns are applied on the Normalized arc-length parameterization of BRep co-edge curves.
  3. BRep Annotations on Scans and Sketches are trasferred with precision using multi-layered distance transforms between noisy scan and CAD (.STEP) .
  4. Textual Caption and Tags generated using BRep metadata injection to a jury system of multiple Large Vision Language Models (Qwen3 and InternVL3). These 1 million textual captions and tags allows retrieval through system-level information, functional intents, forms and usability as input query.
  5. Foundation Model for BRep Boundary and Corner Detection is trained on two on 300K data samples that out performs state-of-the-art methods by large margin. The model is tested on numerous first-had industry grade scan data for precision engineering.

A2Z-10M+ Dataset Overview
03

Data and Annotation Visualizer

Scan Annotated by BRep Features: CAD trimesh vertices colored by BRep face membership. Toggle between face partition, boundary curves (per edge ID), wire loops, and corner/junction overlays. Drag to rotate • scroll to zoom • right-drag to pan.
Select a model to begin…
◆ Face
∿ Boundary
💬 Language
Base
Annotated
3D Sketches: Multi-level 3D hand-drawn strokes simulating five skill levels — from beginner wobble to expert precision. Each level applies progressive deformation, tapering, curve openings, and stroke patterns on arc-length parameterized BRep co-edges. Drag to rotate • scroll to zoom.
Select a model to begin…
2D Sketches: Projected 2D representations of 3D BRep geometry with full annotation support for cross-modal reasoning, retrieval, and AI-assisted reverse engineering tasks.

Interactive Visualizer — Coming Soon

Tags and Caption: Rich mechanical-language captions and semantic tags bridging raw geometry with natural language understanding, enabling cross-modal retrieval, language-guided CAD generation, and XR-based design workflows.
Turntable with Caption
04

Results

Quantitative Results: Benchmark evaluation metrics comparing A2Z-10M+ trained models against state-of-the-art methods on BRep boundary detection, corner detection, and cross-modal retrieval tasks.
Dataset Statistics 1
Dataset Statistics 2
Dataset Statistics 3
Dataset Statistics 4
Caption Tags 1
Caption Tags 2
Caption Tags 3
Caption Tags 4
Caption Tags 5
Caption Tags 6
Caption Tags 7
Caption Tags 8
Caption Tags 9
Caption Tags 10
Face/Edge Statistics 1
Face/Edge Statistics 2
Face/Edge Statistics 3
Face/Edge Statistics 4
05

Acknowledgement

This project was primarily funded by BITS Pilani Hyderabad’s NFSG Grant (Reference N4/24/1033). We are thankful to Vinci4D.ai for providing some CAD model samples of electronic enclosures.

06

Citation

If you find our work useful, please cite:

@inproceedings{AliA2ZCVPR26, title = {A2Z-10M+: Geometric Deep Learning with A-to-Z BRep Annotations for AI-Assisted CAD Modeling and Reverse Engineering}, author = {Jena, Pritham Kumar and Baburaj, Bhavika and Anand, Tushar and Dutta, Vedant and Ulavala, Vineeth and Ali, Sk Aziz}, booktitle = {AxXiv}, year = {2026} }
07

Get in Touch

If you have Questions/Suggestions/Interest to share, Tell Us.