SLICE-100K: A Comprehensive Resource for the Future of 3D Printing and AI Integration

SLICE-100K, introduced by researchers at Iowa State and NYU, is a groundbreaking dataset of over 100,000 G-code files and CAD models, aimed at advancing 3D printing and digital manufacturing. This comprehensive resource facilitates the development of multimodal foundation models and addresses significant gaps in additive manufacturing research.

CoE-EDP, VisionRICoE-EDP, VisionRI | Updated: 10-07-2024 17:06 IST | Created: 10-07-2024 17:06 IST
SLICE-100K: A Comprehensive Resource for the Future of 3D Printing and AI Integration
Rrepresentative image

Researchers at Iowa State University and New York University have introduced SLICE-100K, a groundbreaking dataset aimed at advancing additive manufacturing, particularly extrusion-based 3D printing. The dataset includes over 100,000 G-code files paired with their corresponding tessellated CAD models, LVIS categories, geometric properties, and renderings. This comprehensive collection addresses a significant gap in the availability of large, curated datasets for additive manufacturing, facilitating the development of multimodal foundation models in digital manufacturing.

Building the Ultimate 3D Printing Dataset

SLICE-100K was built using triangulated meshes derived from the Objaverse-XL and Thingi10K datasets. The researchers demonstrated the utility of SLICE-100K by fine-tuning GPT-2 on a subset of the dataset for translating G-code from the legacy Sailfish format to the modern Marlin format. The dataset encompasses a diverse range of 3D printable objects and aims to encourage the research community to tackle new challenges in the design and manufacturing space. The creation of SLICE-100K involved a detailed data collection process, using STL models from Thingiverse and filtering models from Thingi10K based on specific criteria. Each model was accompanied by renderings, descriptive captions, and geometric properties, essential for understanding the structural characteristics of the models.

Innovating with G-code and AI

The G-code generation process utilized PrusaSlicer’s command line functionality, which ensured high-quality G-code suitable for various 3D printers and printing conditions. To enhance the diversity of G-code files with respect to structural properties, different infill patterns were randomly selected during slicing. These patterns included Gyroid, Honeycomb, Cubic, and Grid, each contributing unique mechanical properties and print times. STL renderings were generated using modified Blender scripts from Objaverse-XL, providing multiple views of each object to ensure comprehensive visual coverage. The renderings facilitated object category generation, enhancing the dataset's utility for research and development applications. In addition to G-code generation, the dataset included the categorization of models using LVIS categories. Multiple views of each CAD model were generated using Blender, and image embeddings were created using the pre-trained CLIP-ViT-L-14 model.

Breaking New Ground in G-code Translation

The researchers also explored G-code translation, focusing on translating G-code from Sailfish to Marlin. They developed methods for pre-processing G-code data to maintain context and created pairs of G-code segments for translation. The translation process involved splitting G-code layers into manageable chunks, ensuring the accurate representation of spatial semantics across different G-code flavors. The preprocessing methods included contour flipping and pair creation, which ensured high reliability and accuracy in G-code translation. Evaluations of existing large language models (LLMs) for G-code geometric transformations and flavor translation were conducted using SLICE-100K. The study found that models like GPT-3.5, GPT-4, Claude-2, Llama-2-70b, and Starcoder had varying degrees of success in performing scaling operations and G-code translation. The evaluation metrics included an image-space IoU metric for G-code fidelity, which quantified the accuracy of G-code generation models by comparing rendered images of G-code layers.

The Future of Digital Manufacturing

The G-code renderer, a Python-based tool developed by the researchers, generated top-down renderings of individual layers, enabling a detailed examination of the 3D structure. This approach was essential for capturing all relevant parts of the shape, including internal structural supports not visible from the outside. The researchers found that finetuning even small subsets of SLICE-100K significantly enhanced G-code translation abilities. For instance, finetuning on just five shapes led to substantial improvements in translation quality, highlighting the dataset's effectiveness in reducing the complexity of translation tasks to simple local mappings. SLICE-100K aims to serve as a foundational resource for future innovations in manufacturing, paving the way for domain-specific foundation models. The dataset's comprehensive nature and multimodal approach make it a valuable tool for advancing research in digital manufacturing. Despite its advancements, SLICE-100K has certain limitations. One major challenge is the difficulty in verifying the LVIS categories for the models. Additionally, all models in SLICE-100K were sliced only along the default Z-direction. This uniform slicing approach may limit the dataset’s applicability for research into multi-directional slicing techniques and their impact on manufacturing outcomes. Addressing these limitations in future versions will be crucial for further enhancing the dataset's utility and broadening its application scope. By providing a robust, curated dataset, SLICE-100K represents a significant step forward in the integration of AI and additive manufacturing, enabling the development of smarter, more efficient manufacturing processes.

  • Devdiscourse
Give Feedback