Synthetic Data Generation for Intelligent Inspection of Structural Environments

Noshin Habib, University of Texas at El Paso


Automated detection of cracks and corrosion in pavements and industrial settings is essential to a cost-effective approach to maintenance. Deep learning has paved the path for vast levels of improvement in the area. Such models require a plethora of data with accurate ground truth and enough variation for the model to generalize to the data, which is not widely available. There has been recent progress in computer graphics being used for the creation of synthetic data to address the issue of deficient data availability, but it is limited to specific objects, such as cars and human beings. Textures and deformities within such objects are left unexplored. This study introduces an approach to synthetically produce a dataset of pavement images with cracks and a dataset of industrial images with corrosion using Unreal Engine 5, a 3D Computer Graphics Gaming Engine. For both datasets, a novel annotation technique is used to provide labels with pixel-level detail. For the feasibility of use with object detection algorithms, a python code is created for bounding box derivation of the segmented ground truth. The aim of the datasets is not to fully replace a real dataset altogether, but to save the time and resources that would be required to gather enough images with high levels of variety, without the need for manual annotation. The virtual datasets are trained in combination with real data and are evaluated using the deep learning framework You Only Look Once (YOLOv4). The datasets are also tested on real data to show the transferability of learning from synthetic data to real-world applications. The datasets will be publicly available so that they can easily be alteredfor the needs of the user. This work provides evidence suggesting that (i) the creation of publicly available synthetic data using open-source gaming engines does not have to be limited to large objects and can significantly cut down on time and resources needed for accurately labeled data, and (ii) training on virtual data improves performance on detectingcracks and corrosion.

Subject Area

Mechanical engineering|Artificial intelligence

Recommended Citation

Habib, Noshin, "Synthetic Data Generation for Intelligent Inspection of Structural Environments" (2022). ETD Collection for University of Texas, El Paso. AAI30000344.