Abstract:
The study of material corrosion is an important research area, with corrosion degradation of metallic structures causing expenses up to 4% of the global domestic product annually along with major safety risks worldwide. Unfortunately, large-scale and timely scientific discovery of materials has been hindered by the lack of standardized corrosion experimental data in the public domain for developing machine learning models. Obtaining such data is challenging due to the expert knowledge and time required to conduct these scientific experiments and assess corrosion levels. We curate a novel dataset consisting of 600 images annotated with expert corrosion ratings obtained over 10 years of laboratory corrosion testing by material scientists. Based on this data set, we find that non-experts even when rigorously trained with domain guidelines to rate corrosion fail to match expert ratings. Challenges include limited data, image artifacts, and millimeter-precision corrosion. This motivates us to explore the viability of deep learning approaches to tackle this benchmark classification task. We study (i) convolutional neural networks powered with rich domain-specific image augmentation techniques tuned to our data, and (ii) a recent self-supervised representation learning approach either pretrained on ImageNet or trained on our data. We demonstrate that pretrained ResNet-18 and HR-Net models with tuned augmentations can reach up to 0.83 accuracy. With this corrosion data set, we open the door for the design of more advanced deep learning models to support this real-world task, while driving innovative new research to bridge computer vision and material innovation. Our data and code are available at: https://arl.wpi.edu