Ekkohng / industrial-ml-datasets

A curated list of datasets, publically available for machine learning research in the area of manufacturing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Industrial ML Datasets

The following is a curated list of datasets, publically available for machine learning research in the area of manufacturing.

For more information, please check our corresponding publication:

@inproceedings{jourdan_machine_2021,
 title = {Machine {Learning} for {Intelligent} {Maintenance} and {Quality} {Control}: {A} {Review} of {Existing} {Datasets} and {Corresponding} {Use} {Cases}},
 volume = {2},
 journal = {Proceedings of the 2nd Conference on Production Systems and Logistics},
 author = {Jourdan, Nicolas and Longard, Lukas and Biegel, Tobias and Metternich, Joachim},
 year = {2021},
}

Some additional datasets may be found here: Link

βœ”οΈ indicates a preset split between training and testing data.

🌐 indicates, that the test set labels are hidden behind an evaluation server.

Predictive Maintenance and Condition Monitoring

Name Year Feature Type Feature Count Target Variable Instances Official Train/Test Split Data Source Format License Access
Wood veneers before and after drying
This dataset consists of 2579 image pairs (5158 images in total) of wood veneers before and after drying.
2021 Image >4000x4000 - 5.158 ❌ Real PNG CC BY 4.0 Link
Diesel Engine Faults Features
Fault detection based on pressure curves and vibration.
2020 Signal 84 C (4) 3.500 ❌ Synthetic MAT CC BY 4.0 Link
Degradation of a Cutting Blade
Wrapping machine process data over 12 months with a degrading cutting tool.
2019 Signal 9 C (8) / R 1.062.912 ❌ Real CSV CC BY-SA 3.0 Link
CNC Mill Tool Wear
CNC process data of wax milling with worn/unworn tools.
2018 Signal 48 C (3*2) 25.286 ❌ Real CSV CC0: Public Domain Link
Condition Monitoring of Hydraulic Systems
Test rig process data of multiple load cycles with various fault types and severity levels.
2018 Signal 17 C (5*(24)) 2.205 ❌ Real Non-Standard ? Link
Production Plant Data for Condition Monitoring
nonymized process data of component run-to-failure experiments.
2018 Signal 26 - 228.414 ❌ Real CSV CC BY-SA 3.0 Link
Versatile Production
Popcorn production process data with multiple process steps.
2018 Signal 5-85 - 80.000 ❌ Real CSV CC BY-NC-SA 4.0 Link
Degradation Measurement of Robot Arm Position Accuracy
Target- and actual values of robotic arm tool position, velocity and current for health assessment.
2017 Signal 73 - 155.000 ❌ Real CSV ? Link
APS Failure at Scania Trucks
Anonymized counters and histograms for air pressure system fault detection.
2016 Signal 170 C (2) 76.000 βœ”οΈ Real CSV GNU General Public License Link
Maintenance of Naval Propulsion Plants
Gas turbine process data for component decay state prediction.
2016 Signal 16 R 11.934 ❌ Synthetic Non-Standard
More Information Use of this dataset in publications must be acknowledged by referencing the following publication:
A. Coraddu, L. Oneto, A. Ghio, S. Savio, D. Anguita, M. Figari, Machine Learning Approaches for Improving Condition?Based Maintenance of Naval Propulsion Plants, Journal of Engineering for the Maritime Environment, 2014, DOI: 10.1177/1475090214540874, (In Press)
Link
Plant Fault Detection
Anonymized process data for plant fault detection.
2015 Signal 10 C (6) 8.938.370 ❌ Real CSV ? Link
Asset Failure and Replacement
Anonymized data for asset fault detection.
2014 Signal 1 C (2) 447.341 βœ”οΈ 🌐 Real CSV ? Link
Maintenance Action Recommendation
Anonymized process and maintenance data of an industrial asset for maintenance action recommendation.
2013 Signal 32 C (14) 2.097.152 βœ”οΈ 🌐 Real CSV ? Link
Anemometer Fault Detection
Anemometer measurements for fault detection.
2011 Signal 16
16-20
- 345.700
208.800
βœ”οΈ 🌐 Real Non-Standard ? Link
Gearbox Fault Detection
Test rig accelerometer data for fault detection.
2009 Signal 3 - > 10 Mio. ❌ Real CSV ? Link
Li-Ion Battery Aging
Battery test rig data during charge and discharge cycles for degradation detection.
2008 Signal 12 - 2.167 ❌ Real MAT N/A Link
Turbofan Engine Degradation Simulation
C-MAPSS simulation sensor data of various conditions and fault modes.
2008 Signal 26 - 262.256 βœ”οΈ Synthetic Non-Standard ? Link
Bearing
Bearing test rig accelerometer data of run-to-failure experiments.
2007 Signal 4-8 - 61.440 ❌ Real CSV ? Link
Milling
Milling process- and external sensor data for tool wear detection.
2007 Signal 13 R 1.503.000 ❌ Real MAT ? Link
CWRU Bearing Data
Bearing test rig accelerometer data for fault detection.
n.A. Signal 5 C (2) > 10 Mio. ❌ Real MAT ? Link

Process Monitoring

Name Year Feature Type Feature Count Target Variable Instances Official Train/Test Split Data Source Format License Access
Skoltech Anomaly Benchmark (SKAB)
Time-series data from water circulation loop testbed for evaluating anomaly detection algorithms.
2020 Signal 8 C (2) 34Γ—1,200 βœ”οΈ Real CSV GNU GPL v3.0 Link
High Storage System Anomaly Detection
Storage test rig process data for anomaly detection.
2018 Signal 20 C (2) 91.000 ❌ Synthetic CSV CC-BY-NC-SA 4.0 Link
Genesis Pick-and-Place Demonstrator
Material sorting test rig process data for anomaly detection.
2018 Signal 23 C (3) 32.440 ❌ Real CSV CC-BY-NC-SA 4.0 Link
Tennessee Eastman Process Simulation Dataset
Simulated chemical process data for anomaly detection with different fault types.
2017 Signal 51 C (21) / R > 10 Mio. βœ”οΈ Synthetic RData
More Information The person who owns, created, or contributed a work to the data or work provided here dedicated the work to the public domain and has waived his or her rights to the work worldwide under copyright law. You can copy, modify, distribute, and perform the work, for any lawful purpose, without asking permission.
Link
Robot Execution Failures
Force and torque measurements of an industrial robot with different erroneous operating conditions.
1999 Signal 89 C (13) 463 ❌ Real Non-Standard ? Link
Mechanical Analysis
Vibration measurements of electromechanical devices with different erroneous operating conditions.
1990 Signal 7 C (6) 209 βœ”οΈ Real MAT ? Link
CWRU Bearing Data Bearing test rig accelerometer data for anomaly detection. n.A. Signal 5 C (2) > 10 Mio. ❌ Real MAT ? Link

Predictive Quality and Quality Inspection

Name Year Feature Type Feature Count Target Variable Instances Official Train/Test Split Data Source Format License Access
Casting Product Quality Inspection
Grayscale images of pump impeller castings with and without defects.
2020 Image 300x300
512x512
C (2) 7.348 βœ”οΈ Real JPG CC-BY-NC-ND 4.0 Link
GC10-DET Defect Location for Metal Surface
Grayscale images of metal surfaces with various defect types and corresponding bounding box annotations.
2020 Image Varying C (10) 3.570 ❌ Real JPG, XML ? Link
Mechanic Component Images
Grayscale images of air conditioner pistons with various defect types.
2020 Image 86x90 C (3) 285 ❌ Real PNG ? Link
Multi-Stage Continuous Flow Process
Anonymized process data of a production line with quality measurements of part dimensions.
2020 Signal 116 - 14.088 ❌ Real CSV ? Link
Plastic Extrusion Defects
Process data of a plastic extrusion process.
2020 Signal 470 - 226.536 ❌ Real CSV CC BY-NC-ND 4.0 Link
AITEX
Grayscale images of textile fabrics with various defect types and corresponding segmentation masks.
2019 Image 4096x256 C (13) 245 ❌ Real PNG, Mask ? Link
Deep PCB
Grayscale images of circuit boards with various defect types and corresponding bounding box annotations.
2019 Image 640x640 C (7) 1.500 βœ”οΈ Real JPG, Mask only for research purpose Link
Severstal Steel Defect Detection
Grayscale images of steel surfaces with various defect types and corresponding segmentation polygons.
2019 Image 1600x256 C (5) 18.074 βœ”οΈ 🌐 Real JPG, CSV ? Link
Turning Dataset for Chatter Diagnosis
Sensory data of a turning test rig and varying strengths of chatter.
2019 Signal 8 C (4) > 10 Mio. ❌ Real MAT CC BY 4.0 Link
Magnetic Tile Defect
Grayscale images of magnetic tile surfaces with various defect types and corresponding segmentation masks.
2018 Image 248x373 C (6) 1.344 ❌ Real JPG, PNG ? Link
TIG Welding
Grayscale images of a welding process with various defect types.
2018 Image 800x974 C (6) 33.254 βœ”οΈ Real PNG, JSON CC BY-SA 4.0 Link
Mining Process
Process data of a mining process for impurity prediction in ore concentrate.
2017 Signal 24 R 737.454 ❌ Real CSV CC0: Public Domain Link
Bosch Production Line Performance
Anonymized process data of production lines with and without defects.
2016 Signal 4264 C (2) 2.368.435 βœ”οΈ 🌐 Real CSV ? Link
WM811K Wafer Maps
Defect matrices of semiconductor wafers with various defect types.
2014 2D Defect Matrix Varying C (9) 811.457 ❌ Real MAT ? Link
NEU Surface Defect Database
Grayscale images of metal surfaces with various defect types and corresponding bounding box annotations.
2013 Image 200x200 C (6) 1.800 ❌ Real BMP, XML ? Link
Steel Plate Faults
Geometric measurements of steel plates with various defect types.
2010 Signal 27 C (7) 1.941 ❌ Real CSV ? Link
HCI Industrial Optical Inspection
Synthetic grayscale images of textured surfaces with corresponding defect ellipses.
2007 Image 512x512 C (2) 16.100 βœ”οΈ Synthetic PNG, Non-Standard ? Link

Process Parameter Optimization

Name Year Feature Type Feature Count Instances Official Train/Test Split Data Source Format License Access
Laser Welding
Process parameter recordings for correlation with weld quality indicators such as weld depth and geometrical dimensions.
2020 Signal 13 361 ❌ Real XLS CC BY 4.0 Link
3D Printer
Process parameters of a 3D printer for correlation with print quality indicators such as roughness, tension and elongation.
2018 Signal 12 50 ❌ Real CSV ? Link
Tool Path Generation
Shape deviation measurements and corresponding simulated cutting conditions.
2018 Signal 9 4.968 ❌ Real CSV CC BY 4.0 Link
Mercedes-Benz Greener Manufacturing
Car feature configurations to be correlated with the required test time of the configurations.
2017 Signal 378 8.420 βœ”οΈ 🌐 Real CSV ? Link
SECOM
Semiconductor process measurements and corresponding yields for determination of key factors to yield.
2008 Signal 591 1.567 ❌ Real Non-Standard ? Link

About

A curated list of datasets, publically available for machine learning research in the area of manufacturing