Summary: Detecting Anomalies using Machine Learning
Parameter | Value |
---|---|
Name | Detecting Anomalies in Wafer Manufacturing |
Labeled | Yes |
Time Series | No |
Simulation | No |
Missing Values | No |
Dataset Characteristics | Multivariate |
Feature Type | Real |
Associated Tasks | Classification, Anomaly Detection |
Number of Instances | INA |
Number of Features | INA |
Date Donated | INA |
Source | Kaggle |
Detecting Anomalies can be a difficult task, especially in the case of labeled datasets due to some level of human bias introduced while labeling the final product as anomalous or good. These giant manufacturing systems need to be monitored every 10 milliseconds to capture their behavior, which brings in lots of information and what we call the Industrial IoT (IIOT). Also, hardly a manufacturer wants to create an anomalous product. Hence, the anomalies are like a needle in a haystack which renders the dataset that is significantly imbalanced.
Capturing such a dataset using a machine learning model and making the model generalize can be fun. In this competition, we bring such a use-case from one of India's leading manufacturers of wafers(semiconductors). The dataset collected was anonymized to hide the feature names, and there are 1558 features that would require some serious domain knowledge to understand them.
However, in the era of Deep Learning, we are challenging the data science community to come up with an anomaly detection model that can generalize well on the unseen set of data (Test data). In this hackathon, you will be creating a machine learning/deep learning model to classify the anomalies correctly using Area under the curve (AUC) as the metric.
Wafer manufacturing, Sensor data, Defect detection, Anomaly detection, Manufacturing quality