This is a real-world client-facing project that consists of data analysis, visualization, machine learning, and automated pattern recognition using time series data from measurements of air quality sensors.
- EDA: This notebook consist of my EDA work, together with some data visualization. It also includes data preprocessing and saves the processed data as a
pkl
file, to be used in the later notebooks. - Point Anomaly Detection: This notebook show how to detect the point anomalies (outliers) and automatically flags them.
- Collective Anomaly Detection: This notebook shows how to detect the collective anomalies (patterns) by training Regression models using sliding window and forward chaining on the time series data.
- Clustering: This notebook shows how to use the clustering methods (K-Means and DBSCAN) to group the data into different events. It shows strength and weakness of each method.