Case: Machine Learning with Energy datasets

Summary:

As interest in IOT and sensors pick up steam, companies are trying to build algorithms and systems to understand consumer behavior to help them make better decisions. One such application is energy modeling. Though, most consumers are aware of their aggregate consumption of energy, few are aware of how and where energy is consumed. With increasing sensors in equipment, it is becoming easier to find out which equipment/instruments consume the most power. AdaptiveAlgo Systems Inc. works on solutions to build algorithms and platforms to address energy modeling challenges. The company is putting together a solution for energy modeling and is interested in understanding consumer energy usage and the attributes that contribute to appliance energy usage. The data scientists there came across a recent paper and dataset and are interested in building various machine learning models that could contribute to understanding energy usage by appliances and the attributes that contribute to aggregate energy usage. With the knowledge of energy consumed by various equipment, seasonality and attributes like temperature and humidity, a machine learning model could be used to predict aggregate energy use

Introduction

This case study is divided into eight parts, work of which is kept in it's respective directory. Following are the contents of the case study:

Part	Description
Part 1: Research Paper Review	Review 3 papers and provide a Jupyter Notebook for each paper.
Part 2: Exploratory Data Analysis	Conduct EDA using Python libraries (plotly, seaborn, matplotlib etc.). Provide a PowerPoint Report with graphs and key insights.
Part 3: Feature Engineering	Conduct thorough feature analysis and use pre-processing techniques to make the data usable.
Part 4: Prediction Algorithms	Try Linear Regression, Random Forest, Neural Networks to build prediction models in using sklearn in Pyhthon. Compute RMS, MAPE, R2 and MAE for Training and Testing Datasets. Recommend a model.
Part 5: Feature Selection	Understand the importance of the various features and how the features influence the output. Explore tpot, featuretools, Boruta, tsfresh.
Part 6: Model Validation and Selection	Understand hyperparameter tuning and model validation prior to model selection for production.
Part 7: Final Pipeline	Recommend a Final Model with reason. Automate the entire model from Data Ingestion to Final Model Prediction.
Part 8: Report & Model Development Methodology	Put together a comprehensive report discussing analysis in pdf.

Abstract

Exception Handling and Logging

INFO

Language Used : Python
Process Followed : Data Ingestion, Data Wrangling, Data Cleansing, Exploratory Data Analysis
Tools Used : Jupyter Notebook, boto 3, boto, Amazon S3 bucket

For further Details please refer the Assignment2_Documentation.pdf file

priyankagagneja / Machine-Learning-with-Energy-Dataset