Operationalizing Machine Learning

Project Overview

Project majorly based on use of AutoML studio for model train and deploying as web service on swagger and same tasks using Python SDK by creating ML Pipeline.

The first part consists of creating a machine learning production model using AutoML in Azure Machine Learning Studio, and then deploy the best model and consume it with the help of Swagger UI using the REST API endpoint and the key produced for the deployed model.

The second part of the project is following the same steps but this time using Azure Python SDK to create, train, and publish a pipeline. For this part, I am using the Jupyter Notebook provided. The whole procedure is explained in this README file and the result is demonstrated in the screencast video. For both parts of the project. The data is related with direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to predict whether the client will subscribe a bank term deposit. The result of the prediction appears in column y and it is either yes or no.

Architectural Diagram

Key Steps

The key steps of the project are described below:

1. Authentication:

   This step omitted since it could not be implemented in the lab space provided by Udacity,
   because I am not authorized to create a security principal. However, I am still mentioning it here as
   it is a crucial step if one uses their own Azure account.

2. Automated ML Experiment:

   At this point, security is enabled and authentication is completed. This step involves the creation of
   an experiment using Automated ML, configuring a compute cluster, and using that cluster to run the
   experiment.

Registered Dataset

Registered dataset detailed view

Auto ML experiment configurations

Automl Experiment completed

Best model

3. Deploy the Best Model:

   After the completion of the experiment run, a summary of all the models and their metrics are shown, including
   explanations.The Best Model will appear in the Details tab, while it will appear first in the Models tab. This
   is the model that should be selected for deployment. Its deployment allows to interact with the HTTP API service 
   and interact with the model by sending data over POST requests.

Best model metrics

4. Enable Logging:

  After the deployment of the Best Model, I enabled Application Insights and retrieve logs.

Details tab of Endpoint showing application insights enabled

Running logs script

5. Swagger Documentation:

    This is the step where the deployed model will be consumed using Swagger. Azure provides a Swagger JSON file for deployed models.
    We can find the deployed model in the Endpoints section, where it should be the first one on the list.

Running swagger script

Swagger response and methods

6. Consume Model Endpoints:

   Once the model is deployed, I am using the endpoint.py script to interact with the trained model. I run the script
   with the scoring_uri that was generated after deployment and -since I enabled Authentication- the key of the service. 
   This URI is found in the Details tab, above the Swagger URI.

Endpoint script run

7. Create and Publish a Pipeline:

   In this part of the project, I am using the Jupyter Notebook with the same keys, URI, dataset, cluster, and model names already 
   created.

Pipeline created

Pipeline Endpoint

Published pipeline overview

Jupyter Notebook showing Use RunDetails Widget

In ML Studio scheduled run

8. Documentation:

  The documentation includes:
  
  1. the screencast that shows the entire process of the working ML application.
  2. this README file that describes the project and documents the main steps.

Screen Recording

Screen recording

Standout Suggestions

I explored bank marketing dataset to understand better features and granularity of data, found high class imbalance between two classes which can impact model performance.
Use of deep learning in automated ml model training: explored option of using deep learning for model traing but in community discusson I found not to use that option.

AnshuTrivedi / Project-2-Operationalizing-Machine-Learning