Hmsuks / PGPDS-IIIT-Bangalore

This repository consists of assignments and case studies which were taken during my PG Diploma in Data Science (2018-19)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PGPDS-IIIT-Bangalore

This repository consists of assignments and case studies which were taken during my PG Program in Data Science (2018-19)

Assignment1: Movie Data Set Analysis--> In this assignment,some interesting insights like top movies, top actors, top directors,and highest grossing collections were found out from few movies released between 1916 and 2016, using Python.

Assignment2: Startup Investment Analysis--> In this assignment three major analysis were done like Investment Analysis: Comparing the typical investment amountsin different investment types so that we can choose the type that is best suited for their strategy. ,Country Analysis: Identifying the countries which have been the most heavily invested in the past and Sector Analysis: Understanding the distribution of investments across the eight main sector.

Assignment3: Stock Market Analysis--> The dataset provided here has been extracted from the NSE website. The Stock price data provided is from 1-Jan-2015 to 31-July-2018 for six stocks Eicher Motors, Hero, Bajaj Auto, TVS Motors, Infosys and TCS. Using SQL all the datasets were combined , analysed and a UDF was developed that takes the date as input and returns the signal(Buy/Sell/Hold) for that particular day for the particulr stock.

Assignment4: Gramener Case Study--> The data given contains the information about past loan applicants and whether they ‘defaulted’ or not. The aim is to identify patterns which indicate if a person is likely to default, which may be used for taking actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc. In this case study, we will use EDA to understand how consumer attributes and loan attributes influence the tendency of default.

Case Study 1: Telecom Churn Case Study--> The business objective is to predict the churn in the last (i.e. the ninth) month using the data (features) from the first three months. To do this task well, understanding the typical customer behaviour during churn will be helpful. Build models to predict churn. The predictive model that you’re going to build will serve two purposes: It will be used to predict whether a high-value customer will churn or not, in near future (i.e. churn phase). By knowing this, the company can take action steps such as providing special plans, discounts on recharge etc. It will be used to identify important variables that are strong predictors of churn. These variables may also indicate why customers choose to switch to other networks. In some cases, both of the above-stated goals can be achieved by a single machine learning model. But here, you have a large number of attributes, and thus you should try using a dimensionality reduction technique such as PCA and then build a predictive model. After PCA, you can use any classification model. Also, since the rate of churn is typically low (about 5-10%, this is called class-imbalance) - try using techniques to handle class imbalance.

About

This repository consists of assignments and case studies which were taken during my PG Diploma in Data Science (2018-19)


Languages

Language:Jupyter Notebook 100.0%