priya-aggarwal27

priya-aggarwal27's repositories

Assignment---Syntactic-Analysis

You have learnt to build your own HMM-based POS tagger and implement the Viterbi algorithm using the Penn Treebank training corpus. The vanilla Viterbi algorithm we had written had resulted in ~87% accuracy. The approx. 13% loss of accuracy was majorly due to the fact that when the algorithm encountered an unknown word (i.e. not present in the training set, such as 'Twitter'), it assigned an incorrect tag arbitrarily. This is because, for unknown words, the emission probabilities for all candidate tags are 0, so the algorithm arbitrarily chooses (the first) tag. Need to modify the Viterbi algorithm to solve the problem of unknown words using at least two techniques.

Language:Jupyter Notebook000

Assignment-Advanced-Regression

A US-based housing company named Surprise Housing has decided to enter the Australian market. The company uses data analytics to purchase houses at a price below their actual values and flip them on at a higher price.

Language:Jupyter Notebook010

ChatBot

An Indian startup named 'Foodie' wants to build a conversational bot (chatbot) which can help users discover restaurants across several Indian cities. The main purpose of the bot is to help users discover restaurants

Language:Python000

Customer-Segmentation

Online retail is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers.

Language:Jupyter Notebook000

Investment-Analysis

The objective is to identify the best sectors, countries, and a suitable investment type for making investments. The overall strategy is to invest where others are investing, implying that the 'best' sectors and countries are the ones 'where most investors are investing'.

Language:Jupyter Notebook010

Lending-Club-Case-Study-EDA-

The company wants to understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default. The company can utilise this knowledge for its portfolio and risk assessment.

Language:Jupyter Notebook000

Linear-Regression

A boat-sharing system is a service in which boats are made available for city tour. Required to model the demand for open boats with the available independent variables. It will be used by the management to understand how exactly the demands vary with different features. They can accordingly manipulate the business strategy to meet the demand levels and meet the customer's expectations. Further, the model will be a good way for management to understand the demand dynamics of a new market.

Language:Jupyter Notebook010

MNIST_FASHION_Dataset_CNN

Train a simple CNN on the Fashion MNIST dataset using Tensorflow Keras.

Language:Jupyter Notebook010

Telecom-Churn-Case-Study

This project is based on the Indian and Southeast Asian market. Analyse customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn and identify the main indicators of churn.

Language:Jupyter Notebook010