Rochita Sundar (rochitasundar)


Geek Repo

Company:Data Scientist @ Voices

Location:Vancouver, Canada

Home Page:

Github PK Tool:Github PK Tool

Rochita Sundar's repositories


This project aims to build & optimise a book recommendation system based on collaborative filtering and will tackle an example of both memory based & model based approach (using KNNWithMeans & Singular Value Decomposition)

Language:Jupyter NotebookStargazers:15Issues:1Issues:0


The project involves performing clustering analysis (K-Means, Hierarchical clustering, visualization post PCA) to segregate stocks based on similar characteristics or with minimum correlation. Having a diversified portfolio tends to yield higher returns and face lower risk by tempering potential losses when the market is down.

Language:Jupyter NotebookStargazers:12Issues:1Issues:0


The aim to decrease the maintenance cost of generators used in wind energy production machinery. This is achieved by building various classification models, accounting for class imbalance, and tuning on a user defined cost metric (function of true positives, false positives and false negatives predicted) & productionising the model using pipelines.

Language:Jupyter NotebookStargazers:11Issues:1Issues:0


The aim is to find an optimal ML model (Decision Tree, Random Forest, Bagging or Boosting Classifiers with Hyper-parameter Tuning) to predict visa statuses for work visa applicants to US. This will help decrease the time spent processing applications (currently increasing at a rate of >9% annually) while formulating suitable profile of candidates more likely to have the visa certified.

Language:Jupyter NotebookStargazers:7Issues:1Issues:0


This repository contains the lab work for Coursera course on "Generative AI with Large Language Models".

Language:Jupyter NotebookStargazers:5Issues:1Issues:0


Scrapped tweets using twitter API (for keyword ‘Netflix’) on an AWS EC2 instance, ingested data into S3 via kinesis firehose. Used Spark ML on databricks to build a pipeline for sentiment classification model and Athena & QuickSight to build a dashboard

Language:Jupyter NotebookStargazers:4Issues:1Issues:0


Data consists of tweets scrapped using Twitter API. Objective is sentiment labelling using a lexicon approach, performing text pre-processing (such as language detection, tokenisation, normalisation, vectorisation), building pipelines for text classification models for sentiment analysis, followed by explainability of the final classifier

Language:Jupyter NotebookStargazers:3Issues:1Issues:0


This repository contains my code solutions to Udacity's coursework 'Intro to Deep Learning with PyTorch'.

Language:Jupyter NotebookStargazers:2Issues:2Issues:0


The aim is to develop an ML- based predictive classification model (logistic regression & decision trees) to predict which hotel booking is likely to be canceled. This is done by analysing different attributes of customer's booking details. Being able to predict accurately in advance if a booking is likely to be canceled will help formulate profitable policies for cancelations & refunds.

Language:Jupyter NotebookStargazers:1Issues:1Issues:0


This repository contains my code solution to DeepLearning.AIs Practical Data Science On AWS Cloud Specialization.

Language:Jupyter NotebookStargazers:1Issues:1Issues:0


The objective is to build a ML-based solution (linear regression model) to develop a dynamic pricing strategy for used and refurbished smartphones, identifying factors that significantly influence it.

Language:Jupyter NotebookStargazers:1Issues:1Issues:0


The data relates to several user actions or interests recorded on two variants of landing pages for an online news portal. The objective is to analyse these interests by performing statistical analyses to determine if one variant is more effective based on chosen metrics (A/B testing).

Language:Jupyter NotebookStargazers:1Issues:1Issues:0


Code for the online course "Deployment of Machine Learning Models"

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:0Issues:0Issues:0


Storyboard published on Tableau Public:



Streamlit Smile Detector App



This project aims to scrape the website of Vancouver Public Library using automation test software. The automated tool will scrape more than 70K+ records to gather information on the specific language collection, title, author, category, availability status and ratings of international language material to draw insights

Language:Jupyter NotebookStargazers:0Issues:1Issues:0