Welcome to my data portfolio! Here, I provide a summary of my projects in the data field.
Project Link | Completion Date | Tools | Project Description |
---|---|---|---|
๐ข -- | -- | -- | -- |
๐ถ [--] | -- | -- |
Project Link | Tools/Strategies Used | Project Description |
---|---|---|
๐ข Allocating Shipping Data Columns | Conditional Formatting, Custom Formulae, VLOOKUP, Cell Concatenation, Data Transposing | Allocated columns of missing data to other workbooks. I transformed and concatenated the column data across using VLOOKUP and then combined the tables all into one new sheet for reference. |
๐ถ Project Two | Python, PostgreSQL, Jupyter Notebook | Designed, created, and deployed a custom data model for a dog adoption data set using Python and PostgreSQL on Jupyter Notebook. |
Project Link | Area of Analysis | Project Description |
---|---|---|
๐ Basketball Game Ticket Data | Data Query Langauge, Data Manipulation Language | Simple exercise to showcase my ability to tackle various SQL challenges and demonstrates my proficiency in SQL query writing and problem-solving skills with a ticketing dataset. |
๐ฉ๐ปโโ๏ธ SQL Project 4 | Health analysis | I answer business questions related to patients data, such as average and median measurements per user, types of measurements for active users, and median blood pressure values for users. |
๐ฆ SQL Project 5 | Data cleaning, data analysis | A project close to ๐ก home. Inspired by Alex Freberg, I analysed global and local Covid-19 cases & the impact on Malaysia stock market from Jan 2020 to Jul 2021 using SQL and Tableau. |
Project Link | Area | Project Description | Libraries |
---|---|---|---|
๐ Can we predict if a mushroom is poisonous? | EDA, Predictive Analysis, Classifying | Here, my team and I used the UCI Mushroom Data Set to prepare, analyze, and predict which variables of mushrooms make them more likely to be inedible/poisonous. | sklearn, pandas, NumPy, matplotlib, seaborn |
๐ฅ Sentiment Analysis on Movie Reviews | EDA, Naive Bayes | My team and I used a Multinomial Bayes Classifier to determine whether a movie review had negative or positive sentiment. | pandas, BeautifulSoup, Sklearn, matplotlib, re(regex) |
โฝ๏ธ Predicting Vehicle Weight | EDA and Linear Regression | Analysis on a vehicle dataset and constructing linear regression models that predict the curb weight of a vehicle. | pandas, matplotlib, seaborn |
๐ท Cleaning a Wine Dataset | EDA and Imputing Data | An exercise where a partner and I studied a wine dataset, became familiar with domain knowledge regarding wine, studied each variable numerical and categorical, adjusted skew, normalized, and imputed values for missing values of several variables. | pandas, matplotlib |
๐ Decision Tree Vs. Random Forest on NY State Graduation Data | EDA, Supervised Machine Learning, Decision Trees/Random Forest | In this analysis, we constructed three different kinds of decision trees and random forest models based on feature importance analysis using Logistic Regression on our Boolean variables, trained them on subsets of our data, analysed their performance using confusion matrices, and chose the best one for prediction. | pandas, matplotlib, sklearn, Yellowbrick |
๐ K-Nearest Neighbors and Support Vector Machines to Predict Online Purchases | EDA, KNN, SVM | We used supervised learning methods such as K-nearest neighbors and support vector machines in Python to predict whether or not online shoppers were more willing to make a purchase. | pandas, matplotlib, seaborn |
๐ฎ Sentiment Analysis - A Machine Learning Approach into Hideo Kojima's Divisive Platformer | EDA, Naive Bayes, Feature Engineering, Natural Language Processing | Our team sought to perform sentiment analysis on Twitter tweets in anticipation for Hideo Kojima's video game release, Death Stranding, in 2019. We sourced the Tweets from two libraries, preprocessed them, stored them using MongoDB and then performed sentiment analysis. | pandas, matplotlib, pymongo, NLTK, json |
Project Link | Project Description | Dashboard Link |
---|---|---|
๐ฆ Tableau Project 1 | Cleansed and transformed data on privately-owned companies (start-ups) valued at over $1 billion using Python. Visualised key insights using Tableau, including the timeline of valuations, the top 10 countries and investors with the highest valuations, the most successful unicorns, and the average time it takes to reach a $1 billion valuation. | DASHBOARD LINK 1 |
๐ฆ Tableau Project 2 | A project close to ๐ก home. Inspired by Alex Freberg I analysed X using SQL and Tableau. | DASHBOARD LINK 2 |
Project Link | Project Description | _ |
---|---|---|
๐ Sales Dashboard | Uploaded and modeled CSV sales data in a star schema fashion to then create a dashboard highlighting sales information in PowerBI. |