Andreja Ho's repositories
Movies-ETL
For this project I am creating an ETL (Extract, Transform, and Load) pipeline using Python, RegEx, and SQL Database. The goal is to retrieve data from different sources, clean and transform it into a useful format and finally load the data into an SQL database where the data is ready for further analysis. The result is an established automated pipeline and a clean data set stored in an SQL database.
Pewlett-Hackard-Analysis
In this project, I am using ERDs and Schemas to design databases and writing intermediate-level SQL queries to answer important business questions for the company’s HR department. The result is a well-structured database with implemented constraints, foreign and primary keys.
plotly_deploy
For this project, I am using JavaScript and Plotly.js to create an interactive dashboard for a biotechnology company. The result is a well-designed dashboard with several different charts that clearly communicate findings. The dashboard is deployed via GitHub pages where users can interact with the dashboard.
AB_Testing
In this project, I am performing A/B testing for the company’s new website. I performed hypothesis testing with Python and NumPy to determine p-value and used regression models to advise if the company should launch a new website. The result is robust statistical analysis and interpretation of results to ensure the right decision for the company.
Data_Analyst_Nanodegree_Program_Portfolio
Udacity Data Analyst Nanodegree (DAND) Projects Portfolio. This repository is dedicated to Udacity’s Data Analyst Nanodegree Program (DAND). It contains all major projects, completed in the program.
MechaCar_Statistical_Analysis
Statistical Analysis with R - Summary Statistics, T-Tests, ANOVA
Mission-to-Mars
Web-scraping with HTML and CSS
Amazon_Vine_Analysis
Analysis using Pyspak, Google Colab, and AWS.
election-analysis
In this analysis, I wrote a Python script that generates a vote count per candidate and per county and creates a report for U.S. congressional race in a Colorado precinct.
kickstarter-analysis
For this data analysis, I am using MS Excel, including interactive pivot tables and charts, conditional formatting, advanced filters, VLOOKUPs, Box and Whiskers Charts, and other advanced Excel formulas.
stock-analysis
VBA stock-market analysis. For this data analysis I am using Microsoft Visual Basic for Applications as a tool, including conditional statements, for loops, static and conditional formatting, and nevertheless code refactoring in order to improve its efficiency and clarity.
bikesharing
City Bike Analysis with Tableau
Credit_Risk_Analysis
For this project I am utilizing several models of supervised machine learning on credit loan data in order to predict credit risk. I used Python Scikit-learn library and several machine learning models (Supervised Machine Learning Models - Logistic Regression, Random Forest, AdaBoost Classifier, Cluster Centroids, Oversampling & Undersampling) to compare the strengths and weaknesses of ML models and determine how well a model classifies and predicts data.
Cryptocurrencies
Supervised ML
Data_Investigation
Extensive EDA
Exploratory_and_Explanatory_Visualizations
Data Visualization with Python, Pandas, Matplotlib and Seaborn
first-contributions
🚀✨ Help beginners to contribute to open source projects
github-slideshow
A robot powered training repository :robot:
hello-world
My first repo
Mapping_Earthquakes
An interactive earthquake maps with JavaScript and Leaflet
My_Portfolio
My Personal Page
Neural_Network_Charity_Analysis
ML Neural Network Analysis
PP_Child_Mortality
Child Mortality Analysis
PyBer_Analysis
Ride Share Analysis with Matplotlib
Weather_Trends
Moving Average and Weather Trends
Wine_Quality_Analysis
Wine quality analysis with Python and Pandas
World_Weather_Analysis
An API Weather Analysis
Wrangle_And_Analyze
In this project, I am working with Twitter's data and it’s all about our best friends - dogs! The main focus of this project is data-wrangling - retrieving data from various sources and formats such as Tweeter API, JSON file, and tsv file, assessing data and cleaning data. The highlight of this project is working with various sources and files and clean the data using the define-code-test approach. The project yields a clean main dataset that is ready for further analysis.