KAR (KAR-NG)

KAR-NG

Geek Repo

Location:Brisbane, Australia

Github PK Tool:Github PK Tool

KAR's repositories

Cucumber_Multi-Env_LatinSquare_Field_Experiment

A multi-environment Latin Square designed trial analysed by ANOVA, Two-way ANOVA, Fully Random Model, Mixed Effect Model, and Tukey test.

Language:RStargazers:2Issues:1Issues:0

Maize_Soil_Nutrient_CRD_Glasshouse_Experiment-

A CRD system (8 treatments & 3 harvests) analysed by Shapiro-Wilk test, Q-Q plot, Levene’s test, Kruskal-Wallis test, and Dunn’s Post-hoc test.

Language:RStargazers:2Issues:1Issues:0

Oats_Variety-Fertilizer_SplitPlot_Field_Experiment

A factorial Split-plot system analysed by Shapiro-Wilk test, Levene’s test, Q-Q plot, CI plot, Mixed-Effect Model, ANOVA, and Tukey test.

Language:RStargazers:2Issues:1Issues:0

Bike-Share_Big_Data_Analysis

12 datasets, 3.7 million obs, & 13 vars were cleaned and manipulated for 6 graphs, dynamic map, and statistics to convert casual riders into members.

Language:RStargazers:1Issues:1Issues:0

Brisbane_Real_Estate_Sales_2020

320k obs and 11 vars cleaned and manipulated for EDA and mapping (choropleth, cluster, points) to find a new home for a Brisbane family.

Language:RStargazers:1Issues:1Issues:0

Houston_Avocado_Prices_EDA_-_Forecast

18k obs & 14 vars cleaned and manipulated for EDA, assumption tests, PP, WO, Ljung-Box, and forecasting (ETS & ARIMA) for avocado prices in the US and Houston.

Language:RStargazers:1Issues:1Issues:0

KAR-NG

My Personal Repository

Marketing_Analytics

Solved 9 biz tasks by 18 graphs and 10 statistical methods include dummy data partitioning (RMSE & R2), stepwise model selection, multicollinearity (correlation, VIF), MLR, GLM for logistic regression.

Language:RStargazers:1Issues:1Issues:0

Recommendation_of_Crop_Classes_by_Predictive_Model

Built an ML API that recommends crop classes with 99.5% accuracy; Trained 13 models included Discriminants analyses, KNN, SVMs, Naive Bayers, Decision Tree, Random Forest (RF), and Boosted RF.

Language:RStargazers:1Issues:1Issues:0

ResortHotel_versus_CityHotel

119k obs & 32 vars cleaned and manipulated to create 14 distinct graphs and statistic tables for an extensive EDA to draw insights.

Language:RStargazers:1Issues:1Issues:0

Human-Resource-Data-Mining

5 analytical tasks have been completed using VAT validated gower-PAM clustering, Correspondence Analysis (CA), Asym-Biplot, Multiple Correspondence Analysis (MCA), Chi-Squared test, Regression, and predictive classification models with KNN, SVM, and Random Forest.

Language:RStargazers:0Issues:1Issues:0

Life-Expectancy-Statistical-Analysis-WHO-

Statistically answered 8 research questions using Multiple Factor Analysis (MFA), Principal Component Analysis (PCA), Multiple Linear Regression, Welch's t-test, Wilcoxon signed-rank test, and Longitudinal Multilevel Mixed-effect Modeling with time trajectories.

Language:HTMLStargazers:0Issues:1Issues:0

Analysis-of-Titanic-Mortality

Data manipulation, imputation, feature engineering, and machine learning algorithms (K-Nearest neightbour, random forest, and extreme-gradient boosting) were applied to clean the dataset. A final, perfectly cleaned dataset was synthesised for data visualisation to understand the trend in the tragedy.

Language:HTMLStargazers:0Issues:1Issues:0

Credit-Card-Market-Segmentation

VEV model from Mclust among 5 clustering algorithms has optimal performance and detected 8 distinct groups of users. Data was cleaned, standardized and feature-selected, PCA’s biplot, Ggplot, Radar plots, and parallel coordinate plots were applied for EDA.

Language:RStargazers:0Issues:1Issues:0

Dirty-Data-Challenge-

Clean, manipulate, transform, and join 4 messy datasets

Language:RStargazers:0Issues:1Issues:0

ecar

ecar

Language:HTMLStargazers:0Issues:1Issues:0

Food-Poison-Survey-Analysis-using-Multiple-Correspondence-Analysis

This project applies multiple correspondence analysis (MCA) with the techniques in scree plot, variable plots, individual plots, biplot, cosine square (CO2) and contribution statistcs (contrib) to detect trends in the multivariate food poisoning survey dataset and identified the most probable food that caused the food poison. MCA is one of the principal component methods, and principal componet methods belong to the "unsupervised" machine learning branch.

Stargazers:0Issues:1Issues:0

Loan-EDA-and-Machine-Learning-Prediction

Solved 7 business tasks and identified statistical important variables related to loan application. Many plots were synthesised during EDA and machine learning. Models built include Logistic regression, Decision Tree, Bootstrap Aggregating, Random Forest, Fine tuned Extremely Gradient boosting.

Language:RStargazers:0Issues:1Issues:0

nasa

nasa

Stargazers:0Issues:1Issues:0

pima

pima

Stargazers:0Issues:1Issues:0

Predicting-House-Prices-in-Boston_UniqueVersion

Extracted statistical relationships between house prices and many factors, applicationised the 90% R2 Random Forest model that outcompeted MLR, Lasso, PLS, KNN, and DT into production.

Language:RStargazers:0Issues:1Issues:0

regression

regressionbook

Language:HTMLStargazers:0Issues:1Issues:0

Sales-of-Summer-Clothes-in-E-commerce-

Solve 9 analysis tasks and identified the most important variables in driving the success of clothes sales. Achieved via 22 plots, multiple linear regression and random forest

Language:RStargazers:0Issues:1Issues:0

SimpleTalkDemo_R

Demo data and R script for Simple Talk aricle

Language:RStargazers:0Issues:0Issues:0

soil

soil

Stargazers:0Issues:1Issues:0
Stargazers:0Issues:1Issues:0
Language:JavaScriptStargazers:0Issues:1Issues:0

superstore.sales

superstore.sales

Language:HTMLStargazers:0Issues:1Issues:0