kennethleungty / kennethleungty

Data Science Portfolio

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

πŸ‘‹ Hello, I'm Kenneth Leung

  • Thanks for popping by! As an avid learner, bold builder, curious explorer, and driven doer with a bias towards action, I enjoy seeking and solving meaningful problems with data and technology while having fun at the same time.
  • I welcome you to join me on a learning journey! Follow me on GitHub, Medium, and LinkedIn for a great dose of practical educational data science content.
  • You can find my data science portfolio below, where every project and article was born out of inspiration, curiosity, and motivation. Feel free to reach out for a chat on topics common to both of us!
  • πŸ‘¨β€πŸ”§ Currently working on: (i) Applied Generative AI Use Cases, and (ii) Compilation of high-profile ML failures: Failed-ML. If you're keen to join me in contributing, let's connect!

Project Count

How to reach me

    Buy Me A Coffee 

Portfolio Contents

  1. Computer Vision
  2. Database Management
  3. Data Extraction and Web Scraping
  4. Data Science Certification Guides
  5. Data Science Toolkit
  6. Data Science in the Real World
  7. Generative AI
  8. Insights from Data Science Talks
  9. Machine Learning
  10. MLOps
  11. Natural Language Processing
  12. Networks and Graphs
  13. Sports Analytics
  14. Visualization
  15. Web Development
  16. Web3 and Metaverse
  17. Writing for DataCamp
  18. Writing Tips

Projects with ⭐ are my personal favourites, so do check them out!


Computer Vision πŸ‘οΈ

Title Article Repo
Classifying Images of Alcoholic Beverages with fast.ai v2 πŸ”— πŸ”—
Russian Car Plate Detection with OpenCV and TesseractOCR πŸ”— πŸ”—
Evaluate OCR Output Quality with Character Error Rate (CER) and Word Error Rate (WER) πŸ”— πŸ”—
Top Python libraries for Image Augmentation in Computer Vision πŸ”— πŸ”—
⭐ PyTorch Ignite Tutorial - Classifying Tiny ImageNet with EfficientNet πŸ”— πŸ”—
Practical Guide to Transfer Learning in TensorFlow for Multiclass Image Classification πŸ”— πŸ”—

Database Management πŸ—„οΈ

Title Article Repo
⭐ Definitive Guide to Creating a SQL Database on Cloud with AWS and Python πŸ”— πŸ”—
PyMySQLβ€Š-β€ŠConnecting Python and SQL for Data Science πŸ”— πŸ”—

Data Extraction and Web Scraping 🧰

Title Article Repo
Using OneMap API to extract Singapore postal codes, coordinates and travel distance - πŸ”—
A Detailed Web Scraping Walkthrough Using Python and Selenium πŸ”— πŸ”—
⭐ How to Web Scrape Wikipedia using LangChain Agents and Tools with OpenAI's LLMs and Function Calling πŸ”— πŸ”—

Data Science Certification Guides πŸ‘¨β€πŸŽ“

Title Article Repo
3 Steps to Get AWS Cloud Practitioner Certified in 2 Weeks πŸ”— πŸ”—
3 Steps to Get Tableau Desktop Certified in 2 Weeks πŸ”— -
⭐ No-Frills Guide to Passing the AWS Certified Machine Learning Specialty Exam πŸ”— -

Data Science Toolkit πŸ› οΈ

Title Article Repo
Common Python codes for Data Wrangling - πŸ”—
Enhance your Python code’s readability with pycodestyle πŸ”— -
Free Resources for Generating Realistic Fake Data πŸ”— -
Most Starred and Forked GitHub Repos for Data Science and Python πŸ”— -
Most Starred and Forked GitHub Repos for Data Science and R πŸ”— -
Automatically Generate Machine Learning Code with Just a Few Clicks πŸ”— -
Read and Modify Image Metadata with Python πŸ”— πŸ”—
Top Tips to Google Search Like a Seasoned Data Scientist πŸ”— -
How to Swap Day and Month of Incorrectly Formatted Excel Dates πŸ”— -

Data Science in the Real World 🌏

Title Article Repo
Exploring Illegal Drugs in Singapore β€” A Data Perspective πŸ”— πŸ”—
Pharmacokinetic Modeling of Drug Concentration Trajectories using Ordinary Differential Equations (ODE) and Global Optimization with Differential Evolution - πŸ”—
Healthcare’s AI Future β€” In Conversation with Andrew Ng and Fei-Fei Li πŸ”— -
Real-World Data Science Use Cases in the Insurance Industry πŸ”— -
⭐ Failed-ML: Compilation of high-profile real-world examples of failed machine learning projects πŸ”— πŸ”—

Generative AI πŸ€–

Title Article Repo
Generative AI Pharmacist - Macy πŸ”— πŸ”—
⭐ ChatPod - Q&A over your Podcasts with Whisper, FAISS, and LangChain πŸ”— πŸ”—
⭐ Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A πŸ”— πŸ”—
Domain LLMs - Compilation of Customized LLMs for Specific Domains and Industries - πŸ”—
⭐ Text-to-Audio Generation with Bark, Clearly Explained πŸ”— πŸ”—
Guide to ChatGPT's Advanced Settings β€” Top P, Frequency Penalties, Temperature, and More πŸ”— -

Insights from Data Science Talks πŸ‘¨β€πŸ«

Title Article Repo
Bridging AI’s Proof-of-Concept to Production Gap β€” Insights from Andrew Ng πŸ”— -

Machine Learning 🎰

Title Article Repo
Exploring Condominium Rental Prices with Web Scraping and Exploratory Data Analysis πŸ”— πŸ”—
Using Ensemble Regressors to Predict Condominium Rental Prices πŸ”— πŸ”—
The Dying ReLU Problem, Clearly Explained πŸ”— -
Why Bootstrapping Actually Works πŸ”— -
⭐ Assumptions of Logistic Regression, Clearly Explained πŸ”— πŸ”—
Data-Centric AI Competition - Tips and Tricks of a Top 5% Finish πŸ”— πŸ”—
Credit Card Fraud Detection with AutoXGB πŸ”— πŸ”—
⭐ Micro, Macro & Weighted Averages of F1 Score, Clearly Explained πŸ”— -
Principal Component Regression - Clearly Explained and Implemented πŸ”— πŸ”—
⭐ Feature Selection with Simulated Annealing in Python, Clearly Explained πŸ”— πŸ”—
Quick Primer on Types of Missing Data and Imputation Techniques πŸ”— -
Imputation of Missing Data in Tables with DataWig πŸ”— πŸ”—

MLOps - Machine Learning Operations πŸ‘¨β€πŸ”§

Title Article Repo
Key Learning Points from MLOps Specializationβ€Šβ€”β€ŠCourse 1/4 πŸ”— πŸ”—
Key Learning Points from MLOps Specializationβ€Šβ€”β€ŠCourse 2/4 πŸ”— πŸ”—
Key Learning Points from MLOps Specializationβ€Šβ€”β€ŠCourse 3/4 πŸ”— πŸ”—
Key Learning Points from MLOps Specializationβ€Šβ€”β€ŠCourse 4/4 πŸ”— πŸ”—
⭐ End-to-End AutoML Pipeline with H2O AutoML, MLflow, FastAPI, and Streamlit for Insurance Cross-Sell πŸ”— πŸ”—
⭐ How to Dockerize Machine Learning Applications Built with H2O, MLflow, FastAPI, and Streamlit πŸ”— πŸ”—
⭐ Building and Managing an Isolation Forest Anomaly Detection Pipeline with Kedro πŸ”— πŸ”—

Natural Language Processing πŸ“‘

Title Article Repo
COVID-19 Vaccine β€” What’s the Public Sentiment? πŸ”— πŸ”—
Keyword Extraction and Analysis Pipeline with KeyBERT and Taipy πŸ”— πŸ”—

Networks and Graphs 🌐

Title Article Repo
⭐ Network Analysis and Visualization of Drug-Drug Interactions πŸ”— πŸ”—
How to Deploy Interactive Pyvis Network Graphs on Streamlit πŸ”— πŸ”—
A No-Code Approach to Building Knowledge Graphs πŸ”— πŸ”—

Sports Analytics ⚽

Title Article Repo
⭐ Analyzing English Premier League VAR Football Decisions πŸ”— πŸ”—
Combining Python and R for FIFA Football World Ranking Analysis πŸ”— πŸ”—

Visualization πŸ“ˆ

Title Article Repo
Uniform Singapore Energy Price and Demand Forecast Dashboard (with Plotly Dash) - πŸ”—
Visualizing Fortune 500 Companies in a Bar Chart Race πŸ”— πŸ”—
How to Easily Draw Neural Network Architecture Diagrams πŸ”— πŸ”—

Web Development πŸ–₯️

Title Article Repo
⭐ Post COVID-19 Vaccination Wait-Time Tracker (with Python Flask) πŸ”— πŸ”—
From HTTP to HTTPS β€” Easily Secure Flask Web Apps With Talisman πŸ”— -
⭐ Food King Directory (in collaboration with Night Owl Cinematics) πŸ”— πŸ”—

Web3 and Metaverse πŸ‘¨β€πŸ’»

Title Article Repo
The Web3 / Metaverse Glossary β€” A Keyword Guide to the Tech Future πŸ”— -

Writing for DataCamp ✍️

Title Article Repo
⭐ What Mature Data Infrastructure Looks Like πŸ”— -
Democratizing Data in Government Agencies πŸ”— -
A Survey Into Data Governance Tools πŸ”— -
Scaling Data Science With Data Governance πŸ”— -
3 Reasons Why All Teams Should Learn SQL πŸ”— -
3 Reasons Why All Teams Should Learn R πŸ”— -
How Tableau Helps Your Organization Achieve Greater Data Insights πŸ”— -
How PowerBI Helps Your Organization Achieve Greater Data Insights πŸ”— -

Writing Tips πŸ“œ

Title Article Repo
Create a Clickable Table of Contents for Your Medium Posts πŸ”— -