Catherine Shen's repositories
Git-Influencer
Insight Data Engineering project: A platform built in HDFS, Spark and Airflow to help you to find social influencers from GitHub Network.
Realtime-Stock-Monitoring
Real Time Stock Data Monitoring Platform - A practice project using Kafka, Cassandra and Spark.
AWS_SageMaker
Personal guide and examples to learn and use AWS SageMaker to deploy your ML model at scale.
Multithreading_python
Tutorials and collections on multithreading and async in python
Scala-Spark
Spark Streaming and Machine Learning with Scala.
algorithms-1
Minimal examples of data structures and algorithms in Python
awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
AWS_solutionArchitecture
preparing materials for AWS solution Architecture exams
twitter_sentimentClassification_pipeline
A twitter sentiment classification pipeline, generated sentiment score based on model trained on twitter140 dataset.
ETL_with_airflow
Self-edited Airflow tutorial based on the ETL Best practices with airflow repository.
Presto_Hands_on_tutorials
Collections and sample code for learning PrestoDB.
Airflow_Datapipeline
Airflow cheatsheet and tips for work schedulling
cloud-bigtable-examples
Examples of how to use Cloud Bigtable both with GCE map/reduce as well as stand alone applications.
DataEngineering-Daily-Reading
Great Tech post collections from daily reading.
examples
A repository to host extended examples and tutorials
Flask_blog
simple blog written in python
istio
Connect, secure, control, and observe services.
kafka-streams-course
Learn Kafka Streams with several examples!
mongo-python-driver
PyMongo - the Python driver for MongoDB
nosqlclient
Cross-platform and self hosted, easy to use mongodb management tool - Formerly Mongoclient
pubsub
This repository contains open-source projects managed by the owners of Google Cloud Pub/Sub.
PyGithub
Typed interactions with the GitHub API v3
SQL-Python-api
SQL practice from Leetcode and other sources