YI SHENG CHAN's repositories
LDA_RecEngine
An implementation of LDA-based Recommender System
amazon-redshift-utils
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
awesome-machine-learning
A curated list of awesome Machine Learning frameworks, libraries and software.
azure-sdk-for-python
Microsoft Azure SDK for Python
CS100.1x
BerkeleyX CS100.1x "Introduction to Big Data with Apache Spark"
ctci
Cracking the Coding Interview
database_cleaner
Strategies for cleaning databases in Ruby. Can be used to ensure a clean state for testing.
draft-ietf-ppm-dap
This document describes the Distributed Aggregation Protocol (DAP) being developed by the PPM working group at IETF.
EM_tossing_coins
An implementation of a classic Expectation-Maximization illustration
Facebook-Recruiting
Predict if an online bid is made by a machine or a human
ggplot
ggplot for python
horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
ipython-notebooks
A collection of IPython notebooks covering various topics.
kafka-python
Python client for Apache Kafka
LeetCode-Sol-Res
Clean, Understandable Solutions and Resources for LeetCode Online Judge Algorithms Problems.
mahout_HBase_in_pyspark
this code snippet shows how to use mahout's spark-itemsimilarity API and build item-based recommender engine in pyspark
minhash
Clustering using MinHash technique
mlxtend
A library of extension and helper modules for Python's data analysis and machine learning libraries.
pyLDAvis
Python library for interactive topic model visualization. Port of the R LDAvis package.
Qix
Node.Js、Golang、Machine Learning、PostgreSQL、Deep Learning
ray
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
sbt-scoverage
sbt plugin for scoverage
scalac-scoverage-plugin
Scoverage Scala Code Coverage Core Libs
scoozie
Scala DSL on top of Oozie XML
segment
A tool to segment text based on frequencies and the Viterbi algorithm "#TheBoyWhoDied" => ['#', 'The', 'Boy', 'Who', 'Died']
streaming-matrix-factorization
Distributed Streaming Matrix Factorization implemented on Spark for Recommendation Systems
vowpal_wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
wormhole
Portable, Scalable and Reliable Distributed Machine Learning, support various platforms including Hadoop YARN, MPI, etc.
xgboost
eXtreme Gradient Boosting (GBDT or GBRT) Library for large-scale (distributed) machine learning