Simon Lee's repositories
algebird
Abstract Algebra for Scala
articles
Datartisan 编辑整理的专业文章
BaiduPCS
百度网盘命令行工具。The terminal utility for Baidu Network Disk.
Blog
博客(内附简历)
dotvim
personal vim configurations and plugins
dr-elephant
Performance monitoring and tuning tool for Apache Hadoop
finagle
A fault tolerant, protocol-agnostic RPC system
fix_oralib_osx
Fix dependent Oracle libraries in Orale instant client packages for OS X 10.11 El Capitan
flanker
Python email address and Mime parsing library
incubator-airflow
Apache Airflow (Incubating)
insuranceQA-cnn-lstm
tensorflow and theano cnn code for insurance QA(question Answer matching)
jupyter-scala
Lightweight Scala kernel for Jupyter / IPython 3
kg-beijing
北京知识图谱学习小组
luigi
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
metrics
:chart_with_upwards_trend: Capturing JVM- and application-level metrics. So you know what's going on.
new-coder
New Coder tutorials
nlp-lang
这个项目是一个基本包.封装了大多数nlp项目中常用工具
objenesis
Okay, it's pretty easy to instantiate objects in Java through standard reflection. However there are many cases where you need to go beyond what reflection provides. For example, if there's no public constructor, you want to bypass the constructor code, or set final fields. There are numerous clever (but fiddly) approaches to getting around this and this library provides a simple way to get at them. You will find the official site here.
Programming-Collective-Intelligence
Examples from Programming Collective Intelligence
scalding
A Scala API for Cascading
sent-conv-torch
Text classification using a convolutional neural network.
snowflake
Snowflake is a network service for generating unique ID numbers at high scale with some simple guarantees.
spark-corenlp
CoreNLP wrapper for Spark
spark-solr
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
spark_lagou
"使用Spark分析拉勾网招聘信息" 系列文章的数据仓库.
stream-lib
Stream summarizer and cardinality estimator.
The-Art-Of-Programming-By-July
本github已于14年6月基本停止更新,完整精致的纸质版《编程之法:面试和算法心得》已在京东/当当上销售!
tpot
A Python tool that automatically creates and optimizes machine learning pipelines using genetic programming.
zipkin
Zipkin is a distributed tracing system