Pavel's repositories
ipytelegram
IPython magic for Telegram notifications
caltech-ml-class
Caltech Machine Learning Class: learning from data
titanic-cnn
Titanic: 1d-convolution on learned character embeddings
stackoverflow
https://www.kaggle.com/c/predict-closed-questions-on-stack-overflow
YouTokenToMe
Unsupervised text tokenizer focused on computational efficiency
allen-ai-challenge
7th place @ The Allen AI Science Challenge, solution by Team Generation Gap
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Flink and DataFlow
coursera-data-analysis
Coursera Data Analysis code
diagrams
Charts in JSON
dist-keras
Distributed deep learning with Keras and Apache Spark.
hyperopt
Distributed Asynchronous Hyperparameter Optimization in Python
julialang.github.com
Julia Project web site.
kaggle-rainicorn
allstate-purchase-prediction-challenge@kaggle
la4j
My fork of la4j - Linear Algebra for Java
octo-diagrams
Test repo for diagrams.
scalacaster
Purely Functional Algorithms and Data Structures in Scala
scikit-learn
My fork of scikit-learn: machine learning in Python
smart_open
Utils for streaming large files (S3, HDFS, gzip, bz2...)
t-digest-java
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means
word2gauss
Gaussian word embeddings