Masaki Rikitoku's repositories
HLLSample
Sample code for HyperLogLog sketch algorithm
spaCy
💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython
feast
Feature Store for Machine Learning
nbviewer.js
Client side rendering of Jupyter notebooks
smile
Statistical Machine Intelligence & Learning Engine
jsoniter-scala
Scala macros for compile-time generation of safe and ultra-fast JSON codecs
timeseers
Time should be taken seer-iously
EconML
ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
causalml
Uplift modeling and causal inference with machine learning algorithms
lit
The Language Interpretability Tool: Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface.
spark-nlp
State of the Art Natural Language Processing
pandas_redshift
Load data from redshift into a pandas DataFrame and vice versa.
spark-excel
A Spark plugin for reading Excel files via Apache POI
angel
A Flexible and Powerful Parameter Server for large-scale machine learning
ngboost
Natural Gradient Boosting for Probabilistic Prediction
sona
Spark On Angel, arming Spark with a powerful Parameter Server, which enable Spark to train very big models
nbviewer
nbconvert as a web service: Render Jupyter Notebooks as static web pages
mmlspark
Microsoft Machine Learning for Apache Spark
aws-data-wrangler
Utility belt to handle data on AWS.
PyTorch-On-Angel
PyTorch On Angel, arming PyTorch with a powerful Parameter Server, which enable PyTorch to train very big models.
BlingFire
A lightning fast Finite State machine and REgular expression manipulation library.
aws-glue-data-catalog-client-for-apache-hive-metastore
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog as an external Hive Metastore. It serves as a reference implementation for building a Hive Metastore-compatible client that connects to the AWS Glue Data Catalog. It may be ported to other Hive Metastore-compatible platforms such as other Hadoop and Apache Spark distributions
aws-deepracer-workshops
DeepRacer workshop content
django_bootstrap_skelton
django featuring bootstrap skelton project
CoreNLP
Stanford CoreNLP: A Java suite of core NLP tools.
amazon-forecast-samples
Notebooks and examples on how to onboard and use various features of Amazon Forecast.
sparkprophet
Sample application running fbprophet using spark
liblinear-java
Java version of LIBLINEAR