Xiangrui Meng's repositories
spark-corenlp
a Stanford CoreNLP wrapper for Spark ML pipeline API
pyspark-xgboost
This feature was merged into XGBoost master. See https://github.com/dmlc/xgboost/pull/8020. If you want to try out this feature, please build from XGBoost master and report issues at https://github.com/dmlc/xgboost/issues.
bazel-toolchain
LLVM toolchain for bazel
cloudpickle
Extended pickling support for Python objects
containers
Sample base images for Databricks Container Services
databricks-cli
Command Line Interface for Databricks
gpt-2
Code for the paper "Language Models are Unsupervised Multitask Learners"
hyperopt
Distributed Asynchronous Hyperparameter Optimization in Python
joblib-spark
Joblib spark backend
keras
Deep Learning for humans
langchain
⚡ Building applications with LLMs through composability ⚡
mleap
MLeap: Deploy Spark Pipelines to Production
mlflow
Open source platform for the machine learning lifecycle
morpheus
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
petastorm
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
spark-deep-learning
Deep Learning Pipelines for Apache Spark
spark-website
Mirror of Apache Spark Website
tensorflow
An Open Source Machine Learning Framework for Everyone
tensorflow_recipes
Tensorflow conda recipes
tensorframes
Tensorflow wrapper for DataFrames on Apache Spark
training-1
Reference implementations of training benchmarks
xgboost
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow