linjiuning

linjiuning

Geek Repo

Github PK Tool:Github PK Tool

linjiuning's starred repositories

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Language:PythonLicense:Apache-2.0Stargazers:130010Issues:1119Issues:15368

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:39032Issues:2027Issues:0

tidb

TiDB is an open-source, cloud-native, distributed, MySQL-Compatible database for elastic scale and real-time analytics. Try AI-powered Chat2Query free at : https://www.pingcap.com/tidb-serverless/

Language:GoLicense:Apache-2.0Stargazers:36661Issues:1261Issues:18604

ray

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Language:PythonLicense:Apache-2.0Stargazers:32361Issues:475Issues:18100

mlflow

Open source platform for the machine learning lifecycle

Language:PythonLicense:Apache-2.0Stargazers:18051Issues:293Issues:3765

gensim

Topic Modelling for Humans

Language:PythonLicense:LGPL-2.1Stargazers:15479Issues:433Issues:1847

tikv

Distributed transactional key-value database, originally created to complement TiDB

Language:RustLicense:Apache-2.0Stargazers:14838Issues:308Issues:5098

dgl

Python package built to ease deep learning on graph, on top of existing DL frameworks.

Language:PythonLicense:Apache-2.0Stargazers:13264Issues:173Issues:2733

allennlp

An open-source NLP research library, built on PyTorch.

Language:PythonLicense:Apache-2.0Stargazers:11721Issues:280Issues:2557

CoreNLP

CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.

Language:JavaLicense:GPL-3.0Stargazers:9587Issues:487Issues:1116

h2o-3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6826Issues:386Issues:9394

ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, GraphRAG, DeepSpeed, vLLM, FastChat, Axolotl, etc.

Language:PythonLicense:Apache-2.0Stargazers:6330Issues:247Issues:2452

smile

Statistical Machine Intelligence & Learning Engine

Language:JavaLicense:NOASSERTIONStargazers:5987Issues:270Issues:607

bert4keras

keras implement of transformers for humans

Language:PythonLicense:Apache-2.0Stargazers:5336Issues:73Issues:489

sqlflow

Brings SQL and AI together.

Language:GoLicense:Apache-2.0Stargazers:5054Issues:170Issues:1025

aerosolve

A machine learning package built for humans.

Language:ScalaLicense:Apache-2.0Stargazers:4795Issues:352Issues:19

tablesaw

Java dataframe and visualization library

Language:JavaLicense:Apache-2.0Stargazers:3497Issues:142Issues:725

euler

A distributed graph deep learning framework.

Language:C++License:Apache-2.0Stargazers:2886Issues:139Issues:317

mahout

Mirror of Apache Mahout

Language:HTMLLicense:Apache-2.0Stargazers:2128Issues:233Issues:0

DeepIE

DeepIE: Deep Learning for Information Extraction

plato

腾讯高性能分布式图计算框架Plato

Language:C++License:NOASSERTIONStargazers:1895Issues:80Issues:135

byzer-lang

Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.

Language:ScalaLicense:Apache-2.0Stargazers:1825Issues:117Issues:586

graph-learn

An Industrial Graph Neural Network Framework

Language:C++License:Apache-2.0Stargazers:1275Issues:50Issues:139

AutoPhrase

AutoPhrase: Automated Phrase Mining from Massive Text Corpora

Language:C++License:Apache-2.0Stargazers:1167Issues:39Issues:82

Mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.

Language:JavaLicense:NOASSERTIONStargazers:974Issues:85Issues:130

joinery

Data frames for Java

Language:JavaLicense:GPL-3.0Stargazers:692Issues:43Issues:83

generative-recommenders

Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).

Language:PythonLicense:Apache-2.0Stargazers:566Issues:24Issues:34
Language:C++License:Apache-2.0Stargazers:315Issues:27Issues:37

spark-notes

Deep Dive into Apache Spark 深入研读Spark源码

pyjava

This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python. PyJava introduces Apache Arrow as the exchanging data format.