craig pfeifer (craigpfeifer)

craigpfeifer

Geek Repo

Company:@Lightning-AI

Location:Detroit, MI

Home Page:https://www.linkedin.com/in/craigpfeifer

Twitter:@acraigpfeifer

Github PK Tool:Github PK Tool

craig pfeifer's starred repositories

streamlit

Streamlit — A faster way to build and share data apps.

Language:PythonLicense:Apache-2.0Stargazers:33635Issues:311Issues:4370

jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Language:PythonLicense:Apache-2.0Stargazers:29315Issues:330Issues:5374

pytorch-lightning

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

Language:PythonLicense:Apache-2.0Stargazers:27607Issues:247Issues:6983

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:23847Issues:221Issues:3660

haystack

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Language:PythonLicense:Apache-2.0Stargazers:14870Issues:132Issues:3388

sentence-transformers

Multilingual Sentence & Image Embeddings with BERT

Language:PythonLicense:Apache-2.0Stargazers:14546Issues:134Issues:2072

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonLicense:Apache-2.0Stargazers:11463Issues:382Issues:3324

pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

Language:PythonLicense:Apache-2.0Stargazers:10824Issues:67Issues:190

sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Language:C++License:Apache-2.0Stargazers:9892Issues:124Issues:733

kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

Language:PythonLicense:Apache-2.0Stargazers:9531Issues:106Issues:1876

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Language:PythonLicense:Apache-2.0Stargazers:9385Issues:140Issues:5289

litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Language:PythonLicense:Apache-2.0Stargazers:9109Issues:87Issues:711

aim

Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.

Language:PythonLicense:Apache-2.0Stargazers:5034Issues:44Issues:1002

simpletransformers

Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI

Language:PythonLicense:Apache-2.0Stargazers:4048Issues:64Issues:1119

phoenix

AI Observability & Evaluation

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:3150Issues:29Issues:1930

kenlm

KenLM: Faster and Smaller Language Model Queries

Language:C++License:NOASSERTIONStargazers:2458Issues:69Issues:366

jackson

🔥 Streamline your web application's authentication with Jackson, an SSO service supporting SAML and OpenID Connect protocols. Beyond enterprise-grade Single Sign-On, it also supports Directory Sync via the SCIM 2.0 protocol for automatic user and group provisioning/de-provisioning. 🤩

Language:TypeScriptLicense:Apache-2.0Stargazers:1708Issues:14Issues:135

detext

DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks

Language:PythonLicense:BSD-2-ClauseStargazers:1261Issues:38Issues:17

pynndescent

A Python nearest neighbor descent for approximate nearest neighbors

Language:PythonLicense:BSD-2-ClauseStargazers:872Issues:14Issues:133

docassemble

A free, open-source expert system for guided interviews and document assembly, based on Python, YAML, and Markdown.

Language:PythonLicense:MITStargazers:753Issues:49Issues:365

bleurt

BLEURT is a metric for Natural Language Generation based on transfer learning.

Language:PythonLicense:Apache-2.0Stargazers:673Issues:13Issues:51

vectorhub

Vector Hub - Library for easy discovery, and consumption of State-of-the-art models to turn data into vectors. (text2vec, image2vec, video2vec, graph2vec, bert, inception, etc)

Language:PythonLicense:Apache-2.0Stargazers:551Issues:18Issues:19

ETM

Topic Modeling in Embedding Spaces

Language:PythonLicense:MITStargazers:536Issues:14Issues:36

acl-anthology

Data and software for building the ACL Anthology.

Language:PythonLicense:Apache-2.0Stargazers:384Issues:21Issues:2121

litdata

Transform datasets at scale. Optimize datasets for fast AI model training.

Language:PythonLicense:Apache-2.0Stargazers:264Issues:16Issues:82

Topic_Modelling_Beyond_Tokens

Investigating into how to extract meaningful topic names from textual data

Language:Jupyter NotebookStargazers:19Issues:1Issues:1

PAL-A-tool-for-Pre-annotation-and-Active-Learning

PAL: A tool for Pre-annotation and Active Learning

Language:PythonLicense:BSD-2-ClauseStargazers:18Issues:4Issues:0

rhapsode

Advanced desktop search/corpus exploration prototype

Language:JavaLicense:NOASSERTIONStargazers:7Issues:2Issues:0