ghbacct's starred repositories

BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Language:PythonLicense:MITStargazers:5791Issues:0Issues:0

Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Language:PythonLicense:BSD-3-ClauseStargazers:2887Issues:0Issues:0

texthero

Text preprocessing, representation and visualization from zero to hero.

Language:PythonLicense:MITStargazers:2878Issues:0Issues:0
Language:TypeScriptStargazers:252Issues:0Issues:0

prodigy-tui

A textual TUI for Prodigy

Language:CSSLicense:NOASSERTIONStargazers:14Issues:0Issues:0

machine-learning-for-software-engineers

A complete daily plan for studying to become a machine learning engineer.

License:CC-BY-SA-4.0Stargazers:27944Issues:0Issues:0

100-Days-Of-ML-Code

100 Days of ML Coding

License:MITStargazers:43910Issues:0Issues:0

data-science-interviews

Data science interview questions and answers

Language:HTMLLicense:CC-BY-4.0Stargazers:8444Issues:0Issues:0
Language:Jupyter NotebookLicense:MITStargazers:2211Issues:0Issues:0

dockerLLM

TheBloke's Dockerfiles

Language:ShellLicense:MITStargazers:291Issues:0Issues:0

aici

AICI: Prompts as (Wasm) Programs

Language:RustLicense:MITStargazers:1855Issues:0Issues:0

Daft

Distributed DataFrame for Python designed for the cloud, powered by Rust

Language:RustLicense:Apache-2.0Stargazers:1874Issues:0Issues:0

drawdata

Draw datasets from within Jupyter.

Language:JavaScriptLicense:MITStargazers:742Issues:0Issues:0

feature-engineering-az

Source for book "Feature Engineering A-Z"

Language:HTMLStargazers:61Issues:0Issues:0

hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.

Language:Jupyter NotebookLicense:BSD-3-Clause-ClearStargazers:1564Issues:0Issues:0

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:34024Issues:0Issues:0

mergekit

Tools for merging pretrained large language models.

Language:PythonLicense:LGPL-3.0Stargazers:4067Issues:0Issues:0

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonLicense:Apache-2.0Stargazers:2811Issues:0Issues:0

Flowise

Drag & drop UI to build your customized LLM flow

Language:TypeScriptLicense:Apache-2.0Stargazers:27386Issues:0Issues:0

self-rag

This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.

Language:PythonLicense:MITStargazers:1626Issues:0Issues:0

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonLicense:Apache-2.0Stargazers:1768Issues:0Issues:0

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:6606Issues:0Issues:0

shaderunner

Ctrl + F but fancy.

Language:TypeScriptLicense:MITStargazers:10Issues:0Issues:0
Language:PythonLicense:MITStargazers:7Issues:0Issues:0

ML-Papers-Explained

Explanation to key concepts in ML

Stargazers:6773Issues:0Issues:0

projects

🪐 End-to-end NLP workflows from prototype to production

Language:PythonLicense:MITStargazers:1273Issues:0Issues:0

nlpaug

Data augmentation for NLP

Language:Jupyter NotebookLicense:MITStargazers:4350Issues:0Issues:0

ml-engineering

Machine Learning Engineering Open Book

Language:PythonLicense:CC-BY-SA-4.0Stargazers:10195Issues:0Issues:0

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Language:PythonLicense:MITStargazers:9754Issues:0Issues:0

useb

Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.

Language:PythonLicense:Apache-2.0Stargazers:32Issues:0Issues:0