David Macêdo, PhD (dlmacedo)

dlmacedo

Geek Repo

Company:CIn

Location:UFPE

Home Page:https://dlmacedo.com

Twitter:@david_macedo

Github PK Tool:Github PK Tool

David Macêdo, PhD's starred repositories

llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:32733Issues:340Issues:61

ollama

Get up and running with Llama 2, Mistral, and other large language models locally.

stanford_alpaca

Code and documentation to train Stanford's Alpaca models, and generate the data.

Language:PythonLicense:Apache-2.0Stargazers:29018Issues:341Issues:267

tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Language:PythonLicense:MITStargazers:24592Issues:267Issues:628

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:20742Issues:195Issues:2962

LLMs-from-scratch

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:18956Issues:220Issues:41

gpt-crawler

Crawl a site to generate knowledge files to create your own custom GPT from a URL

Language:TypeScriptLicense:ISCStargazers:18026Issues:118Issues:110

haystack

:mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

Language:PythonLicense:Apache-2.0Stargazers:14224Issues:129Issues:3278

DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Language:Jupyter NotebookStargazers:12795Issues:296Issues:820

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonLicense:MITStargazers:12540Issues:116Issues:507

Qwen

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Language:PythonLicense:Apache-2.0Stargazers:11957Issues:96Issues:1018

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Language:Jupyter NotebookLicense:MITStargazers:10874Issues:97Issues:333

taipy

Turns Data and AI algorithms into production-ready web applications in no time.

Language:PythonLicense:Apache-2.0Stargazers:9456Issues:61Issues:570

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Language:PythonLicense:MITStargazers:6299Issues:61Issues:76

BERTopic

Leveraging BERT and c-TF-IDF to create easily interpretable topics.

Language:PythonLicense:MITStargazers:5686Issues:52Issues:1625

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5267Issues:61Issues:87

amazon-dsstne

Deep Scalable Sparse Tensor Network Engine (DSSTNE) is an Amazon developed library for building Deep Learning (DL) machine learning (ML) models

Language:C++License:Apache-2.0Stargazers:4414Issues:341Issues:108

Anima

33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:3401Issues:98Issues:131

KeyBERT

Minimal keyword extraction with BERT

Language:PythonLicense:MITStargazers:3277Issues:32Issues:191

Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

Language:PythonLicense:BSD-3-ClauseStargazers:2864Issues:38Issues:327

ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)

Language:PythonLicense:MITStargazers:2587Issues:41Issues:250

RAGatouille

Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.

Language:PythonLicense:Apache-2.0Stargazers:2302Issues:22Issues:150

sparrow

Data processing with ML and LLM

Language:PythonLicense:GPL-3.0Stargazers:2149Issues:35Issues:50

dialoqbase

Create chatbots with ease

Language:TypeScriptLicense:MITStargazers:1491Issues:25Issues:152

ATLAS

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Language:PythonLicense:Apache-2.0Stargazers:816Issues:20Issues:7

tab-ddpm

[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"

Language:PythonLicense:MITStargazers:338Issues:6Issues:33
Language:PythonLicense:Apache-2.0Stargazers:256Issues:1Issues:7

ForestDiffusion

Generating and Imputing Tabular Data via Diffusion and Flow XGBoost Models