Shuji SUZUKI's starred repositories
private-gpt
Interact with your documents using the power of GPT, 100% privately, no data leaks
devika
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
azure-search-openai-demo
A sample app for the Retrieval-Augmented Generation pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
trafilatura
Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.
minixfromscratch
Development and compilation setup for the book versions of MINIX (2.0.0 and 3.1.0) on QEMU
MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
chatgpt-arxiv-extension
A browser extension that enhance search engines with ChatGPT
buffer-of-thought-llm
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
SimpleGPUHashTable
A simple GPU hash table implemented in CUDA using lock free techniques
distributed-faiss
A library for building and serving multi-node distributed faiss indices.
pdf-translator
pdf-translator translates English PDF files into Japanese, preserving the original layout.
Score-Entropy-Discrete-Diffusion
[ICML 2024 Oral] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
the-stack-v2
Code for the curation of The Stack v2 and StarCoder2 training data