Takumi Ito (taku-ito)

taku-ito

Geek Repo

Company:Langsmith Inc. / Tohoku NLP Lab

Home Page:https://www.takumi-ito.com/

Github PK Tool:Github PK Tool

Takumi Ito's starred repositories

persona-hub

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Language:PythonStargazers:641Issues:0Issues:0

mesop

Build delightful web apps quickly in Python

Language:PythonLicense:Apache-2.0Stargazers:4812Issues:0Issues:0

outlines

Structured Text Generation

Language:PythonLicense:Apache-2.0Stargazers:7371Issues:0Issues:0

instructor

structured outputs for llms

Language:PythonLicense:MITStargazers:6867Issues:0Issues:0

distilabel

βš—οΈ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.

Language:PythonLicense:Apache-2.0Stargazers:1200Issues:0Issues:0

artkit

Automated prompt-based testing and evaluation of Gen AI applications

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:96Issues:0Issues:0

LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

Stargazers:1336Issues:0Issues:0

LLMDataHub

A quick guide (especially) for trending instruction finetuning datasets

License:MITStargazers:2292Issues:0Issues:0

mindsdb

The platform for building AI from enterprise data

Language:PythonLicense:NOASSERTIONStargazers:25851Issues:0Issues:0

Adala

Adala: Autonomous DAta (Labeling) Agent framework

Language:PythonLicense:Apache-2.0Stargazers:874Issues:0Issues:0

llm-app

Dynamic RAG for enterprise. Ready to run with Docker,⚑in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.

License:MITStargazers:3395Issues:0Issues:0

langkit

πŸ” LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). πŸ“š Extracts signals from prompts & responses, ensuring safety & security. πŸ›‘οΈ Features include text quality, relevance metrics, & sentiment analysis. πŸ“Š A comprehensive tool for LLM observability. πŸ‘€

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:785Issues:0Issues:0

llamafile

Distribute and run LLMs with a single file.

Language:C++License:NOASSERTIONStargazers:17862Issues:0Issues:0

unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonLicense:Apache-2.0Stargazers:13227Issues:0Issues:0

bark

πŸ”Š Text-Prompted Generative Audio Model

Language:Jupyter NotebookLicense:MITStargazers:33977Issues:0Issues:0

langfuse

πŸͺ’ Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

Language:TypeScriptLicense:NOASSERTIONStargazers:4951Issues:0Issues:0

ultrajson

Ultra fast JSON decoder and encoder written in C with Python bindings

Language:CLicense:NOASSERTIONStargazers:4287Issues:0Issues:0
Language:CLicense:NOASSERTIONStargazers:418Issues:0Issues:0

text-dedup

All-in-one text de-duplication

Language:PythonLicense:Apache-2.0Stargazers:555Issues:0Issues:0

preprocess

Corpus preprocessing

Language:C++License:NOASSERTIONStargazers:93Issues:0Issues:0

AlignScore

ACL2023 - AlignScore, a metric for factual consistency evaluation.

Language:PythonLicense:MITStargazers:99Issues:0Issues:0

optimum

πŸš€ Accelerate training and inference of πŸ€— Transformers and πŸ€— Diffusers with easy to use hardware optimization tools

Language:PythonLicense:Apache-2.0Stargazers:2348Issues:0Issues:0

J-UniMorph

Dataset of UniMorph in Japanese

Language:JavaScriptLicense:CC-BY-4.0Stargazers:4Issues:0Issues:0

DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β€€ πŸ€–πŸ’€

Language:PythonLicense:MITStargazers:748Issues:0Issues:0

Appraise

Appraise code used as part of WMT21 human evaluation campaign

Language:PythonLicense:BSD-3-ClauseStargazers:22Issues:0Issues:0

TransformerLens

A library for mechanistic interpretability of GPT-style language models

Language:PythonLicense:MITStargazers:1258Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:10244Issues:0Issues:0

uptrain

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.

Language:PythonLicense:Apache-2.0Stargazers:2117Issues:0Issues:0

reflex

πŸ•ΈοΈ Web apps in pure Python 🐍

Language:PythonLicense:Apache-2.0Stargazers:18211Issues:0Issues:0

reactpy

It's React, but in Python

Language:PythonLicense:MITStargazers:7779Issues:0Issues:0