Yutaro's starred repositories

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Language:PythonLicense:Apache-2.0Stargazers:34089Issues:340Issues:2664

semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps

reflex

🕸️ Web apps in pure Python 🐍

Language:PythonLicense:Apache-2.0Stargazers:18244Issues:145Issues:1489

candle

Minimalist ML framework for Rust

Language:RustLicense:Apache-2.0Stargazers:14744Issues:147Issues:639

mage-ai

🧙 Build, run, and manage data pipelines for integrating and transforming data.

Language:PythonLicense:Apache-2.0Stargazers:7528Issues:60Issues:758

bandit

Bandit is a tool designed to find common security issues in Python code.

Language:PythonLicense:Apache-2.0Stargazers:6198Issues:64Issues:638

OpenMetadata

OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

Language:TypeScriptLicense:Apache-2.0Stargazers:4935Issues:46Issues:6834

mountpoint-s3

A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.

Language:RustLicense:Apache-2.0Stargazers:4304Issues:45Issues:301

YouPlot

A command line tool that draw plots on the terminal.

Language:RubyLicense:MITStargazers:4071Issues:22Issues:27

Awesome-LLMOps

An awesome & curated list of best LLMOps tools for developers

Language:ShellLicense:CC0-1.0Stargazers:3522Issues:61Issues:8

flit

Simplified packaging of Python modules

Language:PythonLicense:BSD-3-ClauseStargazers:2143Issues:34Issues:395

setup-python

Set up your GitHub Actions workflow with a specific version of Python

Language:TypeScriptLicense:MITStargazers:1613Issues:39Issues:554

data-engineering-wiki

The best place to learn data engineering. Built and maintained by the data engineering community.

Language:CSSLicense:CC0-1.0Stargazers:1255Issues:27Issues:26

MOSS-RLHF

MOSS-RLHF

Language:PythonLicense:Apache-2.0Stargazers:1240Issues:34Issues:51

streamsync

No-code in the front, Python in the back. An open-source framework for creating data apps.

Language:VueLicense:Apache-2.0Stargazers:1111Issues:23Issues:141

GPyOpt

Gaussian Process Optimization using GPy

Language:Jupyter NotebookLicense:BSD-3-ClauseStargazers:927Issues:44Issues:291

japanese-addresses

全国の町丁目レベル(277,191件)の住所データのオープンデータ

Language:JavaScriptLicense:MITStargazers:700Issues:20Issues:74
Language:Jupyter NotebookLicense:Apache-2.0Stargazers:600Issues:10Issues:1

Long-Context

This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.

Language:PythonLicense:Apache-2.0Stargazers:571Issues:13Issues:6

matplotlib-venn

Area-weighted venn-diagrams for Python/matplotlib

Language:Jupyter NotebookLicense:MITStargazers:500Issues:11Issues:68

MetaGym

Collection of Reinforcement Learning / Meta Reinforcement Learning Environments.

Language:PythonLicense:Apache-2.0Stargazers:273Issues:10Issues:29

sveltris

Piece together any framework with Svelte (like Tetris)

wasminspect

An interactive debugger for WebAssembly

Language:RustLicense:MITStargazers:135Issues:7Issues:6

datavault4dbt

Scalefree's dbt package for a Data Vault 2.0 implementation congruent to the original Data Vault 2.0 definition by Dan Linstedt including the Staging Area, DV2.0 main entities, PITs and Snapshot Tables.

Language:PLSQLLicense:Apache-2.0Stargazers:129Issues:19Issues:56

HojiChar

The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.

Language:PythonLicense:Apache-2.0Stargazers:112Issues:4Issues:2

pams

PAMS: Platform for Artificial Market Simulations

Language:PythonLicense:EPL-1.0Stargazers:39Issues:4Issues:27

timesy

A social technical note application for developers

Language:CSSLicense:MITStargazers:36Issues:1Issues:20

QAmeleon

QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning PaLM with only five examples per language. We use the synthetic data to finetune downstream QA models leading to improved accuracy in comparison to English-only and translation-based baselines.

instruction_ja

Japanese instruction data (日本語指示データ)

Language:PythonLicense:MITStargazers:22Issues:3Issues:0