Beast code in Giters

Lirong Jian's repositories

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonMIT100

tigris

Tigris is a modern, scalable backend for building real-time websites and apps.

Language:GoApache-2.0100

alpa

Auto parallelization for large-scale neural networks

Language:PythonApache-2.0000

antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Language:JavaNOASSERTION000

Apache Arrow is a columnar in-memory analytics layer designed to accelerate big data. It houses a set of canonical in-memory representations of flat and hierarchical data along with multiple language-bindings for structure manipulation. It also provides IPC and common algorithm implementations.

Language:C++Apache-2.0000

BQconvert

BigQuery Schema Conversion Tool

NOASSERTION000

c-store

C-Store : A column-oriented DBMS prototype (frozen)

000

ClickBench

ClickBench: a Benchmark For Analytical Databases

NOASSERTION000

cylon

Cylon is a fast, scalable distributed memory data parallel library for processing structured data

Apache-2.0000

diagrams

:art: Diagram as Code for prototyping cloud system architectures

MIT000

dsb

The DSB benchmark is designed for evaluating both workloaddriven and traditional database systems on modern decision support workloads. DSB is adapted from the widely-used industrialstandard TPC-DS benchmark. It enhances the TPC-DS benchmark with complex data distribution and challenging yet semantically meaningful query templates. DSB also introduces configurable and dynamic workloads to assess the adaptability of database systems. Since workload-driven and traditional database systems have different performance dimensions, including the additional resources required for tuning and maintaining the systems, we provide guidelines on evaluation methodology and metrics to report.

MIT000

juicefs

A distributed POSIX file system built on top of Redis and S3.

AGPL-3.0000

Jungle

An embedded key-value store library specialized for building state machine and log store

Apache-2.0000

llama2_aided_tesseract

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections, complete with options for text validation and hallucination filtering.

000