Lirong Jian (jianlirong)

jianlirong

Geek Repo

Company:@HashDataInc

Location:Beijing, China

Github PK Tool:Github PK Tool

Lirong Jian's repositories

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

tigris

Tigris is a modern, scalable backend for building real-time websites and apps.

Language:GoLicense:Apache-2.0Stargazers:1Issues:0Issues:0

alpa

Auto parallelization for large-scale neural networks

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

antlr4

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Language:JavaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

arrow

Apache Arrow is a columnar in-memory analytics layer designed to accelerate big data. It houses a set of canonical in-memory representations of flat and hierarchical data along with multiple language-bindings for structure manipulation. It also provides IPC and common algorithm implementations.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

BQconvert

BigQuery Schema Conversion Tool

License:NOASSERTIONStargazers:0Issues:0Issues:0

c-store

C-Store : A column-oriented DBMS prototype (frozen)

Stargazers:0Issues:0Issues:0

ClickBench

ClickBench: a Benchmark For Analytical Databases

License:NOASSERTIONStargazers:0Issues:0Issues:0

cylon

Cylon is a fast, scalable distributed memory data parallel library for processing structured data

License:Apache-2.0Stargazers:0Issues:0Issues:0

diagrams

:art: Diagram as Code for prototyping cloud system architectures

License:MITStargazers:0Issues:0Issues:0

dsb

The DSB benchmark is designed for evaluating both workloaddriven and traditional database systems on modern decision support workloads. DSB is adapted from the widely-used industrialstandard TPC-DS benchmark. It enhances the TPC-DS benchmark with complex data distribution and challenging yet semantically meaningful query templates. DSB also introduces configurable and dynamic workloads to assess the adaptability of database systems. Since workload-driven and traditional database systems have different performance dimensions, including the additional resources required for tuning and maintaining the systems, we provide guidelines on evaluation methodology and metrics to report.

License:MITStargazers:0Issues:0Issues:0

juicefs

A distributed POSIX file system built on top of Redis and S3.

License:AGPL-3.0Stargazers:0Issues:0Issues:0

Jungle

An embedded key-value store library specialized for building state machine and log store

License:Apache-2.0Stargazers:0Issues:0Issues:0

llama2_aided_tesseract

Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections, complete with options for text validation and hallucination filtering.

Stargazers:0Issues:0Issues:0

llmperf

LLMPerf is a library for validating and benchmarking LLMs

License:Apache-2.0Stargazers:0Issues:0Issues:0

lux

Automatically visualize your pandas dataframe via a single print! 📊 💡

License:Apache-2.0Stargazers:0Issues:0Issues:0

magika

Detect file content types with deep learning

License:Apache-2.0Stargazers:0Issues:0Issues:0

MediaCrawler

小红书笔记 | 评论爬虫、抖音视频 | 评论爬虫、快手视频 | 评论爬虫、B 站视频 | 评论爬虫、微博帖子 | 评论爬虫

License:Apache-2.0Stargazers:0Issues:0Issues:0

modin

Modin: Speed up your Pandas workflows by changing a single line of code

License:Apache-2.0Stargazers:0Issues:0Issues:0

neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, branching, and bottomless storage.

License:Apache-2.0Stargazers:0Issues:0Issues:0

OpenLineage

An Open Standard for lineage metadata collection

License:Apache-2.0Stargazers:0Issues:0Issues:0

orioledb

OrioleDB – building a modern cloud-native storage engine (... and solving some PostgreSQL wicked problems)

License:NOASSERTIONStargazers:0Issues:0Issues:0

proton

A unified streaming and historical data processing engine in one single binary, powered by ClickHouse

License:Apache-2.0Stargazers:0Issues:0Issues:0

queryparser

Parsing and analysis of Vertica, Hive, and Presto SQL.

License:MITStargazers:0Issues:0Issues:0

sqlancer

Detecting Logic Bugs in DBMS

License:MITStargazers:0Issues:0Issues:0

sqlsmith

A random SQL query generator

License:GPL-3.0Stargazers:0Issues:0Issues:0

system-design-resources

These are the best resources for System Design on the Internet

License:GPL-3.0Stargazers:0Issues:0Issues:0

timely-dataflow

A modular implementation of timely dataflow in Rust

License:MITStargazers:0Issues:0Issues:0

ucx

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

License:NOASSERTIONStargazers:0Issues:0Issues:0

velox

A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.

License:Apache-2.0Stargazers:0Issues:0Issues:0