Rohit Rastogi (rohitrastogi)

rohitrastogi

Geek Repo

Company:Truera

Location:San Francisco, CA

Github PK Tool:Github PK Tool

Rohit Rastogi's starred repositories

sqlmesh

Efficient data transformation and modeling framework that is backwards compatible with dbt.

Language:PythonLicense:Apache-2.0Stargazers:1508Issues:0Issues:0

influxdb

Scalable datastore for metrics, events, and real-time analytics

Language:RustLicense:Apache-2.0Stargazers:28281Issues:0Issues:0

iceberg-rust

Apache Iceberg

Language:RustLicense:Apache-2.0Stargazers:520Issues:0Issues:0

iceberg-rust

Rust implementation of Apache Iceberg with integration for Datafusion

Language:RustLicense:Apache-2.0Stargazers:77Issues:0Issues:0

roapi

Create full-fledged APIs for slowly moving datasets without writing a single line of code.

Language:RustLicense:Apache-2.0Stargazers:3159Issues:0Issues:0

Scrapegraph-ai

Python scraper based on AI

Language:PythonLicense:MITStargazers:13353Issues:0Issues:0

firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.

Language:TypeScriptLicense:AGPL-3.0Stargazers:8039Issues:0Issues:0

substrait

A cross platform way to express data transformation, relational algebra, standardized record expression and plans.

Language:PythonLicense:Apache-2.0Stargazers:1106Issues:0Issues:0

spiceai

A self-hostable CDN for databases. Spice provides a unified SQL query interface and portable runtime to locally materialize, accelerate, and query datasets from any database, data warehouse, or data lake.

Language:RustLicense:Apache-2.0Stargazers:1729Issues:0Issues:0

nimble

New file format for storage of large columnar datasets.

Language:C++License:Apache-2.0Stargazers:398Issues:0Issues:0

prost-arrow

prost-arrow derives arrow array builders for protobuf messages generated by prost

Language:RustLicense:Apache-2.0Stargazers:3Issues:0Issues:0

RemoteShuffleService

Remote shuffle service for Apache Spark to store shuffle data on remote servers.

Language:JavaLicense:NOASSERTIONStargazers:319Issues:0Issues:0

ray-sql

Distributed SQL Query Engine in Python using Ray

Language:RustLicense:Apache-2.0Stargazers:219Issues:0Issues:0

incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

Language:ScalaLicense:Apache-2.0Stargazers:1082Issues:0Issues:0

s3xz-caching-solution

Reference Architecture to automate the use of S3 Express One Zone as a caching layer for S3 Regional Buckets.

Language:PythonLicense:MIT-0Stargazers:8Issues:0Issues:0

chronon

Chronon is a data platform for serving for AI/ML applications.

Language:ScalaLicense:Apache-2.0Stargazers:675Issues:0Issues:0

Daft

Distributed DataFrame for Python designed for the cloud, powered by Rust

Language:RustLicense:Apache-2.0Stargazers:1891Issues:0Issues:0

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:12373Issues:0Issues:0

joey

baby quokka

Language:PythonLicense:AGPL-3.0Stargazers:3Issues:0Issues:0

parseable

Open Source ElasticSearch Alternative. Parseable helps you search and get insights from your logs in the most simple way possible.

Language:RustLicense:AGPL-3.0Stargazers:1793Issues:0Issues:0

incubator-xtable

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Language:JavaLicense:Apache-2.0Stargazers:791Issues:0Issues:0

Hoptimator

Multi-hop declarative data pipelines

Language:JavaLicense:BSD-2-ClauseStargazers:81Issues:0Issues:0

superlinked

A compute framework for turning complex data into vectors. Build multimodal vectors with ease and define weights at query time so you don't need a custom reranking algorithm to optimise results. Go straight from notebook to production with the same SDK.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:356Issues:0Issues:0

dtm

A distributed transaction framework, supports workflow, saga, tcc, xa, 2-phase message, outbox patterns, supports many languages.

Language:GoLicense:BSD-3-ClauseStargazers:9925Issues:0Issues:0

dspy

DSPy: The framework for programming—not prompting—foundation models

Language:PythonLicense:MITStargazers:14625Issues:0Issues:0

perspective

A data visualization and analytics component, especially well-suited for large and/or streaming datasets.

Language:C++License:Apache-2.0Stargazers:8009Issues:0Issues:0

chdb

chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse

Language:C++License:Apache-2.0Stargazers:1896Issues:0Issues:0

datafusion-comet

Apache DataFusion Comet Spark Accelerator

Language:RustLicense:Apache-2.0Stargazers:678Issues:0Issues:0

datahub

The Metadata Platform for your Data Stack

Language:JavaLicense:Apache-2.0Stargazers:9484Issues:0Issues:0

openhouse

Open Control Plane for Tables in Data Lakehouse

Language:JavaLicense:BSD-2-ClauseStargazers:274Issues:0Issues:0