amihajlovic / hydra

A modern, open source replacement for enterprise data warehouses

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hydra - the open source data warehouse

The world’s fastest Postgres for analytics

Hydra is a modern, open source replacement for enterprise data warehouses. It’s fast and feature-rich so devs can build better analytics, quicker.

Hydra implements an open source columnar engine to Postgres, driving 23X query performance, better cache hit rates, and scalability over basic Postgres. When comparing to traditional warehouses, Hydra delivers 1500X more throughput to enable realtime analytical workloads.

Contents

πŸ’ͺ Benchmarks

Results in seconds, smaller is better.
Hydra - the open source data warehouse

Review Clickbench for comprehensive results and the list of 42 queries tested.

This benchmark represents typical workload in the following areas: clickstream and traffic analysis, web analytics, machine-generated data, structured logs, and events data. It covers the typical queries in ad-hoc analytics and real-time dashboards.

Hydra - the open source data warehouse

Transactions / Second (TPS)

Hydra delivers 1500X more throughput than traditional warehouses to enable realtime analytical workloads. This is accomplished with transactional heap tables.

Hydra Redshift
TPS 21988 15

View detailed results

πŸš€ Quick Start

Run Hydra locally

The Hydra Docker image is a drop-in replacement for postgres Docker image.

You can also try out Hydra locally using docker-compose.

git clone https://github.com/hydradatabase/hydra && cd hydra
cp .env.example .env
docker compose up
psql postgres://postgres:hydra@127.0.0.1:5432

Use Hydra Cloud

Hydra Cloud is the fastest and most reliable way to run Hydra. It is a cloud-based data warehouse that allows you to consolidate data from various sources into a single, unified system. It provides a user-friendly interface for automated data ingestion and transformation.

Hydra Cloud provides a scalable and secure cloud environment where automatic backups, resource scaling, high availability, point-in-time recovery, and more is available instantly with new databases.

Sign up for Hydra Cloud and get a free, managed database.

🎁 Features

🐘 hosted postgres database - docs
πŸ“Š columnar store with updates and deletes- docs
πŸ”€ query parallelization
πŸ” vectorized execution of WHERE clauses
🌐 external tables - docs

Hydra - the open source data warehouse

Read documentation on using Hydra’s columnar table access method.

🀝 Community and Status

DEVELOPERS.md for contributing and building the image.
Discord discussion with the Community and Hydra team
GitHub Discussions for longer topics
GitHub Issues for bugs and missing features
Blog for latest announcements, tutorials, product updates
@hydradatabase for the tweets, memes, and social posts
Docs for Hydra features and warehouse ops

follow the repo

  • Private Alpha: Limited to select design partners
  • Public Beta: Talk with Hydra team to learn more
  • Hydra 1.0 Release: Generally Available (GA) and ready for production use

Coming Soon

Watch releases of this repo to get notified of updates.

  • 🧹 vacuum stripe optimizations and space reclamation
  • 🏎️ vectorized execution of aggregate functions
  • πŸš… use of SIMD in vectorized execution
  • ↔️ separation of compute and storage

πŸ“ License

Hydra is only possible by building on the shoulders of giants.

The code in this repo is licensed under:

The docker image is built on the Postgres docker image, which contains a large number of open source projects, including:

  • Postgres - the Postgres license
  • Debian or Alpine Linux image, depending on the image used
  • Hydra includes the following additional software in the image:
    • multicorn - BSD license
    • mysql_fdw - MIT-style license
    • parquet_s3_fdw - MIT-style license
    • pgsql-http - MIT license

As for any pre-built image usage, it is the image user's responsibility to ensure that any use of this image complies with any relevant licenses for all software contained within.

About

A modern, open source replacement for enterprise data warehouses

License:Apache License 2.0


Languages

Language:C 74.3%Language:PLpgSQL 14.0%Language:Python 4.8%Language:Go 3.1%Language:Makefile 1.2%Language:Shell 1.0%Language:HCL 0.9%Language:M4 0.5%Language:Dockerfile 0.2%Language:Ruby 0.1%