aiical / proton

A unified streaming and historical data analytics database in a single binary, powered by ClickHouse.

Home Page:https://timeplus.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proton – open source, unified streaming and data processing engine for real-time analytics

License Slack follow on LinkedIn YouTube Twitter(X)

Introduction · Architecture · Get Started · What's next · Documentation · Contributing · Need help?

Introduction

Proton is a unified streaming and historical data analytics database in a single binary. It helps data engineers and platform engineers solve complex real-time analytics use cases, and powers the Timeplus streaming analytics platform.

Proton extends the historical data, storage, and computing functionality of the popular ClickHouse project with streaming and OLAP data processing.

Why use Proton?

  • A unified, lightweight engine to connect streaming and historical data processing tasks with efficiency and robust performance.
  • A smooth developer experience with powerful streaming and analytical functionality.
  • Flexible deployments with Proton's single binary and no external service dependencies.
  • Low total cost of ownership compared to other analytical frameworks.

Plus built-in support for powerful streaming and analytical functionality:

Functionality Description
Data transformation Scrub sensitive fields, derive new columns from raw data, or convert identifiers to human-readable information.
Joining streams Combine data from different sources to add freshness to the resulting stream.
Aggregating streams Developer-friendly functions to derive insights from streaming and historical data.
Windowed stream processing (tumble / hop / session) Collect insightful snapshots of streaming data.
Substreams Maintain separate watermarks and streaming windows.
Data revision processing (changelog) Create and manage non-append streams with primary keys and change data capture (CDC) semantics.
Federated streaming queries Query streaming data in external systems (e.g. Kafka) without duplicating them.
Materialized views Create long-running and internally-stored queries.

Architecture

Architecture

See our architecture doc for technical details and the FAQ for more information on the various editions of Proton, how it's related to ClickHouse, and why we chose Apache License 2.0.

Get started

Single Binary

If you’re an Apache Kafka or Redpanda user, you can install Proton as a single binary via:

curl -sSf https://raw.githubusercontent.com/timeplus-io/proton/develop/install.sh | sh

This will install the Proton binary in the current folder, then you can start the server via proton server start and start a new terminal window with proton client to start the SQL shell.

For Mac users, you can also use Homebrew to manage the install/upgrade/uninstall:

brew tap timeplus-io/timeplus
brew install proton

Next, create an external stream in Proton with SQL to consume data from your Kafka or Redpanda. Follow this tutorial for SQL snippets.

Docker Compose

If you don’t want to setup Kafka or Redpanda, you can use the docker-compose.yml file in examples/carsharing. Download the file to a local folder. Make sure you have Docker Engine and Desktop installed. Use docker compose up to start the demonstration stack.

Next, you can open the shell of the Proton container and run your first streaming SQL. To print out the new data being generated, you can run the following sample SQL:

select * from car_live_data

To get the total number of events in the historical store, you can run the following SQL:

select count() from table(car_live_data)

To show the number of event events, at certain intervals (2 seconds, by default), you can run:

select count() from car_live_data

Congratulations! You have successfully installed Proton and run queries for both historical and streaming analytics.

Docker

With Docker engine installed on your local machine, pull and run the latest version of the Proton Docker image.

docker run -d --pull always --name proton ghcr.io/timeplus-io/proton:latest

Connect to your proton container and run the proton-client tool to connect to the local Proton server:

docker exec -it proton proton-client -n

If you stop the container and want to start it again, run docker start proton.

Query a test stream

From proton-client, run the following SQL to create a stream of random data:

-- Create a stream with random data.
CREATE RANDOM STREAM devices(device string default 'device'||to_string(rand()%4), temperature float default rand()%1000/10);

-- Run the long-running stream query.
SELECT device, count(*), min(temperature), max(temperature) FROM devices GROUP BY device;

You should see data like the following:

┌─device──┬─count()─┬─min(temperature)─┬─max(temperature)─┐
│ device0 │    2256 │                0 │             99.6 │
│ device1 │    2260 │              0.1 │             99.7 │
│ device3 │    2259 │              0.3 │             99.9 │
│ device2 │    2225 │              0.2 │             99.8 │
└─────────┴─────────┴──────────────────┴──────────────────┘

What's next?

Now that you're running Proton and have created your first stream, query, and view, you can explore reading and writing data from Apache Kafka with External Streams, or view the Proton documentation to explore additional capabilities.

To see more examples of using Proton, check out the examples folder.

The following drivers are available:

Integrations with other systems:

Get more with Timeplus

To access more features, such as sources, sinks, dashboards, alerts, data lineage, create a workspace at Timeplus Cloud or try the live demo with pre-built live data and dashboards.

Documentation

We publish full documentation for Proton at docs.timeplus.com alongside documentation for the Timeplus (Cloud and Enterprise) platform.

We also have a FAQ for detailing how we chose Apache License 2.0, how Proton is related to ClickHouse, what features are available in Proton versus Timeplus, and more.

Contributing

We welcome your contributions! If you are looking for issues to work on, try looking at the issue list.

Please see the wiki for more details, and BUILD.md to compile Proton in different platforms.

We also encourage you to join the Timeplus Community Slack to ask questions and meet other active contributors from Timeplus and beyond.

Need help?

Join the Timeplus Community Slack to connect with Timeplus engineers and other Proton users.

For filing bugs, suggesting improvements, or requesting new features, see the open issues here on GitHub.

Licensing

Proton uses Apache License 2.0. See details in the LICENSE.

About

A unified streaming and historical data analytics database in a single binary, powered by ClickHouse.

https://timeplus.com

License:Apache License 2.0


Languages

Language:C++ 76.7%Language:Python 9.1%Language:Assembly 5.4%Language:Shell 4.4%Language:C 2.4%Language:JavaScript 0.6%Language:CMake 0.6%Language:Jinja 0.3%Language:HTML 0.2%Language:Dockerfile 0.1%Language:Perl 0.1%Language:Clojure 0.0%Language:ANTLR 0.0%Language:SCSS 0.0%Language:CSS 0.0%Language:Cap'n Proto 0.0%Language:Java 0.0%Language:C# 0.0%Language:Go 0.0%Language:GAP 0.0%Language:PHP 0.0%Language:Makefile 0.0%Language:Vim Script 0.0%