winningsix / BDTK

A modular acceleration toolkit for big data analytic engines

Home Page:https://intel.github.io/BDTK/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Introduction

Big Data Analytic Toolkit is a set of acceleration libraries aimed to optimize big data analytic frameworks.

The following diagram shows the design architecture. BDTK-INTRODUCTION

Major components of the project include:

  • Cider:

    a modularized and general-purposed Just-In-Time (JIT) compiler for data analytic query engine. It employs Substrait as a protocol allowing to support multiple front-end engines. Currently it provides a LLVM based implementation based on HeavyDB.

  • Velox Plugin:

    a Velox-plugin is a bridge to enable Big Data Analytic Toolkit onto Velox. It introduces hybrid execution mode for both compilation and vectorization (existed in Velox). It works as a plugin to Velox seamlessly without changing Velox code.

  • Intel Codec Library: Intel Codec Library for BigData provides compression and decompression library for Apache Hadoop/Spark to make use of the acceleration hardware for compression/decompression.

Cider & Velox Plugin

Major API Example

How to build

How to Enable in Presto

Code Of Conduct

Online Documentation

You can find the all the Big Data Analytic Toolkit documents on the project web page.

License

Big Data Analytic Toolkit is licensed under the Apache 2.0 License. A copy of the license can be found here.

About

A modular acceleration toolkit for big data analytic engines

https://intel.github.io/BDTK/

License:Apache License 2.0


Languages

Language:C++ 95.9%Language:CMake 1.6%Language:Python 1.2%Language:C 0.7%Language:Shell 0.4%Language:Makefile 0.1%Language:Dockerfile 0.1%Language:Batchfile 0.0%