Nurran / byzer-lang

Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.

Home Page:https://www.byzer.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CI License

drawing

TOC

Byzer-Lang

Byzer (former MLSQL) is a low-code, open-sourced and distributed programming language for data pipeline, analytics and AI in cloud native way.

Deisgn protocol: Everything is a table. Byzer is a SQL-like language, to simplify data pipeline, analytics and AI, combined with built-in algorithms and extensions.

We believe that everything is a table, a simple and powerful SQL-like language can significantly reduce human efforts of data development without switching different tools.

Byzer Architecture

Byzer-lang Arch

Byzer Code Example

load hive.`raw.stripe_discounts` as discounts;
load hive.`raw.stripe_invoice_items` as invoice_items;

select
        invoice_items.*,
        case
            when discounts.discount_type = 'percent'
                then amount * (1.0 - discounts.discount_value::float / 100)
            else amount - discounts.discount_value
        end as discounted_amount

    from invoice_items

    left outer join discounts
        on invoice_items.customer_id = discounts.customer_id
        and invoice_items.invoice_date > discounts.discount_start
        and (invoice_items.invoice_date < discounts.discount_end
             or discounts.discount_end is null)
as joined;



select

        id,
        invoice_id,
        customer_id,
        coalesce(discounted_amount, amount) as discounted_amount,
        currency,
        description,
        created_at,
        deleted_at

    from joined
as final;



set allColumns = "all,wow";

!if ''' split(:allColumns,",")[0] == "all" ''';
   select * from final as final2;
!else;
   select id,invoice from final as final2;
!fi;

select * from final2 as output;

Official WebSite

https://www.byzer.org

Notebook Support

byzer-notebook

VSCode Extension(MacOS、Linux、Windows)

VSCode IDE Extension

More document about byzer-lang vscode extension(Chinese version)

Docker Sandbox (With Notebook)

export MYSQL_PASSWORD=${1:-root}
export SPARK_VERSION=${SPARK_VERSION:-3.1.1}
export MLSQL_VERSION=${MLSQL_VERSION:-2.2.0-SNAPSHOT}

docker run -d \
-p 3306:3306 \
-p 9002:9002 \
-p 9003:9003 \
-e MYSQL_ROOT_HOST=% \
-e MYSQL_ROOT_PASSWORD="${MYSQL_PASSWORD}" \
--name mlsql-sandbox-${SPARK_VERSION}-${MLSQL_VERSION} \
mlsql-sandbox:${SPARK_VERSION}-${MLSQL_VERSION}

Then you can visit http://127.0.0.1:9002 .

Download Byzer

  • The latest stable version is release-2.2.0
  • You can download from Byzer Download Website
  • Spark 2.4.3/3.1.1 have been tested

Naming Convention

mlsql-engine_${spark_major_version}-${mlsql_version}.tgz

## Pre-built for Spark 2.4.3
mlsql-engine_2.4-2.1.0.tar.gz

## Pre-built for Spark 3.1.1
mlsql-engine_3.0-2.1.0.tar.gz

Building a Distribution

Prerequisites

  • JDK 8+
  • Maven
  • Linux or MacOS

Downloading Source Code

## Clone the code base
git clone https://github.com/byzer-org/byzer-lang.git
cd byzer-lang

Building Spark 2.4.3 Bundle

export MLSQL_SPARK_VERSION=2.4
./dev/make-distribution.sh

Building Spark 3.1.1 Bundle

export MLSQL_SPARK_VERSION=3.0
./dev/make-distribution.sh

Building without Chinese Analyzer

## Chinese analyzer is enabled by default.
export ENABLE_CHINESE_ANALYZER=false
./dev/make-distribution.sh <spark_version>

Deploying

  1. Download or build a distribution
  2. Install Spark and set environment variable SPARK_HOME, make sure Spark version matches that of MLSQL
  3. Deploy tgz
  • Set environment variable MLSQL_HOME
  • Copy distribution tar ball over and untar it

4.Start Byzer in local mode

cd $MLSQL_HOME
## Run process in background
nohup ./bin/start-local.sh 2>&1 > ./local_mlsql.log &
  1. Open a browser and type in http://localhost:9003, have fun.

Directory structure

|-- mlsql
    |-- bin        
    |-- conf       
    |-- data       
    |-- examples   
    |-- libs       
    |-- README.md  
    |-- LICENSE
    |-- RELEASE

How to contribute to Byzer-Lang

If you are planning to contribute to this repository, please create an issue at our Issue page even if the topic is not related to source code itself (e.g., documentation, new idea and proposal).

This is an active open source project for everyone, and we are always open to people who want to use this system or contribute to it.

For more details about how to contribute to the Byzer Org, please refer to How to Contribute

Contributors

Made with contrib.rocks.

WeChat Group

扫码添加K小助微信号,添加成功后,发送 mlsql 这5个英文字母进群。

About

Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.

https://www.byzer.org

License:Apache License 2.0


Languages

Language:JavaScript 53.7%Language:Scala 14.7%Language:CSS 12.4%Language:HTML 10.2%Language:Less 5.4%Language:Java 1.9%Language:Python 1.4%Language:Shell 0.3%Language:Roff 0.0%Language:Dockerfile 0.0%Language:ANTLR 0.0%Language:Batchfile 0.0%