scramjetorg / framework-js

Simple yet powerful live data computation framework.

Home Page:https://www.scramjet.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scramjet Framework TypeScript

npm version Tests Known Vulnerabilities License Version GitHub stars Donate

⭐ Star us on GitHub — it motivates us a lot! 🚀

Scramjet Framework

Scramjet is a simple reactive stream programming framework. The code is written by chaining functions that transform the streamed data, including well known map, filter and reduce.

The main advantage of Scramjet is running asynchronous operations on your data streams concurrently. It allows you to perform the transformations both synchronously and asynchronously by using the same API - so now you can "map" your stream from whatever source and call any number of API's consecutively.

This is a pre-release of the next major version (v5) of JavaScript Scramjet Framework.

We are open to your feedback! We encourage you to report issues with any ideas, suggestions and features you would like to see in this version. You can also upvote (+1) existing ones to show us the direction we should take in developing Scramjet Framework.

Not interested in JavaScript/TypeScript version? Check out Scramjet Framework in Python!

Table of contents

Installation

Simply run:

npm i @scramjet/framework

And then you can require it in the JS/TS code like:

sample-file.ts

import { DataStream } from "@scramjet/framework";

You can also use nightly build as npm dependency by referring to nightly branch (which is the latest build) from this repository:

package.json

{
    "dependencies": {
        "scramjet": "scramjetorg/framework-js#nightly"
    }
}

After adding Scramjet Framework as dependency it needs to be installed via npm (or similar):

npm i

You can also build Scramjet Framework yourself. Please refer to Development Setup section for more details.

Usage

Scramjet streams are similar and behave similar to native nodejs streams and to streams in any programing language in general. They allow operating on streams of data (were each separate data part is called a chunk) and process it in any way through transforms like mapping or filtering.

Let's take a look on how to create and operate on Scramjet streams.

If you would like to dive deeper, please refer to streams source files.

Creating Scramjet streams

The basic method for creating Scramjet streams is from() static method. It accepts iterables (both sync and async) and native nodejs streams. As for iterables it can be a simple array, generator or anything iterable:

import { DataStream } from "scramjet";

const stream = DataStream.from(["foo", "bar", "baz"]);

Scramjet streams are asynchronous iterables itself, which means one stream can be created from another:

import { DataStream } from "scramjet";

const stream1 = DataStream.from(["foo", "bar", "baz"]);
const stream2 = DataStream.from(stream1);

They can be also created from native nodejs Readables:

import { createReadStream } from "fs";
import { DataStream } from "scramjet";

const stream = DataStream.from(createReadStream("path/to/file"));

The more "manual" approach is creating streams using constructor:

import { DataStream } from "scramjet";

const stream = new DataStream();

Such approach is useful when one needs to manually write data to a stream or use it as a pipe destination:

import { DataStream } from "scramjet";

const stream = new DataStream();
stream.write("foo");

const stream2 = new DataStream();
stream.pipe(stream2);

Getting data from Scramjet streams

Similar as to creating Scramjet streams, there are specific methods which allow getting data out of them. Those are sometimes called sink methods as they allow data to flow through and out of the stream. As those methods needs to wait for the stream end, they return a Promise which needs to be awaited and is resolved when all data from source is processed.

import { DataStream } from "scramjet";

const stream1 = DataStream.from(["foo", "bar", "baz"]);
await stream1.toArray(); // ["foo", "bar", "baz"]

const stream2 = DataStream.from(["foo", "bar", "baz"]);
await stream2.toFile("path/to/file"); // Writes to a file, resolves when done.

const stream3 = DataStream.from(["foo", "bar", "baz"]);
await stream3.reduce(
    (prev, curr) => `${ prev }-${ curr }`,
    ""
); // "foo-bar-baz"

As Scramjet streams are asynchronous iterables they can be iterated too:

import { DataStream } from "scramjet";

const stream = DataStream.from(["foo", "bar", "baz"]);

for await (const chunk of stream) {
    console.log(chunk);
}
// Logs:
// "foo"
// "bar"
// "baz"

Similar to writing, there is also more "manual" way of reading from streams using .read() method:

import { DataStream } from "scramjet";

const stream = DataStream.from(["foo", "bar", "baz"]);

await stream.read(); // "foo"
await stream.read(); // "bar"

Read returns a Promise which waits until there is something ready to be read from a stream.

Basic operations

The whole idea of stream processing is an ability to quickly and efficiently transform data which flows through the stream. Let's take a look at basic operations (called transforms) and what they do:

Mapping

Mapping stream data is basically the same as mapping an array. It allows to map a chunk to a new value:

import { DataStream } from "scramjet";

DataStream
    .from(["foo", "bar", "baz"])
    .map(chunk => chunk.repeat(2))
    .toArray(); // ["foofoo", "barbar", "bazbaz"]

The result of the map transform could be of different type than initial chunks:

import { DataStream } from "scramjet";

DataStream
    .from(["foo", "bar", "baz"])
    .map(chunk => chunk.charCodeAt(0))
    .toArray(); // [102, 98, 98]

DataStream
    .from(["foo", "bar", "baz"])
    .map(chunk => chunk.split(""))
    .toArray(); // [["f", "o", "o"], ["b", "a", "r"], ["b", "a", "z"]]

Filtering

Filtering allows to filter out any unnecessary chunks:

import { DataStream } from "scramjet";

DataStream
    .from([1, 2, 3, 4, 5, 6])
    .filter(chunk => chunk % 2 === 0)
    .toArray(); // [2, 4, 6]

Grouping

Batching allows to group chunks into arrays, effectively changing chunks number flowing though the stream:

import { DataStream } from "scramjet";

DataStream
    .from([1, 2, 3, 4, 5, 6, 7, 8])
    .batch(chunk => chunk % 2 === 0)
    .toArray(); // [[1, 2], [3, 4], [5, 6], [7, 8]]

Whenever callback function passed to .batch() call returns true, new group is emitted.

Flattening

Operation opposite to batching is flattening. At the moment, Scramjet streams provides .flatMap() method which allows first to map chunks and then flatten the resulting arrays:

import { DataStream } from "scramjet";

DataStream
    .from(["foo", "bar", "baz"])
    .flatMap(chunk => chunk.split(""))
    .toArray(); // ["f", "o", "o", "b", "a", "r", "b", "a", "z"]

But it can be also used to only flatten the stream by providing a callback which only passes values through:

import { DataStream } from "scramjet";

DataStream
    .from([1, 2, 3, 4, 5, 6, 7, 8])
    .batch(chunk => chunk % 2 === 0)
    .flatMap(chunk => chunk)
    .toArray(); // [1, 2, 3, 4, 5, 6, 7, 8]

Piping

Piping is essential for operating on streams. Scramjet streams can be both used as pipe source and destination. They can be also combined with native nodejs streams having native streams as pipe source or destination.

import { DataStream } from "scramjet";

const stream1 = DataStream.from([1, 2, 3, 4, 5, 6, 7, 8]);
const stream2 = new DataStream();

stream1.pipe(stream2); // All data flowing through "stream1" will be passed to "stream2".
import { createReadStream } from "fs";
import { DataStream } from "scramjet";

const readStream = createReadStream("path/to/file"));
const scramjetStream = new DataStream();

readStream.pipe(scramjetStream); // All file contents read by native nodejs stream will be passed to "scramjetStream".
import { createWriteStream } from "fs";
import { DataStream } from "scramjet";

const scramjetStream = DataStream.from([1, 2, 3, 4, 5, 6, 7, 8]);

scramjetStream.pipe(createWriteStream("path/to/file")); // All data flowing through "scramjetStream" will be written to a file via native nodejs stream.

Requesting Features

Anything missing? Or maybe there is something which would make using Scramjet Framework much easier or efficient? Don't hesitate to fill up a new feature request! We really appreciate all feedback.

Reporting Bugs

If you have found a bug, inconsistent or confusing behavior please fill up a new bug report.

Contributing

You can contribute to this project by giving us feedback (reporting bugs and requesting features) and also by writing code yourself! We have some introductory issues labeled with good first issue which should be a perfect starter.

The easiest way is to create a fork of this repository and then create a pull request with all your changes. In most cases, you should branch from and target main branch.

Please refer to Development Setup section on how to setup this project.

Development Setup

Project setup

  1. Install nodejs (14.x).

Refer to official docs. Alternatively you may use Node version manager like nvm.

  1. Clone this repository:
git clone git@github.com:scramjetorg/framework-js.git
  1. Install project dependencies:
npm i

Commands

There are multiple npm commands available which helps run tests, build the project and help during development.

Running tests

npm run test

Runs all tests from test directory. It runs build internally so it doesn't have to be run manually.

npm run test:unit[:w]

Runs all unit tests (test/unit directory). It runs build internally so it doesn't have to be run manually. When run with :w it will watch for changes, rebuild and rerun test automatically. To run unit tests without rebuilding the project use npm run test:run:unit.

npm run test:unit:d -- build/test/.../test.js [--host ...] [--port ...]

Runs specified test file in a debug mode. It runs build internally so it doesn't have to be run manually. This is the same as running npm run build && npx ava debug --break build/test/.../test.js [--host ...] [--port ...]. Then it can be inspected e.g. via Chrome inspector by going to chrome://inspect.

npm run test:bdd

Runs all BDD tests (test/bdd directory). It runs build internally so it doesn't have to be run manually. To run BDD tests without rebuilding the project use npm run test:run:bdd.

Running single test file or specific tests

Single test file can be run by passing its path to test command:

npm run test:unit -- build/test/ifca/common.spec.js

While specific test cases can be run using -m (match) option:

npm run test:unit -- -m "*default*"

Both can be mixed to run specific tests from a given file or folder:

npm run test:unit -- build/test/ifca/common.spec.js -m "*default*"

Building the project

npm run build[:w]

Transpiles .ts sources and tests (src and test directories) and outputs JS files to build directory. When run with :w it will watch for changes and rebuild automatically.

npm run dist

Builds dist files - similar to build but skips test directory and additionally generates source maps.

Miscellaneous

npm run lint

Lints src and test directories. Used as a pre-commit hook.

npm run lint:f

Fixes lint warnings/errors in src and test files.

npm run coverage

Checks code coverage, generates HTML report and serves it on 8080 port.

npm run coverage:check

Checks code coverage. Will fail if it is below a threshold defined in package.json. Useful as a CI job.

npm run coverage:generate

About

Simple yet powerful live data computation framework.

https://www.scramjet.org

License:MIT License


Languages

Language:TypeScript 82.2%Language:JavaScript 16.9%Language:Gherkin 0.8%Language:Shell 0.1%