mingrammer / nuclio

High-Performance Serverless event and data processing framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Build Status Go Report Card

nuclio — "Serverless" for Real-Time Events and Data Processing

nuclio is a new serverless project, derived from iguazio's elastic data life-cycle management service for high-performance events and data processing. nuclio is being extended to support a large variety of event and data sources. You can use nuclio as a standalone binary (for example, for IoT devices), package it within a Docker container, or integrate it with a container orchestrator like Kubernetes.

nuclio is extremely fast. A single function instance can process hundreds of thousands of HTTP requests or data records per second. This is 10–100 times faster than some other frameworks. See nuclio Architecture to learn how it works.

nuclio technical presentation in slideshare and a video recording with demo.

Note: nuclio is still under development, and is not recommended for production use.

In This Document

Why Another "serverless" Project?

We considered existing cloud and open-source serverless solutions, but none addressed our needs:

  • Real-time processing with minimal CPU and I/O overhead and maximum parallelism

  • Native integration with a large variety of data and event sources, and processing models

  • Abstraction of data resources from the function code, to support code portability, simplicity, and data-path acceleration

  • Simple debugging, regression testing, and multi-versioned CI/CD pipelines

  • Portability across low-power devices, laptops, on-prem clusters, and public clouds

We designed nuclio to be extendable, using a modular and layered approach. We hope many will join us in developing new modules and integrations with more event and data sources, developer tools, and cloud platforms.

Getting Started With nuclio

The simplest way to explore nuclio is to run the nuclio playground (you only need docker):

docker run -p 8070:8070 -v /var/run/docker.sock:/var/run/docker.sock nuclio/playground

Browse to http://localhost:8070, deploy one of the example functions or write your own. You can then head over to the nuclio SDK repository for a complete step-by-step guide to using nuclio over Kubernetes and nuctl - nuclio's command line interface.

playground

nuclio High-Level Architecture

architecture

Function Processors
A processor listens on one or more event sources (for example, HTTP, Message Queue, Stream), and executes user functions with one or more parallel workers. The workers use language-specific runtimes to execute the function (via native calls, SHMEM, or shell). Processors use abstract interfaces to integrate with platform facilities for logging, monitoring, and configuration, allowing for greater portability and extensibility (such as logging to a screen, file, or log stream).
Event Sources
Functions can be invoked through a variety of event sources (such as HTTP, RabitMQ, Kafka, Kinesis, NATS, DynamoDB, iguazio v3io, or schedule), which are defined in the function specification.
Event sources are divided into several event classes (req/rep, async, stream, pooling), which define the sources' behavior.
Different event sources can plug seamlessly into the same function without sacrificing performance, allowing for portability, code reuse, and flexibility.
Data Bindings
Data-binding rules allow users to specify persistent input/output data resources to be used by the function. (Data connections are preserved between executions.) Bound data can be in the form of files, objects, records, messages etc.
The function specification may include an array of data-binding rules, each specifying the data resource and its credentials and usage parameters.
Data-binding abstraction allows using the same function with different data sources of the same type, and enables function portability.
Playground
The playground is a standalone container micro-service accessed through HTTP, it presents a code editor UI for editing, deploying and testing functions. This is the most user-friendly way to work with nuclio. The playground container comes with a version of the builder inside.
nuctl cli
nuctl is nuclio command line tool (cli), allowing users to list, create, build, update, execute and delete functions.
Controller
A controller accepts function and event-source specifications, invokes builders and processors through an orchestration platform (such as Kubernetes), and manages function elasticity, life cycle, and versions.
Builder
A builder receives raw code and optional build instructions and dependencies, and generates the function artifact — a binary file or a Docker container image, which the builder can also push to a specified image repository.
The builder can run in the context of the CLI or as a separate service for automated development pipelines.
Dealer
A dealer is used with streaming and batch jobs to distribute a set of tasks or data partitions/shards among the available function instances, and guarantee that all tasks are completed successfully. For example, if a function reads from a message stream with 20 partitions, the dealer will guarantee that the partitions are distributed evenly across workers, taking into account the number of function instances and failures.
nuclio SDK
The nuclio SDK is used by function developers to write, test, and submit their code, without the need for the entire nuclio source tree.

For more information about the nuclio architecture, see nuclio Architecture.

nuclio Function Examples

The function demonstrated below, uses the Event and Context interfaces to handle inputs and logs, and returns a structured HTTP response (can also use a simple string as returned value).

in Golang

package handler

import (
    "github.com/nuclio/nuclio-sdk"
)

func Handler(context *nuclio.Context, event nuclio.Event) (interface{}, error) {
    context.Logger.Info("Request received: %s", event.GetPath())

    return nuclio.Response{
        StatusCode:  200,
        ContentType: "application/text",
        Body: []byte("Response from handler"),
    }, nil
}

in Python

def handler(context, event):
    response_body = f'Got {event.method} to {event.path} with "{event.body}"'

    # log with debug severity
    context.logger.debug('This is a debug level message')

    # just return a response instance
    return context.Response(body=response_body,
                            headers=None,
                            content_type='text/plain',
                            status_code=201)

More Details and Links

for more questions and help use nuclio slack channel

About

High-Performance Serverless event and data processing framework

License:Apache License 2.0


Languages

Language:Go 86.7%Language:JavaScript 5.4%Language:Python 3.4%Language:CSS 1.9%Language:Shell 1.1%Language:HTML 0.8%Language:Makefile 0.4%Language:Ruby 0.3%