composable-logs / composable-logs

Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documentation site with more details and demo:

Home Page:https://composable-logs.github.io/composable-logs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ci/cd: publish snapshot to PyPI PyPI version license=mit Ideas and feedback=welcome

Composable Logs

Composable Logs is a Python library to run ML/data workflows on stateless compute infrastructure (that may be ephemeral or serverless).

In particular, using Composable Logs one can do ML experiment tracking without a dedicated tracking server (and database) to record ML metrics, models or artifacts. Instead, these are emitted using the OpenTelemetry standard for logging. This is an open standard in software engineering with growing support.

It can be useful to think of the logs emitted by Composable Logs as somewhat similar to logs emitted by unit test frameworks (like eg the JUnit format).

For example, log events emitted from Composable Logs can be directed to a JSON-file, or sent to any log storage supporting OpenTelemetry (span) events. In either case, this means that one does not need a separate tracking service only for ML experiments.

The below shows how a captured JSON log can be converted into a static website based on ML Flow.

Composable Logs uses the Ray framework for parallel task execution.

For more details:

Documentation and architecture

Live demo

  • Using Composable Logs one can run a ML training pipeline using only a free Github account. This uses:

    • Github actions: trigger the ML pipeline daily and for each PR.
    • Build artifacts: to store OpenTelemetry logs of past runs.
    • Github Pages: to host static website for reporting on past runs.

    The static website is rebuilt after each pipeline run (by extracting relevant data from past OpenTelemetry logs). This uses a fork of MLFlow that can be deployed as a static website, https://github.com/composable-logs/mlflow.

    Screenshot

  • Codes for pipeline (MIT): https://github.com/composable-logs/mnist-digits-demo-pipeline

Public roadmap and planning

Install via PyPI

Latest release
Snapshot of latest commit to main branch

Any feedback/ideas welcome!

License

(c) Matias Dahl, MIT, see LICENSE.md.

(Note: As of 1/2023 this project was renamed from pynb-dag-runner to composable-logs.)

About

Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documentation site with more details and demo:

https://composable-logs.github.io/composable-logs

License:MIT License


Languages

Language:Python 96.8%Language:Makefile 3.2%