Monitor deep learning model training and hardware usage from mobile.

🔥 Features

Monitor running experiments from mobile phone (or laptop)
Monitor hardware usage on any computer with a single command
Integrate with just 2 lines of code (see examples below)
Keeps track of experiments including infomation like git commit, configurations and hyper-parameters
Keep Tensorboard logs organized
Save and load checkpoints
API for custom visualizations
Pretty logs of training progress
Change hyper-parameters while the model is training
Open source! we also have a small hosted server for the mobile web app

Installation

You can install this package using PIP.

pip install labml

PyTorch example

from labml import tracker, experiment

with experiment.record(name='sample', exp_conf=conf):
    for i in range(50):
        loss, accuracy = train()
        tracker.save(i, {'loss': loss, 'accuracy': accuracy})

PyTorch Lightning example

from labml import experiment
from labml.utils.lightening import LabMLLighteningLogger

trainer = pl.Trainer(gpus=1, max_epochs=5, progress_bar_refresh_rate=20, logger=LabMLLighteningLogger())

with experiment.record(name='sample', exp_conf=conf, disable_screen=True):
        trainer.fit(model, data_loader)

TensorFlow 2.X Keras example

from labml import experiment
from labml.utils.keras import LabMLKerasCallback

with experiment.record(name='sample', exp_conf=conf):
    for i in range(50):
        model.fit(x_train, y_train, epochs=conf['epochs'], validation_data=(x_test, y_test),
                  callbacks=[LabMLKerasCallback()], verbose=None)

📚 Documentation

Guides

🖥 Screenshots

Formatted training loop output

Custom visualizations based on Tensorboard logs

Tools

Hosting your own experiments server

# Install the package
pip install labml-app

# Start the server

labml app-server

Training models on cloud

# Install the package
pip install labml_remote

# Initialize the project
labml_remote init

# Add cloud server(s) to .remote/configs.yaml

# Prepare the remote server(s)
labml_remote prepare

# Start a PyTorch distributed training job
labml_remote helper-torch-launch --cmd 'train.py' --nproc-per-node 2 --env GLOO_SOCKET_IFNAME enp1s0

Monitoring hardware usage

# Install packages and dependencies
pip install labml psutil py3nvml

# Start monitoring
labml monitor

Other Guides

Setting up a local Ubuntu workstation for deep learning

Setting up a cloud computer for deep learning

Citing

If you use LabML for academic research, please cite the library using the following BibTeX entry.

@misc{labml,
 author = {Varuna Jayasiri, Nipun Wijerathne},
 title = {labml.ai: A library to organize machine learning experiments},
 year = {2020},
 url = {https://labml.ai/},
}

About

🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱

https://labml.ai

MIT License

Languages

Language:Jupyter Notebook 85.0%Language:Python 8.0%Language:TypeScript 6.3%Language:SCSS 0.3%Language:JavaScript 0.2%Language:Makefile 0.1%Language:Shell 0.1%Language:HTML 0.0%Language:Cython 0.0%Language:Jinja 0.0%