dlwh / levanter-midi

Experiments with Jax and TPUs for Foundation Models - modified tokens for lakh

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Levanter

You could not prevent a thunderstorm, but you could use the electricity; you could not direct the wind, but you could trim your sail so as to propel your vessel as you pleased, no matter which way the wind blew.
— Cora L. V. Hatch

Levanter is a library based on Jax and Equinox for training foundation models created by Stanford's Center for Research on Foundation Models (CRFM).

Haliax

Though you don’t seem to be much for listening, it’s best to be careful. If you managed to catch hold of even just a piece of my name, you’d have all manner of power over me.
— Patrick Rothfuss, The Name of the Wind

Haliax is a module (currently) inside Levanter for named tensors, modeled on Alexander Rush's Tensor Considered Harmful. It's designed to work with Jax and Equinox to make constructing distributed models easier.

Getting Started with Levanter

Installation

First install the appropriate version of Jax for your system. See Jax's installation instructions as it varies from platform to platform.

If you're using a TPU, more complete documentation for setting that up is available here.

Now clone this repository and install it with pip:

git clone https://github.com/stanford-crfm/levanter.git
cd levanter
pip install -e .
wandb login  # optional, we use wandb for logging

TODO: put things on pypi, etc

Training a GPT2-nano

As a kind of hello world, here's how you can train a GPT-2 "nano"-sized model on a small dataset.

python examples/gpt2_example.py --config_path config/gpt2_nano.yaml

This will train a GPT2-nano model on the WikiText-2 dataset. You can change the dataset by changing the dataset field in the config file.

The config file is a Pyrallis config file. Pyrallis is yet-another yaml-to-dataclass library. You can use --help or poke around other configs to see all the options available to you.

Training on a TPU Cloud VM

Please see the TPU Getting Started guide for more information on how to set up a TPU Cloud VM and run Levanter there.

Training with CUDA

Please see the CUDA Getting Started guide for more information on how to set up a CUDA environment and run Levanter there.

Understanding Levanter and Haliax

Please see the Overview guide for more information on how Levanter and Haliax work, their inspirations, etc.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for more information.

License

Levanter is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

About

Experiments with Jax and TPUs for Foundation Models - modified tokens for lakh

License:Apache License 2.0


Languages

Language:Python 96.8%Language:Shell 3.2%