DeepLink-org / EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Install

  • Install python requirements

    pip install -r requirements.txt

    other dependency

    • flash-attn (dropout_layer_norm) (maybe you need to compile it by yourself)
  • Pull deepspeed & add them to pythonpath

    export PYTHONPATH=/path/to/DeepSpeed:$PYTHONPATH
  • Install package in development mode

    pip install -e . -v

Train

Train Example

Infer and Eval

Infer Example

Support Models

  • qwen14b,
  • internlm7-20b,
  • baichuan1/2 (7b-13b)
  • llama1-2 (7b/13b/70b)

Model Example

Data

Data Example

3D Parallel config setting

Parallel Example

Speed Benchmark

Speed Benchmark

Dynamic Checkpoint

To optimize the model training performance in terms of time and space, EasyLLM supports Dynamic Checkpoint. Based on the input token size, it enables checkpointing for some layers. The configuration file settings are as follows:

Dynamic Checkpoint Example

License

This repository is released under the Apache-2.0 license.

Acknowledgement

We learned a lot from the following projects when developing EasyLLM.

About

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

License:Apache License 2.0


Languages

Language:Python 98.8%Language:Shell 0.7%Language:C++ 0.3%Language:Dockerfile 0.1%Language:Makefile 0.0%