Finetune_llama2_Megatron

Building a llama fine-tuning script from scratch using PyTorch and the transformers API, with support for four optional parameters: gradient checkpoint, mixed precision, data parallelism, and tensor parallelism. Avoid using ColossalAI/Megatron/DeepSpeed. Referring to existing code is allowed.

The loss curve:

About

Languages

Language:Python 97.8%Language:Shell 2.2%