Implement LLM using Nvidia FP8 support.
Inspired by Peng, H., et al; FP8-LM: Training FP8 Large Language Models. arXiv:2310.18313v2 and Karpathy's llm.c.
- Load tokens
Implement LLM using Nvidia FP8 support.
Inspired by Peng, H., et al; FP8-LM: Training FP8 Large Language Models. arXiv:2310.18313v2 and Karpathy's llm.c.
GNU Affero General Public License v3.0