raymondhs / mlx-gpt

GPT-2 implementation in MLX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MLX-GPT

In this repo, I ported Andrej Karpathy's NanoGPT implementation of GPT-2 into MLX. This was done both for personal learning and to explore the feasibility of running (and training) large language models (LLMs) on consumer hardware like a MacBook.

Requirements

To install MLX using pip, these are needed:

  • Machine with M-series chip
  • MacOS >= 13.0
  • Python between 3.8-3.11

Additionally, these packages are needed:

  • mlx
  • tiktoken (for using GPT-2's tokenizer)
  • transformers, torch (for loading HuggingFace's GPT-2 models)

Usage

To run the script and generate text, use the following command:

python mlx_gpt.py

By default, this will load the smallest (124M parameters) GPT-2 model from HuggingFace, and run some generation:

Here's some example output I got:

> I'm a language model, we're trying to solve the problem like a real human and we're trying to understand the human anatomy and know. And
> I'm a language model, but I like the way that people define what they write or do. And as a translator I write it myself.
> I'm a language model, not some advanced computer, and I'm talking about more than just writing numbers. I'm talking about the development of an
> I'm a language model, not a language model," says the study's lead author, Dr. Robert W. Smith of Yale University. The scientists
> I'm a language model, so you can see in the examples what the type of the current data is. That makes it harder to learn something of

You can modify the script to customize the input prompt and generation parameters such as model type and maximum length.

Training

WIP

References

About

GPT-2 implementation in MLX


Languages

Language:Python 100.0%