Only requirements are Nvidia docker
git clone https://github.com/Danny-Dasilva/gpt2.git
cd gpt2
sudo docker build ./ -t gpt2
GPT_DIR=$PWD
docker run --name create-text \
--rm -it --privileged -p 6006:6006 \
--mount type=bind,src=${GPT_DIR},dst=/home/gpt2/local \
--gpus all \
gpt2
sudo docker exec -it create-text /bin/bash
put your txt file in ${GPT_DIR} (the cloned repo)
the following examples will use the example file of training.txt
if you want examples on how to format the txt file that info is provided below
in the docker container
python3 encode.py local/training.txt training.npz
run name here is run1 this will be the same as the model name later
python3 train.py --dataset training.npz --run_name run1
the model will save automatically at 1000 steps but if you hit Ctrl + C it will save the most recent step, sometimes if you train for too short you will get a ValueError("Can't load save_path when it is None.")
I think it has something to do with the tokens not delimiting correcly because the model is not trained enough.
below is for unconditional e.g. generated text
python3 generate_unconditional_samples.py --top_k 40 --model_name run1
these will ask for a model prompt
python3 interactive_conditional_samples.py --top_k 40 --model_name run1 --length 25
-
top_k: Integer value controlling diversity. 1 means only 1 word is considered for each step (token), resulting in deterministic completions, while 40 means 40 words are considered at each step. 0 (default) is a special setting meaning no restrictions. 40 generally is a good value.
-
temperature: Float value controlling randomness in boltzmann distribution. Lower temperature results in less random completions. As the temperature approaches zero, the model will become deterministic and repetitive. Higher temperature results in more random completions. Default value is 1.
-
length: number of characters included in each sample before a new sample is generated
the example said to delimit with <|endoftext|>
but you can use whatever
files should look like below
Example sentence or paragraph of text
<|endoftext|>
Another text thing
<|endoftext|>