ritzfy / toy-part

Yet another Toy Pretrain(able) Autoregressive Transformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Story Generator using Pretrained Autoregressive Transformer Model

Overview

This is a Python implementation of a story generator using a transformer model. The model is trained on the TinyStoriesV2 dataset and can complete stories based on a given prompt.

Features

  • Generates stories based on a given prompt
  • Uses a transformer model to generate text
  • Includes data loading and preprocessing utilities
  • Supports training and evaluation of the model

Technical Details

  • The model is implemented using torch and its scaled dot-product multi-head attention implementation
  • Tokenizer used is tiktoken
  • The model is trained using a custom training loop utilizing cosine annealing and learning rate warmup

How to Use

  • Install the required dependencies using pip install -r requirements.txt
  • Download your dataset and place it in the data directory
  • Train the model using python main.py
  • Generate stories using python generate.py
  • Deploy the streamlit app using python app.py

Future Improvements:

  • Implement control using a configuration file.
  • Explore different model architectures and hyperparameters.
  • Integrate larger and more diverse datasets for training.
  • Add functionality for user-specified story themes or genres.

Author

Ritav Jash

License

This project is licensed under the MIT License.

About

Yet another Toy Pretrain(able) Autoregressive Transformer


Languages

Language:Python 100.0%