Adminixtrator / gpt-2

GPT-2 model 345M

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

gpt-2

Status: Status


GPT-2 - a Transformer-based language model and a successor to GPT - has shown unprecedented performance in language modeling, primarily due to its over an order of magnitude more parameters. While GPT-2’s performance on QA with no task-specific training is embryonic, it indicates that an unsupervised language model could contribute to their performance through fine-tuning.

  • This repo includes an experiment of fine-tuning GPT-2 345M for Question Answering (QA). It also runs the model on Stanford Question Answering Dataset 2.0 (SQuAD).

📃 Testing

1. Open your terminal and clone this repository somewhere

$ git clone https://github.com.Adminixtrator/gpt-2.git

2. Download the 345M model

# from your notebook
!python3 download_model.py 345M
!export PYTHONIOENCODING=UTF-8

3. Changing directory

import os
os.chdir('src')  #You must be in gpt-2

4. Install regex

$ pip install regex

5 Run the model

run Test_GPT2.py 

See the Colab Notebook if you seem to have issues with testing and working with SQuAD.

Happy Developing!


Challenges faced

Major issue was the fine-tuning of the model with BERT on the Stanford Question answering Dataset (SQuAD) as most of the online sources had no sample to use for understanding what goes on in the fine-tunning.

Requirements

fire>=0.1.3 # Fire 
regex==2017.4.5 # For OpenAI GPT
requests==2.21.0    # Used for downloading models over HTTP 
tqdm==4.31.1    # progress bars in model download and training scripts
torch>=0.4.1    # PyTorch
boto3   # Accessing files from S3 directly.

REFERENCE - SQuAD


Credentials

License: Status

About

GPT-2 model 345M

License:MIT License


Languages

Language:Jupyter Notebook 88.3%Language:Python 11.5%Language:sed 0.1%Language:Shell 0.1%