flan-t5 prompt-based-learning question-answering squad

Zero-One-Few-Shot Question Answering

M.Sc. in Language Technology M915-NLG_NLU Assignment2: Development and Comparison of N-shot QA systems on SQuAD v.1 with a pretrained generative language model.

This is a useful repo for you to experiment with the zero one and few sgot learning scheme on Question answering with the use of LLMs. Here, Flan-T5 Small is used, due to its reported few shot capabilities.

Prompt template

Experiment settings and results

Zero-shot: no instructions
One-shot: Single use of prompt template
Few-shot: Random 9-shots after finetuning
Exp_Few-shot: 11-shots from Question classification in 11 question types ("What", "How many", "How", "Whom", "Whose", "Who", "When", "Which", "Where", "Why", "Be/Do/etc., "Other"}

Evaluation on the official dev set with typical QA metrics: Exact Match and F1_score

Contributor Expectations

Increase the window size between k-value of shots (e.g 10-20-30) to check for bigger differences in evaluation metrics
Instead of random sampling, use k-demonstrations of one single Wikipedia article (e.g Super Bowl, Nikola_Tesla or any other specific document) in order to if the model learns more patterns in one specific document and learns to discern pattterns based on the posed quedstion. (Pay Attention to overfitting!)
Try larger versions of Flan T5 or any other LLM (e.g a decoder only model like GPT-2)

About

NLU_NLG Winter Semester

flan-t5 prompt-based-learning question-answering squad

MIT License

Languages

Language:Jupyter Notebook 100.0%