Kleo-Karap / QA_systems

NLU_NLG Winter Semester

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zero-One-Few-Shot Question Answering

M.Sc. in Language Technology M915-NLG_NLU Assignment2: Development and Comparison of N-shot QA systems on SQuAD v.1 with a pretrained generative language model.

This is a useful repo for you to experiment with the zero one and few sgot learning scheme on Question answering with the use of LLMs. Here, Flan-T5 Small is used, due to its reported few shot capabilities.

Prompt template

image

Experiment settings and results

  1. Zero-shot: no instructions
  2. One-shot: Single use of prompt template
  3. Few-shot: Random 9-shots after finetuning
  4. Exp_Few-shot: 11-shots from Question classification in 11 question types ("What", "How many", "How", "Whom", "Whose", "Who", "When", "Which", "Where", "Why", "Be/Do/etc., "Other"}

Evaluation on the official dev set with typical QA metrics: Exact Match and F1_score image

Contributor Expectations

  • Increase the window size between k-value of shots (e.g 10-20-30) to check for bigger differences in evaluation metrics
  • Instead of random sampling, use k-demonstrations of one single Wikipedia article (e.g Super Bowl, Nikola_Tesla or any other specific document) in order to if the model learns more patterns in one specific document and learns to discern pattterns based on the posed quedstion. (Pay Attention to overfitting!)
  • Try larger versions of Flan T5 or any other LLM (e.g a decoder only model like GPT-2)

About

NLU_NLG Winter Semester

License:MIT License


Languages

Language:Jupyter Notebook 100.0%