WikiQA-based Open Domain Question Answering

Advanced NLP major project: Open domain question answering on WikiQA corpus

In this project, we study the problem of Answer Selection, a subproblem in Open Domain Question Answering using the WikiQA dataset. We implement three methods on the problem. Our literature survey, analysis and findings can be found under the reports folder as well as the presentation link shared above.

To try out the code here:

Attentive Pooling Networks

This notebook contains the implementation of Attentive Pooling Networks using Bi-LSTMs and CNNs. This architecture was proposed in this paper for computation of question and answer representations with mutual influence in each.

BERT Finetuning

A simple BERT finetuning beats the baseline established by Attentive Pooling Networks. This notebook can be used to try the BERT-based modeling of the problem.

TANDA

The Transfer AND Adapt (TANDA) strategy, which is currently a SoTA on the problem, proposed a 2-stage finetuning on top of a pretrained architecture like BERT:

Transferring the capability of the model with the use of a large, generic task-specific dataset to learn the task
Domain-adaptive finetuning on the downstream task-specific dataset for few shot learning This notebook contains the code for the same. A runnable Python script can be found here.

About

ANLP major project: Open domain question answering on WikiQA corpus

Languages

Language:Jupyter Notebook 98.5%Language:Python 1.5%