BERT based Question Answering Model utilising Wikipedia pages as context

The model uses BERT to find the span of an answer for given question and context pairs.

Consider a sample question below:

The question along with the context is passed as input to BERT model, which then returns two integer values denoting the starting index and ending index for the possible answer in the given context.

What is answer_start_score and answer_end_scores

Here, for an answer span, answer_start_scores and answer_end_scores contains the log likelihood of each word being the starting word and ending word for the answer span.

`start_scores` for the given question

`end_scores` for the given question

How to select the answer span?

We select the maximum of all the possible values in answer_start_scores as the starting index for the answer span ans similarly we select the maximum of all values in answer_end_scores as the ending index for the answer span.

The answer for the given sample question is retrieved as follows:

Is there a way to get the answer without providing the context?

We'll try to retrieve the wikipedia documents based on the user's query and these documents will be used as the context.

How the documents are retrieved

Based on the question given by the user, the model first tries to fetch top 10 documents from the Wikipedia using Wiki library in Python. Among the 10 documents only first 2 are used as the context to find the answer.

A sample question for which documents are retrieved

The question is same as the previous one, except the part that now no context is provided by the user.

The image below contains the documents retrieved by the Wiki library for a particular question

But the length of the wikipedia article is greater than token limit!

The BERT model can't have input length(question + context) greater than 512 tokens. Hence, the Wikipedia documents retrieved are broken into chunks of maximum 512 length and then each chunk is sequentially fed to the model as context along with the question.

Length of each chunk

Each document will have n answers where n is the number of chunks(roughly equals ceil(l/(512 - q)), where l = length of Wikipedia article, q = length of th question) for that particular document.

One specific best answer from all the chunks for a particular document

For each chunk, the max_start_score and max_end_score is calculated. Now, the sum of max_start_score and max_end_score is calculated and the chunk with the maximum sum is considered to offer the best possible answer.

The Final Model

The Final model which returns only 2 answers, one each from the top two documents fetched.

neeraj2681 / Question_Answering_Model

BERT based Question Answering Model utilising Wikipedia pages as context

Consider a sample question below:

What is answer_start_score and answer_end_scores

`start_scores` for the given question

`end_scores` for the given question

How to select the answer span?

The answer for the given sample question is retrieved as follows:

Is there a way to get the answer without providing the context?

How the documents are retrieved

A sample question for which documents are retrieved

The image below contains the documents retrieved by the Wiki library for a particular question

But the length of the wikipedia article is greater than token limit!

Length of each chunk

More than one answers returned by a single document

A sample question for which a single document returns more than one answer

The model return answers as follows:

One specific best answer from all the chunks for a particular document

The Final Model

The below snap shows the way final model works

About

Languages

BERT based Question Answering Model utilising Wikipedia pages as context

Consider a sample question below:

What is answer_start_score and answer_end_scores

start_scores for the given question

end_scores for the given question

How to select the answer span?

The answer for the given sample question is retrieved as follows:

Is there a way to get the answer without providing the context?

How the documents are retrieved

A sample question for which documents are retrieved

The image below contains the documents retrieved by the Wiki library for a particular question

But the length of the wikipedia article is greater than token limit!

Length of each chunk

More than one answers returned by a single document

A sample question for which a single document returns more than one answer

The model return answers as follows:

One specific best answer from all the chunks for a particular document

The Final Model

The below snap shows the way final model works

About

Languages

`start_scores` for the given question

`end_scores` for the given question