Keyrat06 / Question_Answering

6.864 final project implementing question answering network described in https://arxiv.org/pdf/1512.05726.pdf.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question Answering

In Natural Language Processing, Question Answering is an important type of Information Retrieval task. Online question answering forums such as those managed by StackExchange allow users to post a question on a subject with the community responding with suitable answers. In the last few years, there has been a in their popularity and thus a corresponding explosion in the number of their users. The absence of an effective automated ability to refer to and reuse answers already available for previous posted questions, means that the community has to repeatedly spend time and energy in answering the same question. In this paper, we first explore a method for finding a related question to a posed question, given supervised data from the AskUbuntu forum. We then explore methods to try and transfer the learned model over to the AskAndroid forum where we do not have supervised data.

Question Retrieval

  1. CNN Model
    • This file contains the interfacing code for our Question Retreival CNN_Model
  2. util
    • Probably the most important file in this repository! It contains the engine code for all models as well as the training code and all helper functions.
  3. LSTM Model
    • This file contains the interfacing code for our Question Retreival CNN_Model

Domain Adaptation

  1. bm25 & tf-idf
    • This file contains the interface code for using our inhouse bm25 and tf-idf code (all implementation is in util)
  2. Direct Transfer
    • This file contains the interfacing code for our Direct Transfer for both LSTM and CNN models
  3. Advisarial Domain Adaptation
    • This file contains the interfacing code for our Advisarial Domain Adaptation code

Advisarial Domain Adaptation was highly influenced by Unsupervised Domain Adaptation by Backpropagation. The following figure shows the Advisarial model used and was taken from this paper: alt text

Notes

  1. To run this code you must export PYTHONPATH=
  2. qa includes the code gathered from "Denoising Bodies to Titles: Retrieving Similar Questions with Recurrent Convolutional Models" and was highly influential to our code in this repository
  3. data includes data gathered from https://github.com/taolei87/askubuntu and https://github.com/jiangfeng1124/Android and also required the downloading of the standford glove embedding that were to large to put in a git repository

About

6.864 final project implementing question answering network described in https://arxiv.org/pdf/1512.05726.pdf.


Languages

Language:Python 100.0%