5hirish / adam_qas

ADAM - A Question Answering System. Inspired from IBM Watson

Home Page:http://www.shirishkadam.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Extracting data from structured information extracted from Wikipedia

5hirish opened this issue · comments

Your Environment

$ python -m qas.sys_info
# To get the system info, past the output of the above command here
  • Question you were trying to ask:

Has this issue been closed? I believe I can contribute effectively to this issue

@meghanabhange Not yet this issue is not closed. I am just no longer getting time to maintain this project. But can help get started if you are interested.

Okay great.
So, to get started, was there a specific measurable result that you had in mind they could be achieved by resolving this issue?

So any given Wikipedia page has both structured and unstructured information. Consider this example question: 'How many career points does Sebastian Vettle has?' Now the answer to this is stored in a structured form (tabular form) and not in the text on Sebastian's Wikipedia page. Wiki: Sebastian Vettle. As far as I can remember this project can extract data from tables (horizontal/vertical) in a key-value format and store it in Elasticsearch. But it just doesn't understand how to query it.

If you are up for the challenge. I suggest you create a separate module to query structured info as the unstructured one needs a lot of performance fixes and improvements too. I can help you with elasticsearch or you are blocked anywhere in terms of the codebase.