tbmihailov / qa_datasets_converter

Formate converter from one type of qa task datasets to another type

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset Converter for Question-Answering (QA) Tasks

Dataset Converter for natural language processing tasks such QA(question-answering) Tasks: from one format to other one

QA Dataset Paper & Data :

Supported Formats :

Source Destination Status
QAngaroo SQuAD completed
MCTest SQuAD completed
WikiQA SQuAD completed
InsuranceQA v1 SQuAD completed
InsuranceQA v2 SQuAD completed
TriviaQA SQuAD completed
NarrativeQA SQuAD completed
MS MARCO SQuAD completed
MS MARCO v2 SQuAD completed
WikiMovies SQuAD on hold
Simple Questions SQuAD on hold
Ubuntu Corpus v2 SQuAD completed
NewsQA SQuAD completed
SQuAD MatchZoo completed
Quasar-T SQuAD completed
Quasar-S SQuAD completed

Example Call :

You can find the sample call for each format type in the executor.py file such as below.

python executor.py 
--log_path="~/log.log" 
--data_path="~/data/" 
--from_files="source:question.train.token_idx.label,voc:vocabulary,answer:answers.label.token_idx" 
--from_format="insuranceqa" 
--to_format="squad" 
--to_file_name="filename.what" # it is gonna be renamed as "[from_to]_filename.what"

About

Formate converter from one type of qa task datasets to another type

License:MIT License


Languages

Language:Python 100.0%