carolinlawrence / scripts_nlmaps

Various scripts to be used in conjunction with the NLmaps corpus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NLmaps Scripts

A few scripts that help with processing the NLmaps corpus.

Linearisation and Reversing it

Original NLmaps queries are written in a bracket form which can be linearised into individual tokens by taking a pre-order tree traversal. For example, the query

query(north(area(keyval('name','Paris')),nwr(keyval('building','cathedral'))),qtype(latlong))

can be linearised to

query@2 north@2 area@1 keyval@2 name@0 Paris@s nwr@1 keyval@2 building@0 cathedral@s qtype@1 latlong@0

We provide two scripts that handle the conversion, one for each direction. The idea for linearisation was originally presented by Andreas et al., 2013. Thus, some code in this repo closely resembles code from their repo smt-semparse but has been modified for the NLmaps corpus.

To linearise a file of NLmaps queries, use:

python linearise.py -i input_file -o output_file

and to reverse the linearisation, use:

python functionalise.py -i input_file -o output_file

Evaluation

NLmaps can either be evaluated at the query sequence level or based on the answers if queries are executed against an instance of the OpenStreetMap database.

To validate at the sequence level, use:

python seq_eval.py -i suggested_queries_file -g gold_queries_file

and to validate at the answer level, use:

python eval.py -i suggested_answers_file -g gold_answers_file

To validate at the answer level, an instance of the OpenStreetMap database needs to be installed as well as overpass-nlmaps.

Answers can then be generated usin

./query_db -d $DB_DIR -a answer_file -f query_file

About

Various scripts to be used in conjunction with the NLmaps corpus


Languages

Language:Python 100.0%