sfschouten / exploiting-ambiguity

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reasoning about Ambiguous Definite Descriptions

This repository contains the code used for data collection and experimentation.

The benchmark data and answers given by each model can be found under benchmark/.
The jsonl files contain a JSON object on each line with the following format:

{
  "label_nr":      1|2,                  The option corresponding to the correct answer.
  "label_name":    "de dicto"|"de re",   Correct answer class name. 
  "messages":      ["..."],              The list of messages to be sent to the LLM (one in case of direct prompting, two in case of chain-of-thought prompting.)
  "entity":        "...",                The name of the 'main entity'.
  "property":      "...",                The property ascribed to the definite description.
  "prompt_style":  "...",                A key into PROMPT_DICTIONARY in create_fragment.py . 

The above elements are just the benchmark itself, below the fields corresponding to the answers given by the models.

  "responses":     ["..."],              The replies given by the LLMs in response to the messages.
  "results": {
    "choice":      1|2 ,                 The option chosen by the model.
    "explanation": "..."                 The explanation provided by the model for why it chose the option that it did.
  }
}

About


Languages

Language:Python 97.2%Language:Shell 2.8%