Language Illusions

Repository for "Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics" by Yuhan Zhang, Ted Gibson, and Forrest Davis to appear in CoNLL 2023

Structure of the repository

The file data includes three files mapping to the three illusions. Each file stores the sentences we tested and the corresponding metrics (whole-sentence perplexity & critical region surprisal) out of a certain language model (i.e., BERT, RoBERTa, GPT-2, GPT-3).

The file processing_scripts includes four R files that store the reproducing codes that generate our statistical analyses.

The conference paper is named paper_CoNLL_2023.

Acknowledgement

We thank Ankana Saha, Carina Kauf and Hayley Ross for the helpful discussion about the project. Any errors are ours and we appreciate your feedback!

Corresponding email: yuz551@g.harvard.edu.

forrestdavis / LanguageIllusions

Language Illusions

Structure of the repository

Acknowledgement

About

Languages