Definition Generation for Lexical Semantic Change Detection (LSCD)

This repository contains the code and some data for the paper Definition generation for lexical semantic change detection (ACL Findings 2024) by Mariia Fedorova, Andrey Kutuzov and Yves Scherrer.

Repository structure

├── definition_generation   # generating definitions    .
├── definition_embeddings   # APD and PRT experiments (section 4.2. in the paper)
├── embed_definitions       # generating definitions' embeddings (section 4.2. in the paper)
├── src                     # other experiments (definitions-as-strings, etc) and evaluation, see more in the README file (sections 4.1, 4.3 in the paper)
├── generated_definitions   # prompts and definitions generated by us

Obtaining the data

Lists of words and ground truth

src/data/

Diachronic corpora

Sampled usage examples with prompts and generated definitions can be found in generated_definitions/.

The usage examples were sampled from the following resources:

English
Norwegian: NBDigital corpus and Norsk aviskorpus (available under CC-BY-NC)
Russian; the corpora's license does not allow publishing them; for that reason, we could only release the prompts and definitions without usage examples. Any other corpus may be used instead of it (although the results may be different then).

Definition generation and evaluation

cd definition_generation
git clone git@github.com:ltgoslo/definition_modeling.git
./definition_generation_pipeline.sh ${}

Read about the generation parameters in the README file.

Reproducing the baselines (Table 2)

Reproducing evaluation of LSCD performance with definition embeddings obtained with different decoding strategies (Table 3)

WARNING

Scripts in definition_embeddings/ are SLURM scripts, loading cluster-specific modules. For easier use in generic settings, we commented out module use and module load commands. (In fact, APD and PRT themselves do not require a GPU to be run - but computing sentence transformers embeddings for all usage examples in reasonable time does).

In order to reproduce the whole experiment, create sentence transformers embeddings of usage examples using embed_definitions/embed_definitions.py (embed_definitions/embeddings.slurm shows an example of running it on a cluster, don't forget to replace account name and modules used) and run compute_scores.sh to compute the per-word change scores.

cd embed_definitions
./embeddings.slurm
cd ../definition_embeddings
./compute_scores.sh

Then run evaluation:

./evaluate.sh

Reproducing evaluation of LSCD performance with merged definitions obtained with different decoding strategies (Table 4)

./merge_all.sh

Reproducing evaluation of both baseline and merged definitions LSCD in one run

This assumes that you already have all your predictions in src/predictions/

cd src/analysis/
python eval_all.py

This will create src/analysis/result.txt with Spearman correlation scores and p-values for all methods. Insignificant correlations will be highlighted.

Reproducing Figure 1

src/analysis/graphs.ipynb

Citation

@inproceedings{fedorova-etal-2024-definition,
    title = "Definition generation for lexical semantic change detection",
    author = "Fedorova, Mariia  and
      Kutuzov, Andrey  and
      Scherrer, Yves",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.339",
    pages = "5712--5724",
    abstract = "We use contextualized word definitions generated by large language models as semantic representations in the task of diachronic lexical semantic change detection (LSCD). In short, generated definitions are used as {`}senses{'}, and the change score of a target word is retrieved by comparing their distributions in two time periods under comparison. On the material of five datasets and three languages, we show that generated definitions are indeed specific and general enough to convey a signal sufficient to rank sets of words by the degree of their semantic change over time. Our approach is on par with or outperforms prior non-supervised sense-based LSCD methods. At the same time, it preserves interpretability and allows to inspect the reasons behind a specific shift in terms of discrete definitions-as-senses. This is another step in the direction of explainable semantic change modeling.",
}

ltgoslo / Definition-generation-for-LSCD