There are 2 repositories under multiword-expressions topic.
Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
Code for NAACL 2019 paper: "Bridging the Gap: Attending to Discontinuity in Identification of Multiword Expressions"
Data for the DiMSUM shared task at SEMEVAL 2016
A Python package for Exploratory Data Analysis (EDA) for text-based data.
Comparison between various noun compound embeddings
Data and code for the paper "ID10M: Idiom Identification in 10 Languages" (NAACL 2022).
Data and code for the paper "NER4ID at SemEval-2022 Task 2: Named Entity Recognition for Idiomaticity Detection".
Adjacent code related to the paper prepared for Joint Workshop on Multiword Expressions and Universal Dependencies (MWE-UD 2024), 25th May, 2024.
Rigor-Mortis is an online GWAP where players have to find multiword expressions in French sentences
Python implementation of Substitution-driven Measures of Association
Java implementation of substitution driven measures of association that can be used to identify MWEs.