nrc-cnrc / MSLC

Data and figures for MSLC — Données et figures pour MSLC

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MSLC23 Data

Overview

This repository contains data and additional figures associated with the paper Metric Score Landscape Challenge (MSLC23): Understanding Metrics’ Performance on a Wider Landscape of Translation Quality by Chi-kiu Lo, Samuel Larkin, and Rebecca Knowles, published at WMT 2023.

Data

Within the data/ directory, you will find several subdirectories. These are designed to be similar to the ones used by MTME (https://github.com/google-research/mt-metrics-eval#specification). They contain the MT output from the systems in the MSLC23 challenge set as well as the scores assigned to those by the participating metrics. For en-de and zh-en, the reference used is refA, while for en-he and he-en the reference used is refB. The source and reference text are available at https://github.com/wmt-conference/wmt23-news-systems.

  • documents/SRC-TRG.docs: For the given SRC-TRG language pair, this file contains tab-separated data, where the first column is the domain and the second is the document name. Our data only consists of the news domain data from the WMT test sets. These document IDs can be used to match the system outputs to source and reference data (available at https://github.com/wmt-conference/wmt23-news-systems).
    • Per-line contents: DOMAIN DOCUMENT
    • Line order matches WMT source and reference
  • system-outputs/SRC-TRG/SYSNAME.txt: For the given SRC-TRG language pair and an MT system (SYSNAME), this file contains the MT system's output (over the same set of lines in the same order as listed in the documents/SRC-TRG.docs) file. These translations were produced by the MT systems trained for this paper.
    • Per-line contents: Translation
    • Line order matches documents/SRC-TRG.docs
  • metric-scores/SRC-TRG/METRICNAME-REF.seg.score: For the given SRC-TRG language pair, metric METRICNAME and the reference set REF (one of refA, refB, or, in the case of referenceless metrics, src) used by the metric, these files contain tab-separated data. The three columns are DOMAIN (always news), SYSNAME (the name of the MT system), and SCORE (the segment-level score). These metric scores were generated by participants in the Metrics shared task.
    • Per-line contents: DOMAIN SYSNAME SCORE
    • For each SYSNAME, the line order matches documents/SRC-TRG.docs
  • mapping/SRC-TRG.tsv: For the language pair, this maps the MT system name from checkpoint ID to the letter name used in the paper.
    • Per-line contents: SYSNAME LETTER

Figures

The figs/ directory contains additional figures that were too large to include in the paper. They are identified by source and target language. The diagonal entries show stacked histograms of segment scores across the challenge set (cool colours/bottom) and submitted WMT systems (warm colours/top). The off-diagonal entries are scatterplots where each point is a single segment positioned according to the score assigned to it by row and column metrics; each point is coloured according to the MT system that produced it. Where available, we include MQM scores and restrict the figures to the subset of data that received such annotations.

Licence

The contents of this repository are released under a CC-BY 4.0 licence.

Citing this work

If you choose to use this data, please cite:

@InProceedings{lo-larkin-knowles:2023:WMT,
  author    = {Lo, Chi-kiu  and  Larkin, Samuel  and  Knowles, Rebecca},
  title     = {Metric Score Landscape Challenge (MSLC23): Understanding Metrics' Performance on a Wider Landscape of Translation Quality},
  booktitle      = {Proceedings of the Eighth Conference on Machine Translation},
  month          = {December},
  year           = {2023},
  address        = {Singapore},
  publisher      = {Association for Computational Linguistics},
  pages     = {776--799},
  url       = {https://aclanthology.org/2023.wmt-1.65}
}

About

Data and figures for MSLC — Données et figures pour MSLC

License:Creative Commons Attribution 4.0 International