ZurichNLP

Data and code accompanying the paper "As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning" (ACL 2022)

Language:Python22 7 1

understanding-mbr

Language:ShellMIT17 4 1

segtest

A Test Suite for Morphological Phenomena in Neural Machine Translation

Language:ShellMIT7 60

BLESS

Code for the EMNLP 2023 paper "BLESS: Benchmarking Large Language Models on Sentence Simplification"

Language:Jupyter NotebookMIT6 5 1

mbr-sensitivity

Data and code for the paper "Identifying Weaknesses in Machine Translation Metrics Through Minimum Bayes Risk Decoding: A Case Study for COMET"

Language:PythonMIT6 50

sdg_swisstext_2024_sharedtask

Repository for data and evaluation of 2024 Shared Task on SDG classification held by the Swiss Text Conference.

Language:PythonAGPL-3.05 7 2

translation-direction-detection

Unsupervised translation direction detection using NMT systems

Language:PythonMIT4 50

20Minuten

Language:Jupyter Notebook3 8 2

acl2020-historical-text-normalization

Code for the ACL 2020 paper "Semi-supervised Contextual Historical Text Normalization" by Peter Makarov and Simon Clematide

Language:Python3 7 1

contrastive-conditioning

Code and data accompanying the paper "Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias"

Language:PythonMIT3 60

distil-lingeval

Data and code accompanying the paper "On the Limits of Minimal Pairs in Contrastive Evaluation"

Language:PythonMIT3 70

MultiPivotNMT

The implementation of "Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models"

Language:PythonMIT3 60

recognizing-semantic-differences

Code for the paper "Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents"

Language:PythonMIT2 60

specific_hospo_respo

Code for hospitality review response generation

Language:Jupyter Notebook2 60

swiss-german-text-encoders

Code for the paper "Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect"

Language:PythonMIT2 10

voting-booklet-bias

Code for the paper "Voting Booklet Bias: Stance Detection in Swiss Federal Communication"

Language:Jupyter Notebook2 30

romanisation-transfer

Code for the Paper "On Romanization for Model Transfer Between Scripts in Neural Machine Translation"

Language:Mathematica1 60

understanding-ctx-aug

Code for the 2023 ACL Findings paper, Uncovering Hidden Consequences of Pre-training Objectives in Sequence-to-Sequence Models (Kew & Sennrich, 2023)

Language:Jupyter Notebook1 60

llm-response-stability

Data and code for the paper "Yes, no, maybe? Revisiting language models' response stability under paraphrasing for the assessment of political leaning"

Language:PythonGPL-3.0040

SimpleFUDGE

Code for the paper "Target-Level Sentence Simplification as Controlled Paraphrasing" (TSAR 2022)

Language:Jupyter Notebook060

simplewiki-data-acquisition

Language:Python020

sockeye

Sequence-to-sequence framework with a focus on Neural Machine Translation based on Apache MXNet

Language:PythonApache-2.005 2

transformers

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

Language:PythonApache-2.0010

window_audio_segmentation

Code and data for the paper "Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation"

Language:PythonMIT060