ZurichNLP's repositories
ContraDecode
The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding"
multilingual-instruction-tuning
Code and data for the paper "Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?"
coverage-contrastive-conditioning
Data and code accompanying the paper "As Little as Possible, as Much as Necessary: Detecting Over- and Undertranslations with Contrastive Conditioning" (ACL 2022)
mbr-sensitivity
Data and code for the paper "Identifying Weaknesses in Machine Translation Metrics Through Minimum Bayes Risk Decoding: A Case Study for COMET"
sdg_swisstext_2024_sharedtask
Repository for data and evaluation of 2024 Shared Task on SDG classification held by the Swiss Text Conference.
translation-direction-detection
Unsupervised translation direction detection using NMT systems
acl2020-historical-text-normalization
Code for the ACL 2020 paper "Semi-supervised Contextual Historical Text Normalization" by Peter Makarov and Simon Clematide
contrastive-conditioning
Code and data accompanying the paper "Contrastive Conditioning for Assessing Disambiguation in MT: A Case Study of Distilled Bias"
distil-lingeval
Data and code accompanying the paper "On the Limits of Minimal Pairs in Contrastive Evaluation"
MultiPivotNMT
The implementation of "Investigating Multi-Pivot Ensembling with Massively Multilingual Machine Translation Models"
recognizing-semantic-differences
Code for the paper "Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents"
specific_hospo_respo
Code for hospitality review response generation
swiss-german-text-encoders
Code for the paper "Modular Adaptation of Multilingual Encoders to Written Swiss German Dialect"
voting-booklet-bias
Code for the paper "Voting Booklet Bias: Stance Detection in Swiss Federal Communication"
romanisation-transfer
Code for the Paper "On Romanization for Model Transfer Between Scripts in Neural Machine Translation"
understanding-ctx-aug
Code for the 2023 ACL Findings paper, Uncovering Hidden Consequences of Pre-training Objectives in Sequence-to-Sequence Models (Kew & Sennrich, 2023)
llm-response-stability
Data and code for the paper "Yes, no, maybe? Revisiting language models' response stability under paraphrasing for the assessment of political leaning"
SimpleFUDGE
Code for the paper "Target-Level Sentence Simplification as Controlled Paraphrasing" (TSAR 2022)
transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
window_audio_segmentation
Code and data for the paper "Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation"