There are 2 repositories under multilingual-models topic.
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A Laravel package for multilingual models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021.
PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.
Multilingual Generative Pretrained Model
WangChanGLM 🐘 - The Multilingual Instruction-Following Model
LANGBRIDGE: Multilingual Reasoning Without Multilingual Supervision
Do Multilingual Language Models Think Better in English?
Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis
Master thesis with code investigating methods for incorporating long-context reasoning in low-resource languages, without the need to pre-train from scratch. We investigated if multilingual models could inherit these properties by making it an Efficient Transformer (s.a. the Longformer architecture).
Video Search with CLIP
PyTorch implementation of Sentiment Analysis of the long texts written in Serbian language (which is underused language) using pretrained Multilingual RoBERTa based model (XLM-R) on the small dataset.
The multilingual language model XLM-R fine-tuned for metaphor detection on a token-level using Huggingface
Dataset: Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society
Official Repository for the paper titled "Meta-Learning for Effective Multi-task and Multilingual Modelling" accepted at EACL 2021
NLP deep learning model for multilingual toxicity detection in text 📚
Multilingual Speech to Speech (STS) Translator is the First Ever Code-mixed English-Arabic speech to Bangla-Arabic Speech Translator
On Bilingual Lexicon Induction with Large Language Models (EMNLP 2023). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.
Code for the shared task on homophobia/transphobia detection at LT-EDI Workshop @ ACL 2022
Multilingual-StyleCLIP is a model that can edit StyleGAN2 's images with a multilingual text prompt
Evaluating the Efficacy of Summarization Evaluation across Languages. In Findings of ACL 2021.
Code for "Multilingual Sentiment Elicitation System for Social Media Data" @ IEEE Intelligent Systems
This repository offers an evaluation of machine translation models for healthcare, focusing on languages like Telugu, Hindi, Arabic, and Swahili. It emphasizes accuracy and medical terminology, aiming to enhance medical communication across diverse languages. The dataset used in evaluation is provided.
Codes for master's thesis investigating approaches for building a multilingual, knowledge-grounded dialogue system via cross-task and cross-lingual transfer learning.
BERT classification of Myers-Brigg personality types based on Twitter tweets in four different European languages.
This repo contains the annotations and other artifacts of the paper titled: In What Languages are Generative Language Models the Most Formal? Analyzing Formality Distribution across Languages
LLMs for Low Resource Languages in Multilingual, Multimodal and Dialectal Settings
This repository contains a Python script that uses a pre-trained NBART (Neural Bidirectional AutoRegressive Transformer) model to perform multi-lingual translation tasks between several languages. The model was trained on multiple language pairs using data parallelism, allowing it to learn representations across all languages simultaneously.