There are 5 repositories under code-switching topic.
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
A curated list of research papers and resources on code-switching
Implementation of meta-transfer-learning for ASR and LM (ACL 2020)
This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.
Natural Language Procesing
Multilingual Meta-Embeddings for Named Entity Recognition (RepL4NLP & EMNLP 2019)
CodeSwitch is a NLP tool, can use for language identification, pos tagging, name entity recognition, sentiment analysis of code mixed data.
Pytorch implementation of CS-Tacotron, a code-switching speech synthesis end-to-end generative TTS model.
Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection
Code-Switching Language Modeling using Syntax-Aware Multi-Task Learning (CALCS 2018, ACL)
A sequence tagging model with active learning
Repository containing Abusive Tweet Detection, Location Detection and Gender Detection codes
Code repository for ACL2020 paper Multi-label and Multilingual News Framing Analysis
Jopara (Guarani-dominant mixed with Spanish) sentiment analysis corpus
[EMNLP 2023] Official repository of paper titled "Detecting Propaganda Techniques in Code-Switched Social Media Text"
a socket script to obtain chinese phones-sequence for any english word
Implementation of a deep learning model (BiLSTM) to detect code-switching
Official repository for the paper titled "From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text" accepted at ACL 2021
Chrome extension for translating highlighted English text into Chinglish (a chinese + english hybrid)
A simple UI to translate a text written in romanised hindi form to fully english sentence
Mixed Speech with Korean and English Dataset
This repository contains crowdsourced universal part-of-speech tags for the Miami Bangor corpus.
Code-Switching Sentence Generation by Generative Adversarial Networks and its Application to Data Augmentation. (Interspeech 2019)
Code-Switched Data generation based on Part-of-speech and Language Modeling of the generated text.
Code-switching analysis using linguistic techniques like part-of-speech
language detection in code-switching for es/en/zh speakers
Japanese Speaking English Speech Dataset
Data Analysis Toolkit for On the Margins, LLC
Tweet ids for code-mixed Russian-German and Russian-Hebrew tweets
Kolloqe Input Component with code-switching support between Sinhala and English attachable via <script> tags
Python program for detecting unintentional bilingual and translation instances in NLP datasets.
A Centralized Frenglish Benchmark from Naturally Occurring Code-Switching and Code-Mixing