BUET CSE NLP Group (csebuetnlp)

csebuetnlp

Geek Repo

0

following

0

stars

Company:Bangladesh University of Engineering and Technology

Location:Bangladesh

Home Page:csebuetnlp.github.io

Github PK Tool:Github PK Tool

BUET CSE NLP Group's repositories

xl-sum

This repository contains the code, data, and models of the paper titled "XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages" published in Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.

banglabert

This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.

banglanmt

This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.

BanglaNLG

This repository contains the official release of the model "BanglaT5" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaNLG: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla".

CoDesc

A large dataset of 4.2m Java source code and parallel data of their description from code search, and code summarization studies.

Language:PythonLicense:MITStargazers:48Issues:1Issues:0

CrossSum

This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs" published in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), July 9-14, 2023.

normalizer

This python module is an easy-to-use port of the text normalization used in the paper "Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation". It is intended to be used for normalizing / cleaning Bengali and English text.

banglaparaphrase

This repository contains the code, data, and associated models of the paper titled "BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset", accepted in Proceedings of the Asia-Pacific Chapter of the Association for Computational Linguistics: AACL 2022.

Language:PythonStargazers:12Issues:1Issues:0

IllusionVQA

This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models"

Language:Jupyter NotebookStargazers:3Issues:0Issues:0

TransCoder

Public release of the TransCoder research project https://arxiv.org/pdf/2006.03511.pdf

Language:PythonLicense:NOASSERTIONStargazers:2Issues:0Issues:0