There are 2 repositories under bengali-nlp topic.
This repository contains the official release of the model "BanglaBERT" and associated downstream finetuning code and datasets introduced in the paper titled "BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla" accpeted in Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: NAACL-2022.
Fully Configurable RAG Pipeline for Bengali Language RAG Applications. Supports both Local and Huggingface Models, Built with Langchain.
Dataset for identifying potential hates (e.g., political, religious, personal, gender abusive, geopolitical, etc.) for under-resourced Bengali language.
Bengali transformer using transformers
Fine tuned llama 3 models for context based question answering in bengali language.
[AAAI 2021] - Simple or Complex? Learning to Predict Readability of Bengali Texts.
A Python package for translating emoji, emoticons into Bengali text for NLP tasks.
Machine Learning approach to Bengali Corpus POS (Parts of Speech) Tagging using BNLP (Bengali Natural Language Processing) Toolkit. This is the Minor Project Presentation at Heritage Institute of Technology under the mentorship of Prof. Sandipan Ganguly.
[EACL 2021] - Unsupervised Abstractive Summarization of Bengali Text Documents.
Bengali News Summarization - BengaliGPT & T5
This repository consists of Bengali Text-Visualization using Word2Vec Model. A mini project under the mentorship of Prof. Sandipan Ganguly, HIT-K.
Fine-tune mBart 50 for Bengali Sentence Error Correction
Machine Learning approach to Bengali corpus NER- Named Entity Recognition using BNLP. A mini project under the mentorship of Prof. Sandipan Ganguly, HIT-K
Machine Learning approach to Bengali Corpus POS Tagging using BNLTK. This is an experimenting project under the mentorship of Prof. Sandipan Ganguly, HIT-K.
Bengali POS Tagging using Indian Corpus through NLTK. A sample testing to apply POS Tagging under the supervision of Prof. Sandipan Ganguly, HIT-K.
Harnessing large language models over transformer models for detecting Bengali depressive social media text: A comprehensive study
Repository to track the state of the art research progress in Bengali natural language processing for most common task
AI project to detect abusive comments in social media.
In this project we have tried to do multi-label hate-speech classification in Bengali and Hindi language using fill-mask transformer models.
Tool to generate lists of Bengali words and transcriptions matching given phonological descriptions
Bangla Language Computing Research
đź’¬ Classify Bengali comments as positive or negative. It provides valuable insights for social media to improve satisfaction based on sentiment analysis.
Automatic Subtitle Generation for Bengali Multimedia Using Deep Learning.
This is the project page of the dataset for the binary and multi-label classification of irony in Bengali Tweets.
Bengali Geonames python library
This repository contains code for classifying hate speech in Bengali audio.
Bengali Misogyny Identification with Deep Learning and LIME.
This research is based on Optical Character Recognition (OCR). In this study, a system will be designed to recognise Bengali numerals. The numerals are handwritten, and the numerical dataset will be collected from a cloud-based repository. Data will be preprocessed first and then randomly split into training and testing data. Here a Convolutional Neural Network (CNN) will be designed. The CNN will be trained using the training dataset of Bengali handwritten numerical. After training, the CNN will be tested using the testing dataset for recognition accuracy.