Khuyagbaatar Batsuren (kbatsuren)

kbatsuren

Geek Repo

Company:National University of Mongolia

Twitter:@khuyagbaatar_b

Github PK Tool:Github PK Tool

Khuyagbaatar Batsuren's starred repositories

Motliere

A Python package to segment French words into morphological subwords.

Language:PythonStargazers:4Issues:0Issues:0

zett

Code for Zero-Shot Tokenizer Transfer

Language:PythonStargazers:105Issues:0Issues:0

lang-khk

Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Halh Mongolian language

Language:TextLicense:NOASSERTIONStargazers:3Issues:0Issues:0

pyfoma

Python Finite-State Toolkit

Language:PythonLicense:Apache-2.0Stargazers:37Issues:0Issues:0

InfiniTransformer

Unofficial PyTorch/🤗Transformers(Gemma/Llama3) implementation of Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Language:PythonLicense:MITStargazers:326Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

lexd

A lexicon compiler for non-suffixational morphologies

Language:C++License:GPL-3.0Stargazers:11Issues:0Issues:0

wtpsplit

Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.

Language:PythonLicense:MITStargazers:656Issues:0Issues:0

ltg-bert

LTG-Bert

Language:PythonLicense:GPL-3.0Stargazers:25Issues:0Issues:0
Language:PythonStargazers:18Issues:0Issues:0

ml-tutorials

"Machine Learning" онлайн хуралд ашиглагдах материал юм

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4Issues:0Issues:0

albert-mongolian

ALBERT trained on Mongolian text corpus

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:18Issues:0Issues:0

SWOWEN-2018

English Small World of Words SWOWEN-2018

Language:RStargazers:64Issues:0Issues:0

genbench_cbt_2023

The official Genbench Collaborative Benchmarking Task repository 2023 (Archived)

Language:PythonLicense:NOASSERTIONStargazers:14Issues:0Issues:0

length-generalization

Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023

Language:PythonLicense:MITStargazers:121Issues:0Issues:0

baseline-pretraining

Code for pre-training BabyLM baseline models.

Language:PythonStargazers:12Issues:0Issues:0
Language:ScilabStargazers:10Issues:0Issues:0

2023glossingST

A repo for the 2023 Sigmorphon glossing shared task

Language:PythonStargazers:9Issues:0Issues:0

WAX

The respository describing a novel datasets for word association explanations

Language:PythonStargazers:10Issues:0Issues:0

kuzdra

A nonce sentence translated into 100 languages

Stargazers:3Issues:0Issues:0

norare-cldf

CLDF dataset of norare-data

Language:TeXLicense:CC-BY-4.0Stargazers:1Issues:0Issues:0

bilingual-abstracts-corpus

Bilingual corpus of scientific abstracts from ÚFAL Charles University publications.

Language:PythonStargazers:1Issues:0Issues:0
Language:ShellStargazers:1Issues:0Issues:0

Negative-Precedent-in-Legal-Outcome-Prediction

This is a repository for code used in the paper: On the Role of Negative Precedent in Legal Outcome Prediction

Language:PythonLicense:MITStargazers:6Issues:0Issues:0

superbizarre

Code and data for "Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words"

Language:PythonStargazers:15Issues:0Issues:0

text_characterization_toolkit

A library for computing diverse text characteristics and using them to analyze data sets and models with ease.

Language:PythonLicense:MITStargazers:39Issues:0Issues:0

2022SegmentationST

SIGMORPHON 2022 Shared Task on Morpheme Segmentation

Language:PythonStargazers:1Issues:0Issues:0

morphoeval

Evaluation for unsupervised morphological analysis and segmentation

Language:PythonLicense:MITStargazers:2Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0

2022SegmentationST

SIGMORPHON 2022 Shared Task on Morpheme Segmentation

Language:Jupyter NotebookStargazers:23Issues:0Issues:0