microsoft / DeBERTa

The implementation of DeBERTa

microsoft/DeBERTa Issues

Fine-tune DeBERTa v3 language model, worthwhile endeavour?
Updated 14 days ago5
Generator weights
Updated a month ago
Deberta-v3-base Generator model
Updated a month ago2
How can I evaluate COPA dataset?
Updated 4 months ago
Reason for missing values in table for the Roberta-base, mrpc entry
Updated 5 months ago
Evaluation hangs for distributed MLM task
Updated 5 months ago7
No assert: Training does not start when using different tokenizer/ tokenized-data
Updated 5 months ago
Inference gives different results when using multiple gpus (distributed mode) vs just one gpu (not distributed mode)
Updated 5 months ago
Model is not initialized correctly when path to a pretrained model is provided via `pre_trained`
Updated 5 months ago
Question regarding symmetric KL Loss
Updated 6 months ago
EOF error while running the rtd.sh script
Updated 7 months ago1
Load deberta-v3-large but got deberta-v2 model
Updated 9 months ago2
out of memory
Updated 9 months ago18
Trying to initialize model "large"
Updated 10 months ago
How to pretrain mDeBERTa ?
Updated 10 months ago24
Trying to run rtd_task.py on Windows
Updated 10 months ago1
Eligibility for Commercial Use
Closed a year ago1
When calculating Qr, why is the W of content used instead of the W of position used?
Updated a year ago
Error when running the example code for pretraining the rtd model.
Updated a year ago15
Install fails due to use of deprecated `sklearn` package
Updated a year ago
AssertionError: RTD is not registed.
Closed a year ago1
n/a
Closed a year ago
No module named 'torch._six'
Closed a year ago2
mDeBERTa Generator model
Closed a year ago3
effectiveness of RTD
Updated a year ago
Info on Deberta-v2-xlarge training infra
Updated a year ago
Microsoft
Updated a year ago
This model for MLM is waste of time, why did you even made it if it cannot be used?
Closed a year ago6
How to pretrain DeBERTa v3 ??
Closed a year ago2
Where is the Gradient-Disentangled Embedding Sharing(GDES) part in the code?
Closed a year ago3
Code about deberta_v3
Closed a year ago1
which version is torch ?
Closed a year ago
Generator Model
Closed a year ago1
Convert DeBERTa model to ONNX with mixed precision
Updated a year ago
why vocab.txt and tokenizer.json not in pretrained model in huggingface ??
Updated a year ago1
AssertionError: [] in google coab
Updated 2 years ago
Can you upload the code finetuned in SQuad 2.0? Thank you very much.
Updated 2 years ago
mDeBERTa large
Updated 2 years ago
Can you tell me which token represents the overall representation of the sentence in the task of feature-extraction? The first token or the last token?
Updated 2 years ago
Can't run bash commands in /DeBERTa/experiments/glue/
Closed 2 years ago
DeBERTaV3 small & xsmall pre-training configuration?
Updated 2 years ago2
Why does the size of DeBERTaV3 double on disk after finetuning?
Closed 2 years ago2
Embedding layer vocab size not match to tokenizer length
Updated 2 years ago1
where is ENHANCED MASK DECODER ACCOUNTS part in code?
Closed 2 years ago1
DeXLNeta
Updated 2 years ago
Pre-training times: v2 vs. v3
Updated 2 years ago1
AttributeError: 'DebertaV2Tokenizer' object has no attribute 'get_vocab_size'
Updated 2 years ago
How to use this model for MLM task?
Updated 2 years ago
Release source distribution through PyPI or GitHub releases
Updated 2 years ago1
Pretrained RTD Model ?
Updated 2 years ago