Giters
microsoft
/
DeBERTa
The implementation of DeBERTa
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
1874
Watchers:
42
Issues:
121
Forks:
211
microsoft/DeBERTa Issues
Fine-tune DeBERTa v3 language model, worthwhile endeavour?
Updated
14 days ago
Comments count
5
Generator weights
Updated
a month ago
Deberta-v3-base Generator model
Updated
a month ago
Comments count
2
How can I evaluate COPA dataset?
Updated
4 months ago
Reason for missing values in table for the Roberta-base, mrpc entry
Updated
5 months ago
Evaluation hangs for distributed MLM task
Updated
5 months ago
Comments count
7
No assert: Training does not start when using different tokenizer/ tokenized-data
Updated
5 months ago
Inference gives different results when using multiple gpus (distributed mode) vs just one gpu (not distributed mode)
Updated
5 months ago
Model is not initialized correctly when path to a pretrained model is provided via `pre_trained`
Updated
5 months ago
Question regarding symmetric KL Loss
Updated
6 months ago
EOF error while running the rtd.sh script
Updated
7 months ago
Comments count
1
Load deberta-v3-large but got deberta-v2 model
Updated
9 months ago
Comments count
2
out of memory
Updated
9 months ago
Comments count
18
Trying to initialize model "large"
Updated
10 months ago
How to pretrain mDeBERTa ?
Updated
10 months ago
Comments count
24
Trying to run rtd_task.py on Windows
Updated
10 months ago
Comments count
1
Eligibility for Commercial Use
Closed
a year ago
Comments count
1
When calculating Qr, why is the W of content used instead of the W of position used?
Updated
a year ago
Error when running the example code for pretraining the rtd model.
Updated
a year ago
Comments count
15
Install fails due to use of deprecated `sklearn` package
Updated
a year ago
AssertionError: RTD is not registed.
Closed
a year ago
Comments count
1
n/a
Closed
a year ago
No module named 'torch._six'
Closed
a year ago
Comments count
2
mDeBERTa Generator model
Closed
a year ago
Comments count
3
effectiveness of RTD
Updated
a year ago
Info on Deberta-v2-xlarge training infra
Updated
a year ago
Microsoft
Updated
a year ago
This model for MLM is waste of time, why did you even made it if it cannot be used?
Closed
a year ago
Comments count
6
How to pretrain DeBERTa v3 ??
Closed
a year ago
Comments count
2
Where is the Gradient-Disentangled Embedding Sharing(GDES) part in the code?
Closed
a year ago
Comments count
3
Code about deberta_v3
Closed
a year ago
Comments count
1
which version is torch ?
Closed
a year ago
Generator Model
Closed
a year ago
Comments count
1
Convert DeBERTa model to ONNX with mixed precision
Updated
a year ago
why vocab.txt and tokenizer.json not in pretrained model in huggingface ??
Updated
a year ago
Comments count
1
AssertionError: [] in google coab
Updated
2 years ago
Can you upload the code finetuned in SQuad 2.0? Thank you very much.
Updated
2 years ago
mDeBERTa large
Updated
2 years ago
Can you tell me which token represents the overall representation of the sentence in the task of feature-extraction? The first token or the last token?
Updated
2 years ago
Can't run bash commands in /DeBERTa/experiments/glue/
Closed
2 years ago
DeBERTaV3 small & xsmall pre-training configuration?
Updated
2 years ago
Comments count
2
Why does the size of DeBERTaV3 double on disk after finetuning?
Closed
2 years ago
Comments count
2
Embedding layer vocab size not match to tokenizer length
Updated
2 years ago
Comments count
1
where is ENHANCED MASK DECODER ACCOUNTS part in code?
Closed
2 years ago
Comments count
1
DeXLNeta
Updated
2 years ago
Pre-training times: v2 vs. v3
Updated
2 years ago
Comments count
1
AttributeError: 'DebertaV2Tokenizer' object has no attribute 'get_vocab_size'
Updated
2 years ago
How to use this model for MLM task?
Updated
2 years ago
Release source distribution through PyPI or GitHub releases
Updated
2 years ago
Comments count
1
Pretrained RTD Model ?
Updated
2 years ago
Previous
Next