AI4Bharat / Indic-BERT-v1

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT

Home Page:https://indicnlp.ai4bharat.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error during downloading the en-indic dataset

Subhashree-Tripathy opened this issue · comments

image
Getting the above error while trying to download the en-indic dataset.

Even I'm getting the same error. Requesting the authors to kindly resolve this. Very excited to try indic-bert. @divkakwani

Hey guys,

Sorry for this.

We have an issue with GCP bucket links and it'll most likely be resolved next week.

@Subhashree-Tripathy What is the dataset you were trying to download? Can you paste the link here? Will try to check if we have a backup.

@pranavraikote For using the indicbert model, can you try using the model from huggingface:

from transformers import AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('ai4bharat/indic-bert')
model = AutoModel.from_pretrained('ai4bharat/indic-bert')

I was having an issue with the above link, I checked and now it's working
https://indicnlp.ai4bharat.org/samanantar/#downloads