google / maxtext

A simple, performant and scalable Jax LLM!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

`hf_access_token` only effective for loading gated datasets, not gated tokenizers

jmschndev opened this issue · comments

Seems the hf_access_token is only effective when loading a gated dataset, not necessarily a gated tokenizer.

One can work around this via modifying preprocess_dataset() in _hf_data_processing.py to explicitly call huggingface_hub.login(token=config.hf_access_token if a token is present.