THUDM / SwissArmyTransformer

SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

Home Page:https://THUDM.github.io/SwissArmyTransformer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

不支持流式dataset

af-74413592 opened this issue · comments

visualglm只有FewshotData,数据直接加载到内存中会爆掉,改成
large_dataset_streamed = load_dataset("json", data_files=path,split="train", streaming=True)
dataset = large_dataset_streamed.map(datapreprocess)
的形式后,发现也不支持流式dataset。

支持流式,只需要在训练脚本里传入参数--iterable-dataset

谢谢