mesolitica / malaya

Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

base

huseinzol05 opened this issue · comments

Train using 2048 context length, 0.15 masking probability and 3.0 mean span length, does not converge, not sure why.

Going to use default settings, 512 context length and same batch size as nanoT5.