google / sentencepiece

Unsupervised text tokenizer for Neural Network-based text generation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Evaluate Profile-Guided Optimization (PGO)

zamazan4ik opened this issue · comments

Hi!

I evaluate Profile-Guided Optimization (PGO) applicability to different kinds of software - all my results are available in my repo. From my experience, PGO helps with achieving better performance in many scenarios.

Recently I performed PGO tests for HuggingFace Tokenizer project - the results are located here. Since the results are quite promising (up to 20% performance improvements in some scenarios), I think it could be interesting to perform the same PGO benchmarks for SentencePiece. As far as I understand from the technical highlights - performance is one of the goals of the project.

Did anyone try before to optimize SentencePiece performance with PGO? If yes, could you please share the benchmark results? If no - is there an established methodology/make command/anything else to perform such benchmarks?

Thanks in advance.