Adding reasoning about `batchify` in language model example

Question

Adding reasoning about `batchify` in language model example

HallerPatrick opened this issue 2 years ago · comments

📚 Documentation

The comment/documentation of the function batchify in word_language_model/main.py gives a explanation how the sequence is rearranged into columns, with the explanation of "efficient batch processing".
For me it is not inherently clear, why that would help. It even confused me the first time I looked at it, or tried to debug. I maybe would love a little more explanation or a reference, where I can read more about it.

Hope I used the right issue template here...

Greetings,
Patrick

Anuvabh Dutt · Answer 1 · Wed May 04 2022 03:12:42 GMT+0800 (China Standard Time)

I found these:

I'm assuming that this is due to the underlying CUDA kernel being better suited for batches having sequence dimension first, followed by the batch dimension