About Openai DALL-E

Question

About Openai DALL-E

Asthestarsfalll opened this issue 2 years ago · comments

Hi, what a fantastic work!
I'm wondering what network settings the original Openai DALL-E are, especially about the Transformer it used.
I can't figure out the meaning of parameter stable causal rotary_emb and shared_attn_ids.
All I know from the paper is that the image transformer use 64 attention layers, each of which uses 62 attention heads with a per-head state size of 64.
And I notice that you mention the sparse atteniton but I can't get any details about them from the blog post
If I want to train a DALL-E of original Openai version, how can I choose the parameters？
And if there exists any pretrained models or checkpoints of it?
Thank for your help in advance!