Giters
zhuzilin
/
ring-flash-attention
Ring attention implementation with flash attention
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
621
Watchers:
12
Issues:
38
Forks:
52
zhuzilin/ring-flash-attention Issues
ring flash attention with BPT
Closed
13 days ago
Comments count
3
[Feature Request] Support `window_size`
Updated
a month ago
Comments count
4
精度问题
Closed
2 months ago
Comments count
3
Compatibility Breakage Due to Flash Attention API Update (v2.7.0)
Updated
a month ago
Perf on 8*H800
Closed
2 months ago
Comments count
1
Does ring-attn not support dropout?
Closed
2 months ago
Comments count
3
ring attention实现原理
Closed
2 months ago
Comments count
12
flash attention版本
Closed
2 months ago
Comments count
11
Got error in ZigZagRingFlashAttnVarlenFunc
Closed
2 months ago
Comments count
4
Numerical errors in backward
Updated
2 months ago
Comments count
4
verify causal masking
Updated
3 months ago
Comments count
6
Is it support casual attention?
Closed
10 months ago
Comments count
1
Error when increasing the sequence length
Updated
3 months ago
是否需要更新全局最大值?
Closed
8 months ago
Comments count
2
Will llama3_flash_attn suffer from the imbalance issue, too?
Closed
3 months ago
Comments count
2
[咨询]请问 varlen 版本的 ring attention 有办法解除每个子序列都必须要被 degree 整除这个限制?
Closed
4 months ago
Comments count
2
Benchmark Question
Closed
4 months ago
Comments count
2
run the code has error
Updated
4 months ago
Comments count
4
[Feature Request] Balancing computation with zigzag blocking
Closed
10 months ago
Comments count
3
多机训练速度问题
Updated
5 months ago
Comments count
3
请问 ringattention这段代码中在算什么?
Closed
6 months ago
mask to zigzag attention
Closed
6 months ago
Comments count
1
Bugs when using zigzag_ring_flash_attn: RuntimeError: Number of requests do not match number of collectives
Updated
7 months ago
多卡qkv维度问题
Closed
7 months ago
Comments count
2
Thanks for your great work and here is my test results!
Closed
8 months ago
Comments count
8
Is there a ring_flash_attn_func test example?
Closed
8 months ago
Comments count
4
How is ring attention applied while doing auto regressive decoding (in the stage of decoding tokens one by one)?
Closed
8 months ago
Comments count
1
stripe_flash_attn_varlen_func
Closed
8 months ago
Comments count
1
large memory usage
Updated
9 months ago
Comments count
5
test on 8*A800
Closed
9 months ago
Comments count
3
关于tp和分块操作最终聚合的问题
Closed
10 months ago
Comments count
1
Question about updating lse
Closed
10 months ago
Comments count
2
4卡 A100 测试 ring attention 性能不太行呢
Closed
10 months ago
Comments count
1
ring attention with varlen
Closed
10 months ago
Comments count
2
Is there a FlashAttnVarlenFunc version?
Closed
10 months ago
Comments count
5
Great work
Updated
10 months ago