Giters
srush
/
llama2.rs
A fast llama2 decoder in pure Rust.
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
1006
Watchers:
11
Issues:
21
Forks:
56
srush/llama2.rs Issues
Support fast GPU processing with Triton
Updated
2 months ago
Speed comparison
Updated
6 months ago
Comments count
2
The generation speed is superb, while the context was being truncated.
Closed
10 months ago
Comments count
18
Exported Models do not load
Closed
10 months ago
Comments count
2
Where is the requirements.export.txt?
Closed
10 months ago
Comments count
2
Tensor has shape torch.Size([448, 1024]) ... this looks incorrect.
Updated
a year ago
Comments count
9
How to run baby llama?
Updated
a year ago
Comments count
1
CodeLlama support
Updated
a year ago
Comments count
1
no `TransformerWeights` in `model`
Closed
a year ago
Comments count
3
fabulous, does it support llama 1 and its derivatives anyway?
Updated
a year ago
Comments count
2
Quick Code Review: Auto-vectorization
Updated
a year ago
Comments count
8
why qzeros need added 1 when unmasked?
Closed
a year ago
Comments count
2
Non-mmap'ed weights
Closed
a year ago
Comments count
2
License?
Closed
a year ago
Comments count
1
readme commands doesn't work
Closed
a year ago
Comments count
1
Some llama2 finetunes don't seem to work
Updated
a year ago
Comments count
2
Python Versions
Closed
a year ago
Comments count
4
Unable to export LLaMa2 model to bin file
Updated
a year ago
Comments count
1
Quick review
Updated
a year ago
Comments count
2
Minor Nitpics, from an also rust newbie :)
Updated
a year ago
Comments count
1
nice work, some questions
Updated
a year ago
Comments count
3