DachengLi1 / LongChat

Official repository for LongChat and LongEval

DachengLi1/LongChat Issues

lmsys/longchat-7b-v1.5-32k is a base model or a aligned model?
Updated 4 months ago
How to prepare the training data
Updated 6 months ago2
Update Anthropic Client
Updated 7 months ago2
Inference is very slow on long text input
Updated 9 months ago1
license
Closed 9 months ago1
How was the 18k dataset prepared?
Closed a year ago3
dummy conversations seem to be short
Closed 10 months ago2
OOM issue
Closed 10 months ago4
Add support for flash attention with use_cache
Updated 10 months ago1
Hi, using xformers monkey patch training llama2 got loss explosion
Updated a year ago
Why the use of flash attention in the inference stage will lead to slower？
Closed a year ago2
flash attention rename
Closed a year ago1
Do you support Llama-2-13b model data？
Updated a year ago
train ValueError
Updated a year ago
flash_attn installed, but got ImportErrorImportError
Closed a year ago4
Output token limit
Updated a year ago
Maybe a bug in the preprocess?
Updated a year ago3
About the print message
Updated a year ago2
About the learning rate
Updated a year ago1
Xformers Monkey Patch Compatibility
Updated a year ago1
Longchat inference configuration
Updated a year ago1
longchat-13b-16k chat not work
Updated a year ago9
Can inference be run on consumer hardware?
Updated a year ago8
OutOfMemoryError: CUDA out of memory.
Updated a year ago5
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Updated a year ago
Web GUI for longchat
Updated a year ago3
The purpose of pretrain script?
Closed a year ago2
Monkey Patch Xformers use `past_key_value` but `use_cache` can't be `True`?
Closed a year ago9
Support for other model like Baichuan
Updated a year ago
why not reuse fschat code?
Closed a year ago8
Will it support qlora?
Updated a year ago1
Add scripts on querying closed sourced models
Closed a year ago1
Add scripts to generate more testcases
Closed a year ago1
How to use 3090 to train 16k model?
Updated a year ago7
Multi-node training?
Closed a year ago1
Load the model for inference?
Closed a year ago4
unsupervised pre-training on the model
Closed a year ago5