FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

https://sites.google.com/view/medusa-llm

FasterDecoding/Medusa Issues

jinja2.exceptions.UndefinedError: dict object has no element 0
Updated a month ago1
Training code is not working
Updated a month ago2
[Retraining] Use Liger Kernel to avoid multi-head logits materialization and scale the context length by N times
Updated a month ago1
Training Medusa heads
Updated a month ago6
Instruct data format
Updated 2 months ago
Are Medusa Heads computed in parallel or serially?
Updated 2 months ago
updated medusa models in huggingface?
Updated 2 months ago
[ISSUE] The Pull Request at https://github.com/FasterDecoding/Medusa/pull/97 from Narsil/medusa2 needs to be rolled back.
Updated 3 months ago
do you support Amd gpu -- rocm ??
Closed 3 months ago
Errors occurred during the environment and training
Closed 3 months ago2
Some questions about sampling strategy
Closed a year ago3
Using Medusa with Whisper
Updated 4 months ago5
Does Medusa support beam search decoding strategy?
Updated 4 months ago
The implementation of stage 2 with axolotl
Updated 4 months ago
PPL compute
Updated 4 months ago
Token-wise the same generalization?
Closed 4 months ago2
Containerization with Dockerfile to setup medusa
Updated 5 months ago
Conversation roles must alternate user/assistant/user/assistant/
Updated 5 months ago
How to use the finetuned mistal model for inference with Medusa
Updated 5 months ago7
Medusa Training Loss
Updated 5 months ago5
[bug] fix preprocess function
Updated 5 months ago
ImportError: cannot import name 'is_flash_attn_available' from 'transformers.utils'
Updated 6 months ago1
Is there no way to inference without training?
Updated 6 months ago3
Is there a bug in gen_model_answer_baseline.py?
Updated 6 months ago1
train medusa stage-2
Updated 6 months ago1
mistral.json
Updated 6 months ago
which dataset should i use when training medusa heads with llama2 7b
Updated 6 months ago
Why medusa-2 train llama2 with no such great improvement?
Updated 6 months ago2
Cant it support chatgllm?
Updated 6 months ago
HYDRA support?
Updated 7 months ago
Misleading Name LLM Name MEDUSA
Updated 7 months ago
about Medusa mask details
Closed 7 months ago
release medusa-llm v1.0
Closed 7 months ago1
[Dynamic Batching] Concerns about whether features are not supported using Medusa
Updated 7 months ago
Encounter an CUDA error when set Medusa head
Updated 7 months ago
Why the speed up of Medusa 1 on vicuna changed?
Closed 8 months ago2
deepspeed support
Updated 8 months ago
medusa-2 HF repo has no 'medusa_num_heads' in config
Closed 8 months ago1
Medusa 1 and 2 speed up
Closed 8 months ago2
OSError
Updated 8 months ago3
About changing LLM from LLAMA to LLAMA-2
Closed 8 months ago2
AttributeError: 'LlamaForCausalLM' object has no attribute 'medusa_head'
Closed 8 months ago2
how did you construct the sparse tree architecture
Closed 8 months ago2
Sparse candidate generation confusion
Closed 8 months ago6
Question about Heads warmup
Updated 8 months ago1
Clarifications on Models + Batch Size
Closed 9 months ago5
Can I make an AWQ quantization?
Closed 9 months ago1
Results for different configs
Closed a year ago8
How to load finetune checkpoint files directly？
Closed a year ago
AttributeError: 'LlamaForCausalLM' object has no attribute 'medusa_head'
Closed a year ago