ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models

Home Page:http://ludwig.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remove target_module hardcoding for Mixtral model

arnavgarg1 opened this issue · comments

PEFT 0.7.1 doesn't have a default LoRA target module mapping: https://github.com/huggingface/peft/blob/8665e2b5719faa4e4b91749ddec09442927b53e0/src/peft/utils/constants.py#L49

For now, this PR (#3852) adds a fallback mechanism to default to the q_proj and v_proj linear layer tensors when LoRA is used with either mixtral or mixtral instruct.

Once PEFT adds official support for Mixtral, we should remove this hack.