OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't find forward_only function

Nieysh opened this issue · comments

I'm trying to use alpaca_finetuning_v1/llama to autoregressively generate text for validation during finetuning, however, in alpaca_finetuning_v1/llama/generation.py line 42: logits = self.model.forward_only(tokens[:, prev_pos:cur_pos], prev_pos)
I find no forward_only function in alpaca_finetuning_v1/llama/model.py
Could you please release the forward_only function?Or is there any way to use alpaca_finetuning_v1/llama/model to generate texts for validation?Because it's not convenient to load the trained model during training by llama/model.py under root folder

Thanks for pointing out our bug! Actually, alpaca_finetuning_v only support training on Alpaca dataset. To inference, you can use the model in our main dir: https://github.com/OpenGVLab/LLaMA-Adapter/blob/main/llama/model.py in which the forward function is for inference.

Thanks for pointing out our bug! Actually, alpaca_finetuning_v only support training on Alpaca dataset. To inference, you can use the model in our main dir: https://github.com/OpenGVLab/LLaMA-Adapter/blob/main/llama/model.py in which the forward function is for inference.

Thanks! But I find the model.py under https://github.com/OpenGVLab/LLaMA-Adapter/blob/main/llama/model.py is different from the one under alpaca_finetuning_v1/llama/model.py, I tried to use the forward function there to be the forward_only function so that I can inference while training, but it failed because alpaca_finetuning_v1/llama/model.py seems not take account for the situation"when seq_len >1 and mask is None"

You can add the forward_only function in

@torch.inference_mode()
def forward_inference(self, tokens: torch.Tensor, start_pos: int):
_bsz, seqlen = tokens.shape
h = self.tok_embeddings(tokens)
self.freqs_cis = self.freqs_cis.to(h.device)
freqs_cis = self.freqs_cis[start_pos : start_pos + seqlen]
if self.adapter_len * self.adapter_layer > 0:
adapter = self.adapter_query.weight.reshape(-1, self.adapter_len, self.params.dim).unsqueeze(1)
if seqlen == 1:
mask = None
elif start_pos == 0:
mask = torch.full((1, 1, seqlen, seqlen), float("-inf"), device=tokens.device)
mask = torch.triu(mask, diagonal=1).type_as(h)
else:
raise NotImplementedError()
for i, layer in enumerate(self.layers):
adapter_index = i - (len(self.layers) - self.adapter_layer)
h = layer(h, start_pos, freqs_cis, mask, adapter[adapter_index].half() if adapter_index >= 0 else None)
h = self.norm(h)
output = self.output(h[:, -1, :])
return output.float()
def enable_cache(self):
for layer in self.layers:
layer.attention.enable_cache()
def disable_cache(self):
for layer in self.layers:
layer.attention.disable_cache()
to alpaca_finetuning_v1, we will then update the code.

Thx! I add and still get the following bug. It seems that the attention module can't handle 'mask == None'
Traceback (most recent call last):
File "example_test_infer.py", line 114, in
fire.Fire(main)
File "/data1/nieyunshuang/nys_new/miniconda3/envs/navllmsig/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/data1/nieyunshuang/nys_new/miniconda3/envs/navllmsig/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/data1/nieyunshuang/nys_new/miniconda3/envs/navllmsig/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "example_test_infer.py", line 106, in main
results = generator.generate(prompts, max_gen_len=512, temperature=temperature, top_p=top_p)
File "/data1/nieyunshuang/nys_new/LLaMA-Adapter/alpaca_finetuning_v1/llama/generation.py", line 42, in generate
logits = self.model.forward_only(tokens[:, prev_pos:cur_pos], prev_pos)
File "/data1/nieyunshuang/nys_new/LLaMA-Adapter/alpaca_finetuning_v1/llama/model.py", line 241, in forward_only
h = layer(h, start_pos, freqs_cis, mask, adapter[adapter_index].half() if adapter_index >= 0 else None)
File "/data1/nieyunshuang/nys_new/miniconda3/envs/navllmsig/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/data1/nieyunshuang/nys_new/LLaMA-Adapter/alpaca_finetuning_v1/llama/model.py", line 167, in forward
h = x + self.attention.forward(self.attention_norm(x), start_pos, freqs_cis, mask, adapter)
File "/data1/nieyunshuang/nys_new/LLaMA-Adapter/alpaca_finetuning_v1/llama/model.py", line 106, in forward
mask = torch.cat([extra_mask, mask], dim=-1)
TypeError: expected Tensor as element 1 in argument 0, but got NoneType