The selfrag_llama2_7b model does not come out as in the example

Question

The selfrag_llama2_7b model does not come out as in the example

leejaehoon1830 opened this issue 4 months ago · comments

I experimented using the settings provided in the example at https://huggingface.co/selfrag/selfrag_llama2_7b, but the prediction result I got was just a series of 'Model prediction: blank result'. However, when using the model at https://huggingface.co/selfrag/self_rag_critic, the results come out as expected.

from transformers import AutoTokenizer, AutoModelForCausalLM
from vllm import LLM, SamplingParams

model = LLM("selfrag/selfrag_llama2_7b", download_dir="/gscratch/h2lab/akari/model_cache", dtype="half")
sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False)

def format_prompt(input, paragraph=None):
prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input)
if paragraph is not None:
prompt += "[Retrieval]{0}".format(paragraph)
return prompt

query_1 = "Leave odd one out: twitter, instagram, whatsapp."
query_2 = "Can you tell me the difference between llamas and alpacas?"
queries = [query_1, query_2]

preds = model.generate([format_prompt(query) for query in queries], sampling_params)
for pred in preds:
print("Model prediction: {0}".format(pred.outputs[0].text))

Expected results are below
Model prediction: Twitter, Instagram, and WhatsApp are all social media platforms.[No Retrieval]WhatsApp is the odd one out because it is a messaging app, while Twitter and # Instagram are primarily used for sharing photos and videos.[Utility:5] (this query doesn't require factual grounding; just skip retrieval and do normal instruction-following generation)
=>But I got the blank result

Expected results are below
Model prediction: Sure![Retrieval] ... (this query requires factual grounding, call a retriever)
=>But I got the blank result

generate with retrieved passage

prompt = format_prompt("Can you tell me the difference between llamas and alpacas?", paragraph="The alpaca (Lama pacos) is a species of South American camelid mammal. It is similar to, and often confused with, the llama. Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.")
preds = model.generate([prompt], sampling_params)
print([pred.outputs[0].text for pred in preds])

Expected results are below
['[Relevant]Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.[Fully supported][Utility:5]']
=>But I got the blank result

jim zhang · Answer 1 · Fri Jan 26 2024 19:33:58 GMT+0800 (China Standard Time)

I'm curious what your question is？Maybe I can provide some help～

leejaehoon1830 · Answer 2 · Mon Jan 29 2024 09:26:40 GMT+0800 (China Standard Time)

My problem is simliar to #30.

wenyang · Answer 3 · Sun Mar 03 2024 10:24:57 GMT+0800 (China Standard Time)

Well, I got <unk>, all the text are <unk>

Akari Asai · Answer 4 · Wed Mar 20 2024 05:34:05 GMT+0800 (China Standard Time)

Do you mind providing the vllm version? Not directly about Self-RAG, but I've recently encountered similar issues when I was loading Mixtral models (e.g., examples are all blank) and I wonder if this happens due to some vllm side issue...