sleeepeer / PoisonedRAG

[USENIX Security 2025] PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models

Home Page:https://arxiv.org/abs/2402.07867

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about replicating the code

Zongdanyang opened this issue · comments

commented

Hello, I am very interested in your paper, so I have recently been replicating the code you provided. When I tried to reproduce the results with top_k=50, I encountered a RuntimeError: The size of tensor a (4096) must match the size of tensor b (7933) at non-singleton dimension 3 error. I would like to ask if there is a fixed dimension related to k somewhere, and where should I modify it? Thank you.

Hi, thanks for pointing it out. Could you please specify your experimental setting, (e.g. what LLM you used? Your issue about dimension seems to come from a local model (llama2 or vicuna)).

commented

I used the llama-7b-chat-hf model. Below are my experimental settings:
test_params = {
# beir_info
'eval_model_code': "contriever",
'eval_dataset': "nq",
'split': "test",
'query_results_dir': 'main',

# LLM setting 
'model_name': 'llama7b', 
'use_truth': False,
'top_k': 50,
'gpu_id': 1,

# attack
'attack_method': 'LM_targeted',
'adv_per_query': 5,
'score_function': 'dot',
'repeat_times': 10,
'M': 10,
'seed': 12,

'note': None

}

I replicated your setting and found that llama-7b-chat-hf has a max_length of 4096. When top_k=50, the context could be very long (with a length of 7933). Our top_k=50 results in the Appendix (Figure21-26) were tested on PaLM-2 (our default model), which is an online API and supports longer context input. So I suggest you test top_k=50 using PaLM-2.

Here is a screenshot from llama-2-7b's HuggingFace page, which indicates that llama-2 only supports a 4k context length (i.e. 4096).
image

commented

Thank you for your suggestion!

Welcome! If you have further questions, feel free to contact me!