composable-models / llm_multiagent_debate

ICML 2024: Improving Factuality and Reasoning in Language Models through Multiagent Debate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

question about comparison to self-consistency

TianHongZXY opened this issue · comments

Hi, very interesting work! I have a question about the comparison to the self-consistency. In Table 1 you include a Multi-Agent (Majority) as baseline, which serves as self-consistency I think, since you use 3 agents, the total number of sampled solutions is 3 for it, while your method use 3 agents to debate for 2 round, which means you sample 6 solutions (3 in the first round and 3 in the second round). Also during the second round each agent can observe all the agents' previous solutions, which indicates its chain-of-thought length is 3 times longer, these more expenses can afford sampling more solutions with self-consistency, I wonder is that a fair comparison or you actually sample more than 3 solutions in Multi-Agent (Majority). BTW, why you omit the comparison to self-consistency in Table 2? Thank you~

Hi -- we're planning to add comparisons with consistency with more forward passes soon.

In terms of comparisons with self-consistency in Table 2 -- we are generating bullet point biographies as answers, where there is no method to take the majority vote across different responses so we omitted the comparison

Thank you for answering, hoping to see more details of the experimental setting