guosyjlu / DS-Agent

Official implementation of "DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning" in ICML'24

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

model performance measure的问题

KevinZ-01 opened this issue · comments

您好,请问paper里面mean rank and best rank这里参与ranking的是哪些模型呢?是public的在该数据集上的所有模型吗?谢谢!

你好,参与ranking的是全部baseline产生的5个模型。例如,在development stage,参与ranking的有ResearchAgent w/ GPT-3.5, ResearchAgent w/ GPT-4, DS-Agent w/ GPT-3.5, DS-Agent w/ GPT-4,共计4个agent*5次重复实验=20个模型进行排名。

Thanks!