KwaiKEG / KwaiAgents

A generalized information-seeking agent system with Large Language Models (LLMs).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

训练集(KAgentInstruct)ReACT数据问题

Emperorizzis opened this issue · comments

Hi ! 我发现您最近开源了训练数据KAgentInstruct,感谢开源!

在训练数据ReACT方法这一类别中,我发现了下述问题:
prompt数据中定义的tools存在矛盾情况

Answer the following questions as best you can. You have access to the following tools:

[{'name': 'open_weather_map', 'description': 'A tool for fetching current weather information for a specified location. Input should be a location string (e.g. London,GB).', 'parameters': {'location': {'type': 'string', 'description': 'Location to search.', 'required': True}}}, {'name': 'query_powerbi', 'description': 'A tool for querying a dataset based on a detailed question input. It will try to answer the question using the dataset and, if it cannot, it will prompt for clarification.', 'parameters': {'tool_input': {'type': 'string', 'description': 'Detailed question to search.', 'required': False}}}, {'name': 'submit_file', 'description': 'A tool to submit a file once all steps are complete.', 'parameters': {}}, {'name': 'rewrite_sql', 'description': 'A tool for rewriting an input SQL query.', 'parameters': {'sql': {'type': 'string', 'description': 'The SQL query to rewrite.', 'required': True}}}, {'name': 'get_qa', 'description': 'A tool for answering questions based on an input text. Can be used to process text from an image.', 'parameters': {'input': {'type': 'string', 'description': 'The input text.', 'required': True}}}, {'name': 'check_availability', 'description': 'A tool for checking the availability of a property based on its ID.', 'parameters': {'propertyId': {'type': 'string', 'description': 'The ID of the property.', 'required': True}}}, {'name': 'sleep', 'description': 'A tool for making the agent sleep for a specified number of seconds.', 'parameters': {'sleep_time': {'type': 'number', 'description': 'The number of seconds to sleep.', 'required': True}}}, {'name': 'searx_search', 'description': 'A tool for meta-searching, useful for retrieving up-to-date information based on a search query.', 'parameters': {'query': {'type': 'string', 'description': 'The search query.', 'required': True}}}, {'name': 'get_elements', 'description': 'A tool for retrieving URL(s) based on a CSS selector and (optional) attribute(s).', 'parameters': {'selector': {'type': 'string', 'description': 'A CSS selector, such as "*", "div", "p", "a", #id, or .classname.', 'required': True}, 'attributes': {'type': 'array', 'description': 'An optional set of attributes to retrieve for each element.', 'items': {'type': 'string', 'description': 'An attribute to retrieve for each element'}, 'required': False}}}, {'name': 'rent_estimate', 'description': 'A tool for estimating rent for a specified property.', 'parameters': {'property_type': {'type': 'string', 'description': 'The type of the property (SingleFamily, Condo, MultiFamily, Townhouse, or Apartment).', 'required': True}, 'long': {'type': 'number', 'description': 'The longitude of the property.', 'required': True}, 'lat': {'type': 'number', 'description': 'The latitude of the property.', 'required': True}, 'd': {'type': 'number', 'description': 'The diameter in miles.', 'required': True}, 'beds': {'type': 'number', 'description': 'The number of bedrooms in the property.', 'required': True}, 'bath': {'type': 'number', 'description': 'The number of bathrooms in the property.', 'required': True}, 'sqftMin': {'type': 'number', 'description': 'The minimum square footage of the property.', 'required': True}, 'sqftMax': {'type': 'number', 'description': 'The maximum square footage of the property.', 'required': True}, 'address': {'type': 'string', 'description': 'The address of the property.', 'required': True}}}, {'name': 'no_function', 'description': 'A placeholder function indicating that no appropriate tool exists with the given parameters.', 'parameters': {}}]

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [duckduckgo_search, Wikipedia, Calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: 请给我一些建筑设计方面的创意灵感
Thought:

例如上述例子中,定义的工具为:

[{'name': 'open_weather_map', 'description': 'A tool for fetching current weather information for a specified location. Input should be a location string (e.g. London,GB).', 'parameters': {'location': {'type': 'string', 'description': 'Location to search.', 'required': True}}}, {'name': 'query_powerbi', 'description': 'A tool for querying a dataset based on a detailed question input. It will try to answer the question using the dataset and, if it cannot, it will prompt for clarification.', 'parameters': {'tool_input': {'type': 'string', 'description': 'Detailed question to search.', 'required': False}}}, {'name': 'submit_file', 'description': 'A tool to submit a file once all steps are complete.', 'parameters': {}}, {'name': 'rewrite_sql', 'description': 'A tool for rewriting an input SQL query.', 'parameters': {'sql': {'type': 'string', 'description': 'The SQL query to rewrite.', 'required': True}}}, {'name': 'get_qa', 'description': 'A tool for answering questions based on an input text. Can be used to process text from an image.', 'parameters': {'input': {'type': 'string', 'description': 'The input text.', 'required': True}}}, {'name': 'check_availability', 'description': 'A tool for checking the availability of a property based on its ID.', 'parameters': {'propertyId': {'type': 'string', 'description': 'The ID of the property.', 'required': True}}}, {'name': 'sleep', 'description': 'A tool for making the agent sleep for a specified number of seconds.', 'parameters': {'sleep_time': {'type': 'number', 'description': 'The number of seconds to sleep.', 'required': True}}}, {'name': 'searx_search', 'description': 'A tool for meta-searching, useful for retrieving up-to-date information based on a search query.', 'parameters': {'query': {'type': 'string', 'description': 'The search query.', 'required': True}}}, {'name': 'get_elements', 'description': 'A tool for retrieving URL(s) based on a CSS selector and (optional) attribute(s).', 'parameters': {'selector': {'type': 'string', 'description': 'A CSS selector, such as "*", "div", "p", "a", #id, or .classname.', 'required': True}, 'attributes': {'type': 'array', 'description': 'An optional set of attributes to retrieve for each element.', 'items': {'type': 'string', 'description': 'An attribute to retrieve for each element'}, 'required': False}}}, {'name': 'rent_estimate', 'description': 'A tool for estimating rent for a specified property.', 'parameters': {'property_type': {'type': 'string', 'description': 'The type of the property (SingleFamily, Condo, MultiFamily, Townhouse, or Apartment).', 'required': True}, 'long': {'type': 'number', 'description': 'The longitude of the property.', 'required': True}, 'lat': {'type': 'number', 'description': 'The latitude of the property.', 'required': True}, 'd': {'type': 'number', 'description': 'The diameter in miles.', 'required': True}, 'beds': {'type': 'number', 'description': 'The number of bedrooms in the property.', 'required': True}, 'bath': {'type': 'number', 'description': 'The number of bathrooms in the property.', 'required': True}, 'sqftMin': {'type': 'number', 'description': 'The minimum square footage of the property.', 'required': True}, 'sqftMax': {'type': 'number', 'description': 'The maximum square footage of the property.', 'required': True}, 'address': {'type': 'string', 'description': 'The address of the property.', 'required': True}}}, {'name': 'no_function', 'description': 'A placeholder function indicating that no appropriate tool exists with the given parameters.', 'parameters': {}}]

但prompt中却说:

... ...
Action: the action to take, should be one of [duckduckgo_search, Wikipedia, Calculator]
... ...

我发现这好像是一种比较普遍的情况。

请问是否能上传更新版的训练数据?再次感谢您的开源,期待回复!

commented

感谢你的发现,最新数据已经上传:https://huggingface.co/datasets/kwaikeg/KAgentInstruct