run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Home Page:https://docs.llamaindex.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Question]: How to create a multiDocagent using function call with bedrock llms

noviljohnson opened this issue · comments

Question Validation

  • I have searched both the documentation and discord for an answer.

Question

I want to create multi document agents using function calling as shown in here structuredPlannerAgent, but with bedrock llms ("anthropic.claude-v2",
"anthropic.claude-3-sonnet-20240229-v1:0")

but i am getting an error that these models don't support function calling.
i checked the Settings.llm.metadata.is_function_calling_model is False

  1. is it possible to make the Settings.llm.metadata.is_function_calling_model : True and use the above-mentioned models for function calling
  2. Or suggest some Bedrock llms that can be used for Function Calling to use in structuredPlannerAgent
  3. Or is there any other way to use above mentioned models and create multi document agents.

I tried this Multidocagents as well, but with ReActAgent instead of OpenAiAgent.
but the output is not as expected. I can't use Openai api and i have to go with bedrock llms.

basically, i am trying to create the following workflow.
agents workflow

Hey @noviljohnson! 😄 Long time no type, huh? I hope you've been doing great since our last chat. 🚀 Let me dive into your query about multi document agents and bedrock llms, and I'll get back to you with some insights in a jiffy!

commented

You need to use the ReactAgentWorker, since the bedrock LLM class doesn't implement the FunctionCallingLLM class

commented
from llama_index.core.agent import (
    StructuredPlannerAgent,
    ReActAgentWorker,
)

# create the react agent worker for reasoning
worker = ReActAgentWorker.from_tools(
    [lyft_tool, uber_tool], verbose=True
)

# wrap the worker in the top-level planner
agent = StructuredPlannerAgent(
    worker, tools=[lyft_tool, uber_tool], verbose=True
)

To address your query about enabling function calling for "anthropic.claude-v2" and "anthropic.claude-3-sonnet-20240229-v1:0" or finding alternative models that support such functionality for implementing a multi-document agent architecture, it's essential to ensure model compatibility with the function calling API. The FunctionCallingAgent class from LlamaIndex provides a framework for this, but your specified models must be compatible or adapted to fit this framework.

  1. Model Compatibility: First, verify if your chosen models can integrate with the FunctionCallingAgent. This step is crucial for enabling function calling capabilities.

  2. Adapting Models: If the models do not support function calling out of the box, you may need to adapt them to work with the FunctionCallingAgent framework. This could involve wrapping your models to fit the expected interface for function calling.

  3. Configuring FunctionCallingAgent: Utilize the FunctionCallingAgent by setting up its parameters to customize how your LLM interacts with tools and manages tool calls based on the chat context and user messages.

  4. Integration with ReActAgentWorker: While not explicitly addressed, the principles of the FunctionCallingAgent can be adapted to work within a structured planner agent setup, potentially involving customization to fit the structured planning and execution flow of ReActAgentWorker.

  5. Error Handling and Parallel Calls Management: Implement logic for error handling and managing parallel tool calls effectively, based on the capabilities of your chosen LLMs and the requirements of your application.

For detailed implementation and further guidance, refer to the LlamaIndex documentation and source code, specifically the sections related to the FunctionCallingAgent class and its integration with agent workers. This approach will ensure a seamless operation of your multi-document agent architecture within the constraints and capabilities of the Bedrock LLMs you wish to use.

Details

Sources


About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

Hi @logan-markewich,
thank you

it worked with ReactAgentWorker.

but i am getting response sometimes in English and sometimes in different language (maybe French/ Latin/Spanish 😂)
could you tell me why this is happening. Or should i change anything.

This is the output i got

El documento propone Quantized Low Rank Adapters (QLORA) como un método eficiente para ajustar modelos de lenguaje grandes cuantizados (LLM) para tareas específicas. QLORA agrega un pequeño número de parámetros específicos de la tarea a un LLM cuantizado en forma de adaptadores de rango bajo. Esto permite un ajuste eficiente sin volver a entrenar o cambiar el modelo base. Cuantizar los adaptadores hace que la inferencia sea muy rápida y eficiente en memoria en comparación con los adaptadores de precisión completa. Los experimentos muestran que QLOR

But in thought/action plan printing statements i see English

Answer: Based on the summary from the reinforcement_learning tool, some key points of the paper on a standardized reinforcement learning framework for AAM contingency management are:

  • It proposes using reinforcement learning and an MDP formulation for real-time decision making and contingency management in AAM.

  • The MDP is integrated into the AAM-Gym simulation framework to enable rapid prototyping and benchmarking of algorithms.

  • This provides a way to evaluate existing systems and enables future research on RL algorithms for AAM contingency management.

  • The paper includes baseline results and metrics using the framework to demonstrate its capabilities.

thanks😊

hi,
after experimenting for some time, now i am getting all responses in English.
But it is taking 1.5 to 2 min to return the response. Is this normal response time?.
Is there a way to reduce the response time.

Thank you.

commented

The response time depends on the LLM being used, how many LLM calls are being made, and how much text the LLM is writing.

Typically there is
1 LLM call to either call a tool or respond
1 tool call (which could involve 1+ LLM calls)
1 LLM call to either respond using the tool response or call another tool
and so on