AI-Engineer-Foundation / agent-protocol

Common interface for interacting with AI agents. The protocol is tech stack agnostic - you can use it with any framework for building agents.

Home Page:https://agentprotocol.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add an Endpoint to get Agent Info

ntindle opened this issue · comments

Is your feature request related to a problem? Please describe.
I've run into a case where I need to display different agent details as needed based on the agent.

Things such as:

  • Name
  • Version

Describe the solution you'd like
I would like a GET Endpoint Added to the protocol to request information about the agent

Describe alternatives you've considered
Requiring this information to be queriable outside the agent protocol is possible, but non linear

Another thing we discussed was

Author,
Help Text

Continue Message: yes or continue or something like that

I believe we've settled on the following format for a route existing at /ap/v1/agent/info:

{
    "name": "My Agent",
    "description": "General purpose agent.",
    "version": "1.0.0",
    "protocol": "1.0",
    "github": "https://github.com/myagent/myagent",
    "url": "https://myagent.com",
    "docs": "https://myagent.com/docs",
    "issues": "https://github.com/myagent/myagent/issues"
}

Additionally to this schema we discussed having a config_options array of objects with the properties of type, default, description,and options (optional). Where type refers to the type of input, and options is an array of that type that show what is available. This config_options array would be used as an info endpoint for clients to use against the tasks and steps in their config options.

We should also have the "schema_plugin" array that is the plugin system

Here is the AgentInfo object I had been thinking of using for the spec:

    AgentInfo:
      type: object
      properties:
        name:
          description: Name of the agent.
          type: string
          example: My Agent
        description:
          description: Description of the agent.
          type: string
          example: My agent is the best agent.
        version:
          description: Version of the agent.
          type: string
          example: 1.0.0
        protocol_version:
          description: Version of the agent protocol.
          type: string
          example: 1
        github:
          description: GitHub repository of the agent.
          type: string
          example: 'https://github.com/AI-Engineers-Foundation/agent-protocol'
        url:
          description: URL of the agent.
          type: string
          example: 'https://my-agent.com'
        docs:
          description: Link to the documentation of the agent.
          type: string
          example: 'https://my-agent.com/docs'
        issues:
          description: Link to the issues of the agent.
          type: string
          example: 'https://github.com/AI-Engineers-Foundation/agent-protocol/issues'
        config_options:
          description: List of configuration options for the agent's tasks and steps. The config is a user-defined set of key/value pairs where the values are standard but the keys are not.
          type: object
          example: |-
            {
            "debug": {
            "type": "boolean",
            "default": false,
            "description": "Whether to run the agent in debug mode."
            },
            "model": {
            "type": "string",
            "default": "gpt-4",
            "description": "The model in which the agent's tasks should run."
            }
            }
          additionalProperties:
            type: object
            properties:
              type:
                description: The type of the value.
                type: string
                enum:
                  - string
                  - integer
                  - float
                  - boolean
                  - list
                  - dict
              default:
                description: The default value of the config option.
                type: string
              description:
                description: 'A description of the value with type, default value, and description.'
                type: string
              options:
                description: A list of options for the config option.
                type: array
                items:
                  oneOf:
                    - type: string
                    - type: integer
                    - type: number
                    - type: boolean
                    - type: object
                    - type: array
                      items:
                        oneOf:
                          - type: string
                          - type: integer
                          - type: number
                          - type: boolean
                          - type: object
            required:
              - type
              - default
              - description
            example: |-
              {
              "type": "string",
              "default": "gpt-4",
              "description": "Model for the agent's steps to use."
              "options": ["gpt-4", "gpt-3.5-turbo", "gpt-3.5-turbo-16k"]
              }
            description: 'A description of the value with type, default value, and description.'
      required:
        - name
        - version
        - protocol_version
        - config_options

I like this proposal and agree with a need to standardize resource provision of some subset of Agent metadata. However, I could see a situation where some of the metadata proposed (i.e. agent name, github, issues, url, docs) is considered proprietary or sensitive and gated off to authenticated users. In that case, the Agent implementer would have to either 1) not conform to the spec with less than the entirety of Agent info provided, or 2) not be able to retrieve the entire metadata info without authenticating first.

I think there is a happy middle ground where things like version and protocol_version might be supplied (leaving the Agent implementers to associate the rest of the metadata with the specific version released/deployed). I think something like config_options is especially sensitive if this endpoint is always open, and I am not sure I understand the need for a duplicate url property when this endpoint is being served by a deployed Agent.

I also have a concern about the default assumption that github will always be the version control platform/repository of an Agent that conforms to the Agent Protocol, or that issues will be publicly available. I think it's quite likely there will be paid vendor Agents hosted that should ideally conform to the Agent Protocol spec to prevent vendor lock-in but who will not want to conform if the spec is too prescriptive about implementation details that may not make sense for them.

Perhaps it would be worth discussing a limit to the protocol's provided top-level info of the below?

  1. agent version (required)
  2. protocol version (required)
  3. name (optional)
  4. description (optional)
  5. additional properties (optional)

I derived those thinking about a universally applicable protocol relevant to all of the below, and believe that a protocol that is follow-able will be one that supports all of these:

  • open source hosted Agents
  • paid/vendor-hosted Agents
  • unauthenticated requesters of Agents
  • authenticated requesters of Agents
  • kubernetes/distributed systems-level requesters of Agents
  • github/gitlab/bitbucket/etc repository-using builders of Agents

The proposed endpoint /ap/v1/agent/info with the the following body makes sense:

{
  "name": "My Agent",
  "description": "General purpose agent.",
  "version": "1.0.0",
  "protocol": "1.0",
  "git": "https://myselfhostedgit.com/myagent/myagent",
  "url": "https://myagent.com",
  "docs": "https://myagent.com/docs",
  "issues": "https://github.com/myagent/myagent/issues"
}

+1 on not assuming that github is the only place where the code can exist.

There is also a proposal to introduce the plugin system under the info endpoint. However, I would argue that the protocol does not need to be aware of the plugin. This means that the protocol does not depend on plugin and only plugin depends on the protocol.

In any case there are no plugin scenarios brought out in the plugin issue thread except for auth.

Auth

I would propose to support auth in the core agent protocol.

Let’s consider some common authorization methods:

  • None - well none, client does not pass anything in the Authorization header.
  • Basic - client sends HTTP requests with an Authorization header that contains the word Basic followed by a space and a base64-encoded string username:password.
  • Bearer Token (or Token Authentication) - client sends an HTTP request with an Authorization header containing the word Bearer followed by a space and a token.
  • JWT - client sends an HTTP request with an Authorization header containing the word Bearer followed by a space and a JWT token.
  • OAuth2.0 - client sends an HTTP request with an Authorization header containing the word Bearer followed by a space and a oauth token.

In the real world there are alternative ways how the authorization token is sent. The 2 alternatives to header I see are in query parameter (?auth_token=mock_token) and in payload of POST query. As Agent Protocol has already GET and POST requests it would introduce unnecessary complexities and I would conclude that the most logical path forward would be to include the authorization in the header.

By including the Authorization header in the Agent Protocol we could support all? common authorization methods mentioned above. In addition as this is an optional field then it can already be added on the agent implementation side without breaking anything.

What is needed would be a way for the agent to say that the Authorization is needed. I would propose using the config_options for this:

{
  "config_options": {
    "auth": {
      "type": "string",
      "default": null,
      "description": "The authentication method to use.",
      "options": [
        null,
        "basic",
        "bearer_token",
        "oauth2",
        "jwt"
      ]
    }
  }
}

For example if the agent is using JWT then the /info endpoint would return:

{
  "version": "1.0.0",
  "protocol": "1.0",
  "config_options": {
    "auth": "jwt"
  }
}

Plugin

What we in https://agentwallet.ai/ are building is essentially a payment plugin. The most convenient way to accomplish paying to the Agent is through our platform. There would be no need for any protocol or agent code changes. The payment plugin would be in front of the agent protocol. This is my understanding how the plugins should work: “adds extra functionality to the existing software without modifying the existing software”

Proposed solution to include Auth into Agent Protocol: #80

Really like this perspective.

Discussed with the community:

  • we are giving a "Green Light" on this
  • It helps to include some documentations on how to acquire these tokens for auth for various auth type: jwt, oath2 etc (this could be included in the description for auth)

@hackgoofer Thanks for the update!

  1. Do I understand correctly that the green light is for the Auth approach, not the /info endpoint? (Would agree as these are 2 separate things now)
  2. Do you have some ideas on how to provide information on auth? I think it probably makes sense to go over the different scenarios.

What I imagine:

  • completely private - I don't want anyone to know how to access the auth credentials. The agent is running inside kubernetes or my gated system -> No documentation on how to get tokens
  • paid - you can sign up to my agent at my company website and you get the API key etc.. -> Documentation on how to get tokens is in my developer docs.
  • automatic token generation - I want users/clients to register, can be done automatically so that I can allow only 1 access token task running at the same time to protect myself from DDOS (or something similar) -> I think this is the scenario where the protocol needs to document how to get tokens
  • no auth - no docs

Maybe the community can share some ideas in here how they would like to use the Auth?

My understanding is the three enclosed topics have all been green lit. I'm working with @jzanecook on the info and config object details and RFCs.

@KasparPeterson would you mind writing up an RFC for auth?

You could consider having info as a .well-known/ (maybe .well-known/agent?)
I was involved in a project that had payment pointers, which seem comparable to agent info endpoints
https://paymentpointers.org/

The payment pointer/wallet endpoint also specified an authServer (but the only auth supported was GNAP) as well as a resourceServer where the existing agent protocol endpoints could live.
https://openpayments.guide/apis/wallet-address-server/operations/get-wallet-address/