AI-Engineer-Foundation / agent-protocol

Common interface for interacting with AI agents. The protocol is tech stack agnostic - you can use it with any framework for building agents.

Home Page:https://agentprotocol.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Plugins Feature Discussion Thread

jzanecook opened this issue · comments

Initially, we had discussed creating a plugin system with an Agentfile that would be placed at the root folder of the agent. However, with the addition of the info endpoint and specifically the idea of config_options within that route, what if plugins were displayed within that endpoint instead of in a separate file?

The idea here is to essentially define an external resource that we could pull that would define whatever extensions to the protocol exist, which would include detailed specs and any other relevant info.

This might require writing another spec for agent-protocol-plugin so I'm curious what the general opinion is.

I like this idea. It may be good to hear your proposed schema in the info endpoint for the plugin system.

I can see two categories of extensions:

Easy: This involves adding new endpoints under a fresh OpenAPI specification file. Users can implement any number of these, and then specify which ones they've adopted in an 'api/extensions' endpoint or something similar.
Hard: This pertains to introducing new parameters to existing endpoints, like 'user_id'. I'm uncertain about a clean method for this.

Much like the OpenAPI specification, developing the SDK is straightforward for the first type but challenging for the second. The easy extension type can be executed by setting up a FastAPI router for each extension. Subsequently, you can effortlessly integrate as many extension routers as needed into your implementation.

I'm still figuring out a solution for the second type.

Well there's an even harder one, what if you wanted to implement a whole different type of server? A real-time websocket or graphql server or a stream of some kind or any number of crazy configurations an agent might implement. A plugin system is good for that, but the issue is how it would work and how it could do that while also minimizing development time. Somebody mentioned before that they wanted websockets and that's just one person, what if I wanted my agent to live stream video/audio/data to the client? A REST API isn't going to be able to cover that, so then what would a plugin look like?

A simple extension to the REST API standard is relatively easy to implement, but if the plugin needs to be more than just an extension to the open api spec then it becomes significantly more complex.

How could we go about creating a plugin system that would allow agents to implement complex architectures as well as allow clients to easily implement those architectures without an excessive amount of code?

What are example plugin scenarios? I heard "auth" but that probably needs to be baked into the core protocol... Other concrete examples???

And how would a caller discover which plugins the Agent implements?

Caller can discover with the "info" endpoint. #39
With the protocol, we are trying to provide a minimum set of functionalities, but also ensuring collaboration from third parties to contribute to protocol extensions such as auth. I can imagine auth be implemented/needed differently by different companies, which is why we think it is good to have it be extension rather than baked into the core protocol.