amundsen-io / amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Home Page:https://www.amundsen.io/amundsen/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Proposal: OpenAPI/Swagger ingestion support

snowman2 opened this issue · comments

Expected Behavior or Use Case

What are your thoughts about adding support for loading in sources that are APIs with OpenAPI/Swagger docs?

For example, DataHub has support for ingesting OpenAPI/Swagger sources ref.

Service or Ingestion ETL

This would be an ingestion component.

Possible Implementation

This could extend:

from databuilder.extractor.base_extractor import Extractor

The properties could map to ColumnMetadata and each path could map to TableMetadata.

The issues we have run into is that the table name is used for the URL of the source. If you pass in a path that is an endpoint path: path/to/endpoint/one/, this causes the frontend to silently fail. As a workaround, we replaced / with . and it works like so: path.to.endpoint.one.

There are other similar setups where the schema does not seem to fit and this may require looking into other changes as well.

Recommendations for a path forward and how best to implement this are welcome. If you are interested in this, we could submit a PR and start a discussion for how best to move forward.

Example Screenshots (if appropriate):

Context

We have several API's internally with OpenAPI/Swagger endpoints and this would help automate the process to have a central catalog of the data available as well as where to find it.

Thanks for opening your first issue here!

commented

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.