pulumi / registry

The global index of everything you can do with Pulumi.

Home Page:https://www.pulumi.com/registry

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Move package API docs from `docs` to this repo to facilitate a single-place of ownership

praneetloke opened this issue · comments

We want to move all API docs and how-to guides from the docs repo to registry to their respective folders. We are looking at ways to allow us to still run make serve (for development within the repo) and not have Hugo kill our dev machines. For that @cnunciato will look into excluding the api-docs specifically from Hugo processing those in dev mode. We would add an alternate make target for devs who wish to see the API docs locally that would involve doing a build. So there would still be a way for devs to “see” the API docs locally which in today's world one can't unless they copy the api-docs folder over for a package from docs to here just to see something work end-to-end. This would be an improvement over what we have today.

Then we would open a PR in docs repo to delete all API/how-to guides related folders. That is, content/registry/**/* as well as static/registry. We would also move some of the tools/ over to the registry repo as well. This means we’ll need to update the existing docs generation workflow in that repo to do a “redirect” of the repository_dispatch event to the registry repo where we will have the same exact workflow. This will buy us time to be able to update all of our providers to post a repository_dispatch, when we release new versions, to the registry repo instead. This is so we don’t break the mechanism that is in-place today.

This helps the self-service contribution story (#206) immensely as everything related to packages would be housed in the same repository. It would also make it easy for us to author automation that does the right things.

UPDATE: We eventually ended up doing something entirely different from the above. See #237 (comment).

As part of this, we should update the README in a couple of ways:

  • Remove the note about this work being still in progress (under "About this repository")
  • Ensure the rest of the README is accurate in light of what's changed

@cnunciato and I started working on this. One thing we've discovered is that there doesn't seem to be a way to tell Hugo to exclude certain dirs from its "watch" (livereload) feature. We were able to use the ignoreFiles config property to let Hugo skip processing the api-docs folder but hit the snag with the livereload. So we are looking to change the approach slightly by having the api-docs in a separate Hugo module in this repo alongside the _default module. That should allow us to still keep all api-docs in the same repo as well as selectively have Hugo config files that load specific modules from this repo. It should also allow us to have PR previews in this repo that contain the full set of API docs. Christian is working on this part while I am working on updating the necessary GH workflows to ensure that everything continues to work once we move all API docs from the docs repo.

Last week, @cnunciato and I tried to pull in updates, from our PR branch related to this, into a docs repo PR branch and hit a problem with the registry repo module ending up being too large for Go's module systems. The problem we hit was the infamous module source tree too big. In re-thinking why we generate the API docs at all, we have an alternative proposal that actually solves a few more things about our current process of API docs generation.

We would not generate API docs AOT and check them into a repository anymore since they don't serve any other purpose than to include them in a build. Instead, we think we can defer the generation of the API docs to just-in-time by simply relying on the metadata files as the source artifact as to which packages the API docs would need to be built for. To be precise at the time of site build we could invoke resourcedocsgen to walk the metadata files dir and simply run the generator for every one of those packages. Almost all of the information needed to generate documentation is in those files. The only two pieces of information that we do not persist today are the package repo and the path to the Pulumi schema file in the repo. Those can be easily added.

The work to be done with the new proposal would be:

  • Update resourcedocsgen with a new command that would read the metadata files from the registry's data folder for packages and run the generator for every one of them
  • Update the existing metadata generator command in resourcedocsgen to also persist the package repo and path to the package's schema file in that repo (IFF different from the default location)
  • Just as before, update the existing docs generator workflow to not run the API docs generator anymore. Instead, have it inform the registry of the updated package to facilitate transitioning packages to post repository_dispatch directly to the registry repo
  • Update the docs repo push GHA workflow to run the new resourcedocsgen command as part of the build process to get the web property updated
  • Update the registry repo PR workflow to generate API docs for two providers (we can add more, or all of them, if we'd like later)

As part of this, I suggest that we don't move the tools from docs and let it be there. This does mean that the registry repo itself would need to clone the docs repo in order to run the docs generator for the two providers during a PR build as well as to generate metadata files. We could think about re-organizing those tools separate to this effort in a way that makes sense.

Benefits

The above suggestions would still maintain the benefit of not having several push builds queue up every time a new provider is released. We would still be drastically decreasing the amount of builds running at any given time since we would batch up all changes made to the registry and pick them up in a scheduled workflow in docs.

We would be able to close #105 with the above suggestion. That means anytime we want to rebuild API docs we no longer need to rely on static lists in our scripts. We would use the registry's package listing as the source of truth for which packages we need to regenerate API docs.

Potential downsides

Christian brought up a point about a potential issue with this approach. It's that we would be adding ~15 mins to a site deployment because of the API docs generation. We would still be net-green in terms of the number of PR builds enqueued for deployment compared to today. It'll be far fewer builds but each build would be slower by ~15 mins compared to today.

Thanks for writing all this up, Praneet! And yep, that's right -- it'll add ~15 minutes to each deployment and PR build, which means folks writing content in repos like pulumi-hugo will need to wait an additional 30 minutes or so for those changes to be published.

The local-dev experience will also be affected in that devs working on registry won't be able to pull in some already-built docs from a local clone of pulumi/docs, but have to run resourcedocsgen locally first and then copy them in. Not a super-common thing of course, but something to be aware of, as it's one of the bigger pain points we have in the local-dev workflow today.

This is definitely a step in the right direction, though, and I'm sure we'll be able to iron out these wrinkles as we go.

Another benefit to the approach if I'm understanding it correctly is that it would remove the current confusion that exists today where someone tries to submit a PR against the auto-generated content.

@praneetloke do we have any outstanding work here?

Nope, not in this issue. I do want to track some follow-up work for later. Feel free to close this. In fact, as a bonus we can close #105 as well :)

I'll link to this issue when I get those follow-up ones created.