microsoft / win32metadata

Tooling to generate metadata for Win32 APIs in the Windows SDK.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add API set information to functions where applicable

ChrisDenton opened this issue · comments

I think it might be useful for the metadata to contain information on API sets on relevant functions. E.g.:
[ApiSet(api-ms-win-shell-shellfolders-l1-1-0)] or a similar attribute.

Some people have requested that Rust's standard library use API sets rather than the "old" DLLs for a variety of reasons:

  1. API sets are always loaded from a system DLL, even if they're not KnownDLLs. This can matter for some types of applications (e.g. standalone applications that may be run from the user's Downloads folder) that want secure loading of DLLs. There are other ways to mitigate that but using API sets has the advantage of having inherently more secure DLL loading.
  2. It allows loading fewer DLLs. Some API sets redirect to DLLs that are more tightly scoped but aren't necessarily stable between Windows versions/skus.
  3. The delay load linker feature works on DLLs so being more fine grained can be useful. E.g. using delay load on kernel32 is often not that useful.

I am currently exploring the options here and not committing to anything yet but I think either way it would be nice if metadata had some more data, especially given that there is a lack of up to date documentation on API sets.

Not sure where we would get authoritative apiset information from. The SDK, docs, and libs don't even have the right information. 🤔

Hmm...

Well #1928 suggests that's not an issue unique to apisets 😛

But I'd suggest getting it from import libs would be the best option. If they're wrong then that's an issue that needs fixing on their end.

  • We have some internal requests for this as well, so that we can build Rust binaries for stripped down versions of Windows that don't necessarily have traditional DLLs like kernel32.
  • We have some internal apiset.xml files that provide the mapping. They're meant to be used to generate umbrella libs like onecoreuap.lib but perhaps those libs don't quite get the right information.
  • I'd be happy to just switch to this rather than keeping two sets of library names for each function.

cc @dpaoliello

I've avoided apisets for a long time because I couldn't see how they benefited developers directly. Recently I received a few different requests to look into them and so I finally had a good conversation with the engineers in Windows responsible for apisets, reverse forwarders, umbrella libs, and all the rest. After much back and forth it was finally concluded that indeed there is no direct value to the developer. It's just not the right abstraction or way to express library dependencies in Windows. They serve an important role but not one that we should interact with directly.

There are some APIs that only live in an apiset because that's the only library name we've been given and in such cases that's just what we use and that's fine. But there is no benefit in trying to get to the apiset behind a "legacy" library name like user32.dll, because well, the apiset in many cases just redirects back to the "legacy" DLL anyway and if it were ever moved, as is sometimes the case with APIs in kernel32.dll for example, they'll have forwarders in place to fix it up. There are even old APIs like SetCursorPos that show up in six different apisets and nobody can tell me which to use. There are also rumblings of the apiset names changing in future making it even more problematic.

The best advice we can give is to stick with the traditional library names like user32.dll, kernel32.dll, and so on.

Thanks Kenny! I'm more than happy to go with the expert opinion on this.

The only two concrete DLLs that were mentioned to me where apisets do make a difference are kernelbase.dll (instead of kerenl32.dll) and windows.storage.dll (instead of shell32.dll).

I'll close this issue now as it doesn't seem useful to explore this any further but I don't might reopening should something change in the future.

And there will undoubtedly be some cases where we need to use a more appropriate library name and that's certainly fine to consider on a case-by-case basis.

The main bit of work here is to be more consistent. Right now I gather we're scraping the umbrella libs like onecoreuap but those libs have a mishmash of traditional and apiset library names. We should at least be consistent and stick with the traditional library names where applicable.

GetStagedPackageOrigin for example is reported to come from api-ms-win-appmodel-runtime-l1-1-1.dll but it should in fact be kernelbase.dll as per the docs:

https://learn.microsoft.com/en-us/windows/win32/api/appmodel/nf-appmodel-getstagedpackageorigin

In fact, the internal database (WCD) where all of this is tracked states that Kernelbase.dll is the "preferred module".

Scanning the import libs in the Windows SDK (10.0.26100.0) it seems that GetStagedPackageOrigin is only ever used with the apiset dll name. So this sounds like it needs a coordinated fix so that both metadata and import libs agree?

Ideally the umbrella libs would just use the conventional library names as well.