pypa / pip

The Python package installer

Home Page:https://pip.pypa.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Config settings support in PEP 517

pfmoore opened this issue · comments

PEP 517 provides a method, config settings, for supplying arbitrary configuration to a build backend. There are no defined semantics for this argument, although there is an example in the PEP showing how pip "might" map command line arguments onto config_settings.

The setuptools backend appears to implement a part of this suggested interface (it processes a --global-option key in essentially the way the PEP implies). The flit backend completely ignores config_settings.

Pip needs a user interface (command line options) to allow users to supply config settings to a backend - but without any agreed semantics, there's probably not much we can do beyond allowing users to specify key/value pairs. It's possible that PEP 518 (or some similar standard) should be extended to allow projects to specify config_settings in the project build metadata?

Finally, there's pip's --python-tag option. This currently maps directly to a specific setuptools command line option. Due to the lack of common semantics, there's currently no way to support this under PEP 517 in a backend-neutral way. It may be worth (for the fallback use of the setuptools backend only? as a "better than nothing" approach for all backends?) mapping it to

config_settings = {'--global-option': ['--python-tag', python_tag]}

So, actions to consider:

  • Standardising any semantics for config_settings (my view: out of scope for pip)
  • Letting projects specify config settings in pyproject.toml (my view: needs a standard, out of scope for pip)
  • Command line interface for config_settings (my view: a minimal --build-settings key:value for now)
  • Config settings in requirements.txt (my view: allow --build-settings in there)
  • --python-tag (my view: translate to --global-option for now, review later)

Also, apart from the short-term approach for --python-tag, I propose not implementing any of this until after base PEP 517 support is released. Without more comprehensive buy in from backends, it's hard to see what form config_settings usage will ultimately take, and it's backends that should drive this, not frontends.

Command line interface for config_settings (my view: a minimal --build-settings key:value for now)

The config_settings is likely to be specific to a package, so we might want to directly provide: --build-settings package:key:value ?

I'm just going to go ahead and pin this, because we should really remember that this has to be done still.

I still don't think this is urgent. There's been little or no movement on backends making use of the config settings options, and essentially no demand from pip's end users. I'm inclined to think that it might have been better if PEP 517 hadn't included config_settings, treating it as YAGNI until genuine use cases came up.

If implemented this would give us a route to deprecate and remove the special cases we have for --install-option, --build-option, and --global-option in several places in the code. It is blocked on pypa/setuptools#1928, though, since the way the setuptools backend interprets the options is not currently compatible with the way we do. This also gives us a path to address #2677.

I'm inclined to think that it might have been better if PEP 517 hadn't included config_settings, treating it as YAGNI until genuine use cases came up.

Perhaps, but we've crossed that bridge now. :(

TBH, I don't think most users would discover that unimplemented functionality that is "only documented in a PEP", which could be the reason for the lack of end-user demand -- I don't know that I want something, if I don't know that thing exists.

What about: --config-setting=pkg:{"setting1": "value1"}? That is:

  • name of the project
  • a colon
  • a JSON object containing the config settings to be used for that package

and we can accept it multiple times.

This gives us a few nice properties:

  1. supports arbitrary keys/values without us needing to define a mini-language or escape characters
  2. values that are "invalid" package names can be used for communicating non-package-specific config settings (e.g. __global__ or *)
  3. PEP 517 currently states that frontends should provide a way for users to provide arbitrary string key and string value parameters, but if this is ever updated in the future to allow more complex nested structures then it will be straightforward to support it
  4. we can easily distinguish between a package being specified and not (whether the value starts with {). This would let us support a shorter syntax in requirements files or simple command-lines without ambiguity, like pkg --config-setting={"setting1": "value1"}
  5. using JSON gives us automatic support for unicode escapes (\uXXXX), which may be useful since ensuring the terminal properly accepts (and apps properly read) unicode input may be a challenge on some platforms or environments

There are also a few downsides:

  1. JSON requires double-quotes, which on the Windows command-prompt will require being escaped (by putting them twice)
  2. Opportunity for usage errors on shells that support escape characters. If the shell expands \n in an argument for example, we could get
    {"setting1":"va
    lue1"}
    
    when the user intended
    {"setting1":"va\nlue1"}
    
  3. Requirements files, which don't do any of the above interpretation, may look different than the equivalent command-line on the user's platform

This could be mitigated by only supporting --config-setting in requirements files.

A few other things to consider, in general:

  1. using the config settings applicable to a project as part of our wheel cache key
  2. explicitly failing if config settings are provided but the resolved project doesn't use PEP 517
  3. explicitly failing if config settings are provided that do not match any given project (probably not a great idea, given conditional dependencies)
  4. normalizing the project name before comparing against resolved packages
  5. "config setting" may not be so meaningful to end users, so we should consider alternate argument names

I just stumbled upon this issue when I tried to pass some extra arguments to the setuptools backend.
It took a while until I released that the config settings have not been implemented at all in pip.

Is there a specific reason why build_options and global_options is not passed on as suggested in the PEP? Are you waiting for a more complete design?

Potentially, the documentation could be expanded as well. It still reads as if --build-option and --global-option have the effects described in the PEP.

https://github.com/pypa/pip/blob/master/docs/html/reference/pip_wheel.rst#customising-the-build

On a related note, the new config UI should probably be designed to still allow installing wheels of PEP 518 build dependencies (?).

As for my use case: I have a Cython project and have implemented an extra command to transpile the .pyx sources to the .cpp outputs. Performing this operation in the isolated build environment simplifies the build process - it's one pip wheel call. It also avoids another tool for dependency management during building, as the Cython dependency is declared through build-system/requires.

@pskopnik I think it's a matter of designing + implementing the behavior in pip's implementation.

I'm looking at (finally) implementing this functionality, at least in part in anticipation of the setuptools PEP 660 support having two modes which the user could select via a config option.

Reviewing the discussion here, what I propose is as follows:

  1. Add an option, --config-settings, to the install and wheel commands. The existing --(build|install)-options flags are only for those commands. --global-options also applies to download, but I'm going to say YAGNI on that (for now, at least).
  2. It will be possible to include --config-settings in a requirements file.
  3. If --config-settings is specified, all of --(build|install|global)-options will be ignored with a warning.
  4. The syntax will be --config-settings KEY=VALUE. This can be repeated multiple times, with later values overriding earlier ones if KEY is the same. Note that PEP 517 states that frontends only need to be able to provide string valued keys and values. So I'm dismissing the idea of letting values be arbitrary JSON, or having special syntax for boolean-valued settings.
  5. (Not 100% sure about this one) The syntax --config-settings KEY= will remove KEY from the config dictionary. May be useful for overriding settings froma config file, but I'd happily drop this (in favour of just setting the value to "" following point 3) if anyone said YAGNI.
  6. The settings will apply to all backend calls. I though there was something that said backends should ignore unrecognised settings, but I can't find it right now. In any case, I can't think of any way of sanely allowing the user to say which hooks or backends a particular config setting is to be supplied to, so I think it's a reasonable thing to hope for.
  7. The settings will be passed for all packages - if there's a need to supply per-package config settings, put them in a requirements file. I'm calling YAGNI on having special syntax for this on the command line.

In general, behaviour will follow --(build|install)-options.

I'd appreciate comments on the design, and on the feasibility of the proposal. However if anyone wants to propose something more complex (such as per-call settings as per item 6, or per-package settings as per 7) then please be prepared to suggest a modified syntax that covers it. The reason I'm keeping things simple is precisely because I don't know how to make a good UI for more complicated possibilities, so saying "you need to cover X" without explaining how, will probably just result in this getting stalled.

Also, if backend developers want to suggest constraints or features that they need, I'd appreciate such suggestions coming with a proposed syntax (same reason).

/cc @abravalheri for views on whether this will work for setuptools.

I prefer interpreting KEY= as setting an empty string, but mostly for consistency reasons, and also since I feel being able to set a value to an empty string is more important than unset a previously set one. If we’re going to implement unsetting a key, it should be something else (e.g. KEY with the equal sign).

The rest of the points all sound good to me in general.

Add an option, --config-settings, to the install and wheel commands. The existing --(build|install)-options flags are only for those commands.

Does --install-options only apply for setup.py install, or are they also passed to bdist_wheel? I’m mostly just curious here, since this does not change anything to the outlined plan—it would’ve if we’re only deprecating the direct counterpart setuptools-specific options, but not since we are going to deprecate all three anyway.

One point to the semantic. Does setting --config-settings also force building from source? I seem to recall this is what the setuptools-specific options do. We might also need to more clearly indicate this is only going to be passed to the build backend (especially if this forces building from source) in the option name, maybe --build-config-setting (yet another minor point, I think the option should be singular since it only allows specifying one key).

I prefer interpreting KEY= as setting an empty string, but mostly for consistency reasons, and also since I feel being able to set a value to an empty string is more important than unset a previously set one.

That makes sense to me. I'm happy to do this in place of my original proposal, and treat unsetting a key as YAGNI (for now at least).

Does --install-options only apply for setup.py install, or are they also passed to bdist_wheel?

As far as I can tell, --install-options is for install, and --build-options is for pip wheel, but I didn't look very closely into the details (as my proposal is effectively to treat them as legacy and disable them if the new option is specified).

Does setting --config-settings also force building from source?

I don't intend that it does. I would expect that the settings would be used if a PEP 517 hook is called, but would not change the decision as to whether to call one. If you want to ensure building from source, you should use --no-binary.

This means that specifying config settings might do nothing, and could end up using a wheel built with different settings. My view is that's a rare enough situation that I don't want to worry about it for now. After all, there's nothing in a built wheel to say what config settings were used, so there's nothing we could do about it anyway1. On the same principle, I don't intend to try to warn if config settings get ignored (that is possible, but would require adding checks to a bunch of code paths that don't currently need to care about config settings). Basically, if you want to be sure, use --no-binary, otherwise settings will be used if needed, ignored otherwise. We can revisit this choice later, based on real-life experience. But for now I expect use of config settings to be so rare as to make taking anything other than the easiest choices a waste of effort.

Footnotes

  1. I personally think that two wheels built with different config settings probably have to be functionally identical anyway - there's nothing explicit to that effect in any specs I can find, but I'd expect any backend that did make them different would break far too many assumptions to be useful.

Sounds good to me. The problem that a wheel may have mismatching config settings can probably be better resolved if we could record the config settings in wheel metadata, which would also help with other wheel-tagging problems such as GPU information. For now, we can perhaps rely on documentation.

Thank you very much @pfmoore for looking into this.
In general I like the design, I think it makes sense and I think it is useful for the PEP 660 scenario.

As discussed in https://discuss.python.org/t/pep-660-and-setuptools/14855/18, I also agree with expecting backends to ignore the values of KEY they don't recognize.

Is there any plans to include a handy alias like build does (e.g. -C)?


Just one clarification:

Would VALUE be able contain contains spaces? I did a quick test and sys.argv already seems to support that:

python3 test.py arg="with multiple words"
# sys.argv => ['test.py', 'arg=with multiple words']
python3 test.py arg='with multiple words'
# sys.argv => ['test.py', 'arg=with multiple words']

Is there any plans to include a handy alias like build does (e.g. -C)?

It's not hard to add an alias, but I'm reluctant to use up a 1-character option for something this uncommon. I suggest not doing so initially, and if users complain that it's too verbose, we can add an abbreviation later.

Would VALUE be able contain contains spaces?

That's essentially down to your shell's (plus the C runtime's on Windows), quoting rules, but basically yes:

pip install --config-settings "OPTION=SOME VALUE WITH SPACES"

should work pretty well everywhere. Single quotes, or just quoting the value rather than the whole OPTION=VALUE, would be less portable1. For ease of use I'd tend to recommend that config setting names and values avoid spaces or punctuation characters, but that's just my personal view.

Footnotes

  1. I'd briefly considered allowing the user to specify config settings via JSON, but correctly quoting JSON on the command line in Powershell is a pain, so I dropped the idea as a bad UI.

Draft PR now available as #11059

That aforementioned PR has landed. :)