aio-libs / aiohttp-devtools

dev tools for aiohttp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Configuration files: ini, json, yaml or python

samuelcolvin opened this issue · comments

The project created by the start sub-command will need to employ some kind of config file. While people can of course change it, the default used will influence what people use and should therefore make sense.

It should be noted that most of the cleanest "12-factor" deployment pipelines (eg. heroku, Docker) use environment variables to pass sensitive values to apps, so the configuration logic will either need to allow overriding of variables from environment variables or allow environment variables to be referenced from the configuration.

There are a number of options:

ini

Advantages:

  • pythonic - .ini parsing is included in the python stdlib and used in setup.cfg etc.
  • no extra requirements

Disadvantages:

  • not as widely used as other formats and therefore less well understood, hard to describe complex objects with
  • can't access environment variables

json

Advantages:

  • widely understood
  • no extra requirements - included in python stdlib

Disadvantages:

  • comments not possible
  • easy to make syntax mistakes: trailing comma, double quotes vs single quotes
  • can't access environment variables

yaml

Advantages:

  • widely used

Disadvantages:

  • extra requirement, also pyyaml also often requires compiling unless you're lucky and can use the manylinux binary
  • not well documented, the pyyaml docs are confusing and it's not always clear which other docs you should use
  • can't access environment variables
  • for those who haven't used it yaml is yet another thing to learn, and not always that intuitive - partly because it's so relaxed and partly as there are multiple ways to do things (eg. lists).

The aiohttp docs discuss this issue and recommend yaml.

python

Advantages:

  • arguable pythonic
  • no extra requirements
  • allows arbitrary logic while configuring
  • allows access to environment variables inside configuration
  • used by django and an option in any other project so fairly well understood

disadvantages:

  • allows arbitrary logic while configuring: logic vs config can get confused

(lets ignore the likes of toml and xml for brevity)

It would in theory be possible to support multiple formats however I think that would lead to a cognitive burden both in development and usage.

I'm happy with either yaml or python but I would suggest python as it's the most "vanilla" and easy way to go. Also if people really want to use another config file they can easy load it inside the python configuration.

@asvetlov ?

I vote on yaml.
Please also think about config validation via tools like trafaret or json-schema.
See https://github.com/tailhook/trafaret_config as an example.

Cerberus is good for stuff like that, and pretty stable.

How would you suggest dealing with environment variables?

Could do something like:

db_password:
  env: DBPASS
  default: foobar

Or just update the config object based on a defined set of defaults?

The more logic like this required, the more #3 becomes required.

A vote for python, primarily for its transparency in environment access. See for example gunicorn's config file format.

Though I have a soft spot for yaml, I think that environment variables should be supported out-of-the-box with this tool without requiring a specific syntax in the configuration file (such as ${VAR}: https://github.com/sseg/heroku-aiohttp-web/blob/master/web/utils/settings.py#L5-L16).

@samuelcolvin Cerberus is not older than aiohttp itself :)
If you need a reaaly stable solution there is collander.
I use trafaret for five years and pretty happy with its very straightforward schema definition DSL.

Maybe we need a pluggable validator -- but I pretty sure that the project requires a config file validation in-the-box. Maybe not in very first version though -- we can live without validation for a while but keep in mind this requirement.

Also please take a look on https://github.com/tailhook/trafaret_config twice.
Yes, it uses trafaret as schema specification -- but I've mentioned trafaret_config for another reason. It extends yaml loader for generating errors with line numbers, e.g.:

bad.yaml:2: smtp.port: value can't be converted to int
bad.yaml:3: smtp.ssl_port: value can't be converted to int
bad.yaml:4: port: value can't be converted to int

That's what we must have. I pretty sure the similar report could be implemented on top of any validator.

@sseg python is the worst available option, sorry.
When you are configuring nginx you don't edit config.c file with recompiling nginx after config changing, right?

python isn't compiled :-)

I've need using django's settings.py as the closest equivalent to a config file on a large project for the last two years and it's worked well.

Also a major advantage of a python config is how simple it would be to understand and extend. That extension could include loading a .yaml file.

I'm happy with yaml but we do need a good solution to for environment variables. What would you suggest?

The logic on the start command I'm building would make it very easy to add options for multiple types of config.

Most likely different people have different preferences and different types of solution will work better with setups.

Lets leave this for now and discuss when we have some concrete options.

python is not validated and schema-less solution.

I've run out of time and need to take a pause.
Sorry, I'll be not available until after tomorrow most likely.

@samuelcolvin please go ahead but don't invest too much time into config file polishing.

yaml config seems to be working well.

Good post @samuelcolvin, I'm also looking for the best config approach in Python. I've been using config.py in our project (Test Automation Harness), here is my observation so far:
Pros:

  • simple to implement, easy to be understand by Non-programmers
  • can have comments
  • less code in your main program to access the values
  • support all python data structures

Cons:

  • too flexible, need rules to keep it for config only

I'd say for small to medium application, Python config file has it's ground.

After much thought on this exact question I built pydantic to solve this exact problem (as well as some others).

It allows you to use a simple python file to define settings but with advanced validation.

See here for an example of using pydantic for settings management.

to use yaml env configs you can do so

def construct_yaml_env(self, node):
    value = [x.strip() for x in self.construct_scalar(node).split(',')]
    if not (3 > len(value) > 0) or not value[0]:
        raise yaml.YAMLError(
            u'!!env {}, not trafaret '
            u'!!env ENV_VARIABLE, DEFAULT_VAL_OPTIONAL'.format(
                self.construct_scalar(node)))

    if len(value) == 1:
        value = os.environ[value[0]]

    value = os.environ.get(*value)

    return yaml.load(value)

yaml.add_constructor(u'tag:yaml.org,2002:env', construct_yaml_env)

test

_template = """
var1: !!env VAR1, true 
var2: !!env VAR2, False
var3: !!env VAR3, null
var4: !!env VAR4, 456
"""


def test_env_default_type():
    data = yaml.load(_template)

    assert data['var1'] is True
    assert data['var2'] is False
    assert data['var3'] is None
    assert isinstance(data['var4'], int)
    assert data['var5'] == 6


def test_env_type():
    os.environ['VAR1'] = 'false'
    os.environ['VAR2'] = 'True'
    os.environ['VAR3'] = 'null'
    os.environ['VAR4'] = '4.5'

    data = yaml.load(_template)

    assert data['var1'] is False
    assert data['var2'] is True
    assert data['var3'] is None
    assert isinstance(data['var4'], float)