salt-formulas / reclass

A recursive external node classifier for automation tools like Ansible, Puppet, and Salt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong logic for reclass _param checks

jiribroulik opened this issue · comments

@epcim @AndrewPickford

Currently reclass probably checks if all _params are defined for a node. It should be actually checked at 'apply' pillar (meaning at the end) if the end param is defined. Steps to reproduce:

Have this pillar:

parameters:
  _param:
    salt_glusterfs_service_host: ${_param:glusterfs_service_host}
    glusterfs_node01_address: ${_param:cluster_node01_address}
    glusterfs_node02_address: ${_param:cluster_node02_address}
    glusterfs_node03_address: ${_param:cluster_node03_address}
  glusterfs:
    client:
      volumes:
        salt_pki:
          path: /srv/salt/pki
          server: ${_param:salt_glusterfs_service_host}
          opts: "defaults,backup-volfile-servers=${_param:glusterfs_node01_address}:${_param:glusterfs_node02_address}:${_param:glusterfs_node03_address}"

Even though in reality you care about the glusterfs_node01_address, glusterfs_node02_address, glusterfs_node03_address because its applied at the last line. Reclass gives error on cluster_node01_address, cluster_node02_address, cluster_node03_address. Which is only 'in the middle' param, never used. So it should not report error.

Can this be fixed please?

I simulated the issue on this simple example: https://github.com/epcim/reclass-issue14

➜  reclass git:(master) tree
.
├── classes
│   ├── first.yml
│   ├── second.yml
│   └── third.yml
├── nodes
│   └── dontpanic.yml
└── reclass-config.yml

2 directories, 5 files
➜  reclass git:(master) cat classes/second.yml 

classes:
- first

parameters:
  _param:
    # aaa: dummy
     aaa: ${_param:yyy}
     ccc: ${_param:aaa}
  mykey: ${_param:ccc}

➜  reclass git:(master) cat classes/third.yml 

classes:
  - second

parameters:
  _param:
    ccc: 444
  mykey: ${_param:ccc}


➜  reclass git:(master) reclass --nodeinfo dontpanic                                  
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/reclass/values/refitem.py", line 53, in _resolve
    return path.get_value(context)
  File "/usr/lib/python2.7/dist-packages/reclass/utils/dictpath.py", line 128, in get_value
    return self._get_innermost_container(base)[self._get_key()]
KeyError: 'yyy'

-> dontpanic
   Cannot resolve ${_param:yyy}, at _param:aaa, in yaml_fs:///etc/reclass/classes/second.yml


If I've understood the issue correctly then this arises from how reclass merges, or for scalar values overwrites parameters. In @epcim's example the logic for resolving _param:ccc first makes a list: [${_param:aaa}, 444] and then tries to resolve each list element in order. So first reclass tries resolving _param:aaa which looks for _param:yyy which isn't present and so the whole thing fails.

This is perfectly reason behaviour when merging lists and dicts as reclass needs to resolve each parameter involved to merge them together. But it does produce this edge case for scalars that the final value only depends on the final parameter in the list of parameters and not on any of the preceding values.

Note that even if in the example _param:ccc was resolved to 444 with out generating an error then an error would still happen as reclass will still try to resolve _param:aaa, which would still fail.

In order to ignore unneeded parameters during parameter merge/overwrite reclass would need to resolve the parameter list in reverse order and have some logic for stopping on a failed resolve and in the case that the final element is a scalar and all the resolvable previous elements are scalars just using the final scalar value. However for lists and dicts after the elements are resolved the merge would still need to happen from the first element in the parameter list.

I'm not sure if that is the correct thing to do or if the current behaviour is preferable and how the _param parameters are organised should be rewritten. I don't use the _param reclass organisation myself so it's not a problem I've run into.

For the second issue of the _param parameters that would fail but are not needed (_param:aaa) it would be reasonably clean to add an option to supply a regex and only resolve parameters matching the regex and any parameters the regex generated list of parameters depended on. Which could include parameters not matching the original regex.

Original reclass, did the interpolation different way and the issue is relevant. The way to resolve this might be A) better algorithm, B) Accept the current, backward non-compatible behaviour, C) Fix our models on all levels, D) Workaround, that will not throw an exception, but will store "UNKNOWN" as value.

Unless someone claims to rewrite it according to A) I would go by conditional D option. Possibly allow throwing an error anyway, if in last loop/highest structure was UNKNOWN not resolved. That would also allow us to summary all possible missing interpolations in one error output.

The root cause of the difference are changes in my fork stemming from how original reclass treated merging references. In original reclass references are first merged (a reference simply overwrites a previous reference) and then the references are evaluated. For my fork references are first evaluated and then merged. This can be seen with the following:

nodes/node1.yml:

classes:
  - test1
  - test2
  - test3

classes/test1.yml:

parameters:
  a:
    - 1
    - 2
    - 3
  b:
    - 4
    - 5
    - 6

classes/test2.yml

parameters:
  c: ${a}

classes/test3.yml

parameters:
  c: ${b}

with original reclass the parameter c evaluates to the list [4,5,6] with my fork it evaluates to [1,2,3,4,5,6].

@epcim - As a runtime option (as I need the new reference merging style) a more original reclass like merging is doable. But it's bound to have some oddities/differences from the original reclass.

How about the following parameter organisation to fix the errors:

parameters:
  _param:
    cluster_node_addresses: {}
    glusterfs_node_addresses: ${_param:cluster_node_addresses}

    test: ${_param:glusterfs_node_addresses:node01}

With cluster_node_addresses and glusterfs_node_addresses as dictionaries default values can be written to cluster_node_addresses and used by glusterfs_node_addresses. By merging in a empty dictionary onto cluster_node_addresses this gives a reasonable errors if node addresses are missing:

-> node1
   Cannot resolve ${_param:glusterfs_node_addresses:node01}, at _param:test, in yaml_fs:///home/test/reclass/test8/classes/test2.yml

Node addresses can also be directly written into the glusterfs_node_addresses dict after it is merged with the ${_param:cluster_node_addresses} so that they overwrite the values from cluster_node_addresses without changing values in cluster_node_addresses.

Note this will not work with original reclass

I lost my week old comment. I fully agree with Andrew - not a bug, there is a way to fix behaviour on our side - a great suggestion we should implement. tl;dr - saltclass pillar simply passes ${not:found:option} if not interpolated.

We could do optionally the same (still, probably we want to keep throwing an error) as fear what would happen if such are passed to system/network configs and then executed :(.

Will send some patch next week, hopefully. Finally we need a workaround first as a change of our shared system models for backward compatibility will take much longer.

@AndrewPickford and other, please review the proposed fix + feature to actually print all missed references at once.

@epcim I've been swamped with a batch system upgrade so will try out the proposed changes next week.

@AndrewPickford can you have a quick look today. I would like to merge it quite soon so we move on.

Resolved by #18