indigo-iam / wlcg-jwt-compliance-tests

Prototype WLCG TPC testsuite using JWT authN/Z

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

An odd authorisation policy?

paulmillar opened this issue · comments

The setup description in README.md says:

Write access (with the exclusion of the /protected folder) is granted to any client presenting a valid WLCG VO proxy

At first, I thought the inclusion of the word "proxy" (in "a valid WLCG VO proxy") was a mistake, that all members of the WLCG VO have write access. However, one of the compliance tests verifies that a JWT token (with no group or authz claims) does not have write access.

These two together (VOMS write requirement and the test), when taken together, leads to the odd scenario. Consider someone who is a member of the WLCG VO, but who is not a member of any (sub)groups therein. That person would have read-write access if they authenticate via X.509+VOMS, but would have read-only access if they authenticate via a JWT.

This makes no sense: either the person is granted write access, or they are not. The choice of authentication technology should not matter. When it does matter, this is often because of misconfiguration or limitations in the underlying software.

Therefore, it seems rather perverse (at least, to me) that a compliance test suite would require such an odd ("broken") setup.

Instead, if a subset of users should have write access then that subgroup should be expressed as a group/FQAN, and not by the user's choice of authentication technology.

Picking up the thread from the email (I think this ticket is the right place...).

I would suggest that the token permissions are too permissive still. This is not an identity token similar to the X.509 proxy authorization mechanisms, but an access token where the authorization is either from a positive presence of a group or a capability.

So:

  1. If the session is authenticated with a WLCG VO proxy (that is, a VOMS extension signed by the WLCG VO), you should have read/write access. This is the closest analogy to the group attribute.
  2. If the request contains a token with a group attribute with any group, you should get read/write access.
  3. If the request contains a token with a capability asserting particular read/write access, that capability should be honored.

The access for a given request should be the union of the authorizations granted by (1), (2), and (3).

For me, there's two related but distinct things at play here.

First, there's the theoretical "correct behaviour" of any storage system. Given the possible ways in which the client may attempt to authorise the request, what should happen? There is anywhere between one and three groups of information here: the VOMS FQANs (if X.509 client authentication is used), the group-membership in the token (if a JWT is used) and scope based explicit AuthZ in the token (again, if a token is used). These different groups may have conflicting assertions, so getting this "right" (supporting the desired use-cases) may be challenging.

For the most part, the JWT profile document should describe correct behaviour here.

We may wish to review that document; for example, to remove an ambiguity (or possibly to change our mind). For example, I'm not sure to what extent the JWT profile document describes how a service is expected to behave if the client both uses X.509 client authentication and attempts to authorise a request using a JWT (a quick scan suggests this is missing). Another possible omission is the behaviour of unauthenticated (the "anonymous" user) access; is that "in scope" for this document?

The important point here is that the JWT document exists independently of this test suite. The document is (IMHO) the final arbitrator in any contention: if a test from the suite checks something that isn't in the document then the test is wrong.

There is, however, a second issue.

Part of the JWT profile document details how group membership is asserted. This is intentionally similar to how VOMS currently operates (something the JWT profile document makes explicit). Therefore, under certain circumstances, the behaviour will depend on how the storage system is configured; in particular, how the storage system's permissions are set up. The test suite cannot simply run on a virgin system, but requires that the storage is careful configured ("just so"), as documented in the README.md file.

In a way, it shouldn't matter exactly how the test suite requires the storage service is configured, provided it is self-consistent. However, I believe the current required setup seems somewhat different from how storage is currently deployed in WLCG. This may be inevitable, in order to catch edge-cases, but it might be worth considering whether such deployments are easily achievable.

This risks here is that additional functionality is developed that is only used by the test suite (never used in production).

Purely to illustrate this point: here is a constructed, hypothetical and concrete example. The test suite could require that storage grants additional privileges if the DN does not contain the same character more than once (if the DN contains an a then there is only one). To fully support this requirement, storage developers would likely need to develop custom code to check whether this condition is met. However, such an enhancement would likely only be "used" by the test suite.

As a somewhat concrete suggestion, perhaps we could start cataloguing scenarios ("use-cases") that we wish to support. These could be ones that exist already (e.g,. a prod. user writes data but regular user cannot modify data. Users can upload output from jobs that all members of the VO can read), but that catalogue might include new ones (e.g., public-access data, group-specific embargoed data, "personal" home directories, etc). This catalogue would also need to describe how users are authenticating (via VOMS? using a JWT with group-membership? with AuthZ? A combination? If so, what?). Although various ideas and use-cases have come up during the WLCG-AuthZ WG discussions, I believe this catalogue (as a concrete document) is currently missing.

I think having such a catalogue would be generally helpful.

It would help "ground" the JWT Profile document, both in checking that we can do everything we want to, and to explain why things are done the way they are. The use-case catalogue could also help identify how storage should be configured; for example, by saying directory-A supports use-cases 1, 2 and 3 while directory-B supports use-cases 4 and 5.

We may still (additionally) need the storage to be configured in some odd-ball way, to tease out some edge-cases, but with a more use-case driven approach, I think those would be cleaner and clearer.

(Just my 2c worth).

Modified tests to match new permission requirements:

  1. If the session is authenticated with a WLCG VO proxy (that is, a VOMS extension signed by the WLCG VO), you should have read/write access. This is the closest analogy to the group attribute.

  2. If the request contains a token with a group attribute with any group, you should get read/write access.

  3. If the request contains a token with a capability asserting particular read/write access, that capability should be honored.

In particular, we now test that read/write access within the wlcg SA is allowed for clients presenting a valid JWT token issued by WLCG IAM with a group /wlcg in the wlcg.groups claim.
Moreover, we also test that read/write access is denied to WLCG clients without wlcg.groups or storage.* scope claims.

Cabability-based tests (point 3.) were already in place.

PR #32