gdcc / xoai

OAI-PMH Java Toolkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Investigate in adding a nonce to the resumption tokens

poikilotherm opened this issue · comments

  • Spec says resumption tokens must be idempotent.
  • By default, we simply encode the request parameters with Base64
  • This opens the possibility for crafting stupid requests, etc

Adding a nonce when the token is created means it can be used to identify cached entries coming from the same origin. An implementing repository could discard any requests with an unrecognized nonce.

Of course, the nonce would need a shared secret between the library and the repository - otherwise there won't be a common ground to identify valid new tokens crafted by the library and sent to the repository. As there already is RepositoryConfiguration, this is an easy thing to add.

The secret should be some random string by default, so it get's re-initialized on every application start by default to avoid at least very stupid replay attacks (expiring old tokens). An application using the library may choose to provide a static key, schedule rotation etc - that's up to their DevSecOps. Applications storing tokens in a database can use the recent secret to identify old entries.

I agree that this may be something we want to consider adding. Though it may not be the highest priority thing to address.

At the very least, I don't think it's a big opening for abuse - the "opens the possibility for crafting stupid requests" part. The stupidest request somebody can make is to ask the configured Max. number of records at a random offset; but why would anyone want to do that to us, what's in it for them? As a DOS attack, just to annoy us/waste our CPU cycles, anyone can just run ListRecords on a set with more than Max. records repeatedly, getting the first page of records that does not require any resumption token. (?)

There is a different issue that this could address; that's not about abuse, but about ensuring that the set in question is "fresh" and hasn't changed since the token was issued. It can be solved potentially by being able to invalidate all the outstanding resumption tokens for a given set, if the set has been updated. Can be as simple as a randomly-generated string saved in the database for every set, that is used to generate tokens, that gets updated every time the contents of the set are modified... But I do believe that there is also a simpler solution, that would not require to maintain this token state. (but I need to confirm that).