unixcharles / acme-client

A Ruby client for the letsencrypt's ACME protocol.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Nonces expiration and Acme::Client::Error::BadNonce

cema-sp opened this issue · comments

Greetings!
We are using acme-client to update certificated of managed domains in an automated way.
We have created a daemon that sits in memory and recurrently orders certificates. It memoizes client instance and re-uses it on every scheduled run.

The client stores anti-reply nonces in memory and uses then on demand.

What we have noticed is that Letsencrypt nonces could (most probably) expire after approximately 24 hours, which results into the following error

Acme::Client::Error::BadNonce: JWS has an invalid anti-replay nonce

Because of that reason we had to refrain from memoizing the client.

Do you think it would be useful to allow acme-client users to provide something like a "nonce store instance"? It could be optional and configurable, and would allow one to store nonces keeping in mind their expiration.

# lib/acme/client.rb
...
  def initialize(jwk: nil, kid: nil, private_key: nil, directory: DEFAULT_DIRECTORY, connection_options: {}, nonce_store: [])
    ...
    @nonces ||= nonce_store
  end
...

# my_client.rb

my_store = StoreWithExpiration.new
client = Acme::Client.new(nonce_store: my_store)

What about just retrying on Acme::Client::Error::BadNonce?

Nonce expiration is not really part of the spec, so I prefer to make no assumption about it and consider it an implementation details. I would generally consider BadNonce as retry-able.

@unixcharles Thank you for answering, I'll try this approach.

Nonce expiration is not really part of the spec, so I prefer to make no assumption about it and consider it an implementation details. I would generally consider BadNonce as retry-able.

Indeed, and retrying on badNonce is in the spec!

An error response with the "badNonce" error type MUST include a Replay-Nonce header with a fresh nonce. On receiving such a response, a client SHOULD retry the request using the new nonce.

👍

What we have noticed is that Letsencrypt nonces could (most probably) expire after approximately 24 hours, which results into the following error

Speaking specifically to Let's Encrypt the time that it takes for a nonce to fall out of the active pool is a byproduct of traffic volume and so its very difficult to firmly establish a lifetime. Today you might see ~24hrs and tomorrow it could be 10m. The other thing is that nonces are per-datacenter. If we swap active datacenters during a maintenance, or if load balancing changes, a previously fetched nonce may suddenly be invalid (and retrying is the best option).

Hope that extra detail helps!

@cpu Thank you, that's super helpful 👍 I believe the issue could be closed.