go-jose / go-jose

An implementation of JOSE standards (JWE, JWS, JWT) in Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Base64url without padding

jsha opened this issue · comments

We're working on upgrading letsencrypt/boulder to v3. In the process I noticed PR #3, which changes go-jose so it accepts base64url inputs with any amount of padding characters (=).

As justification, it cites the example implementation of base64url without padding from RFC 7515, appendix C. However, I think that interpretation is incorrect. I'll reproduce the example here for convenience:

     static byte [] base64urldecode(string arg)
     {
       string s = arg;
       s = s.Replace('-', '+'); // 62nd char of encoding
       s = s.Replace('_', '/'); // 63rd char of encoding
       switch (s.Length % 4) // Pad with trailing '='s
       {
         case 0: break; // No pad chars in this case
         case 2: s += "=="; break; // Two pad chars
         case 3: s += "="; break; // One pad char
         default: throw new System.Exception(
           "Illegal base64url string!");
       }
       return Convert.FromBase64String(s); // Standard base64 decoder
     }

This code adds the exact correct amount of padding before handing off the now-padded string to C#'s Convert.FromBase64String function, which verifies the correct padding.

I think there is no support in RFC 7515 for arbitrarily padded base64url strings. I'd like to propose reverting PR #3.

I think there are actually three different possible behaviors here:

  1. Be very strict. RFC 7515, Section 2 says that the phrase "Base64url Encoding" means "without padding". Therefore all input base64url strings must not have padding, and any that do must be rejected. This feels overly strict for two reasons: as PR #3 says, padding shows up in the wild; and RFC 7515 itself is actually pretty vague on this point, saying only "as permitted by Section 3.2", which then makes no mention of padding or not.

  2. Strip all padding before processing. This is what PR #3 implemented. But it is perhaps overly-lax, as it also accepts inputs with too little (e.g. = instead of ==) or too much (e.g. ======) padding.

  3. Accept either unpadded or correctly-padded base64url. This seems to me like the best compromise: do exactly what the RFC says, but also accept the correctly-padded form because it's been observed in the wild and is a reasonable mistake to make. However, it is slightly more complex to implement than either of the above, since it requires testing whether the correct amount of padding is present, rather than simply blindly correcting for it. Maybe the best implementation of this method would be something like this:

func base64URLDecode(value string) ([]byte, error) {
	if strings.HasSuffix(value, "=") {
		return base64.URLEncoding.DecodeString(value)
	}
	return base64.RawURLEncoding.DecodeString(value)
}

(Notably, the sample code in RFC 7515, Appendix C does none of these: it attempts to do (2), but in fact is perfectly willing to accept a string with a single padding character that should have had two padding characters.)

@jsha Can we close this with the v4 release?