Unclear explanation of GOLDENDOODLE
xdrr opened this issue · comments
The exact definition of GOLDENDOODLE in this tool and the associated blog posts is unclear. I can't seem to determine what distinct vulnerability GOLDENDOODLE represents and how we can prevent it.
In padcheck.go
, the comment above the check for GOLDENPOODLE says "Distinct error for valid padding with invalid MAC" whereas Zombie Poodle says "Unique error on invalid padding with valid MAC" but as you can see, these definitions are identical (A != B | B != A).
In this blog post it says "GOLDENDOODLE is the name I’ve given for exploiting modern TLS stacks using the classic CBC padding oracle technique described by Serge Vaudenay in 2002" which seems to imply that GOLDENDOODLE is a pet name for Vaudenay's research. It continues by saying that by reducing the set of characters guessed to application-specific characteristics, the attack can be performed faster. This is true of any padding oracle attack.
In the blog post describing how GOLDENDOODLE was found it talks about there needing to be a lack of MAC validation but doesn't describe how this is relevant to a padding oracle. Some implementations of TLS (like in OpenSSL) check the MAC after padding, which, for the purposes of a padding oracle, is the same as not validating the MAC.
It would be great if further clarification could be provided on what GOLDENDOODLE is and how it can be exploited. For example, could a toy example of the script used to exploit the Cisco ASA bug (described here) be provided to demonstrate GOLDENDOODLE?
Thank you for your interest in this research. I will try to respond to your points/questions in order:
In padcheck.go, the comment above the check for GOLDENPOODLE says "Distinct error for valid padding with invalid MAC" whereas Zombie Poodle says "Unique error on invalid padding with valid MAC" but as you can see, these definitions are identical (A != B | B != A).
The distinction between these test cases is subtle but important. The easiest way to think about this is that any modified ciphertext which decrypts to plaintext ending with \x00 is going to be treated as valid 0-length padding. If the stack reveals that the padding was well-formed independent of MAC validation, the attacker can make more extensive manipulations to the ciphertext. The attacker can more freely rearrange ciphertext blocks. It is no longer strictly necessary to craft HTTP requests aligned to the cipher's block size because you don't need a full block of padding.
In the case of POODLE/ZP, the attacker relocates a ciphertext block and observes the result. In GD, the attacker relocates the ciphertext block of interest and then sets bytes in the previous ciphertext block based on a guess for the plaintext. If the attacker's guess is correct, the padding will be valid. Multiple bytes can be tested simultaneously with this technique and the tests are rather deterministic. Further modifications like this in a POODLE/ZP attack would corrupt the MAC and leave the attacker uninformed as to whether the padding was valid.
In this blog post it says "GOLDENDOODLE is the name I’ve given for exploiting modern TLS stacks using the classic CBC padding oracle technique described by Serge Vaudenay in 2002" which seems to imply that GOLDENDOODLE is a pet name for Vaudenay's research.
Yes! GOLDENDOODLE is a pet name for the process of applying Vaudenay's research to attacking HTTPS connections when the target implementation contains certain errors.
It continues by saying that by reducing the set of characters guessed to application-specific characteristics, the attack can be performed faster. This is true of any padding oracle attack.
The meaningful bit about this is the fact that GD allows the attacker to "guess" values which opens the door to deterministic brute-forcing. When exploiting POODLE, the attacker relocates a ciphertext block and observes the response so it is effectively like rolling dice with 256 sides. (Attacker gets a plaintext byte each time it lands on 16.) With GD, the attacker can fully craft the chain of ciphertext blocks to test whether a specific byte or set of bytes decrypts to a specific value. Rather than rolling die with each tampered request, the GD attacker surgically crafts messages to test possible decryption values.
In a GD attack, having a limited set of possible plaintext bytes allows the attacker to be more effective. (Don't need to test for non-ASCII bytes in a session cookie.) In a POODLE attack, this has no impact since the attack is bound by luck.
In the blog post describing how GOLDENDOODLE was found it talks about there needing to be a lack of MAC validation but doesn't describe how this is relevant to a padding oracle. Some implementations of TLS (like in OpenSSL) check the MAC after padding, which, for the purposes of a padding oracle, is the same as not validating the MAC.
I hope this has been clarified already by my response. The lack of MAC validation is relevant to HTTPS padding oracle exploitation because the attacker can modify the bytes from the original MAC to seed specific guessed values into the CBC decryption. I'm not sure I follow what you mean about OpenSSL, but I think what you are saying would be more relevant to timing based padding oracles. (It is ultimately treating the connection the same if it found the padding invalid or the MAC -- timing might be different, but I intentionally was not researching timing based oracles.)
It would be great if further clarification could be provided on what GOLDENDOODLE is and how it can be exploited. For example, could a toy example of the script used to exploit the Cisco ASA bug (described here) be provided to demonstrate GOLDENDOODLE?
Unfortunately, the toy example from that blog post was lost due to hardware failure. There may be some additional information of interest in this report or in my Black Hat Asia presentation
Thanks for detailed explanation. I think I almost have it but I'm missing a few key points.
It is no longer strictly necessary to craft HTTP requests aligned to the cipher's block size because you don't need a full block of padding.
Can you explain further why this is not necessary, possibly with an example.
Multiple bytes can be tested simultaneously with this technique and the tests are rather deterministic.
Can you clarify how you're able to test multiple bytes concurrently and if this is more effective than testing one byte at a time.
The lack of MAC validation is relevant to HTTPS padding oracle exploitation because the attacker can modify the bytes from the original MAC to seed specific guessed values into the CBC decryption.
Can you give an example of how modifying the MAC bytes (usually 20 bytes preceding the padding) helps a padding oracle.
Thanks.
Bump.
Can I please get an update as I'm still trying to understand these vulnerabilities for the services I operate.
Can you explain further why this is not necessary, possibly with an example.
I feel that the example of this is quite clearly illustrated in the presentation. I'm not sure how much more tangible I can make it, but here goes... In POODLE, the attack must produce a ciphertext where one block is completely filled with padding. The attacker will then duplicate a targeted ciphertext block known to contain some secret and use it to replace the block of padding. When the receiving station decrypts this ciphertext, the interesting ciphertext block gets decrypted by the block cipher and combined via XOR with the previous ciphertext block as part of the vulnerable CBC process. If the result of this XOR is the length value of padding and the receiving station does not validate the padding byte values, it will strip the padding bytes and find a valid MAC. The attacker observes this and the last byte of the targeted ciphertext block can be calculated. The attacker will have to do this an average of 256 times for each byte they are looking for due to the randomness of the encryption.
Now consider the case where the MAC is not being checked. The attacker can duplicate the targeted ciphertext over the last block and use a "guess" about the value of the last byte (or bytes) to set the last byte (or bytes) of the previous ciphertext block (generally containing the MAC or MAC'd data portion) so that it has the guessed value of the block cipher decryption such that the decryption ends \x00 (or \x01\x01 or \x02\x02\x02, etc) so that the "padding" is well-formed.
Can you clarify how you're able to test multiple bytes concurrently and if this is more effective than testing one byte at a time.
Hopefully the explanation above clarifies about how to test multiple bytes. Basically, it is just a matter of setting fixed values for more bytes of the second to last ciphertext so that the resulting padding is well-formed if the guess is correct. This does have some benefit of error correction since it is possible (although far less likely) to have a F+ on the oracle with GOLDENDOODLE. For example, consider if the attacker is trying to force the padding length to be \x00 but in reality it resulted in \x01 but the byte before that randomly was also \x01 and so the padding is "valid" again. As the "guessed" byte string becomes longer, the chance of this false positive decreases considerably.
Can you give an example of how modifying the MAC bytes (usually 20 bytes preceding the padding) helps a padding oracle.
The bottom line in all of this is that we have an unauthenticated data stream and so all parts can be altered without detection. The attacker can make precise changes to improve the chances of a positive oracle response because of how CBC chains the blocks together with XOR to produce plaintext which is then "parsed" as padding.
I've reviewed your comments and the slides for the BH Asia presentation. I've provided formal definitions of your vulnerabilities below based on my new understanding.
Can you confirm these definitions are accurate and complete.
Zombie Poodle: any implementation of TLS that reveals (by a response discrepancy) whether the padding was valid.
GOLDENDOODLE: your idea to use Vaudney's guessing approach to more quickly guess padding bytes, in implementations that do not validate the MAC.
Yes, this is generally accurate and complete. The names are not a particularly scientific way of classifying these flaws but I did find that it made it easier to communicate the slightly different degrees of risk associated with different server behaviors when dealing with product vendors and site admins.