gchq / CyberChef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

Home Page:https://gchq.github.io/CyberChef

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug report: From Base64 in Strict mode with URL safe alphabet doesn't work

Ihor0k opened this issue · comments

Describe the bug
From Base64 with URL safe alphabet doesn't work in Strict mode. It shows Error: Base64 input contains non-alphabet char(s)

To Reproduce
Input: "test"
Recipe:

  • To Base64 (URL safe alphabet)
  • From Base64 (URL safe alphabet, Strcit mode)

Output: "Error: Base64 input contains non-alphabet char(s)"

https://gchq.github.io/CyberChef/#recipe=To_Base64('A-Za-z0-9-_')From_Base64('A-Za-z0-9-_',false,true)&input=dGVzdA

Expected behaviour
Expected output: "test"

Screenshots
image

CyberChef version: 10.5.2

For those who just want a fix: Include a padding character in the alphabet. See here for a fixed version of @Ihor0k's example.

Now, the details. Firstly the error message is a bit confusing. There are clearly no non-alphabet characters in the input string. What's happening is that when the padding is missing, the code will continue to read characters after the end of the string.

while (i < data.length) {
    // Including `|| null` forces empty strings to null so that indexOf returns -1 instead of 0
    enc1 = alphabet.indexOf(data.charAt(i++) || null);
    enc2 = alphabet.indexOf(data.charAt(i++) || null);
    enc3 = alphabet.indexOf(data.charAt(i++) || null);
    enc4 = alphabet.indexOf(data.charAt(i++) || null);

    if (strictMode && (enc1 < 0 || enc2 < 0 || enc3 < 0 || enc4 < 0)) {
        throw new OperationError("Error: Base64 input contains non-alphabet char(s)");
    }
    // ...
}

After the end of the string data.charAt(i++) returns an empty string. Then alphabet.indexOf(data.charAt(i++)) would return 0, which would make it seem like the character was actually the first character in the alphabet (A). As mentioned in the comment, this is prevented by including || null. Now indexOf returns -1. In the next bit of code, if strictMode is enabled, it is checked if any non-alphabet character was observed. But this check is also satisfied for empty strings, as they also result in indexOf being -1. This explains the weird error message.

There are multiple ways this can be fixed, depending on what strictMode should enforce. Currently strictMode is a bit vague at what it's trying to accomplish. If strictMode is meant to check correct data length, it probably should also require you to to specify a padding character. If strictMode should also function without a padding character, the non-alphabet character check should be reworked. Two cases should be differentiated: the case when data.charAt(i++) returns an empty string and the case when data.charAt(i++) returns a non-alphabet character. In the case of an empty string, just continue. In the case of a non-alphabet character, do as it stands now.

Then another thing, removeNonAlphChars only matters when strictMode is enabled. When strictMode isn't enabled removeNonAlphChars does nothing. The non-alphabet characters are ignored anyway. I don't know if conditional arguments are possible in CyberChef, but this would be a great application.