ReneNyffenegger / cpp-base64

base64 encoding and decoding with c++

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Padding of urls

jlcordeiro opened this issue · comments

This isn't so much a bug report but more of an alert or question about URL padding. The current implementation seems to use '.' for padding of URL content, rather than '='.

However, every online tool I found was either using '=' or not padding at all. So I looked up the RFC spec which says:

[3.2](https://www.rfc-editor.org/rfc/rfc4648#section-3.2).  Padding of Encoded Data

   In some circumstances, the use of padding ("=") in base-encoded data
   is not required or used.  In the general case, when assumptions about
   the size of transported data cannot be made, padding is required to
   yield correct decoded data.

And then

[5](https://www.rfc-editor.org/rfc/rfc4648#section-5).  Base 64 Encoding with URL and Filename Safe Alphabet

   The Base 64 encoding with an URL and filename safe alphabet has been
   used in [[12](https://www.rfc-editor.org/rfc/rfc4648#ref-12)].

   An alternative alphabet has been suggested that would use "~" as the
   63rd character.  Since the "~" character has special meaning in some
   file system environments, the encoding described in this section is
   recommended instead.  The remaining unreserved URI character is ".",
   but some file system environments do not permit multiple "." in a
   filename, thus making the "." character unattractive as well.

   The pad character "=" is typically percent-encoded when used in an
   URI [[9](https://www.rfc-editor.org/rfc/rfc4648#ref-9)], but if the data length is known implicitly, this can be
   avoided by skipping the padding; see [section 3.2](https://www.rfc-editor.org/rfc/rfc4648#section-3.2).

   This encoding may be referred to as "base64url".  This encoding
   should not be regarded as the same as the "base64" encoding and
   should not be referred to as only "base64".  Unless clarified
   otherwise, "base64" refers to the base 64 in the previous section.

   This encoding is technically identical to the previous one, except
   for the 62:nd and 63:rd alphabet character, as indicated in Table 2.

        Table 2: The "URL and Filename safe" Base 64 Alphabet

     Value Encoding  Value Encoding  Value Encoding  Value Encoding
         0 A            17 R            34 i            51 z
         1 B            18 S            35 j            52 0
         2 C            19 T            36 k            53 1
         3 D            20 U            37 l            54 2
         4 E            21 V            38 m            55 3
         5 F            22 W            39 n            56 4
         6 G            23 X            40 o            57 5
         7 H            24 Y            41 p            58 6
         8 I            25 Z            42 q            59 7
         9 J            26 a            43 r            60 8
        10 K            27 b            44 s            61 9
        11 L            28 c            45 t            62 - (minus)
        12 M            29 d            46 u            63 _
        13 N            30 e            47 v           (underline)
        14 O            31 f            48 w
        15 P            32 g            49 x
        16 Q            33 h            50 y         (pad) =

agree