Turbo Base64:Fastest Base64 Scalar+SIMD

Fastest Base64 SIMD Encoding library

🆕 (2022.02) the fastest now more faster
🆕 (2022.02) improved speed for short strings
100% C (C++ headers), as simple as memcpy
No other base64 library encode or decode faster
✨ Scalar can be faster than other SSE or ARM Neon based base64 libraries
SSE faster than other SSE/AVX/AVX2! base64 library
Fastest AVX2 implementation
TurboBase64 AVX2 decoding up to ~2x faster than other AVX2 libs.
TurboBase64 is 3-4 times faster than other libs for short strings
Fastest ARM Neon base64
👍 Dynamic CPU detection and JIT scalar/sse/avx/avx2 switching
Base64 robust error checking, optimized for long+short strings
Portable library, 32/64 bits, SSE/AVX/AVX2, ARM Neon, Power9 Altivec
OS:Linux amd64, arm64, Power9, MacOs+Apple M1, s390x. Windows: Mingw, visual c++
Big endian + Little endian
Ready and simple to use library, no armada of files, no hassles dependencies
LICENSE GPL 3

Benchmark incl. the best SIMD Base64 libs:

Single thread
Including base64 error checking
Small file + realistic and practical (no PURE cache) benchmark
Unlike other benchmarks, the best of the best scalar+simd libraries are included
all libraries with the latest version

Benchmark Intel CPU: i7-9700k 3.6GHz gcc 11.2

E Size	ratio%	E MB/s	D MB/s	Name	50,000 bytes - 2022.02
67380	133.3	32497	34335	tb64v256	Turbo Base64 avx2
67380	133.3	27789	22264	b64avx2	Base64 avx2
67380	133.3	25305	21980	fb64avx2	Fastbase64 avx2

67380	133.3	17064	20504	tb64v128a	Turbo Base64 avx
67380	133.3	15989	18809	tb64v128	Turbo Base64 sse
67380	133.3	15820	13078	b64avx	Base64 avx
67380	133.3	15322	11302	b64sse	Base64 sse41
50000	100.0	47593	47623	memcpy

E Size	ratio%	E MB/s	D MB/s	Name	1 MB - 2022.02
1333336	133.3	29086	29748	tb64v256	Turbo Base64 avx2
1333336	133.3	26153	22515	b64avx2	Base64 avx2
1333336	133.3	23686	21231	fb64avx2	Fastbase64 avx2

1333336	133.3	16897	20215	tb64v128a	Turbo Base64 avx
1333336	133.3	15932	18749	tb64v128	Turbo Base64 sse
1333336	133.3	15537	12959	b64avx	Base64 avx
1333336	133.3	15135	11304	b64sse	Base64 sse41

1333336	133.3	6546	5473	TB64x	Turbo Base64 scalar
1333336	133.3	6495	4454	b64plain	Base64 plain
1333336	133.3	1908	2752	TB64s	Turbo Base64 scalar
1333336	133.3	2541	4289	chrome	Google Chrome base64
1333336	133.3	2670	2299	fb64plain	FastBase64 plain
1333334	135.4	1754	219	linux	Linux base64
1000000	100.0	28688	28656	memcpy

TurboBase64 vs. Base64 for short strings (incl. checking)

String length	E MB/s	D MB/s	Name	50,000 bytes - short strings 2022.02
4 - 16	2330	2161	TB64avx2	Turbo Base64 avx2
	891	734	b64avx2	Base64 avx2
8 - 32	3963	3570	TB64avx2	Turbo Base64 avx2
	1348	943	b64avx2	Base64 avx2
16 - 64	6881	5937	TB64avx2	Turbo Base64 avx2
	2509	1488	b64avx2	Base64 avx2
32 - 128	10946	8880	TB64avx2	Turbo Base64 avx2
	4902	2777	b64avx2	Base64 avx2

Benchmark ARM Neon: ARMv8 A73-ODROID-N2 1.8GHz (clang 6.0)

E Size	ratio%	E MB/s	D MB/s	Name	30MB binary 2019.12
40000000	133.3	2026	1650	TB64neon	Turbo Base64 Neon
40000000	133.3	1795	1285	b64neon64	Base64 Neon
40000000	133.3	1270	1095	TB64x	Turbo Base64 scalar
40000000	133.3	695	965	TB64s	Turbo Base64 scalar
40000000	133.3	512	782	fb64neon	Fastbase64 SIMD Neon
40000000	133.3	565	460	Chrome	Google Chrome base64
40000000	133.3	642	614	b64plain	Base64 plain
40000000	133.3	506	548	fb64plain	Fastbase64 plain
40500000	135.4	314	91	Linux	Linux base64
30000000	100.0	3820	3834	memcpy

(bold = pareto in category) MB=1.000.000
(E/D) : Encode/Decode
Timmings are respectively relative to the base64 output/input in encode/decode.

Compile: (Download or clone Turbo Base64 SIMD)

    git clone git://github.com/powturbo/TurboBase64.git
    make

Usage: (Benchmark App)

    ./tb64app file 
    or  
    ./tb64app

Function usage:

static inline unsigned turbob64len(unsigned n)
Base64 output length after encoding

unsigned tb64enc(const unsigned char *in, unsigned inlen, unsigned char *out)
Encode binary input 'in' buffer into base64 string 'out'
with automatic cpu detection for simd and switch (sse/avx2/scalar
in : Input buffer to encode
inlen : Length in bytes of input buffer
out : Output buffer
return value: Length of output buffer
Remark : byte 'zero' is not written to end of output stream
Caller must add 0 (out[outlen] = 0) for a null terminated string

unsigned tb64dec(const unsigned char *in, unsigned inlen, unsigned char *out)
Decode base64 input 'in' buffer into binary buffer 'out'
in : input buffer to decode
inlen : length in bytes of input buffer
out : output buffer
return value: >0 output buffer length
0 Error (invalid base64 input or input length = 0)

Environment:

OS/Compiler (32 + 64 bits):

Windows: Visual C++ (2017)
Windows: MinGW-w64 makefile
Linux amd/intel: GNU GCC (>=4.6)
Linux amd/intel: Clang (>=3.2)
Linux arm: aarch64 ARMv8 Neon: gcc (>=6.3)
Linux arm: aarch64 ARMv8 Neon: clang (>=6.0)
MaxOS: XCode (>=9), apple M1
PowerPC ppc64le: gcc (>=8.0) incl. SIMD Altivec

References:

* SIMD Base64 publications:

Last update: 15 FEB 2022

About

Turbo Base64 - Fastest Base64 SIMD/Neon/Altivec

GNU General Public License v3.0

Languages

Language:C 98.2%Language:Makefile 1.8%