This patch adds additional optimization/tuning for kernel builds by adding more micro-architectures options accessible under:
Processor type and features --->
Processor family --->
The kernel uses its own set of CFLAGS, KCFLAGS. For example, see:
CPU Family | GCC Optimization | Min GCC Ver | Min Clang Ver |
---|---|---|---|
Native optimizations autodetected by GCC | -march=native | 4.2 | 3.8 |
Generic 64-bit level v2 | -march=x86-64-v2 | 11.1 | 12.0 |
Generic 64-bit level v3 | -march=x86-64-v3 | 11.1 | 12.0 |
Generic 64-bit level v4 | -march=x86-64-v4 | 11.1 | 12.0 |
AMD Improved K8-family | -march=k8-sse3 | 9.3 | 9.0 |
AMD K10-family | -march=amdfam10 | 9.3 | 9.0 |
AMD Family 10h (Barcelona) | -march=barcelona | 9.3 | 9.0 |
AMD Family 14h (Bobcat) | -march=btver1 | 9.3 | 9.0 |
AMD Family 16h (Jaguar) | -march=btver2 | 9.3 | 9.0 |
AMD Family 15h (Bulldozer) | -march=bdver1 | 9.3 | 9.0 |
AMD Family 15h (Piledriver) | -march=bdver2 | 9.3 | 9.0 |
AMD Family 15h (Steamroller) | -march=bdver3 | 9.3 | 9.0 |
AMD Family 15h (Excavator) | -march=bdver4 | 9.3 | 9.0 |
AMD Family 17h (Zen) | -march=znver1 | 9.3 | 9.0 |
AMD Family 17h (Zen 2) | -march=znver2 | 9.3 | 9.0 |
AMD Family 19h (Zen 3) | -march=znver3 | 10.3 | 12.0 |
Intel Bonnell family Atom | -march=bonnell | 9.3 | 9.0 |
Intel Silvermont family Atom | -march=silvermont | 9.3 | |
Intel Goldmont family Atom (Apollo Lake and Denverton) | -march=goldmont | 9.3 | 9.0 |
Intel Goldmont Plus family Atom (Gemini Lake) | -march=goldmont-plus | 9.3 | |
Intel 1st Gen Core i3/i5/i7-family (Nehalem) | -march=nehalem | 9.3 | 9.0 |
Intel 1.5 Gen Core i3/i5/i7-family (Westmere) | -march=westmere | 9.3 | 9.0 |
Intel 2nd Gen Core i3/i5/i7-family (Sandybridge) | -march=sandybridge | 9.3 | 9.0 |
Intel 3rd Gen Core i3/i5/i7-family (Ivybridge) | -march=ivybridge | 9.3 | 9.0 |
Intel 4th Gen Core i3/i5/i7-family (Haswell) | -march=haswell | 9.3 | 9.0 |
Intel 5th Gen Core i3/i5/i7-family (Broadwell) | -march=broadwell | 9.3 | 9.0 |
Intel 6th Gen Core i3/i5/i7-family (Skylake) | -march=skylake | 9.3 | 9.0 |
Intel 6th Gen Core i7/i9-family (Skylake X) | -march=skylake-avx512 | 9.3 | 9.0 |
Intel 8th Gen Core i3/i5/i7-family (Cannon Lake) | -march=cannonlake | 9.3 | 9.0 |
Intel 10th Gen Core i7/i9-family (Ice Lake) | -march=icelake-client | 9.3 | 9.0 |
Intel Xeon (Cascade Lake) | -march=cascadelake | 10.2 | 10.0 |
Intel Xeon (Cooper Lake) | -march=cooperlake | 10.2 | 10.0 |
Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake) | -march=cooperlake | 10.2 | 10.0 |
Intel 3rd Gen 10nm++ Xeon (Sapphire Rapids) | -march=sapphirerapids | 11.1 | 12.0 |
Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake) | -march=rocketlake | 11.1 | 12.0 |
Intel 12th Gen i3/i5/i7/i9-family (Alder Lake) | -march=alderlake | 11.1 | 12.0 |
Three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options were tested using a make based endpoint.
There are small but real speed increases to running with this patch as judged by a make endpoint. The increases are on par with the speed increase that the upstream sanctioned core2 option gives users, so not including additional options seems somewhat arbitrary to me.
- Three test machines: Intel Xeon X3360, Intel i7-2620M, Intel Core i7-3660K.
- All ran the make benchmark (linked below) 35 times while booted into a 'generic' kernel. Then all ran the same make benchmark 35 times after booting into an optimized kernel. Below are the optimizations chosen for each machine.
- X3360 = core2
- i7-2620M = sandybridge
- i7-3660K = ivybridge
- Results were analyzed for statistical significance via ANOVA plots that clearly show statistically significant albeit small differences.
- All the assumptions for ANOVA are met:
- Data are normally distributed as show in the normal quantile plots.
- The population variances are fairly equal (Levene and Barlett tests).
- The ANOVA plots clearly show significance.
- Pair-wise analysis by Tukey-Kramer shows significance at the 0.05 level for all CPUs compared.
Below are the differences in median values:
CPU | Difference in median value |
---|---|
core2 | +87.5 ms |
sandybridge | +79.7 ms |
ivybridge | +257.2 ms |
- Bash script that controls the benchmark: https://github.com/graysky2/bin/blob/master/bench
- Log file generated by script: http://repo-ck.com/bench/compile_time_optimization.txt.gz
- Original author: jeroen AT linuxforge DOT net
- Link to original version: http://www.linuxforge.net/docs/linux/linux-gcc.php
Find support for older version of the linux kernel and of gcc in the outdated_versions directory.