Counting CPUs is extremely slow on large systems

Question

Counting CPUs is extremely slow on large systems

fweimer opened this issue 4 years ago · comments

On large systems, this fragment from src/journal.sh seems to be responsible for most of the delay in starting new tasks:

    local line size
    # CPU info
    if [ -f "/proc/cpuinfo" ]; then
        local count=0
        local type="unknown"
        local cpu_regex="^model\sname.*: (.*)$"
        while read -r line; do
            if [[ "$line" =~ $cpu_regex ]]; then
                type="${BASH_REMATCH[1]}"
                let count++
            fi
        done < "/proc/cpuinfo"
        __INTERNAL_WriteToMetafile hw_cpu -- "$count x $type"
        __INTERNAL_LogText "    CPUs          : $count x $type" 2> /dev/null
    fi

bash reads /proc/cpuinfo in pieces of 128 bytes, and the kernel really does not like that from a because of the way the proc file system is implemented. (I tried bash-4.4.19-10.el8.x86_64 and kernel-4.18.0-193.el8.x86_64.) The reported counts are wrong as well for some reason.

To give a perspective on the performance, running the modified script

    if [ -f "/proc/cpuinfo" ]; then
        count=0
        type="unknown"
        cpu_regex="^model\sname.*: (.*)$"
        while read -r line; do
            if [[ "$line" =~ $cpu_regex ]]; then
                type="${BASH_REMATCH[1]}"
                let count++
            fi
        done < "/proc/cpuinfo"
        echo hw_cpu -- "$count x $type"
        echo "    CPUs          : $count x $type" 2> /dev/null
    fi

results on an unloaded system in:

$ time bash cpu-script
hw_cpu -- 256 x AMD EPYC 7742 64-Core Processor
    CPUs          : 256 x AMD EPYC 7742 64-Core Processor

real    0m16.720s
user    0m0.095s
sys     0m16.565s

It's much worse on larger systems.

Would it be acceptable to use nproc? Copying the data to a temporary file instead will help as well because it eliminates the source of super-linear performance.

Florian Weimer · Answer 1 · Fri May 29 2020 23:16:34 GMT+0800 (China Standard Time)

Here's the output from a larger system:

$ time bash cpu-script
hw_cpu -- 136 x Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz
    CPUs          : 136 x Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz

real	4m32.753s
user	0m0.270s
sys	4m31.619s

The actual CPU count is 448 for this system.