Const-me / CbrtPs

A function to compute FP32 cubic root with SIMD on PCs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

This small project implements a single function, cbrt_ps.
The function computes cubic root of 4 FP32 values in a vector register.

The implementation requires SSE 4.1 instruction set, and can optionally use AVX 1 if available. So far, no NEON version is there.

The implementation is OS agnostic, should compile for all of them as long as the target CPU supports SSE 4.1.
Tested with Visual Studio 2022 on Windows 10, and GCC 7.4 on Linux.

The implementation can be trivially generalized to 32-bytes AVX vectors.

Usage

Copy-paste CbrtPs.hpp header into your project, include the header into your source or header file[s].

As you see from the comment in that header, it comes with a copy paste friendly terms of MIT license.

References

About

A function to compute FP32 cubic root with SIMD on PCs


Languages

Language:C++ 95.4%Language:CMake 4.6%