Fast CPU and GPU Python implementations of Improved Kernel Partial Least Squares (PLS) by Dayal and MacGregor (1997) and Fast Partition-Based Cross-Validation With Centering and Scaling for XTX and XTY by Engstrøm and Jensen (2025).
Computation of training set (X^T * W * X) and (X^T * W * Y) or (X^T * X) and (X^T * Y) in a cross-validation setting using the fast algorithms by Engstrøm and Jensen (2025).