unslothai / hyperlearn

2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.

Home Page:https://hyperlearn.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Shocking Confusing Speed / Timing results of Algorithms (Sklearn, Numpy, Scipy, Pytorch, Numba) | Prelim results

danielhanchen opened this issue · comments

Anyways, I didn't update the code a lot, but that's because I was busily testing and finding out which algos were the most stable and best.

Key findings for N = 5,000 P = 6,000 [more features than N near square matrix]

  1. For pseudoinverse, (used in Linear Reg, Ridge Reg, lots of other algos), JIT, Scipy MKL, PinvH, Pinv2 and HyperLearn's Pinv are very similar. PyTorch's is clearly problematic, having close to over x4 slower than Scipy MKL.

  2. For Eigh (used in PCA, LDA, QDA, other algos), Sklearn's PCA utilises SVD. Clearly, not a good idea, since it is much better to compute the eigenvec / eigenval on XTX. JIT Eigh is the clear winner at 14.5 seconds on XTX, whilst Numpy is 2x slower. Torch likewise is slower once again...

  3. So, for PCA, a speedup of 3 times is seem if using JIT compiled Eigh when compared to Sklearn's PCA

  4. To solve X*theta = y, Torch GELS is super unstable. Like really. If you use Torch GELS, don't forget to call theta_hat[np.isnan(theta_hat) | np.isinf(theta_hat)] = 0, or else results are problematic. All other algos have very similar MSEs, and HyperLearn's Regularized Cholesky Solve takes a mere 0.635 seconds when compared to say using Sklearn's next fastest Ridge Solve (via cholesky) by over 100% (after considering matrix multiplication time) --> HyperLearn 2.89s vs 4.53s Sklearn.

So to conclude:

  1. HyperLearn's Pseudoinverse has no speed improvement

  2. HyperLearn's PCA will have over 2 times speed boost. (200% improvement)

  3. HyperLearn's Linear Solvers will be over 1 times faster. (100% improvement)

Help make HyperLearn better! All contributors are welcome, as this is truly an overwhelming project...

drawing

Note - I've submitted an issue to PyTorch --> pytorch/pytorch#11174