NVIDIA / MatX

An efficient C++17 GPU numerical computing library with Python-like syntax

Home Page:https://nvidia.github.io/MatX

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FEA] Add `outer()` for outer product

cliffburdick opened this issue · comments

Use cuBLAS *ger functions

If we have something like outer_impl(A, X, Y) where A is batch_dims x N x M and X is batch_dims x N and y is batch_dims x M then we should be able to do this pretty easily by having outer internally clone X to X is batch_dims x N x 1 and clone Y to batch_dims x 1 x M then call matmul_impl.

Note we should probably also support the axis parameter to match matmul/matvec.