lucidrains / hamburger-pytorch

Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

🍔 - Pytorch

Pytorch implementation of the hamburger module from the ICLR 2021 paper Is Attention Better Than Matrix Decomposition?. Following Betteridge's law, the answer according to the paper is "No" for segmentation and GANs.

This repository will contain the NMF-MU (nonnegative matrix factorization w/ multiplicative update) module sandwiched by linear projections.

Update: I tried this, but did not get better results than just using linear attention

Install

$ pip install hamburger-pytorch

Usage

import torch
from hamburger_pytorch import Hamburger

hamburger = Hamburger(
    dim = 512,       # input dimension
    n = 32 * 32,     # n will be size of the sequence, in this case, height times width of the images
    ratio = 8,       # matrix factorization ratio, recommended to be at 8
    K = 6            # number of iterations, optimal at 6 as shown in paper
)

x = torch.randn(1, 512, 32, 32)
hamburger(x) + x # (1, 512, 32, 32)

Citations

@inproceedings{
    anonymous2021is,
    title={Is Attention Better Than Matrix Decomposition?},
    author={Anonymous},
    booktitle={Submitted to International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=1FvkSpWosOl},
    note={under review}
}

About

Pytorch implementation of the hamburger module from the ICLR 2021 paper "Is Attention Better Than Matrix Decomposition"

License:MIT License


Languages

Language:Python 100.0%