nilesh2797 / perceiver-lm

Unofficial PyTorch implementation of PerceiverIO for language modeling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Perceiver LM

Contains PyTorch implementation of PerceiverLM (Language Modelling Perceiver), adapted from PerceiverIO model of esceptico's perceiver-io repo. The PerceiverLM model in this repo is functionally exact PyTorch based replica of original PereciverIO's JAX implementation. Code snippet for loading deepmind's pretrained checkpoint can be found in this notebook.

Perceiver IO

Unofficial implementation of Perceiver IO: A General Architecture for Structured Inputs & Outputs

Installation

From PyPI

pip install -U perceiver-io-pytorch

Usage

import torch

from perceiver_io.decoders import PerceiverDecoder
from perceiver_io.encoder import PerceiverEncoder
from perceiver_io import PerceiverIO

num_latents = 128
latent_dim = 256
input_dim = 64

decoder_query_dim = 4

encoder = PerceiverEncoder(
    num_latents=num_latents,
    latent_dim=latent_dim,
    input_dim=input_dim,
    num_self_attn_per_block=8,
    num_blocks=1
)
decoder = PerceiverDecoder(
    latent_dim=latent_dim,
    query_dim=decoder_query_dim
)
perceiver = PerceiverIO(encoder, decoder)

inputs = torch.randn(2, 16, input_dim)
output_query = torch.randn(2, 3, decoder_query_dim)

perceiver(inputs, output_query)  # shape = (2, 3, 4)

List of implemented decoders

  • ProjectionDecoder
  • ClassificationDecoder
  • PerceiverDecoder

Example architectures:

Citation

@misc{jaegle2021perceiver,
    title   = {Perceiver IO: A General Architecture for Structured Inputs & Outputs},
    author  = {Andrew Jaegle and Sebastian Borgeaud and Jean-Baptiste Alayrac and Carl Doersch and Catalin Ionescu and David Ding and Skanda Koppula and Andrew Brock and Evan Shelhamer and Olivier Hénaff and Matthew M. Botvinick and Andrew Zisserman and Oriol Vinyals and João Carreira},
    year    = {2021},
    eprint  = {2107.14795},
    archivePrefix = {arXiv},
    primaryClass = {cs.LG}
}

About

Unofficial PyTorch implementation of PerceiverIO for language modeling

License:MIT License


Languages

Language:Python 91.1%Language:Jupyter Notebook 8.9%