Architecture

Question

Architecture

lucidrains opened this issue 4 years ago · comments

Phil Wang commented 4 years ago

Share what you think you know about the architecture here

Phil Wang · Answer 1 · Thu Dec 03 2020 15:55:28 GMT+0800 (China Standard Time)

https://twitter.com/MoreheadAlex/status/1334302410039291905?s=19

Jorge C · Answer 2 · Fri Dec 04 2020 03:49:49 GMT+0800 (China Standard Time)

Here is the Democratized Implementation of Alphafold Protein Distance Prediction Network:
https://github.com/dellacortelab/prospr

Phil Wang · Answer 3 · Fri Dec 04 2020 04:31:08 GMT+0800 (China Standard Time)

https://github.com/lucidrains/trRosetta

Phil Wang · Answer 4 · Fri Dec 04 2020 06:06:45 GMT+0800 (China Standard Time)

https://arxiv.org/abs/2006.10503

Phil Wang · Answer 5 · Wed Dec 09 2020 01:14:01 GMT+0800 (China Standard Time)

https://moalquraishi.wordpress.com/2020/12/08/alphafold2-casp14-it-feels-like-ones-child-has-left-home/ Mohammed AlQuraishi believes it is SE(3)-Transformers as well. We'll go by that

Phil Wang · Answer 6 · Sat Dec 12 2020 02:44:46 GMT+0800 (China Standard Time)

After reviewing where we are at with equivariant attention networks, they seem costly and brittle (at least, the current implementations) why can't we just work off the distogram https://github.com/lucidrains/molecule-attention-transformer and get the equivariance for free?

Phil Wang · Answer 7 · Sun Dec 13 2020 05:19:06 GMT+0800 (China Standard Time)

There is the possibility that there is some iterative refinement process going on similar to Xu's paper. Most possibly with Graph Attention Networks https://www.biorxiv.org/content/10.1101/2020.12.10.419994v1.full.pdf

Phil Wang · Answer 8 · Tue Dec 15 2020 04:53:00 GMT+0800 (China Standard Time)

Readying something better than GAT https://github.com/lucidrains/adjacent-attention-network

Phil Wang · Answer 9 · Wed Dec 23 2020 02:19:56 GMT+0800 (China Standard Time)

https://arxiv.org/abs/2012.10885 Lie Transformers, may be an alternative to SE(3)-Transformers, for equivariant attention. will be built here, just in case it is needed https://github.com/lucidrains/lie-transformer-pytorch

Eric Alcaide · Answer 10 · Sun Jan 03 2021 02:19:49 GMT+0800 (China Standard Time)

Hi there! I would like to actively contribute to/guide this project! I have a background in AI/ML as well as Medicine and Physics (plus i did a mini repro of AlphaFold1 2 years ago).
Here there's a good piece of comment about the architecture from one of the CASP staff members: https://moalquraishi.wordpress.com/2020/12/08/alphafold2-casp14-it-feels-like-ones-child-has-left-home/#s3
I could provide some help understanding the features if someone has trouble with that.

@lucidrains i want to push this project forward. I've asked for it on eleuther, if there's any way to move this, i'll be glad to help!

Phil Wang · Answer 11 · Sun Jan 03 2021 02:37:58 GMT+0800 (China Standard Time)

@hypnopump Hi Eric, would be happy to have you on board!

Phil Wang · Answer 12 · Sun Jan 03 2021 02:41:00 GMT+0800 (China Standard Time)

@hypnopump Yup, I'm working on equivariant self-attention this month :) Will get it done 👍

Eric Alcaide · Answer 13 · Sun Jan 03 2021 02:47:19 GMT+0800 (China Standard Time)

@lucidrains cool! anything i can help with, you say it

Phil Wang · Answer 14 · Sun Jan 03 2021 02:51:39 GMT+0800 (China Standard Time)

@hypnopump want to chat in off topic in the eleuther discord?

Phil Wang · Answer 15 · Sun Jan 31 2021 06:44:54 GMT+0800 (China Standard Time)

SE3 Transformers is completed! https://github.com/lucidrains/se3-transformer-pytorch

Phil Wang · Answer 16 · Wed Feb 03 2021 20:55:37 GMT+0800 (China Standard Time)

Will be working on getting reversible networks working with the main trunk (cross attention between two transformers) if it works, it means scale to any depth with no memory cost. Should be done by end of week

Phil Wang · Answer 17 · Fri Feb 05 2021 11:51:01 GMT+0800 (China Standard Time)

Reversible networks is done, which means we can now scale to any depth for the main trunk, with only the memory cost of one layer

Phil Wang · Answer 18 · Fri Feb 05 2021 11:51:44 GMT+0800 (China Standard Time)

Next up, bringing in sparse attention techniques + relative positional encoding

Phil Wang · Answer 19 · Sat Feb 06 2021 11:58:10 GMT+0800 (China Standard Time)

Ok, sparse attention is done. I'll also bring in sparse attention in a convolutional pattern (https://github.com/lucidrains/DALLE-pytorch/blob/main/dalle_pytorch/attention.py#L75) some time next week

Phil Wang · Answer 20 · Thu Feb 11 2021 08:19:09 GMT+0800 (China Standard Time)

Lie Transformers is confirmed to be correct by one of the authors, so we have two equivariant attention solutions at our disposal

Phil Wang · Answer 21 · Tue Mar 09 2021 14:43:12 GMT+0800 (China Standard Time)

Closing because I think we are close