SIPB Deep Learning Group

The schedule of readings for the SIPB/Cambridge AI Deep Learning Group If you have any papers you'd like to discuss, please either make a pull request, or send an email to the group and we'll add it. Papers with implementations available are strongly preferred.

Schedule:

Date	Paper	Implementation
6.16.22	Sharpness-Aware Minimization for Efficiently Improving Generalization	google-research/sam
5.26.22	Neural Tangent Kernel: Convergence and Generalization in Neural Networks
4.28.22	A Modern Self-Referential Weight Matrix That Learns to Modify Itself	IDSIA/modern-srwm
4.14.22	Hierarchical Perceiver
3.24.22	Dual Diffusion Implicit Bridges for Image-to-Image Translation
3.10.22	Understanding Generalization through Visualizations	wronnyhuang/gen-viz
2.17.22	Divide and Contrast: Self-supervised Learning from Uncurated Data
2.10.22	Investigating Human Priors for Playing Video Games	rach0012/humanRL_prior_games
1.27.22	data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language	pytorch/data2vec
1.20.22	Consistent Video Depth Estimation	facebookresearch/consistent_depth
1.13.22	Masked Autoencoders Are Scalable Vision Learners
12.02.21	Training Verifiers to Solve Math Word Problems
11.18.21	(StyleGan3) Alias-Free Generative Adversarial Networks	NVlabs/stylegan3
11.04.21	Do Vision Transformers See Like Convolutional Neural Networks?
10.21.21	CoBERL: Contrastive BERT for Reinforcement Learning
10.14.21	WarpedGANSpace: Finding non-linear RBF paths in GAN latent space	chi0tzp/WarpedGANSpace
10.06.21	RAFT: Recurrent All-Pairs Field Transforms for Optical Flow	princeton-vl/RAFT
9.16.21	Bootstrapped Meta-Learning
9.09.21	Program Synthesis with Large Language Models
8.19.21	Perceiver IO: A General Architecture for Structured Inputs & Outputs	deepmind/perceiver
8.12.21	Reward is enough
8.05.21	Learning Compositional Rules via Neural Program Synthesis	mtensor/rulesynthesis
6.24.21	Thinking Like Transformers
6.17.21	Equilibrium Propagation: Bridging the Gap between Energy-Based Models and Backpropagation
6.10.21	Unsupervised Learning by Competing Hidden Units
5.27.21	Pay Attention to MLPs
5.20.21	Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards
5.13.21	Emerging Properties in Self-Supervised Vision Transformers
5.06.21	Implicit Neural Representations with Periodic Activation Functions	vsitzmann/siren
4.29.21	How to represent part-whole hierarchies in a neural network	lucidrains/glom-pytorch RedRyan111/GLOM ArneBinder/GlomImpl
4.15.21	Perceiver: General Perception with Iterative Attention
4.01.21	Synthetic Returns for Long-Term Credit Assignment
3.25.21	The Pitfalls of Simplicity Bias in Neural Networks
3.18.21	Bootstrap your own latent: A new approach to self-supervised Learning
3.11.21	Meta Learning Backpropagation And Improving It
3.04.21	Taming Transformers for High-Resolution Image Synthesis	CompVis/taming-transformers
2.18.21	Pre-training without Natural Images	hirokatsukataoka16/FractalDB-Pretrained-ResNet-PyTorch
2.11.21	Revisiting Locally Supervised Learning: an Alternative to End-to-end Training	blackfeather-wang/InfoPro-Pytorch
2.04.21	Neural Power Units
1.28.21	Representation Learning via Invariant Causal Mechanisms
1.21.21	γ-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction	JannerM/gamma-models
1.14.21	Improving Generalisation for Temporal Difference Learning: The Successor Representation
12.17.20	Learning Associative Inference Using Fast Weight Memory
	Hopfield Networks cycle ends
12.10.20	Hopfield Networks is All You Need	ml-jku/hopfield-layers
12.03.20	On a model of associative memory with huge storage capacity
11.19.20	Dense Associative Memory for Pattern Recognition
11.12.20	Neural Networks and Physical Systems with Emergent Collective Computational Abilities (= "the Hopfield Networks paper")
	Hopfield Networks cycle of papers - from the original paper on Hopfield networks to "Hopfield Networks is All You Need"
11.05.20	Training Generative Adversarial Networks with Limited Data	NVlabs/stylegan2-ada
10.29.20	Memories from patterns: Attractor and integrator networks in the brain
10.15.20	Entities as Experts: Sparse Memory Access with Entity Supervision
10.08.20	A Primer in BERTology: What we know about how BERT works
10.01.20	It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners	timoschick/pet
9.24.20	End-to-End Object Detection with Transformers	facebookresearch/detr
9.17.20	Gated Linear Networks
7.23.20	A Random Matrix Perspective on Mixtures of Nonlinearities for Deep Learning
7.02.20	DreamCoder: Building interpretable hierarchical knowledge representations with wake-sleep Bayesian program learning	ellisk42/ec
6.18.20	SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver	locuslab/SATNet
6.4.20	Adaptive Attention Span in Transformers
5.28.20	Complexity control by gradient descent in deep networks
5.21.20	What Can Learned Intrinsic Rewards Capture?
5.14.20	COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
5.7.20	Write, Execute, Assess: Program Synthesis With a REPL	flxsosa/ProgramSearch
4.23.20	Graph Representations for Higher-Order Logic and Theorem Proving
4.16.20	Mathematical Reasoning in Latent Space
4.9.20	MEMO: A Deep Network for Flexible Combination of Episodic Memories
4.2.20	Creating High Resolution Images with a Latent Adversarial Generator
3.26.20	Invertible Residual Networks
3.5.20	Value-driven Hindsight Modelling
2.27.20	Analyzing and Improving the Image Quality of StyleGAN
2.13.20	Axiomatic Attribution for Deep Networks
2.6.20	Automated curricula through setter-solver interactions
1.30.20	Protein structure prediction ...	deepmind
1.23.20	Putting An End to End-to-End: Gradient-Isolated Learning of Representations
1.16.20	Normalizing Flows: An Introduction and Review of Current Methods
12.19.19	Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
12.5.19	On the Measure of Intelligence
11.21.19	Understanding the Neural Tangent Kernel	rajatvd
11.14.19	XLNet: Generalized Autoregressive Pretraining for Language Understanding
11.7.19	Learning to Predict Without Looking Ahead: World Models Without Forward Prediction
10.31.19	Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
10.24.19	N-BEATS: Neural basis expansion analysis for interpretable time series forecasting
10.17.19	Unsupervised Doodling and Painting with Improved SPIRAL
10.10.19	Adversarial Robustness as a Prior for Learned Representations	MadryLab
10.3.19	Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks
9.26.19	Image Transformer
9.19.19	Generating Diverse High-Fidelity Images with VQ-VAE-2
9.12.19	Neural Discrete Representation Learning
9.5.19	Neural Text Generation with Unlikelihood Training
8.29.19	Learning Representations by Maximizing Mutual Information Across Views
break	switch from Tuesdays to Thursdays after the break
6.11.19	BERT Rediscovers the Classical NLP Pipeline
6.4.19	Semantic Visual Localization
5.28.19	AlgoNet: C^∞ Smooth Algorithmic Neural Networks
5.14.19	Unsupervised Data Augmentation for Consistency Training
4.30.19	Augmented Neural ODEs
4.9.19	Wasserstein Dependency Measure for Representation Learning
4.2.19	Leveraging Knowledge Bases in LSTMs for Improving Machine Reading
3.26.19	Meta Particle Flow for Sequential Bayesian Inference
3.19.19	A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms
3.12.19	The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
2.26.19	Language Models are Unsupervised Multitask Learners	openai
2.19.19	Learning to Understand Goal Specifications by Modelling Reward
1.29.19	GamePad: A Learning Environment for Theorem Proving
1.15.19	Matrix capsules with EM routing
12.4.18	Optimizing Agent Behavior over Long Time Scales by Transporting Value
11.27.18	Embedding Logical Queries on Knowledge Graphs	williamleif
11.20.18	Large-Scale Study of Curiosity-Driven Learning	openai
11.13.18	Sparse Attentive Backtracking: Temporal Credit Assignment Through Reminding	nke001
11.6.18	Generalizing Hamiltonian Monte Carlo with Neural Networks	brain-research
10.23.18	A Conceptual Introduction to Hamiltonian Monte Carlo
10.16.18	MaskGAN: Better Text Generation via Filling in the ...
10.9.18	Large Scale GAN Training for High Fidelity Natural Image Synthesis
10.2.18	Improving Variational Inference with Inverse Autoregressive Flow
9.25.18	Artificial Intelligence - The Revolution Hasn’t Happened Yet
9.18.18	Learning deep representations by mutual information estimation and maximization
9.11.18	The Variational Homoencoder: Learning to learn high capacity generative models from few examples	insperatum
9.4.18	Towards Conceptual Compression	geosada
8.28.18	Vector-based navigation using grid-like representations in artificial agents	deepmind
	break in maintaining this file; filled on April 10, 2020
------	-------------	-------------
8.21.18	Universal Transformers	tensorflow
8.14.18	Neural Arithmetic Logic Units	gautam1858
8.7.18	Neural Scene Representation and Rendering
7.31.18	Measuring Abstract Reasoning in Neural Networks
6.26.18	Improving Language Understanding by Generative Pre-Training	openai
6.19.18	Associative Compression Networks for Representation Learning
6.12.18	On Characterizing the Capacity of Neural Networks using Algebraic Topology
6.5.18	Causal Effect Inference with Deep Latent-Variable Models	AMLab
5.29.18	ML beyond Curve Fitting
5.22.18	Synthesizing Programs for Images using Reinforced Adversarial Learning
5.15.18	TensorFlow Overview	r1.8
5.8.18	Compositional Attention Networks for Machine Reasoning	stanfordnlp
4.24.18	The Annotated Transformer
4.3.18	How Developers Iterate on Machine Learning Workflows
3.27.18	Faster R-CNN: Towards Real-Time Object,Detection with Region Proposal Networks
3.20.18	Attention Is All You Need	tensor2tensor
3.6.18	Generating Wikipedia by Summarizing Long Sequences	wikisum, per this gist
2.27.18	AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks	StackGAN-v2
2.20.18	Information Dropout	InformationDropout, official implementation
2.13.18	Nested LSTMs	Nested-LSTM
2.6.18	Deep vs. Shallow Networks: An Approximation Theory Perspective
1.30.18	The Case for Learned Index Structures
1.23.18	Visualizing The Loss Landscape Of Neural Nets
1.16.18	Go for a Walk and Arrive at the Answer, RelNet: End-to-End Modeling of Entities & Relations
1.9.18	Intro to Coq
12.12.17	Chains of Reasoning over Entities, Relations, and Text using Recurrent Neural Networks	(ChainsofReasoning)
12.5.17	Stochastic Neural Networks for Hierarchical Reinforcement Learning	snn4hrl
11.28.17	Emergent Complexity via Multi-Agent Competition (blog post)	multiagent-competition
11.14.17	Mastering the game of Go without human knowledge
11.7.17	Meta-Learning with Memory-Augmented Neural Networks	ntm-meta-learning
10.24.17	Poincaré Embeddings for Learning Hierarchical Representations	poincare_embeddings
10.17.17	What does Attention in Neural Machine Translation Pay Attention to?
10.10.17	Zero-Shot Learning Through Cross-Modal Transfer	zslearning
9.26.17	Variational Boosting: Iteratively Refining Posterior Approximations	vboost
9.19.17	Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks	cbfinn
9.12.17	Neuroscience-inspired AI
9.5.17	Recurrent Dropout Without Memory Loss	rnn_cell_mulint_modern.py
8.29.17	Deep Transfer Learning with Joint Adaptation Networks	jmmd.{cpp,hpp}
8.22.17	Designing Neural Network Architectures using Reinforcement Learning	metaqnn
8.15.17	Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences	plstm
8.8.17	Hyper Networks	otoro blog
8.1.17	Full-Capacity Unitary Recurrent Neural Networks	complex_RNN, urnn
7.25.17	Decoupled Neural Interfaces using Synthetic Gradients & follow-up	dni.pytorch
7.18.17	A simple neural network module for relational reasoning	relation-network
7.11.17	Speaker diarization using deep neural network embeddings
6.20.17	Neural Episodic Control	PFCM
6.13.17	Lie-Access Neural Turing Machines	harvardnlp
6.6.17	Artistic style transfer for videos	artistic video
5.30.17	High-Dimensional Continuous Control Using Generalized Advantage Estimation	modular_rl
5.23.17	Emergence of Grounded Compositional Language in Multi-Agent Populations
5.16.17	Trust Region Policy Optimization	modular_rl
5.9.17	Improved Training of Wasserstein GANs	code
5.4.17	Using Fast Weights to Attend to the Recent Past
4.25.17	Strategic Attentive Writer for Learning Macro-Actions
4.18.17	Massive Exploration of Neural Machine Translation Architectures
4.4.17	End to End Learning for Self-Driving Cars
3.28.17	Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning
3.21.17	Image-to-Image Translation with Conditional Adversarial Networks
3.7.17	Neural Programmer Interpreters
2.14.17	Wasserstein GAN
2.7.17	Towards Principled Methods for Training GANs
1.31.17	Mastering the Game of Go with Deep Networks
1.24.17	Understanding Deep Learning Requires Rethinking Generalization
1.17.17	Neural Semantic Encoders
12.21.16	StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
12.14.16	Key-Value Memory Networks for Directly Reading Documents
12.7.16	InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

pmiller10 / cambridge-ai

SIPB Deep Learning Group

Suggested Papers:

Schedule:

About