AIGC_Resources

Gather AIGC most useful tools, materials, publications and reports

Foundation Papers

Title	Model	Publication Date	Code	Organization
Attention Is All You Need	Transformer	Dec 2017		Google
Improving Language Understanding by Generative Pre-Training	GPT	Jun 2018		OpenAI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	Bert	May 2019		Google
On the Opportunities and Risks of Foundation Models		Jul 2022		Center for Research on Foundation Models (CRFM) & Stanford Institute for Human-Centered Artificial Intelligence (HAI)
Language Models are Unsupervised Multitask Learners	GPT-2	Dec 2020	Code	OpenAI
Learning Transferable Visual Models From Natural Language Supervision	CLIP	Feb 2021	Code	OpenAI
Evaluating Large Language Models Trained on Code	Codex	Jul 2021		OpenAI
Competition-Level Code Generation with AlphaCode	AlphaCode	Feb 2022		DeepMind
Adding Conditional Control to Text-to-Image Diffusion Models	ControlNet	Feb 2022	Code	Stanford University
Codegen: an open large language model for code with multi-turn program synthesis	CodeGen	March 2022	Code	Salesforce

Recent Papers

Title	Short Name	Date	Institution	Code (if available)
Training language models to follow instructions with human feedback	Instruct GPT	March 2022	OpenAI
High-Resolution Image Synthesis with Latent Diffusion Models	Stable Diffusion	April 2022	Heidelberg University & Runway
Hierarchical Text-Conditional Image Generation with CLIP Latents	Dalle 2	April 2022	OpenAI
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback	RLHF	Jun 2022	Anthropic
Language Models are Few-Shot Learners	GPT-3	Jun 2022	OpenAI
WebGPT: Browser-assisted question-answering with human feedback	WebGPT	Jun 2022	OpenAI
Robust Speech Recognition via Large-Scale Weak Supervision	Whisper	Sep 2022	OpenAI	Code
LLaMA: Open and Efficient Foundation Language Models	LLaMA	Feb 2023	Meta
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models	Visual ChatGPT	March 2023	Microsoft	Code
Consistency Models		March 2023	OpenAI	Code
Language Is Not All You Need: Aligning Perception with Language Models	Aligning	March 2023	Microsoft
GPT-4 Technical Report	GPT-4	March 2023	OpenAI
BloombergGPT: A Large Language Model for Finance	BloombergGPT	March 2023	Bloomberg
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face	HuggingGPT	April 2023	Microsoft	Code
Segment Anything	SAM	April 2023	Meta	Code
Instruction Tuning with GPT-4		April 2023	Stanford & Google	Code
Generative Agents: Interactive Simulacra of Human Behavior		April 2023	Microsoft
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca	Chinese LLaMa	April 2023	Microsoft	Code
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling	Pythia	April 2023	MMany
DINOv2: Learning Robust Visual Features without Supervision	DINOv2	April 2023	Meta
Shap·E: Generating Conditional 3D Implicit Functions	Shap·E	May 2023	OpenAI	Code
StarCoder: may the source be with you	StarCoder	May 2023	Many	Code
Large Language Models are Zero-Shot Rankers for Recommender Systems		May 2023	Renmin University, Wechat, US San Diego
IMAGEBIND: One Embedding Space To Bind Them All	IMAGEBIND	May 2023	Meta
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold	DragGAN	May 2023	Many	Code
VOYAGER: An Open-Ended Embodied Agent with Large Language Models	VOYAGER	May 2023	NVIDIA and Many	Code
Large Language Models as Tool Makers		May 2023	Deepmind, Princeton University, Stanford University	Code
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions	SELF-INSTRUCT	May 2023	Many	Code
LIMA: Less Is More for Alignment	LIMA	May 2023	Meta, CMU and Many
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction	GPT4Tools	May 2023	Tsinghua University and Many	Code
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions	SELF-INSTRUCT	May 2023	Many	Code
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild	UniControl	May 2023	Salesforce, Stanford University, Northeastern University	Code
QLORA: Efficient Finetuning of Quantized LLMs	QLORA	May 2023	University of Washington	Code
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback	AlpacaFarm	May 2023	Stanford University
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft	STEVE-1	June 2023	University of Toronto
MIND2WEB: Towards a Generalist Agent for the Web	MIND2WEB	June 2023	Ohio State	Code
StyleDrop: Text-to-Image Generation in Any Style	StyleDrop	June 2023	Google	Code
Simple and Controllable Music Generation	MusicGen	June 2023	Meta	Code
Orca: Progressive Learning from Complex Explanation Traces of GPT-4	Orca	June 2023	Microsoft
TryOnDiffusion: A Tale of Two UNets	TryOnDiffusion	June 2023	University of Washington， Google
WizardLM: Empowering Large Language Models to Follow Complex Instructions	WizardLM	June 2023	Microsoft, Peking University	Code
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale	Voicebox	June 2023	Meta
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing	DragDiffusion	June 2023	National University of Singapore & ByteDance

Important Reports

Report	Link	Date	Institution
Stanford AI index Report 2023	Link		Stanford
Sparks of Artificial General Intelligence: Early experiments with GPT-4	Link		Microsoft
A Survey of Large Language Models	Link	April 2023	Renmin University, China & University of Montreal, Canada
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond	Link		Amazon & many others
A Cookbook of Self-Supervised Learning	Link		Meta & many others
Let’s Verify Step by Step	Link	May 2023	OpenAI
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering	Link	May 2023	Kyung Hee University and many
A Comprehensive Survey on Segment Anything Model for Vision and Beyond	Link	May 2023	Hong Kong University of Science and Technology and many
On the Design Fundamentals of Diffusion Models: A Survey	Link	June 2023	Durham University
Open LLM Leaderboard	Link	Update in real time	Huggingface
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering	Link	May 2023	JKyung Hee University and many

Important Projects

MidJourney

Alpaca Open Source Code Stanford March 2023

Dolly Open Source Code Databricks March 2023 Note: OK to use commercially

Vicuna Open Source Code UC Berkeley, CMU, Stanford, and UC San Diego March 2023

ChatPDF March 2023

Bard Google March 2023

Langchain Community Effort March 2023

Microsoft 365 Copilot Microsoft March 2023

AutoGPT Community Effort April 2023

Grounded SAM IDEA April 2023

DeepSpeed Chat Microsoft April 2023

AgentGPT Community Effort April 2023

MiniGPT King Abdullah University of Science and Technology April 2023

DeepFloyd IF Stability.ai April 2023

Open Llama Berkeley May 2023

SoftVC VITS Singing Voice Conversion Community May 2023

Falcon Tii May 2023

UltraLM Tsinghua University June 2023

AIGC Courses

COS597G Understanding Large Language Models Princeton 2022

CS324 Large Language Models Stanford 2023

ChatGPT, LangChain and DS Courses Deeplearning.ai Jun 2023

Large Multimodal Models: Notes on CVPR 2023 Tutorial Microsoft Jun 2023

Very Useful Source Code

OpenAI Cookbook
Llama Index
PrivateGPT
Llama.cpp

Main LLM Development Tips, Updated June 21, 2023

1. Data is still king - LLMs are great but if you don't have quality clean data you won’t go far.

2. Smaller models can be just as good as larger general models at specific tasks. And cheaper!

3. Fine-tuning is becoming cheaper.

4. Evaluation of LLMs is very hard - feels very subjective still.

5. Managed APIs are expensive.

6. "Traditional" ML isn't going anywhere.

7. Memory matters - for both serving and training.

8. Information retrieval w/ vector databases is becoming standard pattern.

9. Start w/ prompt engineering and push that to its limits before fine-tuning w/ smaller models.

10. Use agents/chains only when necessary. They are unruly.

11. Latency is critical for a good user experience.

12. Privacy is critical.

About

Gather AIGC most useful tools, materials, publications and reports