rssqian / AIGC_Resources

Gather AIGC most useful tools, materials, publications and reports

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AIGC_Resources

Gather AIGC most useful tools, materials, publications and reports

Foundation Papers

Title Model Publication Date Code Organization
Attention Is All You Need Transformer Dec 2017 Google
Improving Language Understanding by Generative Pre-Training GPT Jun 2018 OpenAI
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Bert May 2019 Google
On the Opportunities and Risks of Foundation Models Jul 2022 Center for Research on Foundation Models (CRFM) & Stanford Institute for Human-Centered Artificial Intelligence (HAI)
Language Models are Unsupervised Multitask Learners GPT-2 Dec 2020 Code OpenAI
Learning Transferable Visual Models From Natural Language Supervision CLIP Feb 2021 Code OpenAI
Evaluating Large Language Models Trained on Code Codex Jul 2021 OpenAI
Competition-Level Code Generation with AlphaCode AlphaCode Feb 2022 DeepMind
Adding Conditional Control to Text-to-Image Diffusion Models ControlNet Feb 2022 Code Stanford University
Codegen: an open large language model for code with multi-turn program synthesis CodeGen March 2022 Code Salesforce

Recent Papers

Title Short Name Date Institution Code (if available)
Training language models to follow instructions with human feedback Instruct GPT March 2022 OpenAI
High-Resolution Image Synthesis with Latent Diffusion Models Stable Diffusion April 2022 Heidelberg University & Runway
Hierarchical Text-Conditional Image Generation with CLIP Latents Dalle 2 April 2022 OpenAI
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback RLHF Jun 2022 Anthropic
Language Models are Few-Shot Learners GPT-3 Jun 2022 OpenAI
WebGPT: Browser-assisted question-answering with human feedback WebGPT Jun 2022 OpenAI
Robust Speech Recognition via Large-Scale Weak Supervision Whisper Sep 2022 OpenAI Code
LLaMA: Open and Efficient Foundation Language Models LLaMA Feb 2023 Meta
Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models Visual ChatGPT March 2023 Microsoft Code
Consistency Models March 2023 OpenAI Code
Language Is Not All You Need: Aligning Perception with Language Models Aligning March 2023 Microsoft
GPT-4 Technical Report GPT-4 March 2023 OpenAI
BloombergGPT: A Large Language Model for Finance BloombergGPT March 2023 Bloomberg
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face HuggingGPT April 2023 Microsoft Code
Segment Anything SAM April 2023 Meta Code
Instruction Tuning with GPT-4 April 2023 Stanford & Google Code
Generative Agents: Interactive Simulacra of Human Behavior April 2023 Microsoft
Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca Chinese LLaMa April 2023 Microsoft Code
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling Pythia April 2023 MMany
DINOv2: Learning Robust Visual Features without Supervision DINOv2 April 2023 Meta
Shap·E: Generating Conditional 3D Implicit Functions Shap·E May 2023 OpenAI Code
StarCoder: may the source be with you StarCoder May 2023 Many Code
Large Language Models are Zero-Shot Rankers for Recommender Systems May 2023 Renmin University, Wechat, US San Diego
IMAGEBIND: One Embedding Space To Bind Them All IMAGEBIND May 2023 Meta
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold DragGAN May 2023 Many Code
VOYAGER: An Open-Ended Embodied Agent with Large Language Models VOYAGER May 2023 NVIDIA and Many Code
Large Language Models as Tool Makers May 2023 Deepmind, Princeton University, Stanford University Code
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions SELF-INSTRUCT May 2023 Many Code
LIMA: Less Is More for Alignment LIMA May 2023 Meta, CMU and Many
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction GPT4Tools May 2023 Tsinghua University and Many Code
SELF-INSTRUCT: Aligning Language Models with Self-Generated Instructions SELF-INSTRUCT May 2023 Many Code
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild UniControl May 2023 Salesforce, Stanford University, Northeastern University Code
QLORA: Efficient Finetuning of Quantized LLMs QLORA May 2023 University of Washington Code
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback AlpacaFarm May 2023 Stanford University
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft STEVE-1 June 2023 University of Toronto
MIND2WEB: Towards a Generalist Agent for the Web MIND2WEB June 2023 Ohio State Code
StyleDrop: Text-to-Image Generation in Any Style StyleDrop June 2023 Google Code
Simple and Controllable Music Generation MusicGen June 2023 Meta Code
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 Orca June 2023 Microsoft
TryOnDiffusion: A Tale of Two UNets TryOnDiffusion June 2023 University of Washington, Google
WizardLM: Empowering Large Language Models to Follow Complex Instructions WizardLM June 2023 Microsoft, Peking University Code
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Voicebox June 2023 Meta
DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing DragDiffusion June 2023 National University of Singapore & ByteDance

Important Reports

Report Link Date Institution
Stanford AI index Report 2023 Link Stanford
Sparks of Artificial General Intelligence: Early experiments with GPT-4 Link Microsoft
A Survey of Large Language Models Link April 2023 Renmin University, China & University of Montreal, Canada
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond Link Amazon & many others
A Cookbook of Self-Supervised Learning Link Meta & many others
Let’s Verify Step by Step Link May 2023 OpenAI
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering Link May 2023 Kyung Hee University and many
A Comprehensive Survey on Segment Anything Model for Vision and Beyond Link May 2023 Hong Kong University of Science and Technology and many
On the Design Fundamentals of Diffusion Models: A Survey Link June 2023 Durham University
Open LLM Leaderboard Link Update in real time Huggingface
A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering Link May 2023 JKyung Hee University and many

Important Projects

MidJourney

Alpaca Open Source Code Stanford March 2023

Dolly Open Source Code Databricks March 2023 Note: OK to use commercially

Vicuna Open Source Code UC Berkeley, CMU, Stanford, and UC San Diego March 2023

ChatPDF March 2023

Bard Google March 2023

Langchain Community Effort March 2023

Microsoft 365 Copilot Microsoft March 2023

AutoGPT Community Effort April 2023

Grounded SAM IDEA April 2023

DeepSpeed Chat Microsoft April 2023

AgentGPT Community Effort April 2023

MiniGPT King Abdullah University of Science and Technology April 2023

DeepFloyd IF Stability.ai April 2023

Open Llama Berkeley May 2023

SoftVC VITS Singing Voice Conversion Community May 2023

Falcon Tii May 2023

UltraLM Tsinghua University June 2023

AIGC Courses

COS597G Understanding Large Language Models Princeton 2022

CS324 Large Language Models Stanford 2023

ChatGPT, LangChain and DS Courses Deeplearning.ai Jun 2023

Large Multimodal Models: Notes on CVPR 2023 Tutorial Microsoft Jun 2023

Very Useful Source Code

OpenAI Cookbook
Llama Index
PrivateGPT
Llama.cpp

Main LLM Development Tips, Updated June 21, 2023

1. Data is still king - LLMs are great but if you don't have quality clean data you won’t go far.

2. Smaller models can be just as good as larger general models at specific tasks. And cheaper!

3. Fine-tuning is becoming cheaper.

4. Evaluation of LLMs is very hard - feels very subjective still.

5. Managed APIs are expensive.

6. "Traditional" ML isn't going anywhere.

7. Memory matters - for both serving and training.

8. Information retrieval w/ vector databases is becoming standard pattern.

9. Start w/ prompt engineering and push that to its limits before fine-tuning w/ smaller models.

10. Use agents/chains only when necessary. They are unruly.

11. Latency is critical for a good user experience.

12. Privacy is critical.

About

Gather AIGC most useful tools, materials, publications and reports