AI-Mou / Awesome-Reasoning-Foundation-Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Awesome-Reasoning-Foundation-Models Awesome

overview

A curated list of awesome large AI models, or foundation models, for reasoning. We organize the current foundation models into three categories: language foundation models, vision foundation models, and multimodal foundation models. Further, we elaborate the foundation models in reasoning tasks, including commonsense, mathematical, logical, causal, visual, audio, multimodal, embodied reasoning, etc. Reasoning techniques are also summarized.

We welcome contributions to this repository to add more resources. Please submit a pull request if you want to contribute!

Table of Contents

0 Survey

overview

This repository is primarily based on the following paper:

Reasoning with Foundation Models: Concepts, Methodologies, and Outlook

Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng, Yue Wu, Wenhai Wang, Junsong Chen, Xiaozhe Ren, Jie Fu, Junxian He, Wu Yuan, Qi Liu, Xihui Liu, Yu Li, Hao Dong, Yu Cheng, Ming Zhang, Pheng Ann Heng, Jifeng Dai, Ping Luo, Jingdong Wang, Jirong Wen, Xipeng Qiu, Yike Guo, Hui Xiong, Qun Liu, and Zhenguo Li

If you find this repository helpful, please consider citing:

@article{,
  title={Reasoning with Foundation Models: Concepts, Methodologies, and Outlook},
  author={},
  journal={arXiv preprint arXiv:},
  year={2023}
}

1 Relevant Surveys and Links

  • The Rise and Potential of Large Language Model Based Agents: A Survey - [arXiv] [Link]

  • Multimodal Foundation Models: From Specialists to General-Purpose Assistants - [arXiv]

  • A Survey on Multimodal Large Language Models - [arXiv] [Link]

  • Interactive Natural Language Processing - [arXiv] [Link]

  • A Survey of Large Language Models - [arXiv] [Link]

  • Self-Supervised Multimodal Learning: A Survey - [arXiv] [Link]

  • Large AI Models in Health Informatics: Applications, Challenges, and the Future - [arXiv] [Paper] [Link]

  • Towards Reasoning in Large Language Models: A Survey - [arXiv] [Paper] [Link]

  • Reasoning with Language Model Prompting: A Survey - [arXiv] [Paper] [Link]

  • Awesome Multimodal Reasoning - [Link]

2 Foundation Models

foundation_models

2.1 Language Foundation Models

2.2 Vision Foundation Models

2.3 Multimodal Foundation Models

2.4 Reasoning Applications

3 Reasoning Tasks

3.1 Commonsense Reasoning

3.1.1 Commonsense Question and Answering (QA)

3.1.2 Physical Commonsense Reasoning

3.1.3 Spatial Commonsense Reasoning

Benchmarks, Datasets, and Metrics

3.2 Mathematical Reasoning

3.2.1 Arithmetic Reasoning

3.2.2 Geometry Reasoning

3.2.3 Theorem Proving

3.2.4 Scientific Reasoning

Benchmarks, Datasets, and Metrics

3.3 Logical Reasoning

3.3.1 Propositional Logic

  • 2022/09 | Propositional Reasoning via Neural Transformer Language Models - [Paper]

3.3.2 Predicate Logic

Benchmarks, Datasets, and Metrics

3.4 Causal Reasoning

3.4.1 Counterfactual Reasoning

Benchmarks, Datasets, and Metrics

3.5 Visual Reasoning

3.5.1 3D Reasoning

Benchmarks, Datasets, and Metrics

3.6 Audio Reasoning

3.6.1 Speech

Benchmarks, Datasets, and Metrics

3.7 Multimodal Reasoning

3.7.1 Alignment

3.7.2 Generation

3.7.3 Multimodal Understanding

Benchmarks, Datasets, and Metrics

3.8 Embodied Reasoning

3.8.1 Introspective Reasoning

3.8.2 Extrospective Reasoning

3.8.3 Multi-agent Reasoning

3.8.4 Driving Reasoning

Benchmarks, Datasets, and Metrics

3.9 Other Tasks and Applications

3.9.1 Theory of Mind (ToM)

3.9.2 LLMs for Weather Prediction

  • 2022/09 | MetNet-2 | Deep learning for twelve hour precipitation forecasts - [Paper]

  • 2023/07 | Pangu-Weather | Accurate medium-range global weather forecasting with 3D neural networks - [Paper]

3.9.3 Abstract Reasoning

3.9.4 Defeasible Reasoning

3.9.5 Medical Reasoning

3.9.6 Bioinformatics Reasoning

3.9.7 Long-Chain Reasoning

4 Reasoning Techniques

4.1 Pre-Training

4.1.1 Data

a. Data - Text
b. Data - Image
c. Data - Multimodality

4.1.2 Network Architecture

a. Encoder-Decoder
b. Decoder-Only
c. CLIP Variants
d. Others

4.2 Fine-Tuning

4.2.1 Data

4.2.2 Parameter-Efficient Fine-tuning

a. Adapter Tuning
b. Low-Rank Adaptation
c. Prompt Tuning
d. Partial Parameter Tuning
e. Mixture-of-Modality Adaption

4.3 Alignment Training

4.3.1 Data

a. Data - Human
b. Data - Synthesis

4.3.2 Training Pipeline

a. Online Human Preference Training
b. Offline Human Preference Training

4.4 Mixture of Experts (MoE)

4.5 In-Context Learning

4.5.1 Demonstration Example Selection

a. Prior-Knowledge Approach
b. Retrieval Approach

4.5.2 Chain-of-Thought

a. Zero-Shot CoT
b. Few-Shot CoT
c. Multiple Paths Aggregation

4.5.3 Multi-Round Prompting

a. Learned Refiners
b. Prompted Refiners

4.6 Autonomous Agent

About

License:MIT License