hongyunnchen / dl4math

Reading list for research topics in mathematical reasoning and artificial intelligence

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DL4MATH - Reading List

🧰 Resources

Related Surveys

  • A Survey of Question Answering for Math and Science Problem, arXiv:1705.04530 [paper]
  • The Gap of Semantic Parsing: A Survey on Automatic Math Word Problem Solvers, TPAMI 2019 [paper]
  • Representing Numbers in NLP: a Survey and a Vision, NACL 2021 [paper]
  • Survey on Mathematical Word Problem Solving Using Natural Language Processing, ICIICT 2021 [paper]
  • A Survey in Mathematical Language Processing, arXiv:2205.15231 [paper]
  • Partial Differential Equations Meet Deep Neural Networks: A Survey, arXiv:2211.05567 [paper]

Workshops

  • The 1st MATH-AI Workshop: the Role of Mathematical Reasoning in General Artificial Intelligence, ICLR 2021 [website]
  • Math AI for Education: Bridging the Gap Between Research and Smart Education (MATHAI4ED)], NeurIPS 2021 [website]
  • The 1st Workshop on Mathematical Natural Language Processing, EMNLP 2022 [website]
  • πŸ”₯ The 2nd MATH-AI Workshop: Toward Human-Level Mathematical Reasoning, NeurIPS 2022 [website]

Talks

  • Computer Scientist Explains One Concept in 5 Levels of Difficulty, 2022 [YouTube]

🎨 Mathematical Reasoning Benchmarks

Math Word Problems (MWP)

  • [AI2] Learning to Solve Arithmetic Word Problems with Verb Categorization, EMNLP 2014 [paper]
  • [Alg514] Learning to automatically solve algebra word problems, ACL 2014 [paper]
  • [IL] Reasoning about Quantities in Natural Language, TACL 2015 [paper]
  • [SingleEQ] Parsing Algebraic Word Problems into Equations, TACL 2015 [paper]
  • [DRAW] Draw: A challenging and diverse algebra word problem set, 2015 [paper]
  • [Dolphin1878] Automatically solving number word problems by semantic parsing and reasoning, EMNLP 2015 [paper]
  • [Dolphin18K] How well do computers solve math word problems? large-scale dataset construction and evaluation, ACL 2016 [paper]
  • [MAWPS] MAWPS: A math word problem repository, NAACL-HLT 2016 [paper]
  • [AllArith] Unit dependency graph and its application to arithmetic word problem solving, AAAI 2017 [paper]
  • [DRAW-1K] Annotating Derivations: A New Evaluation Strategy and Dataset for Algebra Word Problems, ACL 2017 [paper]
  • [Math23K] Deep neural solver for math word problems, EMNLP 2017 [paper]
  • [AQuA-RAT] Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems, ACL 2017 [paper]
  • [Aggregate] Mapping to Declarative Knowledge for Word Problem Solving, TACL 2018 [paper]
  • [MathQA] MathQA: Towards interpretable math word problem solving with operation-based formalisms, NAACL-HLT 2019 [paper]
  • [ASDiv] A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers, ACL 2020 [paper]
  • [Ape210K] Ape210k: A large-scale and template-rich dataset of math word problems, arXiv:2009.11506 [paper]
  • [SVAMP] Are NLP Models really able to Solve Simple Math Word Problems?, NAACL-HIT 2021 [paper]
  • πŸ”₯ [IconQA] IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning, NeurIPS 2021 (Datasets and Benchmarks)] [paper]
  • πŸ”₯ [GSM8K] Training verifiers to solve math word problems, arXiv:2110.14168 [paper]
  • [MathQA-Python] Program synthesis with large language models, arXiv:2108.07732 [paper]
  • [ArMATH] ArMATH: a Dataset for Solving Arabic Math Word Problems, LREC 2022 [paper]
  • πŸ”₯ [TabMWP] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, arXiv:2209.14610, 2022 [paper]

Geometry Problem Solving (GPS)

  • [GEOS] Solving geometry problems: Combining text and diagram interpretation, EMNLP 2015 [paper]
  • [GeoShader] Synthesis of solutions for shaded area geometry problems, The Thirtieth International Flairs Conference, 2017 [paper]
  • [GEOS-OS] Learning to solve geometry problems from natural language demonstrations in textbooks, Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 2017 [paper]
  • [GEOS++] From textbooks to knowledge: A case study in harvesting axiomatic knowledge from textbooks to solve geometry problems, EMNLP 2017 [paper]
  • [GeoQA] GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning, Findings of ACL 2021 [paper]
  • πŸ”₯ [Geometry3K] Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
  • πŸ”₯ [UniGeo] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]
  • [GeoRE] GeoRE: A Relation Extraction Dataset for Chinese Geometry Problems, NeurIPS 2021 MATHAI4ED Workshop [paper]
  • [GeoQA+] An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding, ICCL 2022 [paper]

Theorem Proving (TP)

  • [HOList] HOList: An environment for machine learning of higher order logic theorem proving, ICML 2019 [paper]
  • [INT] INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving, ICLR 2021 [paper]
  • πŸ”₯ NaturalProofs: Mathematical Theorem Proving in Natural Language, NeurIPS 2021 (Datasets and Benchmarks) [paper]
  • πŸ”₯ [MiniF2F] MiniF2F: a cross-system benchmark for formal Olympiad-level mathematics, ICLR 2022 [paper]

Math Question Answering (MathQA)

  • [Fermi] How Much Coffee Was Consumed During EMNLP 2019? Fermi Problems: A New Reasoning Challenge for AI, EMNLP 2020 [paper]
  • [TAT-QA] TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance, ACL-JCNLP 2021 [paper]
  • [FinQA] FinQA: A Dataset of Numerical Reasoning over Financial Data, EMNLP 2021 [paper]
  • [NumGLUE] NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks, ACL 2022 [paper]
  • [MultiHiertt] MultiHiertt: Numerical Reasoning over Multi Hierarchical Tabular and Textual Data, ACL 2022 [paper]
  • πŸ”₯ Lila: A Unified Benchmark for Mathematical Reasoning, EMNLP 2022 [paper]

Other Math Tasks

  • [TextbookQA] Are You Smarter Than A Sixth Grader? Textbook Question Answering for Multimodal Machine Comprehension, CVPR 2017 [paper]
  • [Figureqa] Figureqa: An annotated figure dataset for visual reasoning, arXiv:1710.07300 [paper]
  • [Dvqa] Dvqa: Understanding data visualizations via question answering, CVPR 2018 [paper]
  • [Raven] Raven: A dataset for relational and analogical visual reasoning, CVPR 2019 [paper]
  • [MNS] Machine Number Sense: A Dataset of Visual Arithmetic Problems for Abstract and Relational Reasoning, AAAI 2020 [paper]
  • [P3] Programming Puzzles, NeurIPS 2021 (Datasets and Benchmarks) [paper]
  • [IsarStep] IsarStep: a Benchmark for High-level Mathematical Reasoning, ICLR 2021 [paper]
  • [PhysNLU] PhysNLU: A Language Resource for Evaluating Natural Language Understanding and Explanation Coherence in Physics, 2022 [paper]
  • [ScienceQA] Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering, NeurIPS 2022 [paper]
  • [PGDP5K] PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems, arXiv:2205.0994 [paper]
  • [ConvFinQA] ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering, arXiv:2210.03849 [paper]
  • [APPS, code generation] Measuring Coding Challenge Competence With APPS, NeurIPS 2021 (Datasets and Benchmarks) [paper]

🧩 Neural Networks for Math

Neural Math Word Problem Solving

  • [symbolic reasoning] Semantic parsing of pre-university math problems, ACL 2017 [paper]
  • [Equation templates] Learning fine-grained expressions to solve math word problems, EMNLP 2017 [paper]
  • [Dependency Graph] Unit dependency graph and its application to arithmetic word problem solving, AAAI 2017 [paper]
  • Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems, ACL 2017 [paper]
  • [expression tree] Translating a math word problem to an expression tree, EMNLP 2018 [paper]
  • [logical reasoning] Mapping to declarative knowledge for word problem solving, TACL 2018 [paper]
  • [equation templates] Template-based math word problem solvers with recursive neural networks, AAAI 2019 [paper]
  • [expression tree] Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems, EMNLP 2020 [paper]
  • [Weak Supervision] Learning by Fixing: Solving Math Word Problems with Weak Supervision, AAAI 2021 [paper]
  • Solving Math Word Problems with Teacher Supervision, IJCAI 2021 [paper]
  • Analogical Math Word Problems Solving with Enhanced Problem-Solution Association, EMNLP 2022 [paper]

Neural Geometry Solving

  • Synthesis of geometry proof problems, AAAI 2014 [paper]
  • Diagram understanding in geometry questions, AAAI 2014 [paper]
  • Retrieving geometric information from images: the case of hand-drawn diagrams, Data Mining and Knowledge Discovery 2017 [paper]
  • Automatic understanding and formalization of natural language geometry problems using syntax-semantics models, International Journal of Innovative Computing, Information and Control 2018 [paper]
  • A Framework for Solving Explicit Arithmetic Word Problems and Proving Plane Geometry Theorems, International Journal of Pattern Recognition and Artificial Intelligence 2019 [paper]
  • [Knowledge] Discourse in multimedia: A case study in extracting geometry knowledge from textbooks, Computational Linguistics, 2020 [paper]

Neural Theorem Proving

  • DeepMath - Deep Sequence Models for Premise Selection, NeurIPS 2016 [paper]
  • Deep network guided proof search, arXiv:1701.06972 [paper]
  • Graph representations for higher-order logic and theorem proving, AAAI 2020 [paper]
  • Neural Theorem Proving on Inequality Problems, AITP 2020 [paper]
  • Latent Action Space for Efficient Planning in Theorem Proving, 2021 [paper]
  • Learning to Give Checkable Answers with Prover-Verifier Games, arXiv:2108.12099 [paper]
  • REFACTOR: Learning to Extract Theorems from Proofs, 2022 [paper]

Neural Networks for MathQA

  • Combining retrieval, statistics, and inference to answer elementary science questions, AAAI 2016 [paper]

Neural Networks for Other Math Tasks

  • πŸ”₯ Advancing mathematics by guiding human intuition with AI, Nature 2021 [paper]
  • Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics, AAAI 2022 [paper]
  • πŸ”₯ Discovering faster matrix multiplication algorithms with reinforcement learning, Nature 2022 [paper]

πŸ“œ Pre-trained Models for Math

Pre-trained Language Models (PTLMs)

  • [GPT-2] Language models are unsupervised multitask learners, 2019 [paper]
  • [UnifiedQA] UNIFIEDQA: Crossing Format Boundaries with a Single QA System, EMNLP 2020 [paper]

Language Models for MWPs

  • Lime: Learning inductive bias for primitives of mathematical reasoning, ICML 2021 [paper]
  • πŸ”₯ [IconQA] IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning, NeurIPS 2021 (Datasets and Benchmarks)] [paper]
  • MWP-BERT: Numeracy-Augmented Pre-training for Math Word Problem Solving, Findings of NAACL 2022 [paper]
  • TAPEX: Table Pre-training via Learning a Neural SQL Executor, ICLR 2022 [paper]
  • Insights into Pre-training via Simpler Synthetic Tasks, NeurIPS 2022 [paper]
  • Learning from Self-Sampled Correct and Partially-Correct Programs, arXiv:2205.14318 [paper]
  • Solving quantitative reasoning problems with language models, arXiv:2206.14858 [paper]

Language Models for Geometry Solvers

  • πŸ”₯ [Inter-GPS] Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning, ACL 2021 [paper]
  • πŸ”₯ [UniGeo] UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression, EMNLP 2022 [paper]

Language Models for Theorem Proving

  • Generative Language Modeling for Automated Theorem Proving, arXiv:2009.03393 [paper]
  • HyperTree Proof Search for Neural Theorem Proving, arXiv:2205.11491 [paper]
  • Proof Artifact Co-training for Theorem Proving with Language Models, ICLR 2022 [paper]
  • Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers, NeurIPS 2022 [paper]
  • [LISA] LISA: Language models of ISAbelle proofs, AITP 2021 [paper]

Language Models for MathQA

  • From 'F' to 'A' on the NY Regents Science Exams: An Overview of the Aristo Project, arXiv:1909.01958 [paper]
  • Injecting Numerical Reasoning Skills into Language Models, ACL 2020 [paper]
  • Injecting Numerical Reasoning Skills into Knowledge Base Question Answering Models, arXiv:2112.06109 [paper]

Language Models for Other Math Tasks

  • Linear algebra with transformers, TMLR 2022 [paper]
  • Show Your Work: Scratchpads for Intermediate Computation with Language Models, arXiv:2112.00114 [paper]

🌠 In-context Learning with LLMs for Math

Large Language Models (100B+)]

  • πŸ”₯ [GPT-3] Language models are few-shot learners, NeurIPS 2020 [paper]
  • πŸ”₯ [Codex] Evaluating large language models trained on code, arXiv:2107.03374 [paper]
  • πŸ”₯ [PaLM] PaLM: Scaling Language Modeling with Pathways, arXiv:2204.02311 [paper]

Prompt Learning for MWPs

  • Calibrate before use: Improving few-shot performance of language models, ICML 2021 [paper]
  • Emergent Abilities of Large Language Models, Transactions on Machine Learning Research 2022 [paper]
  • Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity, ACL 2022 [paper]
  • What Makes Good In-Context Examples for GPT-3? The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures 2022 [paper]
  • πŸ”₯ [CoT] Chain of thought prompting elicits reasoning in large language models, arXiv:2201.11903 [paper]
  • πŸ”₯ Self-consistency improves chain of thought reasoning in language models, arXiv:2203.11171 [paper]
  • Least-to-Most Prompting Enables Complex Reasoning in Large Language Models, arXiv:2205.10625 [paper]
  • πŸ”₯ [Zero-shot CoT] Large Language Models are Zero-Shot Reasoners, preprint arXiv:2205.11916 [paper]
  • πŸ”₯ [CoT GPT-3 + RL] Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning, arXiv:2209.14610, 2022 [paper]
  • πŸ”₯ Language models are multilingual chain-of-thought reasoners, arXiv:2210.03057 [paper]
  • Automatic Chain of Thought Prompting in Large Language Models, arXiv:2210.03493 [paper]
  • Large Language Models are few(1)]-shot Table Reasoners, arXiv:2210.06710 [paper]
  • Challenging BIG-Bench tasks and whether chain-of-thought can solve them, arXiv:2210.09261 [paper]
  • Scaling Instruction-Finetuned Language Models, arXiv:2210.11416 [paper]

Prompt Learning for Proving

  • [PaLM, Codex] Autoformalization with Large Language Models, NeurIPS 2022 [paper]
  • [Codex] Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs, arXiv:2210.12283 [paper]

Prompt Learning for MathQA

  • πŸ”₯ A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level, PNAS 2022 [paper]
  • πŸ”₯ Minerva: Solving Quantitative Reasoning Problems with Language Models, NeurIPS 2022 [paper]

♣️ Other Methods for Math

Early Methods

  • Empirical explorations of the geometry theorem machine, Western Joint IRE-AIEE-ACM Computer Conference 1960 [paper]
  • Basic principles of mechanical theorem proving in elementary geometries, Journal of Automated Reasoning 1986 [paper]
  • Automated generation of readable proofs with geometric invariants, Journal of Automated Reasoning 1996 [paper]
  • My computer is an honor studentβ€”but how intelligent is it? Standardized tests as a measure of AI, AI Magazine 2016 [paper]

Symbolic Methods

  • Learning pipelines with limited data and domain knowledge: A study in parsing physics problems, NeurIPS 2018 [paper]
  • Automatically proving plane geometry theorems stated by text and diagram, International Journal of Pattern Recognition and Artificial Intelligence 2019 [paper]

Pure ML Methods

  • Classification and Clustering of arXiv Documents, Sections, and Abstracts, Comparing Encodings of Natural and Mathematical Language, JCDL 2020 [paper]

About

Reading list for research topics in mathematical reasoning and artificial intelligence

License:MIT License