Wind-Gone / awesome-ai4db-paper

AI4DB Papers

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AI4DB Paper Sets

Welcom new PR, please conform to the commited rules: paperName(with link) [MeetingName Year]

If the paper has the open-souce code, please supply its github links in Meeting

Learning-based Query Optimization

Cardinality Estimation

Survey

  1. Cardinality Estimation: An Experimental Survey [VLDB 17]
  2. Are We Ready For Learned Cardinality Estimation? [VLDB 21]
  3. Cardinality Estimation in DBMS: A Comprehensive Benchmark Evaluation [VLDB 21]
  4. Learned cardinality estimation: A design space exploration and a comparative evaluation [VLDB 22]
  5. Learned Cardinality Estimation: An In-depth Study [SIGMOD 22]

Query-Driven

Single-Table
  1. Selectivity estimation for range predicates using lightweight models [VLDB 19]
  2. Deep learning models for selectivity estimation of multiattribute queries [SIGMOD 20]
Multi-Tables
  1. Learned Cardinalities: Estimating Correlated Joins with Deep Learning [CIDR 2019]
  2. An End-to-End Learning-based Cost Estimator [VLDB 19]
  3. Flow-Loss: Learning Cardinality Estimates That Matter [VLDB 21]
  4. Speeding Up End-to-end Query Execution via Learning-based Progressive Cardinality Estimation [SIGMOD 23]
  5. Robust Query Driven Cardinality Estimation under Changing Workloads[VLDB 23]
  6. AutoCE: An Accurate and Efficient Model Advisor for Learned Cardinality Estimation [ICDE 23]
  7. Asm: Harmonizing autoregressive model, sampling, and multi-dimensional statistics merging for cardinality estimation [SIGMOD 24]

Data-Driven

Single-Table
  1. Self-tuning, gpu-accelerated kernel density models for multidimensional selectivity estimation [SIGMOD 15]
  2. Deep Unsupervised Cardinality Estimation [VLDB 19]
  3. Quicksel: Quick selectivity learning with mixture models [SIGMOD 20]
  4. Pre-training Summarization Models of Structured Datasets for Cardinality Estimation [VLDB 22]
Multi-Tables
  1. DeepDB: Learn from Data, not from Queries! [VLDB 20]
  2. NeuroCard: One Cardinality Estimator for All Tables [VLDB 21]
  3. FLAT: Fast, Lightweight and Accurate Method for Cardinality Estimation [VLDB 21]
  4. BayesCard: Revitilizing Bayesian Frameworks for Cardinality Estimation [aiXiv 21]
  5. Glue: Adaptively Merging Single Table Cardinality to Estimate Join Query Size [aiXiv 21]
  6. Fauce: fast and accurate deep ensembles with uncertainty for cardinality estimation [VLDB 21]
  7. FACE: a normalizing flow based cardinality estimator [VLDB 22]
  8. FactorJoin: A New Cardinality Estimation Framework for Join Queries [SIGMOD 22] (Bounded)

Hybrid

  1. A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation [SIGMOD 21]

Pretrain

  1. PRICE: A Pretrained Model for Cross-Database Cardinality Estimation [arXiv 24]

Plan Hints

  1. Bao: Making Learned Query Optimization Practical [SIMOD 21]

Cost Model

  1. Cost-based or Learning-based? A Hybrid Query Optimizer for Query Plan Selection [VLDB 22]
  2. Lero: A Learning-to-Rank Qery Optimizer [VLDB 23]

SQL Embedding

  1. PreQR: Pre-training Representation for SQL Understanding [SIGMOD 22]
  2. LearnedSQLGen: Constraint-aware SQL Generation using Reinforcement Learning (SIGMOD 2022)

Join Order

  1. Learning to Optimize Join queries With Deep Reinforcement Learning [SIGMOD 16]
  2. Deep Reinforcement Learning for Join Order Enumeration[arXiv 18]
  3. Reinforcement Learning with Tree-LSTM for Join Order Selection [ICDE 20]

Query Rewrite

  1. A Learned Query Rewrite System using Monte Carlo Tree Search [VLDB 22]

Database Traditional Technology

Learning-based Index Design

Single-dimensional

  1. The Case for Learned Index Structures [SIGMOD 18]
  2. FITing-Tree: A Data-aware Index Structure [SIGMOD 19]
  3. ALEX: An Updatable Adaptive Learned Index [aiXiv 20]
  4. The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds [VLDB 20]
  5. RadixSpline: a single-pass learned index [aiDM 20]
  6. Why Are Learned Indexes So Effective? [ICML 20]
  7. A Pluggable Learned Index Method via Sampling and Gap Insertion [aiXiv 21]
  8. Updatable Learned Index with Precise Positions [VLDB 21]
  9. The next 50 years in database indexing or: the case for automatically generated index structures [VLDB 21]
  10. Tuning Hierarchical Learned Indexes on Disk and Beyond [SIGMOD 22]
  11. APEX: A High-Performance Learned Index on Persistent Memory [VLDB 22]
  12. FINEdex: A Fine-grained Learned Index Scheme for Scalable and Concurrent Memory Systems [VLDB 22]
  13. Are Updatable Learned Indexes Ready? [VLDB 22]
  14. CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm [VLDB 22]
  15. NFL: Robust Learned Index via Distribution Transformation [VLDB 22]
  16. Cutting Learned Index into Pieces: An In-depth Inquiry into Updatable Learned Indexes [ICDE 23]

Multi-dimensional

  1. Learning Multi-dimensional Indexes [SIGMOD 20]
  2. LISA: A Learned Index Structure for Spatial Data [SIGMOD 20]
  3. Effectively Learning Spatial Indices [VLDB 20]
  4. The ML-Index: A Multidimensional, Learned Index for Point, Range, and Nearest-Neighbor Queries [EDBT 20]
  5. Tsunami: A Learned Multi-dimensional Index for Correlated Data and Skewed Workloads [VLDB 21]
  6. NEIST: a Neural-Enhanced Index for Spatio-Temporal Queries [TKDE 21]
  7. RW-Tree: A Learned Workload-aware Framework for R-tree Construction [ICDE 22]

Learning-based Configuration Advisor

Index Advisor

  1. The Case for Automatic Database Administration using Deep Reinforcement Learning [arXiv 18]
  2. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations [SIGMOD 19]
  3. Online Index Selection Using Deep Reinforcement Learning for a Cluster Database [ICDEW 20]
  4. SMARTIX: A database indexing agent based on reinforcement learning [Applied Intelligence 20]
  5. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms [VLDB 20]
  6. An Index Advisor Using Deep Reinforcement Learning [CIKM 20]
  7. Automated Database Indexing Using Model-Free Reinforcement Learning [ICAPS 20]
  8. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees [ICDE 21]
  9. Index selection for NoSQL database with deep reinforcement learning [Information Sciences 21]
  10. MANTIS: Multiple Type and Attribute Index Selection using Deep Reinforcement Learning [IDEAS 21]
  11. AutoIndex: An Incremental Index Management System for Dynamic Workloads [ICDE 22]
  12. SWIRL: Selection of Workload-aware Indexes using Reinforcement Learning [EDBT 22]
  13. Indexer++: Workload-Aware Online Index Tuning with Transformers and Reinforcement Learning [SAC 22]
  14. Budget-aware Index Tuning with Reinforcement Learning [SIGMOD 22]
  15. ISUM: Efficiently Compressing Large and Complex Workloads for Scalable Index Tuning [SIGMOD 22]
  16. DISTILL: low-overhead data-driven techniques for filtering and costing indexes for scalable index tuning [VLDB 22]
  17. HMAB: Self-Driving Hierarchy of Bandits for Integrated Physical Database Design Tuning [VLDB 22]
  18. SmartIndex: An Index Advisor with Learned Cost Estimator [CIKM 22]
  19. Learned Index Benefits: Machine Learning Based Index Performance Estimation [VLDB 23]

Database Self-Tuning

  1. Automatic Database Management System Tuning Through Large-scale Machine Learning [SIGMOD 17]
  2. Deploying a Steered Query Optimizer in Production at Microsof [SIGMOD 22]
  3. Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data [SIGMOD 23]
  4. AutoSteer: Learned Query Optimization for Any SQL Database [SIGMOD 23]
  5. Auto-WLM: Machine Learning Enhanced Workload Management in Amazon Redshif [SIGMOD 23]

LLM

  1. GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian Optimization [VLDB 24]
  2. D-Bot: Database Diagnosis System using Large Language Models [VLDB 24]
  3. LLMTune: Accelerate Database Knob Tuning with Large Language Models [VLDB 24]

About

AI4DB Papers

License:MIT License