A curated paper list of awesome Industry databases, frameworks, ressources, tools and other awesomeness, for data engineers.
Welcome new PR, please conform to the committed rules: paperName(with link) [MeetingName Year]
If the paper has the open-source code, please supply its github links in Meeting.
- Progressive Partitioning for Parallelized Query Execution in Google’s Napa [VLDB 23]
- Keep Your Distributed Data Warehouse Consistent at a Minimal Cost [SIGMOD 23]
- Amazon Redshift and the Case for Simpler Data Warehouses [SIGMOD 15]
- Amazon Redshift Re-invented [SIGMOD 22]
- The Story of AWS Glue [VLDB 23]
- Auto-WLM: ML-enhanced workload management in Amazon Redshift [SIGMOD 23]
- Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service [OSDI 22]
- Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent [VLDB 23]
- EmbedX: A Versatile, Efficient and Scalable Platform to Embed Both Graphs and High-Dimensional Sparse Data [VLDB 23]
- Towards General and Efficient Online Tuning for Spark [VLDB 23]
- Eigen: End-to-end Resource Optimization for Large-Scale Databases on the Cloud [VLDB 23]
- Anser: Adaptive Information Sharing Framework of AnalyticDB [VLDB 23]
- Lindorm TSDB: A Cloud-native Time-series Database for Large-scale Monitoring Systems [VLDB 23]
- Vineyard: Optimizing Data Sharing in Data-Intensive Analytics [SIGMOD 23]
- PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent Reads [VLDB 23]
- PolarDB-IMCI:A Cloud-Native HTAP Database System at Alibaba [SIGMOD 23]
- Automatic SQL Error Mitigation in Oracle [VLDB 23]
- ByteHTAP: ByteDance’s HTAP System with High Data Freshness and Strong Data Consistency [VLDB 22]
- Krypton: Real-time Serving and Analytical SQL Engine at ByteDance [VLDB 23]
- VeDB: A Software and Hardware Enabled Trusted Relational Database [SIGMOD 23]
- POLARIS: The Distributed SQL Engine in Azure Synapse [VLDB 20]
- Microsoft Purview: A System for Central Governance of Data [VLDB 23]
- OneProvenance: Efficient Extraction of Dynamic Coarse-Grained Provenance From Database Query Event Logs [VLDB 23]
- Towards Building Autonomous Data Services on Azure [SIGMOD 23]
- Presto: A Decade of SQL Analytics at Meta [SIGMOD 23]
- Disaggregating RocksDB: A Production Experience [SIGMOD 23]