db-paper-catalog

A paper catalog on Data Management Area for last five years.

System

VLDB15:Trill: A High-Performance Incremental Query Processor for Diverse Analytics

VLDB15:distributed architecture of Oracle Database In-memory

VLDB15:Building a Replicated Logging System with Apache Kafka

VLDB15:The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

VLDB15:Collaborative Data Analytics with DataHub

VLDB15:AsterixDB: A Scalable, Open Source BDMS

SIGMOD15:Apache Tez: A Unifying Framework for Modeling and Building Data Processing Applications

SIGMOD15:Spark SQL: Relational Data Processing in Spark

SIGMOD15:Design and Implementation of the LogicBlox System

SIGMOD15:REEF: Retainable Evaluator Execution Framework

SIGMOD15:Large-scale Predictive Analytics in Vertica: Fast Data Transfer, Distributed Model Creation, and In-database Prediction

SIGMOD15:Oracle Workload Intelligence

SIGMOD15:On Improving User Response Times in Tableau

ICDE15:PABIRS: A Data Access Middleware for Distributed File Systems

ICDE15:Towards a Web-scale Data Management Ecosystem Demonstrated by SAP HANA

ICDE15:"Anti-Caching"-based Elastic Memory Management for Big Data

ICDE15:Accelerating Big Data Analytics With Collaborative Planning in Teradata Aster 6

VLDB14:epiC: an Extensible and Scalable System for Processing Big Data

VLDB14:WideTable: An Accelerator for Analytical Data Processing

VLDB14:Fuxi: a Fault-Tolerant Resource Management and Job Scheduling System at Internet Scale

VLDB14:Storing and Querying Tree-Structured Records in Dremel

VLDB14:Changing Engines in Midstream: A Java Stream Computational Model for Big Data Processing

VLDB14:Summingbird: A Framework for Integrating Batch and Online MapReduce Computations

VLDB14:Engineering High-Performance Database Engines

VLDB14:Realization of the Low Cost and High Performance MySQL Cloud Database

VLDB14:Anti-Caching: A New Approach to Database Management System Architecture

VLDB14:Simple, Fast, and Scalable Reachability Oracle

SIGMOD14:Sinew: A SQL System for Multi-Structured Data

SIGMOD14:Partial Results in Database Systems

SIGMOD14:Towards Unified Ad-hoc Data Processing

SIGMOD14:HAWQ: A Massively Parallel Processing SQL Engine in Hadoop

SIGMOD14:Major Technical Advancements in Apache Hive

ICDE14:Blazes: Coordination Analysis for Distributed Programs

ICDE14:DBDesigner: A Customizable Physical Design Tool for Vertica Analytic Database

ICDE14:Locality-Sensitive Operators for Parallel Main-Memory Database Clusters

ICDE13:Big Data Analytics at Facebook

ICDE13:EAGRE: Towards Scalable I/O Efficient SPARQL Query Evaluation on the Cloud

ICDE13:C-Cube: Elastic Continuous Clustering in the Cloud

VLDB13:MillWheel: Fault-Tolerant Stream Processing at Internet Scale

VLDB13:F1: A Distributed SQL Database That Scales

VLDB13:DB2 with BLU Acceleration: So Much More than Just a Column Store

VLDB13:The Quantcast File System

VLDB13:Overview of Turn Data Management Platform for Digital Advertising

VLDB13:Scuba: Diving into Data at Facebook

VLDB13:Adaptive and Big Data Scale Parallel Execution in Oracle

VLDB13:Next Generation Data Analytics at IBM Research

VLDB13:SAP HANA: The Evolution from a Modern Main-Memory Data Platform to an Enterprise Application Platform

VLDB13:Platform-as-a-Service for Data-enabled Applications

VLDB13:Facebook Data Analytics

VLDB13:The Trento Big Data Platform for Public Administration and Large Companies: Use cases and Opportunities

VLDB13:Google Data 2020 - The next challenges in big data

VLDB13:DiAl: Distributed Streaming Analytics Anywhere, Anytime

SIGMOD13:Cumulon: Optimizing Statistical Data Analysis in the Cloud

SIGMOD13:Shark: SQL and Rich Analytics at Scale

SIGMOD13:Parallel Analytics as a Service

SIGMOD13:BitWeaving: Fast Scans for Main Memory Data Processing

SIGMOD13:The "Big Data" Ecosystem at LinkedIn

SIGMOD13:On Brewing Fresh Espresso: LinkedIn's Distributed Data Serving Platform

SIGMOD13:Fast Data in the Era of Big Data: Twitter's Real-Time Related Query Suggestion Architecture

SIGMOD13:Petabyte Scale Databases and Storage Systems at Facebook

SIGMOD13:Split Query Processing in Polybase

SIGMOD13:WoW: What the World of (Data) Warehousing Can Learn from the World of Warcraft

SIGMOD13:Speeding up Database Applications with Pyxis

ICDE13:Robust Distributed Stream Processing

VLDB12:PIQL: Success-Tolerant Query Processing in the Cloud

VLDB12:SODA: Generating SQL for Business Users

VLDB12:LogBase: A Scalable Log-structured Database System in the Cloud

VLDB12:REX: Recursive, Delta-Based Data-Centric Computation

VLDB12:The Unified Logging Infrastructure for Data Analytics at Twitter

VLDB12:The Vertica Analytic Database: C-Store 7 Years Later

VLDB12:Avatara: OLAP for Web-scale Analytics Products

VLDB12:A Demonstration of DBWipes: Clean as You Query

VLDB12:ASTERIX: An Open Source System for "Big Data" Management and Analysis (Demo)

VLDB12:Model-based Integration of Past & Future in TimeTravel

SIGMOD12:NoDB: Efficient Query Execution on Raw Data Files

SIGMOD12:Amazon DynamoDB: A Seamlessly Scalable Non-Relational Datastore

SIGMOD12:Efficient Transaction Processing in SAP HANA Database--The End of a Column Store Myth

SIGMOD12:Walnut: A Unified Cloud Object Store

SIGMOD12:F1-The Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business

SIGMOD12:Oracle In-Database Hadoop: When MapReduce Meets RDBMS

SIGMOD12:Optimizing Analytic Data Flows for Multiple Execution Engines

ICDE12:BestPeer++: A Peer-to-Peer Based Large-Scale Data Processing Platform

ICDE12:Vectorwise: A Vectorized Analytical DBMS

ICDE12:Earlybird: Real-Time Search at Twitter

ICDE12:Data Infrastructure at LinkedIn

VLDB11:Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore

VLDB11:Online Expansion of Large-Scale Data Warehouses

VLDB11: VLDB11:Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs

SIGMOD11:Efficient Processing of Data Warehousing Queries in a Split Execution Environment

ICDE11:Hyrax: A Flexible and Extensible Foundation for Data-Intensive Computing

Time Series

VLDB15:YADING: Fast Clustering of Large-Scale Time Series Data

VLDB15:General Incremental Sliding-Window Aggregation

VLDB15:Efficient Processing of Window Functions in Analytical SQL Queries

VLDB15:Gorilla: A Fast, Scalable, In-Memory Time Series Database

VLDB13:Comprehensive and Interactive Temporal Query Processing with SAP HANA

VLDB13:Storing and Processing Temporal Data in a Main Memory Column Store

Stream

SIGMOD15:Twitter Heron: Stream Processing at Scale

SIGMOD15:Persistent Data Sketching

SIGMOD15:Scalable Distributed Stream Join Processing

SIGMOD15:SCREEN: Stream Data Cleaning under Speed Constraints

SIGMOD15:CE-Storm: Confidential Elastic Processing of Data Streams

SIGMOD15:Quality-Driven Continuous Query Execution over Out-of-Order Data Streams

ICDE15:The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines

ICDE15:Piecewise Linear Approximation of Streaming Time Series Data with Max-error Guarantees

ICDE15:On Historical Diagnosis of Sensor Streams

ICDE15:ChronoStream: Elastic Stateful Stream Computation in the Cloud

ICDE15:Configurable Hardware-based Streaming Architecture using Online Programmable-Blocks

VLDB14:High Performance Stream Query Processing With Correlation-Aware Partitioning

SIGMOD14:Complex Event Analytics: Online Aggregation of Stream Sequence Patterns

SIGMOD14:On Complexity and Optimization of Expensive Queries in Complex Event Processing

SIGMOD14:Storm @Twitter

SIGMOD13:Photon: Fault-tolerant and Scalable Joining of Continuous Data Streams

SIGMOD13:Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management

SIGMOD13:Quantiles over Data Streams: An Experimental Study

VLDB12:Sketch-based Querying of Distributed Sliding-Window Data Streams

VLDB12:Spinning Fast Iterative Data Flows

VLDB12:Building User-defined Runtime Adaptation Routines for Stream Processing Applications

VLDB12:MonetDB/DataCell: Online Analytics in a Streaming Column-Store

ICDE12:A General Method for Estimating Correlated Aggregates over a Data Stream

ICDE12:Accuracy-Aware Uncertain Stream Databases

VLDB11:Active Complex Event Processing over Event Streams

VLDB11:Massive Scale-out of Expensive Continuous Queries

SIGMOD11:How Soccer Players Would Do Stream Joins

ICDE11:Memory-Constrained Aggregate Computation over Data Streams

ICDE11:SQPR: Stream Query Planning with Reuse

Interactive / time constriant

VLDB15:JetScope: Reliable and Interactive Analytics at Cloud Scale

VLDB15:Towards Scalable Real-time Analytics: An Architecture for Scale-out of OLxP Workloads

VLDB15:Real-Time Analytical Processing with SQL Server

VLDB15:Real Time Analytics: Algorithms and Systems

SIGMOD15:Analytics in Motion: High Performance Event-Processing AND Real-Time Analytics in the Same Database

VLDB14:Scalable Progressive Analytics on Big Data in the Cloud

SIGMOD14:Druid: A Real-time Analytical Data Store

SIGMOD14:OceanRT: Real-Time Analytics over Large Temporal Data

ICDE14:R-Store: A Scalable Distributed System for Supporting Real-time Analytics

ICDE14:Distributed Interactive Cube Exploration

VLDB12:Optimization of Analytic Window Functions

VLDB11:Analytics for the Real-Time Web

VLDB11:UpStream: A Storage-centric Load Management System for Real-time Update Streams

Query Optimization

VLDB15:query optmization in Oracle 12c Database In-Memory

SIGMOD15:The Flatter, the Better: Query Compilation Based on the Flattening Transformation

SIGMOD15:An Incremental Anytime Algorithm for Multi-Objective Query Optimization

ICDE15:Automatic Tuning of Bag-of-Tasks Applications

ICDE15:Cache-Oblivious Scheduling of Shared Workloads

VLDB14:Shared Workload Optimization

VLDB14:Code generation for efficient query processing in managed runtimes

VLDB14:Adaptive Range Filters for Cold Data: Avoiding Trips to Siberia

SIGMOD14:Dynamically Optimizing Queries over Large Scale Data Platforms

SIGMOD14:Query Shredding: Efficient Relational Evaluation of Queries over Nested Multisets

SIGMOD14:Approximation Schemes for Many-Objective Query Optimization

SIGMOD14:Parallel In-Situ Data Processing with Speculative Loading

SIGMOD14:Orca: A Modular Query Optimizer Architecture for Big Data

SIGMOD14:Optimizing Queries over Partitioned Tables in MPP Systems

SIGMOD14:Parallel I/O aware query optimization

SIGMOD14:Versatile Optimization of UDF-heavy Data Flows with Sofa

SIGMOD14:Reactive and Proactive Sharing Across Concurrent Analytical Queries

ICDE14:The Vertica Query Optimizer: The Case for Specialized Query Optimizers

ICDE14:Waste Not... Efficient Co-Processing of Relational Data

ICDE14:History-aware Query Optimization with Materialized Intermediate Views

ICDE14:Decorrelation of User Defined Function Invocations in Queries

VLDB13:PAQO: A Preference-Aware Query Optimizer for PostgreSQL

VLDB13:Learning and Intelligent Optimization: one ring to rule them all

VLDB13:Designing Query Optimizers for Big Data problems of the future

VLDB13:Continuous Cloud-Scale Query Optimization and Processing

VLDB13:Just-in-time compilation for SQL query processing

VLDb13:Sharing Data and Work Across Concurrent Analytical Queries

ICDE13:Recycling in Pipelined Query Evaluation

ICDE13:Predicting Query Execution Time: Are Optimizer Cost Models Really Unusable?

ICDE13:Top Down Plan Generation: From Theory to Practice

VLDB12:SharedDB: Killing One Thousand Queries With One Stone

VLDB12:Opening the Black Boxes in Data Flow Optimization

VLDB12:PET: Reducing Database Energy Cost via Query Optimization

SIGMOD12:Holistic Optimization by Prefetching Query Results

SIGMOD12:Query Optimization in Microsoft SQL Server PDW

SIGMOD12:Recurring Job Optimization in Scope

SIGMOD12:Adaptive Optimizations of Recursive Queries in Teradata

SIGMOD12:From X100 to Vectorwise: Opportunities, Challenges and Things Most Researchers Do Not Think About

ICDE12:Learning-based Query Performance Modeling and Prediction

ICDE12:Parametric Plan Caching Using Density-Based Clustering

ICDE12:Optimization of Massive Pattern Queries by Dynamic Configuration Morphing

ICDE12:Three-Level Processing of Multiple Aggregate Continuous Queries

ICDE12:Exploiting Common Subexpressions for Cloud Query Processing

SIGMOD11:Query Optimization Techniques for Partitioned Tables

DB General

VLDB15:In-Memory Performance for Big Data

VLDB15:Smart Drill-Down: A New Data Exploration Operator

SIGMOD15:Locality-aware Partitioning in Parallel Database Systems

SIGMOD15:Rethinking SIMD Vectorization for In-Memory Databases

SIGMOD15:One Loop Does Not Fit All

SIGMOD15:DunceCap: Compiling Worst-Case Optimal Query Plans

VLDB14:Trekking Through Siberia: Managing Cold Data in a Memory-Optimized Database

SIGMOD14:Sloth: Being Lazy is a Virtue (When Issuing Database Queries)

SIGMOD14:Versatile Optimization of UDF-heavy Data Flows with Sofa

SIGMOD14:Palette: Enabling Scalable Analytics for Big-Memory, Multicore Machines

SIGMOD13:DBMS Metrology: Measuring Query Time

SIGMOD13:Reverse Engineering Complex Join Queries

SIGMOD13:Micro Adaptivity in Vectorwise

ICDE13:CPU and Cache Efficient Management of Memory-Resident Databases

ICDE13:Identifying Hot and Cold Data in Main-Memory Databases

SIGMOD12:Advanced Partitioning Techniques for Massively Distributed Computation

SIGMOD12:Efficient External-Memory Bisimulation on DAGs

ICDE12:GSLPI: A Cost-Based Query Progress Indicator

SIGMOD11:Predicting Cost Amortization for Query Services

ICDE11:Interactive SQL Query Suggestion: Making Databases User-Friendly

ICDE11:Predicting In-Memory Database Performance for Automating Cluster Management Tasks

Approximate Evaluation

VLDB14:Error-bounded Sampling for Analytics on Big Sparse Data

BlinkDB

VLDB14:A Sampling Algebra for Aggregate Estimation

SIGMOD14:The Analytical Bootstrap: a New Method for Fast Error Estimation in Approximate Query Processing

SIGMOD14:ABS: a System for Scalable Approximate Queries with Accuracy Guarantees

VLDB12:Early Accurate Results for Advanced Analytics on MapReduce

VLDB11:Structure-Aware Sampling: Flexible and Accurate Summarization

Cost and Statistics

VLDB15:Uncertainty Aware Query Execution Time Prediction

VLDB15:Multi-Objective Parametric Query Optimization

VLDB15:join size estimation subject to filter conditions

VLDB14:Scalable Discovery of Unique Column Combinations

SIGMOD14:Plan Bouquets: Query Processing without Selectivity Estimation

SIGMOD14:Histograms as a Side Effect of Data Movement for Big Data

SIGMOD14:Exploiting Ordered Dictionaries to Efficiently Construct Histograms with Q-Error Guarantees in SAP HANA

VLDB13:Upper and Lower Bounds on the Cost of a Map-Reduce Computation

VLDB13:Towards Predicting Query Execution Time for Concurrent and Dynamic Database Workloads

SIGMOD13:CS2: A New Database Synopsis for Query Estimation

SIGMOD13:On the Correct and Complete Enumeration of the Core Search Space

VLDB12:A Statistical Approach Towards Robust Progress Estimation

VLDB12:How to Price Shared Optimizations in the Cloud

VLDB12:Robust Estimation of Resource Consumption for SQL Queries using Statistical Techniques

VLDB12:Towards Energy-Efficient Database Cluster Design

VLDB12:Building Wavelet Histograms on Large Data in MapReduce

ICDE12:Load Balancing in MapReduce Based on Scalable Cardinality Estimates

ICDE12:Scalable and Numerically Stable Descriptive Statistics in SystemML

VLDB11:MapReduce Programming and Cost-based Optimization? Crossing this Chasm with Starfish

VLDB11:Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs

VLDB11:Storing Matrices on Disk: Theory and Practice Revisited

Storage

SIGMOD15:ByteSlice: Pushing the Envelop of Main Memory Data Processing with a New Storage Layout

SIGMOD15:Telco Churn Prediction with Big Data

ICDE15:Oracle Database In-Memory: A Dual Format In-Memory Database

SIGMOD14:MISO: Souping Up Big Data Query Processing with a Multistore System

SIGMOD14:Durable Write Cache in Flash Memory SSD for Relational and NoSQL Databases

SIGMOD14:Fast database restarts at Facebook

SIGMOD14:SpongeFiles: Mitigating Data Skew in MapReduce Using Distributed Memory

SIGMOD14:Leveraging Compression in the Tableau Data Engine

VLDB13:LLAMA: A Cache/Storage Subsystem for Modern Hardware

VLDB12:A Storage Advisor for Hybrid-Store Databases

VLDB12:Relative Lempel-Ziv Factorization for Efficient Storage and Retrieval of Web Collections

ICDE12:Lookup Tables: Fine-Grained Partitioning for Distributed Databases

VLDB11:HYRISE - A Main Memory Hybrid Storage Engine

VLDB11:Column-Oriented Storage Techniques for MapReduce

VLDB11:Compression Aware Physical Database Design

ICDE11:RCFile: A Fast and Space-efficient Data Placement Structure in MapReduce-based Warehouse Systems

Index

VLDB15:Indexing Highly Dynamic Hierarchical Data

VLDB15:BF-Tree: Approximate Tree Indexing

VLDB15:Compressed Spatial Hierarchical Bitmap (cSHB) Indexes for Efficiently Processing Spatial Range Query Workloads

SIGMOD15:Holistic Indexing in Main-memory Column-stores

ICDE15:Smooth Scan: Statistics-Oblivious Access Paths

ICDE15:A Comparison of Adaptive Radix Trees and Hash Tables

ICDE15:High Performance Temporal Indexing on Modern Hardware

VLDB14:The Uncracked Pieces in Database Cracking

VLDB14:Lightweight Indexing of Observational Data in Log-Structured Storage

VLDB14:DGFIndex for Smart Grid: Enhancing Hive with a Cost-Effective Multidimensional Range Index

VLDB14:Indexing HDFS Data in PDW: Splitting the data from the index

SIGMOD14:H2O: A Hands-free Adaptive Store

SIGMOD14:Fine-grained Partitioning for Aggressive Data Skipping

SIGMOD14:Indexing for Interactive Exploration of Big Data Series

SIGMOD14:Indexing on Modern Hardware: Hekaton and Beyond

ICDE14:A Tunable Compression Framework for Bitmap Indices

VLDB13:A Performance Study of Three Disk-based Structures for Indexing and Querying Frequent Itemsets

VLDB13:A Data-adaptive and Dynamic Segmentation Index for Whole Matching on Time Series

VLDB13:Efficient Indexing for Diverse Query Results

SIGMOD13:Column Imprints: A Secondary Index Structure

SIGMOD13:Timeline Index: A Unified Data Structure for Processing Queries on Temporal Data in SAP HANA

SIGMOD13:Enhancements to SQL Server Column Stores

ICDE13:The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases

ICDE13:An Efficient and Compact Indexing Scheme for Large-scale Data Store

VLDB12:Semi-Automatic Index Tuning: Keeping DBAs in the Loop

VLDB12:Stochastic Database Cracking: Towards Robust Adaptive Indexing in Main-Memory Column-Stores

ICDE12:Making Unstructured Data SPARQL Using Semantic Indexing in Oracle Database

VLDB11:CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads

VLDB11:Workload Driven Index Defragmentation

VLDB11:A Framework for Supporting DBMS-like Indexes in the Cloud

VLDB11:Merging What's Cracked, Cracking What's Merged: Adaptive Indexing in Main-Memory Column

SIGMOD11:SQL Server Column Store Indexes

ICDE11:Partitioning Techniques for Fine-grained Indexing

View

VLDB15:Stale View Cleaning: Getting Fresh Answers from Stale materialized views

SIGMOD15:Utilizing IDs to Accelerate Incremental View Maintenance

SIGMOD14:LINVIEW: Incremental View Maintenance for Complex Analytical Queries

ICDE13:Materialization Strategies in the Vertica Analytic Database: Lessons Learned

VLDB12:DBToaster: Higher order delta processing for dynamic frequently fresh views

Aggregation

SIGMOD15:Cache-Efficient Aggregation: Hashing Is Sorting

SIGMOD15:G-OLA: Generalized On-Line Aggregation for Interactive Analysis on Big Data

VLDB11:Online Aggregation for Large MapReduce Jobs

Join

VLDB15:Memory-Efficient Hash Joins

VLDB15:Improving Main Memory Hash Joins on Intel Xeon Phi Processors: An Experimental Approach

SIGMOD15:From Theory to Practice: Efficient Join Query Evaluation in a Parallel Database System

VLDB14:Execution Primitives for Scalable Joins and Aggregations in Map Reduce

VLDB14:Multi-Core, Main-Memory Joins: Sort vs. Hash Revisited

VLDB14:Advanced Join Strategies for Large-Scale Distributed Computation

VLDB14:Interactive Join Query Inference with JIM

VLDB12:Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems

ICDE12:Effective and Robust Pruning for Top-Down Join Enumeration Algorithms

VLDB11:Accelerating Queries with Group-By and Join by Groupjoin

ICDE11:A New, Highly Efficient and Easy To Implement Top-Down Join Enumeration Algorithm (Best Paper)

ICDE11:PrefJoin: An Efficient Preference-aware Join Operator

ETL

VLDB15:Lenses: An On-Demand Approach to ETL

VLDB14:Adaptive Query Processing on RAW Data

VLDB14:Instant Loading for Main Memory Databases

VLDB13:Lazy ETL in Action: ETL Technology Dates Scientific Data

VLDB12:Fast Updates on Read-Optimized Databases Using Multi-Core CPUs

VLDB12:MapReduce-based Dimensional ETL Made Easy

Data Integration

VLDB15:Preference-aware Integration of Temporal Data

VLDB15:Gobblin: Unifying Data Ingestion for Hadoop

SIGMOD14:Characterizing and Selecting Fresh Data Sources

VLDB13:Mosquito: Another One Bites the Data Upload Stream

VLDB13:Big Data Integration

VLDB13:Less is More: Selecting Sources Wisely for Integration

VLDB12:Dedoop: Efficient Deduplication with Hadoop

VLDB12:Entity Resolution: Theory, Practice & Open Challenges

MapReduce Optimization

VLDB15:Shared Execution of Recurring Workloads in MapReduce

VLDB15:Spatial Partitioning Techniques in SpatialHadoop

VLDB15:Tutorial: SQL-on-Hadoop Systems

ICDE15:DualTable: A Hybrid Storage Model for Update Optimization in Hive

ICDE15:HaTen2: Billion-scale Tensor Decompositions

ICDE15:Groupwise Analytics via Adaptive MapReduce

VLDB14:Multi-Query Optimization in MapReduce Framework

VLDB14:Optimization for iterative queries on MapReduce

VLDB14:SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures(Hive vs Tez vs Impala)

VLDB14:Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters

SIGMOD14:Anti-Combining For MapReduce

SIGMOD14:Opportunistic Physical Design for Big Data Analytics

VLDB13:XORing Elephants: Novel Erasure Codes for Big Data

VLDB13:Optimization Strategies for A/B Testing on HADOOP

VLDB13:Piranha: Optimizing Short Jobs in Hadoop

ICDE13:SASH: Enabling Continuous Incremental Analytic Workflows on Hadoop

VLDB12:Putting Lipstick on Pig: Enabling Database-style Workflow Provenance

VLDB12:ReStore: Reusing Results of MapReduce Jobs

VLDB12:PerfXplain: Debugging MapReduce Job Performance

VLDB12:Stubby: A Transformation-based Optimizer for MapReduce Workflows

VLDB12:Only Aggressive Elephants are Fast Elephants

VLDB12:Can the Elephants Handle the NoSQL Onslaught?

VLDB12:M3R: Increased Performance for In-Memory Hadoop Jobs

VLDB12:Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads

VLDB12:Muppet: MapReduce-Style Processing of Fast Data

VLDB12:SkewTune in Action: Mitigating Skew in MapReduce Applications

VLDB12:NoDB in Action: Adaptive Query Processing on Raw Data

VLDB12:Efficient Big Data Processing in Hadoop MapReduce

VLDB12:MapReduce Algorithms for Big Data Analysis

SIGMOD12:SkewTune: Mitigating Skew in MapReduce Applications

ICDE12:Extending Map-Reduce for Efficient Predicate-Based Sampling

VLDB11:Automatic Optimization for MapReduce Programs

VLDB11:CoHadoop: Flexible Data Placement and Its Exploitation in Hadoop

SIGMOD11:Apache Hadoop Goes Realtime at Facebook

SIGMOD11:Nova: Continuous Pig/Hadoop Workflows

SIGMOD11:A Hadoop Based Distributed Loading Approach to Parallel Data Warehouses

DB & Hardware

VLDB15:Resource Bricolage for Parallel Database Systems

VLDB15:A Performance Study of Big Data on Small Nodes

VLDB15:Deployment of Query Plans on Multicores

VLDB15:In-Cache Query Co-Processing on Coupled CPU-GPU Architectures

VLDB15:Databases and Hardware: The Beginning and Sequel of a Beautiful Friendship

VLDB15:SIMD- and Cache-Friendly Algorithm for sorting an array of structure

VLDB15:Scaling Up Concurrent Main-Memory Column-Store Scans: Towards Adaptive NUMA-aware Data and Task Placement

SIGMOD15:JAFAR: Near-Data Processing for Databases

ICDE15:Evolving the Architecture of SQL Server for Modern Hardware Trends

ICDE15:In-Memory BLU Acceleration in IBM’s DB2 and dashDB: Optimized for Modern Workloads and Hardware Architectures

ICDE15:Accelerating Aggregation using Intra-cycle Parallelism

VLDB14:CPU Sharing Techniques for Performance Isolation in Multi-tenant Relational Database-as-a-Service

VLDB14:Write-limited sorts and joins for persistent memory

VLDB14:When Data Management Systems Meet Approximate Hardware: Challenges and Opportunities

VLDB14: Ibex—An Intelligent Storage Engine with Support for Advanced SQL Off-loading

VLDB14:Concurrent Analytical Query Processing with GPUs

SIGMOD14:Patience is a Virtue: Revisiting Merge and Sort on Modern Processors

SIGMOD14:Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age

SIGMOD14:A Comprehensive Study of Main-Memory Partitioning and its Application to Large-Scale Comparison- and Radix-Sort

SIGMOD14:An Application-Specific Instruction Set for Accelerating Set-Oriented Database Primitives

VLDB13:Revisiting Co-Processing for Hash Joins on the Coupled CPU-GPU Architecture

VLDB13:Hardware-Oblivious Parallelism for In-Memory Column-Stores

VLDB13:The Yin and Yang of Processing Data Warehousing Queries on GPU Devices

VLDB13:Microsoft SQL Server’s Integrated Database Approach for Modern Applications and Hardware

VLDB13:Flexible Query Processor on FPGAs

VLDB13:OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures

VLDB13:Why it is time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS

SIGMOD13:Query Processing on Smart SSDs: Opportunities and Challenges

ICDE13:Efficient Many-Core Query Execution in Main Memory Column-Stores

ICDE13:Main-Memory Hash Joins on Multi-Core CPUs: Tuning to the Underlying Hardware

VLDB12:hStorage-DB: Heterogeneity-aware Data Management to Exploit the Full Capability of Hybrid Storage Systems

VLDB12:I/O Characteristics of NoSQL Databases

VLDB11:Fast Set Intersection in Memory

VLDB11:Efficiently Compiling Efficient Query Plans for Modern Hardware

SIGMOD11:LazyFTL: A Page-level Flash Translation Layer Optimized for NAND Flash Memory

SIGMOD11:Operation-Aware Buffer Management in Flash-based Systems

SIGMOD11:4. Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs

Tutorial

SIGMOD15:Overview of Data Exploration Techniques

SIGMOD15:Data Management in Non-Volatile Memory

SIGMOD14:How to Stop Under-Utilization and Love Multicores

SIGMOD14:Workload Management for Big Data Analytics

SIGMOD13:Data Stream Warehousing

VLDB11:New Frontiers in Business Intelligence

VLDB11:Data is Dead... Without What-if Models

VLDB11:System Co-Design and Data Management for Flash Devices