sujithjay / data-readings

Reading List in Data Systems

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reading List in Data Systems

A list of papers, articles, and online resources I have found essential to understanding data-intensive systems and building new data systems. The list is curated and maintained by Sujith Jay Nair (@sujithjay). If you think a paper should be part of this list, please submit a pull request. I will add it to the list once I peruse the paper. Please make sure the subject-matter of the paper is within the realm of either i) understanding data systems, or ii) building data systems.

Data systems are defined to include:

  • Database systems
  • Data processing systems

This list is inspired by Reynold Xin's list on Database Readings, and is a work in progress.

Table of Contents

  1. Consistency and Consensus
  2. Query Processing
  3. State and Stream
  4. Database Design

Consistency and Consensus

Query Processing

State and Stream

  • Data in Flight (2010): Introduces a model of streams as a superset of the relational model. Streams introduce a notion of time (processing-time, IMO) to the relational model. I explore a similar idea in this post. In a relational table, data is persistent and query is transient; in a stream, query is persistent and data is transient.

Database Design

About

Reading List in Data Systems

License:MIT License