Skip to content

huachaohuang/awesome-dbdev

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 

Repository files navigation

Awesome Database Development

Database development is interesting and challenging. You can always find interesting things to learn and challenging problems to solve. You need to get a lot of things right to build a reliable and high-performance database. And it takes time, a lot of time, to think and practice. I have been working on databases for ten years. However, as the proverb goes, the more I know, the more I realize I don't know. So, I collect the database development materials I have read here to review them from time to time. I think it will be helpful to those who share the same interests as me.

Storage Device

Media

Interface

Operating System

Kernel

File system

  • ext4 Data Structures and Algorithms

  • The Design and Implementation of a Log-Structured File System (1991)

    This paper presents a new technique for disk storage management called a log-structured file system. A log- structured file system writes all modifications to disk sequentially in a log-like structure, thereby speeding up both file writing and crash recovery.

  • SFS: Random Write Considered Harmful in Solid State Drives (FAST, 2012)

    In this paper, we propose a new file system for SSDs, SFS. First, SFS exploits the maximum write bandwidth of SSD by taking a log-structured approach. SFS transforms all random writes at file system level to sequential ones at SSD level. Second, SFS takes a new data grouping strategy on writing, instead of the existing data separation strategy on segment cleaning. It puts the data blocks with similar update likelihood into the same segment. This minimizes the inevitable segment cleaning overhead in any log-structured file system by allowing the segments to form a sharp bimodal distribution of segment utilization.

Modern hardware

  • What Every Programmer Should Know About Memory (2007)

    This paper explains the structure of memory subsystems in use on modern commodity hardware, illustrating why CPU caches were developed, how they work, and what programs should do to achieve optimal performance by utilizing them.

  • What Every Systems Programmer Should Know About Concurrency (2018)

    Seasoned programmers are familiar with tools like mutexes, semaphores, and condition variables. But what makes them work? How do we write concurrent code when we can’t use them, like when we’re working below the operating system in an embedded environment, or when we can’t block due to hard time constraints? And since your system transforms your code into things you didn’t write, running in orders you never asked for, how do multithreaded programs work at all? Concurrency — especially on modern hardware — is a complicated and unintuitive topic, but let’s try to cover some fundamentals.

  • Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask (2013)

    This paper presents the most exhaustive study of synchronization to date. We span multiple layers, from hardware cache-coherence protocols up to high-level concurrent software. We do so on different types of architectures, from single-socket – uniform and non- uniform – to multi-socket – directory and broadcast-based – many-cores.

Storage virtualization

Storage Engine

SQL

Transaction

Distributed Algorithm

Theorem

Papers

Links

Consensus

Papers

  • Paxos Made Simple (Lamport, 2001)

    The Paxos algorithm, when presented in plain English, is very simple.

  • Consensus on Transaction Commit (2004)

    This paper presents the Paxos Commit algorithm. Paxos Commit runs a Paxos consensus algorithm on the commit/abort decision of each participant to obtain a transaction commit protocol that uses 2F + 1 coordinators and makes progress if at least F + 1 of them are working properly.

  • Paxos Made Live - An Engineering Perspective (PODC, 2007)

    This paper presents the experience of building Chubby, a fault-tolerant storage system using the Paxos consensus algorithm.

  • There Is More Consensus in Egalitarian Parliaments (SOSP, 2013)

    This paper presents the design and implementation of Egalitarian Paxos (EPaxos), a new distributed consensus algorithm based on Paxos that achieves uniform load balancing across all replicas.

  • Paxos Quorum Leases: Fast Reads Without Sacrificing Writes (SOCC, 2014)

    This paper presents quorum leases, a technique that allows Paxos-based systems to perform consistent local reads on multiple replicas.

  • In Search of an Understandable Consensus Algorithm (USENIX, 2014)

    This paper presents Raft, a consensus algorithm for managing a replicated log. Raft produces a result equivalent to Paxos, and it is as efficient as Paxos, but its structure is different from Paxos. Raft is more understandable than Paxos and also provides a better foundation for building practical systems.

Consistency

Papers

Links

Replication

Papers

Distributed System

Papers

OLTP Database

Papers

Links

OLAP Database

Papers

Books

Miscellaneous

Papers

Books

Links

About

Awesome materials about database development.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published