Scaling Transactional Databases with Strong Guarantees
Database replication is a common mechanism used for scaling performance and improving availability of transactional databases but past approaches have suffered from various issues including limited scalability, performance versus consistency tradeoffs, and requirements for database or application modifications. Hihooi is a replication-based middleware solution that is able to achieve workload scalability, strong consistency guarantees, and elasticity for existing transactional databases at a low cost. A novel replication algorithm enables Hihooi to propagate database modifications asynchronously to all replicas at high speeds, while ensuring that all replicas are consistent. At the same time, a fine-grained routing algorithm is used to load balance incoming transactions to available replicas in a consistent way.
The conventional practice in database replication is to apply the writes serially at the slaves, even though the master processes them in parallel. With increasing writes, however, the lag between the master and a slave node can become significant. Hihooi implements a novel replication algorithm for applying write transactions in parallel at the slaves, while maintaining strong consistency guarantees. The algorithm utilizes Hihooi’s notion of transaction read/write sets. Each transaction 𝑇 will read and/or modify some tables in a database instance, defined as the Table Read Set and the Table Write Set of 𝑇, respectively. Similarly, the Column and Row Read/Write Sets of a transaction 𝑇 denote the columns and rows (based on primary key equality) read/written by 𝑇, respectively. The read/write sets of two transactions can then be used to determine whether the transactions affect the same data items in the database, which in turn can be used to decide when to parallelize their execution. Intuitively, if two write transactions modify two different tables (or different columns/rows of the same table), we can safely execute them in parallel and let them commit in reverse order, without violating any consistency guarantees.
As a middleware system, Hihooi intercepts all incoming transactions and is tasked with routing them to the underlying database engines for execution. Unlike other systems, Hihooi implements a novel routing algorithm that utilizes read/write sets for directing read transactions to Extension DBs, even if they are not consistent with the Primary DB. The key idea is that it is safe to route a read transaction 𝑇 to an Extension DB if the tables (or columns/rows) accessed by 𝑇 will not be modified by the write transactions that have yet to execute on the Extension DB. To achieve this, Hihooi keeps track of the completed transactions that have been applied on each of the Extension DBs along with the transactions that are currently running on the Primary DB. Hence, Hihooi recognizes which tables, columns, or rows are up-to-date on each of the Extension DBs. Next, Hihooi checks which read queries are safe (from a consistency point of view) to execute on which Extension DBs. In the case where multiple Extension DBs can execute an incoming query, Hihooi will perform load balancing and send the query to the least-loaded Extension DB. Hihooi is the first middleware system able to also do this for read queries that are part of multi-statement write transactions. By default, Hihooi supports Global Strong Snapshot Isolation (GSSI). By controlling the replication and routing mechanisms, Hihooi can also offer Weak SI, Replicated SI with Primary Copy, and One-copy Serializability.