Replication in Databases and Distributed Systems

Table of Contents

1 数据复制的用处

Database replication is widely used for fault-tolerance, scalability and performance. The failure of one database replica does not stop the system from working as available replicas can take over the tasks of the failed replica. Scalability can be achieved by distributing the load across all replicas, and adding new replicas should the load increase. Finally, database replication can provide fast local access, even if clients are geographically distributed clients, if data copies are located close to clients.

上面内容摘自:Database Replication (Synthesis Lectures on Data Management), pdf backup

数据复制(replication)有三个主要用处:
(1) 容错(fault-tolerance) :整个系统不会由于一个节点出错而停止,其它的节点可以接管出错节点的工作。
(2) 提高扩展能力(scalability) :把负载分布到多个节点上来提高系统的负载能力。
(3) 提高性能(performance) :客户端访问就近节点,从而能提高访问速度。

2 复制控制协议的分类(四类)

Replica control algorithms can be categorized by two parameters: where update transactions are coordinated and when updates are sent to other replicas. In principle, a protocol can be either eager or lazy, and follow a primary copy or update anywhere approach. From there, we can derive four basic categories: eager primary copy, eager update anywhere, lazy primary copy and lazy update anywhere.

从“where”的角度可分为:
(1) Write Master (又称为Primary Copy) :只能向“主节点”提交写请求。读没有限制,可以从任何节点读取数据。这种情况下,客户端需要对读与写进行区别,俗称 “读写分离”
(2) Update Anywhere (又称为Multi Master) :可以向“任意节点”提交写请求。读没有限制。集群节点的角色对客户端透明。

从“when”的角度可分为:
(1) Eager Replication(立即复制) :Detect conflict before propagation, ensures consistency. 节点完成修改后,还要等待其他节点都完成修改才给客气端返回成功的消息。这能保证较好的一致性,但响应速度相对不快。
(2) Lazy Replication(延迟复制) :Propagate changes after commit, ensures maximum performance. 修改的数据会延迟复制到别的节点上,客户端不用为数据复制过程付出等待时间,响应速度很快。

这四个分类的优缺点如图 1 所示(摘自 Replication in the Wild)。

replication_4_pros_cons.jpg

Figure 1: 四类复制控制协议的优缺点

参考:
Database Replication (Synthesis Lectures on Data Management), Chapter 3, Basic Protocols
Understanding Replication in Databases and Distributed Systems
Comparison of database replication techniques based on total order broadcast

2.1 常见集群系统分类

根据复制控制算法的不同,常见集群系统分类如图 2 所示(摘自 Replication in the Wild)。

replication_4_categories.jpg

Figure 2: 常见集群系统分类


Author: cig01

Created: <2015-12-01 Tue 00:00>

Last updated: <2018-03-14 Wed 23:50>

Creator: Emacs 25.3.1 (Org mode 9.1.4)