Distributed System I
整理DS笔记,突然想给Tag换个封面,去照相迎面碰见男神Daniel❤️ 开心~ 现在去请还欠一顿饭的by腿吃饭(/ω\) 待会和Katrina女神健身~假期就是这么美滋滋\(//∇//)/
第一部分主要侧重于基于分布式系统和Go语言的同步。Daniel上课手舞足蹈真可爱 /(//∇//)/❤
Classical Synchronization & Go Style Concurrency
Concurrency 一致性
Concurrency是DS的核心之一,因为在访问共享资源时要保证安全和多路传输(safe/multiplexed)。
Single Node 模式:
一些相关概念:
- Critical Section 临界区: piece of code accessing a shared resource 访问共用资源的程序片段
- Race Condition 竞争: Multiple threads of execution enter CS at the same time, update shared resource, leading to undesirable outcome 多线程同时进入临界区更新共享资源导致不可预测的结果
- Indeterminate Program: one or more Race Conditions, output ofprogram depending on ordering, non deterministic 因存在竞争输出不确定的程序(比如我的Prj一开始总是惨不忍睹)
Solution: Mutual Exclusion 互斥锁
Desired Properties: Correctness, Efficiency, Fairness …我渣翻它做什么
例子:create thread safe FIFO queue
b.Init():
Initialize values
b.Insert(x)
Insert item into queue
b.Remove()
Block until queue not empty (if necessary) Return element at head of queue
b.Flush():
Clear queue
Emmm… 觉得自己513学的略渣
版本1: mutex
b.Init():
b.sb = NewBuf()
b.mutex = 1
b.Insert(x):
b.mutex.lock()
b.sb.Insert(x)
b.mutex.unlock()
b.Remove():
b.mutex.lock()
x = b.sb.Remove() // 如果Remove在空buffer时调用?
b.mutex.unlock()
return x
b.Flush():
b.mutex.lock()
b.sb.Flush()
b.mutex.unlock()
...
b.Remove():
b.mutex.lock()
b.items.P() // items--, if items < 0 wait
x = b.sb.Remove()
b.mutex.unlock()
return x
...
空buffer调用Remove,调用P,会产生死锁
版本2: Candition Variables 条件变量
- cvars provide a sync point, one thread suspended until activated by another
- always associated with mutex
- Wait() and Signal() operations defined with cvars
cvar.Wait(): Must be called after locking mutex. Atomically: release mutex & suspend operation When resume, lock mutex (but maybe not right away) cvar.Signal(): If no thread suspended, then NO-OP Wake up (at least) one suspended thread.
b.Init():
b.sb = NewBuf()
b.mutex = 1
b.cvar = NewCond(b.mutex)
b.Insert(x):
b.mutex.lock()
b.sb.Insert(x)
b.sb.Signal()
b.mutex.unlock()
b.Remove():
b.mutex.lock()
while b.sb.Empty() {
b.cvar.wait()
}
x = b.sb.Remove()
b.mutex.unlock()
return x
b.Flush():
b.mutex.lock()
b.sb.Flush()
b.mutex.unlock()
就是注意Remove里要用while而不是if来检查当前的状态,当多个线程访问buffer时while会保证原子性,而不会发生一个release lock导致了多个retake lock出现,唤醒多个wait,造成错误。
Semaphores VS. Mutex
Mutexes: used to protect the shared resources from concurrent access. For example, Readers Writers problem.
Semaphores: used to indicate occurrence of an event to other processes. For example, producer consumer problem.
Mutex = Binary Semaphores
Concurrency VS. Parallelism
Concurrency is not parallelism, although it enables parallelism.
Program can still be concurrent butnot parallel for one processor.
*Parallelism is special case, which means different cases run in same time, while concurrency is more general, with or without parallelism, means safely sharing information multiprocesses.
Go Concurrency
- Channels
- GoRoutines
Time Synchronziation
这一节主要讲时间同步,尤其global的时间同步非常重要。介绍了一些时间同步的算法,其中最重要的是Cristian和Lamport Total.
一些概念:
Skew 时钟偏移: difference between the times on two clocks.
Clock drift rate 时钟漂移率:the difference per unit of time from some ideal referenceclock.
解决方法: synchronize clocks / ensure consistent clocks
Time synchronization techniques
对于网络通信,信息延迟是随机并且不可靠(丢包)的。对于传输延迟最多为D,发送者发送本地时间T, 接受者设置其本地时钟为T+D/2, 同步错误最大为D/2. 如果是完美的networks, 传输延迟 = d, 设置接收的本地时间为T+d/2, 误差为0.
Cristian’s Time Sync
- Process p requests time in mr and receives t in mt from S
- p sets its clock to t + RTT/2
- Accuracy ± (RTT/2 - min)
- min is estimated minimum one way delay
- the time by S’s clock when mt arrives is in the range [t+min, t + RTT - min]
Cristian算法可以保证error的范围,因此很重要。
Berkeley algorithm
所有networks内时钟取均值
Network Time Protocal (NTP) 网络时间协定
- Uses a hierarchy of time servers
- Synchronization similar to Cristian’s algorithm
- Accuracy: Local ~1ms, Global ~10ms
- All modes use UDP
- RTT = wait_time_client - server_proc_time
- Offset = ((offset + delay) + (offset - delay))/2
Lamport Clocks
Lamport Clock
对每个线程以及多线程间的通信保证时间戳单调递增,两个规则:增加,取max
- Li is incremented by 1 before each event at process pi
- When pj receives (m,t) it sets Lj := max(Lj, t) and applies rule 1 before timestamping the event receive (m).
- Causal ordering, e -> e’ implies L(e)<L(e’), converse is not true - one way
- L(b)>L(e) but b || e
Total-order Lamport clocks
多线程的时钟同步算法,唯一的区别是全局的时钟计算加入了pid。
L(e) = M * Li(e) + pid, M = maximum number of processes
Vector Clocks
- L(e) < L(e’) does not imply e happened before e’
- Exact causality, V(e) < V(e’) if and only if e → e’
- V(e) [c1, c2 …, cn], ci = # events in process i that causally precede e
- e -> e’ implies V(e)<V(e’). The converse is also true
- c || e (parallel) because neither V(c) <= V(e) nor V(e) <= V(c)
总结: 没有总结。