Architecting for high availability

下文是自己看完亚马逊一个关于高可用架构设计的总结帖,其中不乏一些经典实用的设计原则,为以后留存用,全文用英文表达 🙂

This article depicts some simple principles or best practices about how to build a high availability web system. I get it from an experience share on QCon 2013 London by Amazon AWS solution architecture and mixed up of my own understanding. So the below is just a snapshot for a quick review in future.

 

Architecting for high availability 

 

Background

Unexpected spike happens sometimes, how to guarantee 99.99999% durability?
 

Goal

build an inherently HA and fault tolerant services
 

Rule

#1. Design for failure

Everything fails all the time
Avoid single points of failure 
Assume everything fails, and work backwards
Your goal applications should continue to function
 
e.g.
Loading balancing facilities: LVS, Nginx 
 
 

#2. Multiple availability zones

e.g.
Multiple IDC
 
 

#3. Scaling

Scale up/down capacity
No capacity planning required
Zero administration
Seamless scalability
 
e.g.
mongodb sharding
 
(I ‘ll say that is huge challenge, not all system need to implement that kind of architect, what engineers need to think about is how to prepare a simple and easy way to deal with scale up things for future when system have traffic or storage limit problem)
 
 
 

#4. Self-healing

Health check + auto scaling = Self-healing
 
e.g.
Loading balancing facilities: LVS, Nginx 
 
 

#5. Loose coupling

The looser they are coupled, the bigger they scale, the more fault tolerant they get
 
e.g.
ESB, Queue, MQ etc.
CDN for static files. 
 
 

It’s all about choice

Balance cost & HA

最后要说理论原则是一方面,最重要看自己的实现维护成本做tradeoff,找到最适合自己的才是王道。

 

Leave a Comment.