Architecting for high availability

下文是自己看完亚马逊一个关于高可用架构设计的总结帖,其中不乏一些经典实用的设计原则,为以后留存用,全文用英文表达 🙂

This article depicts some simple principles or best practices about how to build a high availability web system. I get it from an experience share on QCon 2013 London by Amazon AWS solution architecture and mixed up of my own understanding. So the below is just a snapshot for a quick review in future.


Architecting for high availability 



Unexpected spike happens sometimes, how to guarantee 99.99999% durability?


build an inherently HA and fault tolerant services


#1. Design for failure

Everything fails all the time
Avoid single points of failure 
Assume everything fails, and work backwards
Your goal applications should continue to function
Loading balancing facilities: LVS, Nginx 

#2. Multiple availability zones

Multiple IDC

#3. Scaling

Scale up/down capacity
No capacity planning required
Zero administration
Seamless scalability
mongodb sharding
(I ‘ll say that is huge challenge, not all system need to implement that kind of architect, what engineers need to think about is how to prepare a simple and easy way to deal with scale up things for future when system have traffic or storage limit problem)

#4. Self-healing

Health check + auto scaling = Self-healing
Loading balancing facilities: LVS, Nginx 

#5. Loose coupling

The looser they are coupled, the bigger they scale, the more fault tolerant they get
ESB, Queue, MQ etc.
CDN for static files. 

It’s all about choice

Balance cost & HA



Leave a Comment.