Architecting for high availability
下文是自己看完亚马逊一个关于高可用架构设计的总结帖,其中不乏一些经典实用的设计原则,为以后留存用,全文用英文表达 🙂
This article depicts some simple principles or best practices about how to build a high availability web system. I get it from an experience share on QCon 2013 London by Amazon AWS solution architecture and mixed up of my own understanding. So the below is just a snapshot for a quick review in future.
Architecting for high availability
Background
Unexpected spike happens sometimes, how to guarantee 99.99999% durability?
Goal
build an inherently HA and fault tolerant services
Rule
#1. Design for failure
Everything fails all the time
Avoid single points of failure
Assume everything fails, and work backwards
Your goal applications should continue to function
e.g.
Loading balancing facilities: LVS, Nginx
#2. Multiple availability zones
e.g.
Multiple IDC
#3. Scaling
Scale up/down capacity
No capacity planning required
Zero administration
Seamless scalability
e.g.
mongodb sharding
(I ‘ll say that is huge challenge, not all system need to implement that kind of architect, what engineers need to think about is how to prepare a simple and easy way to deal with scale up things for future when system have traffic or storage limit problem)
#4. Self-healing
Health check + auto scaling = Self-healing
e.g.
Loading balancing facilities: LVS, Nginx
#5. Loose coupling
The looser they are coupled, the bigger they scale, the more fault tolerant they get
e.g.
ESB, Queue, MQ etc.
CDN for static files.
It’s all about choice
Balance cost & HA
最后要说理论原则是一方面,最重要看自己的实现维护成本做tradeoff,找到最适合自己的才是王道。