Scalability
常见策略
伸缩策略分为:
- 纵向伸缩(scaling up):使用更强大的机器
- 横向伸缩(scaling out):将 负载 分散到多台小机器上
虽然单节点服务会更简单,但是高配置的机器会非常贵,现实业务通常需要结合两种方式。
伸缩触发方式:
- 自动伸缩:弹性(elastic)系统可以在检测到负载增加时自动增加计算资源。适合于负载极难预测(highly unpredictable)时。
- 手动伸缩:人工分析容量并决定增加、减少机器。更简单、更少意外。
跨多台机器部署无状态服务(stateless services)非常简单,单将带状态的服务从单节点变为分布式架构会引入许多额外复杂度。因此,数据库通常倾向于纵向伸缩,除非伸缩成本或可用性需求迫使其改为分布式。[1]
- Designing Data-Intensive Applications
世界上没有万金油可伸缩架构。良好的可伸缩架构是围绕假设(assumption)建立的,对 负载参数 的假设:哪些操作常见?哪些操作罕见?
对于早期产品而言,快速迭代的能力远比伸缩行重要。特别是无法预料未来的实际负载参数就设计可伸缩的架构,往往会适得其反。[1:1]
- Designing Data-Intensive Applications
Scalability is the term we use to describe a system’s ability to cope with increased load. Note, however, that it is not a one-dimensional label that we can attach to a system: it is meaningless to say “X is scalable” or “Y doesn’t scale.” Rather, discussing scalability means considering questions like “If the system grows in a particular way, what are our options for coping with the growth?” and “How can we add computing resources to handle the additional load?”[2]
People often talk of a dichotomy between scaling up (vertical scaling, moving to a more powerful machine) and scaling out (horizontal scaling, distributing the load across multiple smaller machines). Distributing load across multiple machines is also known as a shared-nothing architecture. A system that can run on a single machine is often simpler, but high-end machines can become very expensive, so very intensive workloads often can’t avoid scaling out. In reality, good architectures usually involve a pragmatic mixture of approaches: for example, using several fairly powerful machines can still be simpler and cheaper than a large number of small virtual machines.[3]
Some systems are elastic, meaning that they can automatically add computing resources when they detect a load increase, whereas other systems are scaled manually (a human analyzes the capacity and decides to add more machines to the system). An elastic system can be useful if load is highly unpredictable, but manually scaled systems are simpler and may have fewer operational surprises.[3:1]
The architecture of systems that operate at large scale is usually highly specific to the application—there is no such thing as a generic, one-size-fits-all scalable architecture (informally known as magic scaling sauce).
An architecture that scales well for a particular application is built around assumptions of which operations will be common and which will be rare—the load parameters. If those assumptions turn out to be wrong, the engineering effort for scaling is at best wasted, and at worst counterproductive. In an early-stage startup or an unproven product it’s usually more important to be able to iterate quickly on product features than it is to scale to some hypothetical future load.[1:2]