性能 是指系统保质地完成给定任务的速度。对于不同的系统往往有不同的性能指标，比如对于 Hadoop 这种批处理系统而言，更关心吞吐量（throughput）；对于在线系统，更关心响应时间（response time）。
- Designing Data-Intensive Applications
对系统性能的要求会被设定在 SLO 和 SLA 中，表示系统性能的指标推荐使用百分位数（percentiles）而不是 平均数，比如 中位数、95 分位数（p95）、99 分位数（p99）甚至 99.9 分位数（p999）。
尾部延迟（tail latency）往往影响用户体验，当系统 负载 达到 100% 后就会产生排队延迟（queueing delay），只要有少量慢请求就会阻碍后续请求的处理，这被称为头部阻塞（head-of-line blocking）。
System performance is the amount of useful work done by a system - measured by the production speed of products of a predefined quality.
In a batch processing system such as Hadoop, we usually care about throughput — the number of records we can process per second, or the total time it takes to run a job on a dataset of a certain size.iii In online systems, what’s usually more important is the service’s response time — that is, the time between a client sending a request and receiving a response.
For example, percentiles are often used in service level objectives (SLOs) and service level agreements (SLAs), contracts that define the expected performance and availability of a service. An SLA may state that the service is considered to be up if it has a median response time of less than 200 ms and a 99th percentile under 1 s (if the response time is longer, it might as well be down), and the service may be required to be up at least 99.9% of the time. These metrics set expectations for clients of the service and allow customers to demand a refund if the SLA is not met.
Queueing delays often account for a large part of the response time at high percentiles. As a server can only process a small number of things in parallel (limited, for example, by its number of CPU cores), it only takes a small number of slow requests to hold up the processing of subsequent requests—an effect sometimes known as head-of-line blocking. Even if those subsequent requests are fast to process on the server, the client will see a slow overall response time due to the time waiting for the prior request to complete. Due to this effect, it is important to measure response times on the client side.
Martin Fowler: Patterns of Enterprise Application Architecture. Addison Wesley, 2002. ISBN: 978-0-321-12742-6 ↩︎
Martin Kleppmann, Designing Data-Intensive Applications, n.d. p13 ↩︎
Martin Kleppmann, Designing Data-Intensive Applications, n.d. p15 ↩︎