In Throughput and Response Time I described the essence of what we mean by software performance, but there are some additional meanings that are important to consider in many cases. So here is a bigger list of what we mean when we talk about performance:
- Response Time
- Resource Utilization, eg memory footprint
- Overload or failure characteristics
- Capacity planning
So lets take these one by one, starting with Scalability. Scalability is how well performance of the system can be improved by adding more hardware. Every software system has limitations to performance, and the underlying cause is often a limit to how fast part of the hardware can go. If the software has good scalability then you can add more of whatever is maxed out and get better performance. Often times we think about scalability simply as the ability to add more servers. For example, Google’s system is highly scalable, they have thousands of servers running all their apps, and as usage grows they just keep adding more servers to keep up with demand. They ‘scale up’. Scalability doesn’t have to be only about scaling across separate servers, we may want run some software on one individual server, but care if it is scalable as we add more CPUs or more disks. Relational databases are the classic example of this, its hard to run a cluster of database servers operating as one, so when we need better performance from a database its cheaper to upgrade to a more powerful server with more CPUs, RAM, and disks. Good scalability often comes with a small compromise in response time or throughput for one individual server. Ie, designing your software without caring about scalability might mean its performance is better then the same application designed with high scalability when both versions are running on just one server. But the scalable design can be scaled up to support more users, more work, or respond more quickly by adding more hardware, where the unscalable design will be stuck with the performance of a single server.
Stability is concerned with how the software stands up to heavy use by lots of users doing lots of work for a long period of time. Stable software crashes rarely and has few errors, unstable software crashes a lot and has problems completing work correctly. Stability is tied up with other performance topics because many stability problems only show up when performance testing with lots of users, and a high load on the system, so whoever is doing performance testing work usually has to also work on the stability of the system.
When looking at resource utilization, we care about how the software uses CPU, RAM, disk IO, disk space, and network capacity. All of these hardware resources are limited, and its desirable for our software to use these resources carefully. Under high load resource utilization will be higher than testing just one user with one use case, so just like stability, studying resource utilization is in the domain of performance testing.
Every software application has a limit to how fast it can go, how many users it can support, etc. When the system is overloaded by going beyond these limits then the software may fail by crashing, causing data loss, incorrectly completing work, dropping users, etc. These are the failure modes or overload characteristics of the application. The most desirable failure mode is that work backlogs and response time goes down, while throughput stays close to 100% of its max. The least desirable failure modes are when the system totally locks up during an overload, and throughput drops to 0. Understanding the overload characteristics are important for capacity planning.
Capacity planning is preparing for the demands on the software system in the future by understanding the performance characteristics of the application, predicting how many users and what kind of workload that application will have, and then setting up the right hardware to be able to run the software with the needed performance, without spending more money on hardware then you need to.