Bottlenecks - disk, cpu, synchronization, etc

Bottlenecks are the parts of the system that limit performance. The system is made up of multiple parts that can run in parallel, for example the disk can be doing disk IO at the same time as the CPU is running some code, but when the system is at its maximum throughput then one of those parts is running at 100% of it's maximum capacity, and that part is the bottleneck, everything else is running at less then 100% because its spending some time waiting for the bottleneck. Here's a list of possible bottlenecks, ordered roughly in order of how likely they are to be the bottleneck in a typical business application:

  1. disk IO
  2. CPU
  3. synchronization of a shared software resource
  4. an external system your software interacts with
  5. network IO
  6. virtual memory
  7. main memory / RAM
  8. some of the users of the system

Usually the bottleneck dominates the use of time in the system, and hence usually performance can only be significantly improved by addressing the bottleneck (this is the down to earth meaning of Amdahl's Law). So it's key to find the bottleneck if you want to improve performance. This is the reason for performance rule 1: measure first optimize second. By measuring where the time is being spent you can find the bottleneck and optimize that, rather than waste time optimizing something that isn't the bottleneck.

To improve performance you can either make the bottlenecked part itself go faster, or you can use the bottlenecked part more efficiently by getting the system to do more useful work for the same load on the bottleneck. For example if disk IO is the bottleneck, you can either get faster disks or you can change your code to use disk IO more efficiently. If hardware is the bottleneck then rarely will changing the hardware lead to very big performance increases, and often times changing the hardware simply isn't possible. But the good news is that the very large performance increases usually come from changing code to use the hardware more efficiently, because the performance problems in most systems are due to violations of the performance rules, not hardware that is too slow.