MarkJ's Rules for Good Software Performance

To get good performance from your software applications your team needs to follow MarkJ's Rules of Good Software Performance: 1) Measure first and optimize second - find bottlenecks , ie don't guess where a performance problem might be before you have one, find the slow part of the system through testing. In performance engineering it's time you have to follow - find out where the time goes and make that part faster. The slowest part of the system is it's bottleneck, the part that maxes out first and limits performance of the entire system. Making another part of the system faster but leaving the bottleneck alone isn't going to make much difference. Steve Sounders gives an excellent example of this in 'High Performance Web Sites: Essential Knowledge for Front-End Engineers', explaining how Yahoo optimized page response time. They found that 10% of the page load time came from Yahoo's back end servers, and 90% came from page download and rendering time on the browser. To improve page load time there was little point trying to optimize the back end server code any more because it only accounted for 10% of the time, instead they spent their time finding ways to optimize page download and rendering time because that's where 90% of the time was going.

2) Use I/O efficiently - it's much slower than code. Inefficient use of I/O is the underlying cause for 50-75% of all performance problems in typical business software, web apps, and most software I've come across. The data the application uses is in some place that has to be accessed via disk I/O or network I/O. I/O to disk is really really slow. I/O to the network is quite slow, and even accessing data in RAM is slower than using data in the CPU and cache. Obviously you can't just skip the I/O altogether, you have to access the data in an efficient way. This comes down to designing the software so that it gets a large amount of data with a small number of I/O calls, rather a large number of I/O calls each getting a small amount of data. The classic example, which is seen in software over and over again, is SQL select calls to the database inside a loop. Ie code inside a loop keeps doing another I/O to get another row from a table. This slow. The correct way to code this is to select all the data needed in one go, then step through a single large result set inside the loop.

3) Use shared resources sparingly - or don't share them. Shared resources usually have to be protected against concurrent use by multiple threads by synchronization locks, but the synchronization makes threads wait to use the resource. Synchronization is the underlying cause for the other 25-50% of performance problems, and concurrency contention is at the heart of scalability. If software has high concurrency contention, aka poor concurrency, then many users requests, many threads, many CPUs, etc will all be waiting to get a lock on a shared resource before they can execute. All that waiting adds up to poor performance. Software that has low concurrency contention, aka good concurrency, allows lots of different things to be going on at once in parallel without all the tasks waiting for each other. Synchronization brings thread safety to shared data, but it also increases concurrency contention.

4) Learn how you are supposed to use APIs and third party components. It seems obvious, but this is the cause of many performance problems. When you use something in a way in that its original designers did not intend, then you often end up tripping on rules 2, 3, or both. (Where this something might be a third party API, a piece of code written by your colleague, or your database.) Programmers are very creative people. A programmers natural tendency when using an API is first to get a basic understanding what it does, and then to dream up no end of 'clever' new ways to use it. Unfortunately if you don't understand exactly how the insides of this component work then once you stray away from the intended uses you end up causing the component to use inefficient patterns of access to data and/or have poor concurrency. A classic example in business software that uses Oracle is using dynamic SQL instead of prepared statements. Oracle is designed so that frequently used queries can execute with good concurrency if they are issued using prepared statements. The part of Oracle that makes it have great performance with prepared statements means that if you instead perform every query as a brand new piece of SQL, then Oracle has high concurrency contention in the statement cache (rule 3) and performance is terrible.

5) Use a realistic amount of test data and a realistic workload - don't test against and empty database. It's a classic problem that occurs again and again. The code works fine on the developer's PC where his database contains just a few hundred rows of test data, but when the system goes into production with millions of rows in the database performance is a disaster. Performance engineering is just too hard to get right without good testing. There will always be something you overlooked or underestimated, or some place where you broke rules 2, 3, or 4 in your code that you need to uncover. But if you test with a practically empty database then you'll fail to uncover these surprises because guess what - your application will go really fast when the database is empty. Further, what you really have to do is some basic capacity planning (rule 6) to forecast what the workload will be, how much data there will be in the database, and how many concurrent users you'll have, and design a test that simulates all of that.

6) Do performance engineering work throughout your project. Doing this means you can have visibility to potential performance problems as you go, while you have a chance to redesign the problem away. If you leave performance testing to the end of the project you'll discover multiple cases of breaking rules 2, 3, and 4 that will require redesign, recoding, and retesting - which will take you over budget and you'll miss your delivery date. If you are introducing a new technology, design, or algorithm its a good idea to do a technical prototype early on and performance test it. Also you need to start with some simple capacity planning: How many users? How much data? What performance characteristics do you need? What kind of hardware can you run it on? Without having a rough answer to these questions you won't know if your performance is good enough or not. If you miss you'll either have a system that is too slow or you'll have wasted money building something that is faster than needed.

7) Don't optimize coding while coding. It's a direct violation of rule 1, but I'm restating it as the last rule because programmers often forget this and waste their time trying to 'optimize' the code they are writing as they go (me included). It's a waste because there's no point trying to optimize code until you know where the bottlenecks are, and when programmers 'optimize' code, we usually make that code harder to understand, harder to maintain, introduce additional bugs. This doesn't mean you shouldn't make good choices concerning performance as you go - you should pay attention to patterns of access to data and concurrency contention. However it's almost never necessary to 'optimize code', code is nearly always fast enough.