Good tools and approaches for diagnosing poor performance
My company is developing a web-based data viewer application which requires a fairly decent amount of bandwidth to function well. However recently we have been changing a lot of things. For example, we changed our internal network infrastructure so that data can be hosted on separate machines connected by Gigabit Ethernet. On top of that, the application itself keeps coming out with new versions since we are still in alpha and beta testing.
Recently we made some changes that are causing poorer performance, and we want to try to identify where the problem is before we start tearing things apart. It is a very small network, and I have limited experience as an IT admin. I have a few ideas for where to start, but I would like to harvest a little wisdom from the pros first: How do you tackle/avoid similar problems? What are the most useful (Windows) tools you have used?
Solution 1:
I always follow this approach: Try to test one thing at a time.
The trusty "Scientific method" works really well for troubleshooting:
- Come up with a theory for why the app is slow
- Devise a test that may confirm that theory.
- repeat.
For a webapp this might mean:
- could it be the databse? Run some standalone SQL queries
- could it be the web server? Test the web server by fetching static pages
- could it be the app? Test the web server by hitting dynamic pages that don't hit a database
- could it be the apps interface to the db? Test the web server by hitting dynamic pages that do hit a database.
also running basic benchmarks for testing cpu,memory,disk speed can help rule one of those things out before you go any further.
I see things like this all the time:
back ups take longer on the new server than they did on the old one.
But no one did a basic disk benchmark to find out that the older server had twice as many spindles than the new server does... or a network benchmark to find out that the new servers gigabit ethernet was only running at 100M.
all that said, if this is a custom web application, the framework you are using most definitely has a way to dump performance information to a log file.. but that is more of a question for stackoverflow.
Solution 2:
I have subscribed to the "Sherlock Holmes" method of troubleshooting, aka Binary Search Troubleshooting Method:
- Divide the problem space in half.
- Rule out one half of the problem space.
- Repeat with remaining problem space.
In my experience, you sometimes get lucky by trying some obvious things first, but once you exhaust the truly quick fixes, you need to get methodical quickly.
This method is compatible with Scientific Method and Test One Thing At A Time.