How do I know when server "runs at speed limit"?

Solution 1:

When you're dealing with the web, it's usually not the speed of an individual request that matters, because it's not a realistic indicator of how the server will perform under load.

What you really want to know is whether your server can serve pages up fast enough for users to have a good experience under expected load conditions.

To answer that question, you should get a tool like WCAT or JMeter and simulate load against your server, then look for where the bottleneck is occuring (if there is any).

For example, say by looking at your Google Analytics or stats account, you know that the most users you have in a day is 500. You then decide that you want to support a peak of around 750 users per day. If you break that down (depending on whether your site is used 24x7 or 9x5), you might discover that the maximum simultaneous page requests you need to support is around 10.

Then with WCAT or Jmeter you run a test that has 10 users simultaneously using the site and performing various operations. If the response time is acceptable (you could be the judge, but I would say less than 2 seconds for page load time), then you can stop there or continue to add users to see at what point the performance drops off.

Once you see the performance drop off, you then correlate that dropoff with what is going on with your server at that time. Are you getting a lot of hard page faults? Is CPU usage high? Is your DB overwhelmed?

Once you discover the bottleneck (if there is any), then you can look at ways to reduce the bottleneck. For example, if you are getting hard page faults and disk thrashing, maybe caching is the way to go.

But it's all about knowing what metrics you need to hit, testing against those, and uncovering and resolving bottlenecks.

Solution 2:

You have a lot of concepts flying around here. CPU processing power, memory, caching, reverse proxy servers, and network bandwidth are all separate things that contribute to a good, or poor, end performance result.

Purely at the network layer, a modern PC or server can handle thousands and thousands of requests in just milliseconds. But that's before you add an application on to it that "does stuff". The parts of this equation that make the biggest differences are:

  1. Application (The code)
  2. Data / Storage (SQL, etc)
  3. Network Throughput / Bandwidth

So if 1 and 2 are already great, and your pages are actually being generated and served very quickly, then 3 might be the biggest bottleneck. If 2 is horrible, then 1 will be horrible too, even if 3 is great. If 1 is the bottleneck, then there's not anything you can optimize on 2 or 3 to make a difference.