It's 99th percentile. It means that 99% of the requests should be faster than given latency. In other words only 1% of the requests are allowed to be slower.


We can explain it through an analogy, if 100 students are running a race then 99 students should complete the race in "latency" time.


Imagine that you are collecting performance data of your service and the below table is the collection of results (the latency values are fictional to illustrate the idea).

Latency    Number of requests
1s         5
2s         5
3s         10
4s         40
5s         20
6s         15
7s         4
8s         1

The P99 latency of your service is 7s. Only 1% of the requests take longer than that. So, if you can decrease the P99 latency of your service, you increase its performance.


Lets take an example from here

Request latency:
    min: 0.1
    max: 7.2
    median: 0.2
    p95: 0.5
    p99: 1.3

So we can say, 99 percent of web requests, the average latency found was 1.3ms (milli seconds/microseconds depends on your system latency measures configured). Like @tranmq told if we decrease the P99 latency of the service, we can increase its performance.

And it is also worth noting the p95, since may be few requests makes p99 to be more costlier than p95 e.g.) initial requests that builds cache, class objects warm up, threads init, etc. So p95 may be cutting out those 5% worst case scenarios. Still out of that 5%, we dont know percentile of real noise cases Vs worst case inputs.

Finally; we can have roughly 1% noise in our measurements (like network congestions, outages, service degradations), so the p99 latency is a good representative of practically the worst case. And, almost always, our goal is to reduce the p99 latency.