What do we mean by "top percentile" or TP based latency?

When we discuss performance of a distributed system we use the terms tp50, tp90, tp99.99 TPS. Could someone explain what do we mean by those?


tp90 is a maximum time under which 90% of requests have been served. Imagine you have times:

10s
1000s
100s
2s

Calculating TP is very simple:

  • sort all times in ascending order: [2s, 10s, 100s, 1000s]
  • find latest item in portion you need to calculate. For TP50 it will ceil(4*.5)=2 requests. You need 2nd request. For TP90 it will be ceil(4*.9)=4. You need 4th request.
  • get time for the item found above. TP50=10s. TP90=1000s