What do we mean by "top percentile" or TP based latency?

When we discuss performance of a distributed system we use the terms tp50, tp90, tp99.99 TPS. Could someone explain what do we mean by those?

tp90 is a maximum time under which 90% of requests have been served. Imagine you have times:

10s
1000s
100s
2s

Calculating TP is very simple:

sort all times in ascending order: [2s, 10s, 100s, 1000s]
find latest item in portion you need to calculate. For TP50 it will ceil(4*.5)=2 requests. You need 2nd request. For TP90 it will be ceil(4*.9)=4. You need 4th request.
get time for the item found above. TP50=10s. TP90=1000s

Related