Is $50$th percentile equal to median?
Consider we have the $100$ distinct integers between $1$ and $100$ inclusive. The median and fiftyth percentile can be calculated as below.
Ordering: $1,2,3 ..... ,98, 99, 100$
- The median is $(50+51)/2$
- The $50$th percentile is $51$ ($51$ is greater than first $50$ elements)
I have two simple questions:
- Did I calculate the median and $50$ percentile correctly? If I calculated them correctly, why all of the sources say $50$th percentile is equal to median.
- If I didn't calculate them correctly, can you help me understand my mistake(s)?
Different books and software define quantiles using different rules. As you have already discussed, one of the differences is how ties (multiplicities) are treated. These different definitions can give noticeably different results for small datasets, but when the sample size increases the differences usually become negligible.
For most practical purposes, lower quartile, median, and upper quartile are the 25th, 50th and 75th percentiles, respectively.
In R statistical software, these are the results for your 'dataset' of the numbers from 1 through 100.
x = 1:100; median(x); quantile(x, .50)
## 50.5 # median
## 50%
## 50.5 # 50th percentile
For the dataset y
of the numbers from 0 through 100, both answers are 50.
Minitab statistical software also gives gives 50.5 and 50 as the medians for the two datasets,
respectively. Altogether, there are about a dozen slightly different
rules in use for finding quantiles. So Minitab and R agree on that much.
But for the 50th percentile they both disagree with the rule you stated
in your question. (SAS statistical software gives 50 as the 50th percentile
of dataset x
and 49 as the 50th percentile for dataset y
.)
As a beginning statistics student, there are two important bits of advice:
(1) During your course, always use the definitions provided by your textbook or instructor.
(2) Understand that others may have slightly different rules, so don't be surprised if you see something a little different in another text, using a statistical calculator, using statistical software, or for summarized data in a journal article.