How many times do I roll an unfair die to determine its bias?
There's a way to approximate the answer using information theory. Let $X$ be the variable representing the side, and let $D$ be an outcome of a roll. Each roll tells you that much information about $X$: $$ I(X;D) = H(D) - H(D|X) = \log 96 + 95 \cdot 0.01 \log 0.01 + 0.05 \log 0.05. $$ As an approximation, the information gained from different rolls is additive; it's certainly at most additive. Since $H(X) = \log 96$ (since you have no a-priori belief about $X$), the number of rolls needed is at least $$ \frac{H(X)}{I(X;D)} \approx 115. $$
Another way to approach the same question is as follows. Suppose the die was thrown $N$ times. The good side came up approximately $N/20$ times, and all the others came up $\mathrm{Bin}(N,0.01) \approx \mathcal{N}(0.01N,0.0099N)$. So we want the probability that such a normal variable be larger than $N/20$ to be smaller than $1/95$ (since there are $95$ such variables; we're ignoring their dependence). This is $$\frac{0.05N - 0.01N}{\sqrt{0.0099N}} \approx 0.4\sqrt{N}$$ standard deviations. Looking at a table, we need roughly $2.3$ standard deviations to get a probability of $1/95$, so we need $$N \approx (2.3/0.4)^2 = 33. $$ Of course, this estimate is pretty rough (it doesn't even match our previous supposed lower bound).
The estimate in the previous paragraph is particularly problematic since in this range the convergence to the normal distribution is pretty bad (the expectation is very small, roughly $1$). In this regime, the probability is actually roughly Poissonian. Let's estimate the probability of success when $N=100$ using this approximation. The correct side is going to show up roughly $5$ times. Each other side is going to show up $P(1)$ times, so it's going to be as large as $4$ with probability $$ 1 - e^{-1} \sum_{t=0}^4 \frac{1^t}{t!} \approx 0.00366. $$ So the expected number of "contenders" is $$ 0.00366 \cdot 95 \approx 0.35.$$ Assuming independence, the probability of no contenders is roughly $$ (1 - 0.00366)^{95} \approx 0.706. $$ This approximation is only roughly vindicated by the experiments below.
In order to find a better approximation, let's take into account the roughly $P(5)$ distribution of the histogram of $X$: $$ e^{-5} \sum_{H=0}^\infty \frac{5^H}{H!} \left(e^{-1} \sum_{t=0}^{H-1} \frac{1}{t!} \right)^{95} \approx 0.527. $$ This conforms much better to the results below.
Finally, here are experimental results from one million trials, along with the Poisson approximation:
$$ \begin{array}{r|c|c} N & \text{Experiment} & \text{Approx} \\\hline 50 & 0.272323 & 0.274900 \\ 100 & 0.526765 & 0.527424 \\ 150 & 0.718004 & 0.714956 \\ 200 & 0.840143 & 0.836808 \\ 250 & 0.912917 & 0.910115 \\ 300 & 0.954097 & 0.951953 \end{array}$$
The results support the information-theoretic bound, as well as our estimates obtained using the Poisson approximation.
Note: When the success probability is $p$ and there are $T$ trials, the standard deviation is $$\sqrt{\frac{p(1-p)}{T}} \leq \frac{1}{2\sqrt{T}}.$$ In our case, $T = 10^6$ and so a standard deviation is at most $0.5\cdot 10^{-3}$. With probability $95\%$, the error is at most two standard deviations, i.e. $\pm 0.001$. So the experimental results are correct up to roughly the third digit (which may be slightly off itself).
As the other answer shows, the answer is not simple. This is because it depends on the exact sequence of rolls. You can see this using Bayesian reasoning. Let $X$ be a r.v. that indicates the biased side. Denote the value of the $i^\mathrm{th}$ role by $Y_i$. Then,
$$p(Y_1, \cdots, Y_n|X=x) = \left(\frac{1}{20}\right)^{m_x} \left(\frac{1}{100}\right)^{n-m_x}$$
where
$$m_x = \sum_{i=1}^n \mathbf{1}\left(Y_i = x\right)$$
and $\mathbf{1}$ is the indicator function. Then, we can use Bayes rule to calculate the probability of $X=x$ given a sequence of rolls.
$$p(X=x|Y_1, \cdots, Y_n) = \frac{p(Y_1, \cdots, Y_n|X=x)p(X=x)}{P(Y_1, \cdots, Y_n)} = \frac{p(Y_1, \cdots, Y_n|X=x)p(X=x)}{\sum_{x'} P(Y_1, \cdots, Y_n | X=x')p(X=x')}$$
Substituting from above and assuming all sides are uniformly equal
$$=\frac{\left(\frac{1}{20}\right)^{m_x} \left(\frac{1}{100}\right)^{n-m_x} \frac{1}{96}}{\sum_{x'=1}^{96}\left[ \left(\frac{1}{20}\right)^{m_{x'}} \left(\frac{1}{100}\right)^{n-m_{x'}}\right] \frac{1}{96}}$$
So, basically, its hard to know a priori how many rolls because it depends on how many "repeats" on all sides comes up. But, you can easily compute the probability that $X=x$ where $x$ is any side you want and stop after enough rolls.
Let's agree that the first side has $5\%$ chance, and remaining $95$ sides each have $1\%$ of occurrence. After $n$ rolls, let $N_k$ denote the random variable for the number of occurrences of $k$-th side. Then the random vector $(N_1,\ldots, N_{96})$, subject to $\sum_{k=1}^{96} N_k = n$, follows multi-nomial distribution.
I would interpret your question as a quest to determine, for $p > \frac{1}{2}$: $$ n_\mathrm{min} = \operatorname{arg min}_n \mathbb{P}(N_1 > N_2 \land N_1 > N_3 \land \ldots \land N_1 > N_{96} ) > p $$
This probability equals: $$ \begin{eqnarray} \mathbb{P}\left( \land_{k=2}^{96} N_1 > N_k \right) &=& \sum_{n_1=1}^n \mathbb{P}\left( \land_{k=2}^{96} N_k \le n_1 -1 ; N_1 = n_1 \right) \mathbb{P}\left( N_1 = n_1 \right) \\ &=& \sum_{k=0}^{n-1} \left( F\left( k+1,k,\ldots,k \right) - F\left( k,k,\ldots,k \right) \right) \end{eqnarray} $$ where $F(n_1,\ldots,n_k,\ldots,n_{96}) = \mathbb{P}\left( N_1 \le n_1, \ldots, N_k \le n_k,\ldots, N_{96} \le n_{96} \right)$.
Even when normal or Poisson approximation to multinomial cumulative distribution function is applicable, it does not simplify matter very much.
As an approximation $F$ can be replaced with product of marginal cumulative distribution functions, since correlation coefficients are on the order of $0.01$, i.e. small: $$ \mathbb{P}\left( \land_{k=2}^{96} N_1 > N_k \right) = \sum_{k=0}^{n-1} f_{\mathrm{Bi}\left(n, \frac{1}{20}\right)}(k+1) \left( F_{\mathrm{Bi}\left(n, \frac{1}{95}\right)}(k) \right)^{95} $$
I ran simulation, and plotted the probability $\mathbb{P}(\land_{k=2}^{96} N_k < N_1)$ as a function of $n$, and compared to this approximation, the agreement is rather good: