How to determine if coin comes up heads more often than tails?
Not a math student, so forgive me if the question seems trivial or if I pose it "wrong". Here goes...
Say I'm flipping a coin a n times. I am not sure if it's a "fair" coin, meaning I am not sure if it will come up heads and tails each with a propability of exactly 0.5. Now, if after n throws it has come up heads exactly as many times as it has come up tails, then obviously there's nothing to indicate that the coin is not fair. But my intuition tells me that it would be improbable even for a completely fair coin to come up with heads and tails an exact even number of times given a large amount of tosses. My question is this: How "off" should the result be for it to be probable that the coin is not fair? IOW, how many more tosses should come up heads rather than tails in a series of n throws before I should assume the coin is weighted?
Update
Someone mentioned Pearson's chi-square test but then for some reason deleted their answer. Can someone confirm if that is indeed the right place to look for the answer?
Solution 1:
Given your prefatory comment, I'm going to avoid talking about the normal curve and the associated variables and use as much straight probability as possible.
Let's do a side problem first. If on a A-D multiple choice test you guess randomly, what's the probability you get 8 out of 10 questions right?
Each problem you have a 25% (.25) chance of getting right and a 75% (.75) chance of getting wrong.
You want to first choose which eight problems you get right. That can be done in 10 choose 8 ways.
You want .25 to happen eight times [$(.25)^8$] and .75 to happen twice [$(.75)^2$]. This needs to be multiplied by the possible number of ways to arrange the eight correct problems, hence your odds of getting 8 out of 10 right is
${10 \choose{8}}(.25)^8(.75)^2$
Ok, so let's say you throw a coin 3000 times. What's the probability that it comes up heads only 300 times? By the same logic as the above problem that would be
${3000 \choose{300}}(.5)^{300}(.5)^{2700}$
or a rather unlikely 6.92379... x 10^-482.
Given throwing the coin n times, the probability it comes up heads x times is
${n \choose{x}}(.5)^n$
or if you want to ask the probability it comes up heads x times or less
$\sum_{i=0}^{x}{{n \choose{i}}(.5)^n}$
so all you have to do is decide now how unlikely are you willing to accept?
(This was a Binomial Probability if you want to read more and all the fancier methods involving an integral under the normal curve and whatnot start with this concept.)
Solution 2:
I am surprised that no one has mentioned Hypothesis Testing so far. Hypothesis testing lets you to decide, with a certain level of significance, whether you have sufficient evidence to reject the underlying (Null) hypothesis or you have do not sufficient evidence against the Null Hypothesis and hence you accept the Null Hypothesis.
I am explaining the Hypothesis testing below assuming that you want to determine if a coin comes up heads more often than tails. If you want to determine, if the coin is biased or unbiased, the same procedure holds good. Just that you need to do a two-sided hypothesis testing as opposed to one-sided hypothesis testing.
In this question, your Null hypothesis is $p \leq 0.5$ while your Alternate hypothesis is $p > 0.5$, where $p$ is the probability that the coin shows up a head. Say now you want to perform your hypothesis testing at $10\%$ level of significance. What you do now is to do as follows:
Let $n_H$ be the number of heads observed out of a total of $n$ tosses of the coin.
Take $p=0.5$ (the extreme case of the Null Hypothesis). Let $x \sim B(n,0.5)$.
Compute $n_H^c$ as follows.
$$P(x \geq n_H^c) = 0.1$$
$n_H^c$ gives you the critical value beyond which you have sufficient evidence to reject the Null Hypothesis at $10\%$ level of significance.
i.e. if you find $n_H \geq n_H^c$, then you have sufficient evidence to reject the Null Hypothesis at $10\%$ level of significance and conclude that the coin comes up heads more often than tails.
If you want to determine if the coin is unbiased, you need to do a two-sided hypothesis testing as follows.
Your Null hypothesis is $p = 0.5$ while your Alternate hypothesis is $p \neq 0.5$, where $p$ is the probability that the coin shows up a head. Say now you want to perform your hypothesis testing at $10\%$ level of significance. What you do now is to do as follows:
Let $n_H$ be the number of heads observed out of a total of $n$ tosses of the coin.
Let $x \sim B(n,0.5)$.
Compute $n_H^{c_1}$ and $n_H^{c_2}$ as follows.
$$P(x \leq n_H^{c_1}) + P(x \geq n_H^{c_2}) = 0.1$$
($n_H^{c_1}$ and $n_H^{c_2}$ are symmetric about $\frac{n}{2}$ i.e. $n_H^{c_1}$+$n_H^{c_2} = n$)
$n_H^{c_1}$ gives you the left critical value and $n_H^{c_2}$ gives you the right critical value.
If you find $n_H \in (n_H^{c_1},n_H^{c_2})$, then you have do not have sufficient evidence against Null Hypothesis and hence you accept the Null Hypothesis at $10\%$ level of significance. Hence, you accept that the coin is fair at $10\%$ level of significance.
Solution 3:
This question is one of statistics (esp. statistical inference), not probability per se. Keywords: "binomial sampling"; "confidence interval for a proportion". Asking this at http://stats.stackexchange.com will get more complete answers.
The related probability fact is: if a coin has probability $p$ of coming up heads and is tossed $n$ times, then the observed number of heads will on average be $np$, but we expect the observed number to fluctuate (if the experiment with $n$ tosses is repeated many times) around the average by an amount on the order of $\sqrt{np(1-p)}$. Keywords: normal distribution, bell curve, Central Limit Theorem, binomial distribution, convergence of binomial distribution to normal (Gaussian) distribution.
Solution 4:
This problem is a Bayesian probability problem. You can never know whether or not the coin is fair because, as you point out yourself, even if the first n flips come up equally head and tails, you would merely have "nothing to indicate [yet] that the coin is unfair".
You can calculate the probability that the coin's bias is, say between 49% and 51%, by evaluating the following:
$I_{0.51}(h+\frac12,t+\frac12) - I_{0.49}(h+\frac12,t+\frac12)$
where $I$ is the incomplete beta function.
This follows from the definition of the Beta distribution as the conjugate prior of the Bernoulli distribution and the definition of the incomplete beta function as its cdf.