Suppose that four out of your 402 facebook friends have birthdays today. What’s the
likelihood probability of that happening? It’s not enough to just take (1/365)^3 because that doesn’t take into account how big your sample of facebook friends is. To find out, we need the binomial distribution: http://en.wikipedia.org/wiki/Binomial_distribution (the probability we need is the formula just after the table of contents with n = 402, k = 3, and p = 1/365). Plug the numbers in and you get 402!/6*399! * 48627125 * 0.997^399. How on earth does one calculate this by hand? I got stuck. But then my other half (who is a mathematician and who told I need the binomial distribution in the first place), wrote and emailed me a little Python program to calculate this. A mathematical/programming gift. Better than chocolates! And the likelihood is 0.0739604817154 – or just a bit more than 7%. That’s quite rare indeed, given that with 402 friends we have more people in need of birthdays than there are spare days in the year.
P.S. Actually, this formula is for sampling without replacement – but in this case we sample one friend at a time and then discard the name out of the pile, so there is one name less each time – which means that the draws are not independent. So we actually get a hypergeometric distribution, instead of a binomial one. However, Wikipedia claims that “for N much larger than n, the binomial distribution is a good approximation, and widely used”, so I’m happy with that.
P.P.S. In the first draft I used “likelihood” as a synonym of probability. As you see in this one, that was WRONG. Dammit, I’m such an imprecise social scientist.