Math Help Forum

Math Help Forum Feed Site Feed

Go Back   Math Help Forum > University Math Help > Advanced Probability and Statistics
Reply
 
Thread Tools Display Modes
  #1  
Old July 17th, 2008, 11:30 AM
Newbie
 
Join Date: Jul 2008
Posts: 4
Country:
Thanks: 2
Thanked 0 Times in 0 Posts
crobbo52 is on a distinguished road
Default Birthday coincidences

Only a gathering of 23 people are needed to to make the probability of two of them having the same birthday greater than 50%. This is a well known and surprising result.

In deriving the expression for probability of birthday coincidence in a group of n people, the approach seems always to consider the probability of non-coincidence:

p 364/365 x 363/365 x 362/365 ......x (365-n)/365

n 1 2 3 n

and subtract this from 1 to find p' (coincidence).

Is there a way of deriving the formula by considering the probability of coincidence directly and not having to go via the non-coincidence route?

Chris R
Reply With Quote
Advertisement
 
  #2  
Old July 17th, 2008, 12:53 PM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

Yes, but the only way I can think of is much harder to the point of silliness when you have a nice easy way of doing it. This is because it is easy to describe the non-event. There are many many ways to have the event as n gets large.

To use card game naming: You could have just 1 pair with the same birthday, 2 pairs, 3 pairs all the way up to \text{floor}(n/2). Then you could have 1 3-of-a-kind, 1 3-of-a-kind plus 1 pair, ....

You would have to add up all the probabilities for the different events, being very careful to never allow the events to overlap.

This is horribly nasty. What's the matter with computing the complement of the event?

There may be some other way to prove it but the standard way is the only general way that I have ever seen (including some quick googleing).

Last edited by meymathis; July 17th, 2008 at 06:02 PM.
Reply With Quote
  #3  
Old July 18th, 2008, 08:36 AM
Newbie
 
Join Date: Jul 2008
Posts: 4
Country:
Thanks: 2
Thanked 0 Times in 0 Posts
crobbo52 is on a distinguished road
Default Birthday coincidences

Thank you very much for your swift reply.

My reason for putting the question is primarily a teaching or explanatory one, not one of computation.

Yes I Googled Birthday coincidence too, and noticed everyone dived in straight away looking at non-coincidence.

The problem is when you are trying to explain why the probability of coincidence is much higher than you might think at first sight, to people who may not be too strong at maths, or may have forgotten stuff. To go straight to the complementary case without saying why is going to be puzzling.

As you helpfully point out (which is what I had suspected), to approach the problem head on is complex and that it is simpler to go at it from the complement. Some sort of sentence to this effect will certainly help the reader.

As an explanation, I would still prefer a direct approach, such as:

Select one person in the group of n. The probability that another chosen person has the same birthday is 1/365. The number of possible pairings of people in the group is

n/2 (n-1). Therefore the probability that there is a coincidence among the group is

1/365 + 1/365 + .... for each pairing, total prob = n/2 (n-1) /365.

I know this is flawed but is there some first approximation approach that would be easy to follow, and preferably sound?

I have another question in a similar vein concerning coincidence in a derangement, and the formula containing e. It has been a while since I looked at it, so I will take a day or to refresh, and submit presumably under stats. I would be grateful for help there, as well.

Chris Robinson
Reply With Quote
  #4  
Old July 18th, 2008, 12:32 PM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

Well, I don't think it is puzzling, but probability was my field of study. If I were teaching it to somebody I would explain that there are a lot of different ways for the event "at least 2 people have the same birthday" to occur. But the complimentary event "no two people have the same birthday" is much easier to describe.

If there was an easier way to get the probability I would agree that the constructive approach, rather than the complimentary, might be better, but when you actually look at the constructive approach you'll find that it is actually much harder.

I think complimentary approaches are a necessary part of doing probability and it can make computations significantly easier. It is a standard thing to think about. If I need to calculate a probability of an event, I look to see if the compliment of that event is easier to calculate.

I can think of 2 ways to approximate it. The first is the plain old constructive approach, but only try to calculate the most likely parts of the event. For example if n is small, mostly you should just have some number of pairs of data (not many 3-of-a-kinds). Further, most of the probability of the event will be in the lower number of pairs. So now, this gets rid of much of the nastiness, but as you'll see, its not a bucket of roses either.

Start with the probability for 1 pair. This is actually not necessarily all that easy.
\Pr(\text{number of pairs is exactly }1\text{ with everyone else having different birthdays})

Let's count by saying the first and the second have the same birthday. And the rest are distinct. And then multiply by n \choose 2 to factor in the fact that any 2 of these people could have the same birthday.

=\frac{{n \choose 2}365\cdot 1 \cdot 364 \cdots (365-n+2)}{365^n}

Now lets do 2 pairs. This is harder. First count how many days could have the 2 pair.
{365 \choose 2} (order doesn't matter}
Given that, how many ways are there of assigning the 2 pairs to the set of people?
{n \choose 2}{n-2 \choose 2} (again, order per birthday doesn't matter)
Given that, how many ways are there of filling out the rest of the birthdays?
\mathbb{P}^{365-2}_{n-4} (here order matters)

Ok so the probability is:
\frac{{365 \choose 2}{n \choose 2} {n-2 \choose 2}\mathbb{P}^{365-2}_{n-4}}{365^n} = \frac{{n \choose 2}{n-2 \choose 2}\frac{1}{2!}365\cdots (365-n+3)}{365^n}

Let's go for 3 pairs! This last way of breaking it down I think will work in general:
\frac{{365 \choose 3}{n \choose 2}{n-2 \choose 2} {n-4 \choose 2} \mathbb{P}^{365-3}_{n-6}}{365^n}

Do you see my point? It may look more straightforward to construct. But I don't think it is.

The only other approach that you could take could give you intuition into the problem, but it is not really the right thing. It would work if medians and means were the same thing. It requires expected values of dependent random variables and I do have to make use of complimentary events again.

Let X_i be the number of people born on day i (1-365). Note that X_i has a binomial distribution with probability of success 1/365.

Now let Y_i be 1 if X_i\geq 2 and 0 otherwise. Since X_i has a binomial distribution, we can calculate
\Pr(Y_i=1)=\Pr(X_i\geq 2)=1-(\Pr(X_i=0)+\Pr(X_i=1))
=1-\left(\left(\frac{364}{365}\right)^n+n\left(\frac{1}{365}\right) \left(\frac{364}{365}\right)^{n-1}\right)

Then you can ask what is the expected value of \sum_{i=1}^{365}Y_i

Note that the Y_i are very dependent, but that doesn't change the fact that the expectation of the sum is the sum of the expectations.

\mathbf{E}\left[\sum_{i=1}^{365}Y_i\right] = \sum_{i=1}^{365}\mathbf{E}[Y_i] = \sum_{i=1}^{365}1\cdot \left(1-\left(\left(\frac{364}{365}\right)^n+n\left(\frac{1}{365}\right) \left(\frac{364}{365}\right)^{n-1}\right)\right)
=n\left(1-\left(\left(\frac{364}{365}\right)^n+n\left(\frac{1}{365}\right) \left(\frac{364}{365}\right)^{n-1}\right)\right)

Then you can plot this as a function of n and see where it goes above 1. It turns out, if I did everything right, that it goes above 1 at n=19. This is giving you the flavor of the issue but it is not the same thing as saying for what n does the probability of having 2 or more people born on the same day go above 50%. You could ask what is \Pr\left(\sum_{i=1}^{365}Y_i\geq 1\right) but this is hard because of the dependence of the Y_i's and so we're back to where we started.

Reply With Quote
The Following 2 Users Say Thank You to meymathis For This Useful Post:
Donate to MHF
  #5  
Old July 18th, 2008, 12:47 PM
Member
 
Join Date: Jul 2008
Posts: 138
Country:
Thanks: 26
Thanked 59 Times in 49 Posts
meymathis will become famous soon enough
Default

By the way, I would like to give a big shout out to Awkward that showed the "bus stop" technique (using expectations of dependent RV's).

http://www.mathhelpforum.com/math-he...627-post3.html

It just goes to show you that an old dog can learn new tricks.
Reply With Quote
The following users thank meymathis for this useful post:
Donate to MHF
  #6  
Old July 18th, 2008, 05:59 PM
Senior Member
 
Join Date: Mar 2008
Posts: 448
Country:
Thanks: 20
Thanked 202 Times in 169 Posts
awkward is just really niceawkward is just really niceawkward is just really niceawkward is just really nice
Default

Quote:
Originally Posted by meymathis View Post
By the way, I would like to give a big shout out to Awkward that showed the "bus stop" technique (using expectations of dependent RV's).

http://www.mathhelpforum.com/math-he...627-post3.html

It just goes to show you that an old dog can learn new tricks.
Ahem. I heard that! Congratulations!

Anyway, here is another way to approximate the birthday probabilities which I think can be found in Ross, "A First Course in Probability", although I'm not sure because I don't have my copy here.

Suppose we have n people in the room, so there are \binom{n}{2} pairs. For any given pair, the probability that they have the same birthday is 1/365, so the mean number of matching pairs is \mu = \binom{n}{2}/365. So far everything has been exact and we haven't made any approximation. But now let's say the pairs are "approximately independent"; if so, then we can argue that the total number of matching pairs is approximately Poisson with mean \mu.

Then for n = 23 we have \mu = 0.6932, and for a Poisson distribution we have P(0) = 0.5000, so the approximation turns out to be pretty good.

Last edited by awkward; July 18th, 2008 at 07:15 PM. Reason: Fixed a typo
Reply With Quote
The Following 2 Users Say Thank You to awkward For This Useful Post:
Donate to MHF
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off
Forum Jump


All times are GMT -7. The time now is 04:22 AM.


Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0 ©2008, Crawlability, Inc.
©2005 - 2009 Math Help Forum


Math Help Forum is a community of maths forums with an emphasis on maths help in all levels of mathematics.
Register to post your math questions or just hang out and try some of our math games or visit the arcade.