August 25th, 2015 at 9:19:50 AM
permalink
This problem has been bugging me…
I have a standard 52-card deck.
I select one card at random and then put it back.
Repeat 100 times.
What is the expected number of times that the most occurring card will be selected?
By simulation, I’m getting about 5.665.
I am able to do the problem with smaller numbers (for example, rolling a dice four times). But I am having difficulties with the larger scale. It’s bugging me to no end—it feels like a more general solution should be easy. Any ideas?
I have a standard 52-card deck.
I select one card at random and then put it back.
Repeat 100 times.
What is the expected number of times that the most occurring card will be selected?
By simulation, I’m getting about 5.665.
I am able to do the problem with smaller numbers (for example, rolling a dice four times). But I am having difficulties with the larger scale. It’s bugging me to no end—it feels like a more general solution should be easy. Any ideas?
August 26th, 2015 at 1:58:17 AM
permalink
I got an approximate calculation of 5.660 by treating the numbers of times each card is drawn as independent Poisson random variables of rate 100/52. Then the cumulative distribution of the maximum becomes the cumulative Poisson distribution raised to the power of 52. From that you can take the finite difference to get the pmf of the maximum, and then use that to compute the expected value.
> k = 0:100
> F = ppois(k,100/52)^52
> f = c(F[1],diff(F))
> sum(k*f)
[1] 5.660232
If you use the cumulative binomial instead of the cumulative Poisson, the result is not quite as accurate at about 5.60. These are approximate of course because the numbers of each card are not completely independent.
If you had asked about the average maximum at the point where we had gotten all 52 cards at least once as in the coupon collector problem, that could be answered exactly using the trick of transforming the problem to a continuous time Poisson process as discussed here earlier. But in the current problem, the constraint that the number of cards must sum to 100 would cause the separate Poisson processes to not be independent.
Since we know that the the maximum M must be at least 2 from the pigeon hole principle, we can write the average as
E( M) = 2 + sum{k = 3 to 100} P(M ≥ k)
But the evaluation of P(M ≥ k) would require counting the number of ways to put 100 distinguishable balls in 52 numbered bins with at least k balls in at least 1 bin.
The second answer on this page gives an exact general formula for your case, but you would need to expand the expression.
> k = 0:100
> F = ppois(k,100/52)^52
> f = c(F[1],diff(F))
> sum(k*f)
[1] 5.660232
If you use the cumulative binomial instead of the cumulative Poisson, the result is not quite as accurate at about 5.60. These are approximate of course because the numbers of each card are not completely independent.
If you had asked about the average maximum at the point where we had gotten all 52 cards at least once as in the coupon collector problem, that could be answered exactly using the trick of transforming the problem to a continuous time Poisson process as discussed here earlier. But in the current problem, the constraint that the number of cards must sum to 100 would cause the separate Poisson processes to not be independent.
Since we know that the the maximum M must be at least 2 from the pigeon hole principle, we can write the average as
E( M) = 2 + sum{k = 3 to 100} P(M ≥ k)
But the evaluation of P(M ≥ k) would require counting the number of ways to put 100 distinguishable balls in 52 numbered bins with at least k balls in at least 1 bin.
The second answer on this page gives an exact general formula for your case, but you would need to expand the expression.