Quote:Gabes22The only problem I see in the McDonald's problem is don't they "release" the different toys at different dates. i.e. releasing toy A on June 1st and toy B on June 8th and so on and so forth, They might do it in waves to like release A, B and C on one date then D, E and F on another and so on and so forth. Knowing that information may drastically change the odds.

I have seen them do that, yes.

the simple answer is, of course, just less than 2x that of collecting at least 1 each...Quote:Ace2Has anyone ever worked the coupon collector problem, but with 2 hits required? So, for instance, what is the expected number of roulette spins for every number to have appeared at least twice ?

I have some help with this question now,

as the pdf I was using had the transition matrix populated correctly

but how to do it was by far not correct. (the main diagonal is easy using a helper column, the others are elusive right now)

I found a formula for this

in this pdf on page 14 that looks good

"By termwise integration it is easy to obtain the following exact

formula, which is a finite sum, for T2,

the average number of trials needed to collect

at least two of each of the d kinds of coupons:"

http://www.brynmawr.edu/math/people/anmyers/PAPERS/SIGEST_Coupons.pdf

but do not understand where and what m and j is (I think m=2 and j=number of states) in the formula they arrive at

(the results from the formula are correct so far)

but I never took calculus and the calculus to inclusion exclusion conversion

lost me

at this point in time

Sally

Quote:mustangsallyI found a formula for this in this pdf on page 14 that looks good

"By termwise integration it is easy to obtain the following exact formula, which is a finite sum, for T2, the average number of trials needed to collect

at least two of each of the d kinds of coupons:"

http://www.brynmawr.edu/math/people/anmyers/PAPERS/SIGEST_Coupons.pdf

but do not understand where and what m and j is (I think m=2 and j=number of states) in the formula they arrive at

(the results from the formula are correct so far)

I think m and j under the sigma sign usually means "for all nonnegative integers m and j," but in this case, it appears to be 0 <= j <= m <= d - 1, as otherwise (d-1)C(m) and (m)C(j) are undefined.

However, while I get results of 2, 11/2, 347/36, and 11259/864, I get them for d = 1 through 4 instead of 2 through 5, which makes sense, as if d = 1, it will always take 2 draws to get "each" coupon twice, while for d = 2, getting all of the coupons in a total of 2 draws happens only half of the time.

ok, I have time now to look at this again.Quote:ThatDonGuyI think m and j under the sigma sign usually means "for all nonnegative integers m and j," but in this case, it appears to be 0 <= j <= m <= d - 1, as otherwise (d-1)C(m) and (m)C(j) are undefined.

nice observation

what values did you use for m and j?Quote:ThatDonGuyHowever, while I get results of 2, 11/2, 347/36, and 11259/864,

I saw that too. I think that is a typo as the rows of results below that shows the correct values 1:4Quote:ThatDonGuyI get them for d = 1 through 4 instead of 2 through 5, <snip>

For the 38 numbers on a 00 Roulette wheel,

Excel says (once I was shown how to populate the transition matrix fast)

mean spins = 234.8326629 for 'at least 2' of each number

that is just over a 40% increase for the 'at least 1' value.

(time for a fast simulation)

now, to see if I can too get the formula to work (from the pdf I linked to)

The Markov chain method may take some time to get into R or even Pari GP

I want the distribution (as always)

Sally

Here is a small table of data so farQuote:Ace2Based on what I’ve seen the limit could be the natural log of 2. So if the average is about 7,485 tries to get all of them once, it would be that divided by 0.693 = 10,800 to get all of them twice. Just a guess.

A | B | C | D | E | F |
---|---|---|---|---|---|

coupons | at least 1 | at least 2 | B/C | C-B | E/B |

1 | 1 | 2 | 0.500 | 1 | 1 |

2 | 3 | 5.5 | 0.545 | 2.5 | 0.833333333 |

3 | 5.5 | 9.6389 | 0.571 | 4.1389 | 0.752527273 |

4 | 8.3333 | 14.189 | 0.587 | 5.8557 | 0.702686811 |

5 | 11.417 | 19.041 | 0.600 | 7.624 | 0.667776123 |

6 | 14.7 | 24.134 | 0.609 | 9.434 | 0.641768707 |

7 | 18.15 | 29.425 | 0.617 | 11.275 | 0.621212121 |

8 | 21.743 | 34.885 | 0.623 | 13.142 | 0.604424412 |

9 | 25.46 | 40.492 | 0.629 | 15.032 | 0.590416339 |

10 | 29.29 | 46.23 | 0.634 | 16.94 | 0.578354387 |

38 | 160.66 | 234.833 | 0.684 | 74.173 | 0.461676833 |

Sally

extended the data as I finished the R code for the Markov chain solution (my computer is so small and old as 8GB ram is all)Quote:mustangsallyHere is a small table of data so far

coupons | at least 1 | at least 2 | B/C | C-B | E/B |
---|---|---|---|---|---|

70 | 338.299 | 481.7105 | 0.702 | 143.4115 | 0.423919373 |

80 | 397.238 | 562.7222 | 0.706 | 165.4842 | 0.416587033 |

90 | 457.431 | 645.1401 | 0.709 | 187.7091 | 0.410355004 |

100 | 518.7378 | 728.8052 | 0.712 | 210.0674 | 0.404958729 |

110 | 581.046 | 813.5907 | 0.714 | 232.5447 | 0.400217367 |

120 | 644.264 | 899.393 | 0.716 | 255.129 | 0.396000708 |

125 | 676.1905 | 942.648 | 0.717 | 266.4575 | 0.394056852 |

130 | 708.317 | 986.1261 | 0.718 | 277.8091 | 0.392210126 |

140 | 773.14 | 1073.717 | 0.720 | 300.577 | 0.388774349 |

200 | 1175.606 | 1614.313 | 0.728 | 438.394 | 0.372908951 |

300 | 1884.799 | 2557.884 | 0.737 | 673.085 | 0.35711235 |

400 | 2627.972 | 3538.97 | 0.743 | 911 | 0.346654378 |

to see what the 'at least 1' and 'at least 2' look like together

Oh, they look to be going apart

ok,

math is done 4 today

Happy 4th of July 2018 USA!

Sally

Quote:mustangsallywhat values did you use for m and j?

For d = 3, I used the (m,j) pairs (0,0), (0,1), (0,2), (1,1), (1,2), and (2,2).

For d = 4, I used the (m,j) pairs (0,0), (0,1), (0,2), (1,1), (1,2), (2,2), (0,3), (1,3), (2,3), and (3,3).

In other words, replace the Sigma with "m,j" at the bottom with a Sigma with "m=0" at the bottom and "d-1" at the top, followed by another Sigma with "j=0" at the bottom and "m" at the top.

thank you.Quote:ThatDonGuyIn other words, replace the Sigma with "m,j" at the bottom with a Sigma with "m=0" at the bottom and "d-1" at the top, followed by another Sigma with "j=0" at the bottom and "m" at the top.

I found another pdf that shows what m and j are

problem I see is this

(j + 2)!

in the formula

Excel does 170! and below but requires the gamma() for higher factorials

R takes too long over that (and precision suffers)

I used a recursion method in R to make this

coupons | at least 1 | at least 2 | at least 3 | at least 4 | at least 5 |
---|---|---|---|---|---|

1 | 1 | 2 | 3 | 4 | 5 |

2 | 3 | 5.5 | 7.875 | 10.1875 | 12.46094 |

3 | 5.5 | 9.6389 | 13.48688 | 17.19202 | 20.80842 |

4 | 8.3333 | 14.189 | 19.56591 | 24.71023 | 29.71024 |

5 | 11.417 | 19.041 | 25.98684 | 32.60279 | 39.01481 |

6 | 14.7 | 24.13387 | 32.67716 | 40.78943 | 48.63511 |

7 | 18.15 | 29.425 | 39.58954 | 49.21796 | 58.51471 |

8 | 21.743 | 34.885 | 46.69068 | 57.85175 | 68.61409 |

9 | 25.46 | 40.492 | 53.95589 | 66.6637 | 78.90403 |

10 | 29.29 | 46.23 | 61.36614 | 75.63296 | 89.36209 |

11 | 33.21865 | 52.08425 | 68.9063 | 84.74299 | 99.97046 |

12 | 37.23853 | 58.04503 | 76.5641 | 93.98035 | 110.7147 |

13 | 41.34174 | 64.10284 | 84.32933 | 103.3339 | 121.5829 |

14 | 45.52187 | 70.25003 | 92.19342 | 112.7944 | 132.5651 |

15 | 49.77343 | 76.48005 | 100.149 | 122.3537 | 143.6525 |

16 | 54.09166 | 82.78725 | 108.1899 | 132.005 | 154.838 |

17 | 58.47239 | 89.1667 | 116.3104 | 141.7422 | 166.115 |

18 | 62.91195 | 95.61405 | 124.5058 | 151.5602 | 177.4778 |

19 | 67.40705 | 102.1254 | 132.7717 | 161.4543 | 188.9216 |

20 | 71.955 | 108.6974 | 141.1043 | 171.4202 | 200.4418 |

21 | 76.55253 | 115.3269 | 149.5002 | 181.4542 | . |

22 | 81.19789 | 122.0111 | 157.9563 | 191.553 | . |

23 | 85.8887 | 128.7475 | 166.4696 | 201.7135 | . |

24 | 90.623 | 135.5337 | 175.0378 | 211.9329 | . |

25 | 95.39895 | 142.3677 | 183.6583 | 222.2086 | . |

26 | 100.2149 | 149.2474 | 192.329 | 232.5384 | . |

27 | 105.0693 | 156.1711 | 201.0481 | 242.92 | . |

28 | 109.9608 | 163.1372 | 209.8135 | 253.3515 | . |

29 | 114.888 | 170.144 | 218.6236 | 263.831 | . |

30 | 119.85 | 177.1902 | 227.4768 | 274.3569 | . |

31 | 124.8446 | 184.2745 | 236.3717 | 284.9274 | . |

32 | 129.8718 | 191.3955 | 245.3068 | 295.5411 | . |

33 | 134.9303 | 198.552 | 254.2809 | . | . |

34 | 140.0191 | 205.7431 | 263.2927 | . | . |

35 | 145.1373 | 212.9677 | 272.3412 | . | . |

36 | 150.2841 | 220.2247 | 281.4251 | . | . |

37 | 155.4587 | 227.5133 | 290.5436 | . | . |

38 | 160.6603 | 234.833 | 299.6956 | . | . |

39 | 165.8882 | 242.1819 | 308.8803 | . | . |

40 | 171.142 | 249.5602 | 318.0967 | . | . |

this looks good for now

Sally

Quote:Ace2Can someone run the number for 1000 coupons, getting at least 2 of each ? I’d be interested to know if there’s a limit. Based on what I’ve seen the limit could be the natural log of 2. So if the average is about 7,485 tries to get all of them once, it would be that divided by 0.693 = 10,800 to get all of them twice. Just a guess.

I can't use the formula for anything above about 50, as there's either an overflow or an underflow (i.e. the number is so small that it is treated as zero) that's throwing the results off.

For those of you interested in details, it's something like this: for 51 coupons, with M = 50 and J = 25, the value involved in the sum is on the order of 10

^{18}. However, 64-bit floating point numbers as implemented by Microsoft are only accurate to about 15 digits, so it's going to be off by up to 1000, which is not good when the final sum (over all J from 0 to 50) is supposed to be around 400.

However, when I run a simulation on various numbers of coupons up to 1000, and divide the simulated average number of draws needed to get two of each by the expected number needed to get one of each, it approaches something like 1.319.

EDIT: At 5000, it's more like 1.278.

As a matter of fact, I wouldn't be surprised if the limit of the ratio is 1. If you have, say, Googolplex (that's 1 followed by googol zeroes) different coupons, there's a very, very, very good chance you will be very close to having chosen all of the coupons at least twice by the time you choose every coupon once.