Hunterhill
Hunterhill
  • Threads: 53
  • Posts: 2151
Joined: Aug 1, 2011
May 2nd, 2013 at 12:14:48 PM permalink
Lets say there is a drawing that has 50 tickets in a drum,each time two tickets will be pulled out,after the drawing, those two tickets will be returned to the drum.
How many drawings would they need to have before each ticket will have been picked? What if they pulled 4 tickets at a time?
The mountain is tall but grass grows on top of the mountain.
Mission146
Mission146
  • Threads: 142
  • Posts: 16832
Joined: May 15, 2012
May 2nd, 2013 at 12:40:00 PM permalink
Quote: Hunterhill

Lets say there is a drawing that has 50 tickets in a drum,each time two tickets will be pulled out,after the drawing, those two tickets will be returned to the drum.
How many drawings would they need to have before each ticket will have been picked? What if they pulled 4 tickets at a time?



I don't know that there's any kind of guaranteed answer because you are putting the tickets back. The probability of any one ticket being pulled is 1/50, and that failing, 1/49, so the two ways this can happen are:

(1/50 * 49/49) and (49/50 * 1/49)

So, then you would just add those two combinations of events together to get the overall probability of pulling any individual ticket, at that point, you have to decide how sure you want to be that it is going to happen.

Basically, you're more likely not to pull it than to pull it, so take 1 - ((1/50 * 49/49) + (49/50 * 1/49)) = 0.96

Okay, so you have a 96% chance of NOT pulling it in any individual trial. So, here are a few examples:

(.96)^17 = 0.49958680767981345

Which is very slightly better than a 50% chance of pulling it in 17 sets of two pulls.

(.96)^34 = 0.24958697840770694

Which is very slightly better than a 75% chance of pulling it in 34 sets of two pulls.

(.96)^57 = 0.0976023519842129

Which is very slightly better than a 90% chance of pulling it in 57 sets of two pulls.

(.96)^113 = 0.009923144909218946

Which is very slightly better than a 99% chance of pulling it in 113 sets of two pulls.

But, that's not exactly what you asked.

I'd imagine that you would take the probability of pulling any individual ticket after x pulls and take it to the power of how many tickets there are to get your overall percentage of having pulled all tickets.

For example, this would mean that after 113 Rounds (226 pulls) you would have a:

(1-.009923144909218946)^50 = 0.6073589127712696

Probability of having pulled each one at least once.

However, my reasoning could be flawed. Someone brighter than me will be around soon enough to either verify what I said or tell you the right way to go about it, as I'm not anywhere near 100% sure about this one.
https://wizardofvegas.com/forum/off-topic/gripes/11182-pet-peeves/120/#post815219
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 2nd, 2013 at 3:05:29 PM permalink
If you are interested in the expected number of draws, then this is closely related to this problem #73 and #74.

There is also a thread about it here.

However, the given formulae are for the case where you would pick one ticket at a time. I guess for two tickets there won't be much difference, but if you specifically want to know this then you could
tringlomane
tringlomane
  • Threads: 8
  • Posts: 6281
Joined: Aug 25, 2012
May 2nd, 2013 at 3:16:37 PM permalink
Quote: MangoJ


However, the given formulae are for the case where you would pick one ticket at a time. I guess for two tickets there won't be much difference, but if you specifically want to know this then you could



I am thinking this problem is significantly more difficult to solve analytically, unless I am missing something (definitely possible).
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 2nd, 2013 at 4:02:50 PM permalink
Totally agree on the significant difficulty, that's why I asked of the "two tickets at a time" is important enough. The "not much difference" was related to the actual numerical result for his problem of 50 cards. The number of draws would be slightly lower, since the second ticket drawn would never be the same as the first ticket and thus avoids colision a little bit.
Hunterhill
Hunterhill
  • Threads: 53
  • Posts: 2151
Joined: Aug 1, 2011
May 2nd, 2013 at 4:14:57 PM permalink
Quote: MangoJ

Totally agree on the significant difficulty, that's why I asked of the "two tickets at a time" is important enough. The "not much difference" was related to the actual numerical result for his problem of 50 cards. The number of draws would be slightly lower, since the second ticket drawn would never be the same as the first ticket and thus avoids colision a little bit.

Yes the two tickets part is important,and also what if there were 4 or 6 tickets at a time. I really appreciate everyones effort.
The mountain is tall but grass grows on top of the mountain.
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 3rd, 2013 at 12:19:15 AM permalink
In this case you should do:

1) understand the result of the "one ticket draw".
2) find the result for two tickets draws.
3) generalize it to k tickets.
4) check your results for k=N tickets which should be trivial.
5) Write a scientific paper about your results.
6) ...
7) Profit.
socks
socks
  • Threads: 15
  • Posts: 364
Joined: Jul 13, 2011
May 3rd, 2013 at 3:31:52 AM permalink
I just made a quick pass at it... no formal proof or math to back anything up...

and I'm getting just under 113
4 tickets = appox 56.5
6 tickets = approx 38

Global arr:Int[50]
Global count:Int = 0
Global totalcount:Int = 0

SeedRnd(MilliSecs())

Function sumarr()
Local sum:Int = 0
For i = 0 To 49
sum :+ arr
Next
Return sum
EndFunction

For i = 1 To 100000
While sumarr() < 50
arr[Rand(50)-1] = 1
arr[Rand(50)-1] = 1

count :+ 1
Wend
totalcount :+ count
count = 0
For k = 0 To 49
arr[k] = 0
Next
Next

Print
Print totalcount/100000.0


Then I realized I was allowing the same ticket to be picked twice in one draw. I added a few lines to prevent it, but everything broke, and I couldn't see the problem. At least I think it broke... I kept getting around 34...

Maybe I'll look at it again later.
kubikulann
kubikulann
  • Threads: 27
  • Posts: 905
Joined: Jun 28, 2011
May 3rd, 2013 at 5:50:05 AM permalink
This is essentially the problem of filling a "picture album" (Panini soccer players in Europe, you in the USA have the same, I guess, with pictures of base-ball players).

Here, they sell in packs of 3 (or 5). You know that a pack will contain three different pics (equivalent to the 2 mentioned in the original post).
You want to know how many packs you will need to buy in order to fill your album (=50 pics in the OP).

Actually, what you are looking for is the distribution of this number. The average is already difficult to compute, but it does not help much in practice, because variance is quite large.

When I was trying to solve the problem, I encountered too many difficulties to write it in closed from. So I made a homespun MonteCarlo experiment by buying an album and filling it. (First and last time in my life that I came interested in football :-] ) But too damn expensive to make a good sample! I tried reordering the purchased packs to see how the album was filling, but I quickly lost track.

In the simpler case where you draw one at a time (with replacement), the average is twice the number of tickets (100, in the example). The distribution is a kind of reverse Polya experiment (the urn content is changing after each draw).
The result of my experiment (3 per pack), alas! seemed to show that I needed about the same amount: twice the number of pics. But hey! no statistical significance here, so don't trust it.
Reperiet qui quaesiverit
7craps
7craps
  • Threads: 18
  • Posts: 1977
Joined: Jan 23, 2010
May 3rd, 2013 at 6:26:01 AM permalink
one method here
https://wizardofvegas.com/forum/questions-and-answers/math/9115-50-states-10-at-a-time/4/#post133407

added:
Quote: Hunterhill

How many drawings would they need to have before each ticket will have been picked?
What if they pulled 4 tickets at a time?

CM answer on page 2 is the average # of draws.
"2 tickets = 111.593945
4 tickets = 54.89879864
6 tickets = 35.98917945
8 tickets = 26.52536371
10 tickets = 20.83935048
12 tickets = 17.04175093"
observation:
The mean is real close to 224.9604 (draw one at a time) / #tickets drawn each draw then - 1

For 2 tickets each draw you would find it fails 41.26671% of the time to select each ticket at least one time
within 112 draws.
The distribution is not normal.
I agree the mean is about 111.5939436
The median is 106
The mode is 97
here is what it looks like in my Excel
I used the inclusion-exclusion formula in the pdf in the next post instead of a transition matrix
(maybe later for that as I like Markov Chains)

The probability distribution depends on how many total draws
since it is possible that even after T draws, all tickets have yet to be chosen.

miplet's method, inclusion-exclusion method in the next linked pdf
or a Markov Chain look to be the paths to take for determining the actual probabilities.

CM may have the MC already done
I also show a 0.014147 probability of having more than 200 total draws
drawing 2 tickets at a time
T       Probability    
[1,] 0
[2,] 0
[3,] 0
[4,] 0
[5,] 0
[6,] 0
[7,] 0
[8,] 0
[9,] 0
[10,] 0
[11,] 0
[12,] 0
[13,] 0
[14,] 0
[15,] 0
[16,] 0
[17,] 0
[18,] 0
[19,] 0
[20,] 0
[21,] 0
[22,] 0
[23,] 0
[24,] 0
[25,] 5.6744368e-21
[26,] 1.856235663e-18
[27,] 1.122143407e-16
[28,] 2.997096053e-15
[29,] 4.719570326e-14
[30,] 5.072375751e-13
[31,] 4.064354381e-12
[32,] 2.575160402e-11
[33,] 1.345280882e-10
[34,] 5.977566396e-10
[35,] 2.313815966e-09
[36,] 7.951182532e-09
[37,] 2.462922777e-08
[38,] 6.963305893e-08
[39,] 1.815701845e-07
[40,] 4.404987503e-07
[41,] 1.001742909e-06
[42,] 2.149129701e-06
[43,] 4.373946175e-06
[44,] 8.485737812e-06
[45,] 1.575993001e-05
[46,] 2.812506362e-05
[47,] 4.838927166e-05
[48,] 8.05017395e-05
[49,] 0.0001298406789
[50,] 0.0002035152873
[51,] 0.0003106657489
[52,] 0.0004627429875
[53,] 0.0006737489169
[54,] 0.0009604184891
[55,] 0.001342326902
[56,] 0.001841908707
[57,] 0.002484379938
[58,] 0.003297559448
[59,] 0.004311590822
[60,] 0.005558571354
[61,] 0.007072099077
[62,] 0.008886752544
[63,] 0.01103752082
[64,] 0.0135592028
[65,] 0.01648579553
[66,] 0.01984989089
[67,] 0.0236820985
[68,] 0.02801051108
[69,] 0.03286022552
[70,] 0.03825293059
[71,] 0.04420656875
[72,] 0.05073507701
[73,] 0.05784820844
[74,] 0.06555143396
[75,] 0.07384592116
[76,] 0.08272858553
[77,] 0.09219220769
[78,] 0.1022256091
[79,] 0.1128138785
[80,] 0.1239386395
[81,] 0.1355783525
[82,] 0.1477086405
[83,] 0.1603026323
[84,] 0.1733313144
[85,] 0.186763886
[86,] 0.2005681101
[87,] 0.2147106556
[88,] 0.2291574264
[89,] 0.2438738734
[90,] 0.258825287
[91,] 0.273977067
[92,] 0.2892949696
[93,] 0.3047453302
[94,] 0.3202952611
[95,] 0.3359128244
[96,] 0.3515671818
[97,] 0.3672287198
[98,] 0.3828691535
[99,] 0.3984616094
[100,] 0.4139806877
[101,] 0.4294025074
[102,] 0.4447047344
[103,] 0.4598665947
[104,] 0.4748688739
[105,] 0.4896939051
[106,] 0.5043255462 <<< here lies the median
[107,] 0.5187491479
[108,] 0.532951515
[109,] 0.5469208596
[110,] 0.5606467507
[111,] 0.5741200587
[112,] 0.5873328966
[113,] 0.6002785593
[114,] 0.612951461
[115,] 0.6253470717
[116,] 0.6374618539
[117,] 0.6492931988
[118,] 0.6608393642
[119,] 0.6720994122
[120,] 0.6830731503
[121,] 0.693761072
[122,] 0.7041643012
[123,] 0.7142845377
[124,] 0.7241240055
[125,] 0.733685403
[126,] 0.7429718561
[127,] 0.7519868732
[128,] 0.7607343035
[129,] 0.7692182968
[130,] 0.7774432665
[131,] 0.7854138545
[132,] 0.7931348988
[133,] 0.800611403
[134,] 0.8078485081
[135,] 0.8148514666
[136,] 0.8216256183
[137,] 0.8281763685
[138,] 0.8345091674
[139,] 0.8406294914
[140,] 0.8465428268
[141,] 0.8522546538
[142,] 0.8577704329
[143,] 0.8630955925
[144,] 0.8682355174
[145,] 0.8731955388
[146,] 0.8779809256
[147,] 0.882596876
[148,] 0.8870485113
[149,] 0.8913408692
[150,] 0.8954788992
[151,] 0.8994674575
[152,] 0.9033113039
[153,] 0.9070150981
[154,] 0.910583398
[155,] 0.9140206567
[156,] 0.9173312218
[157,] 0.9205193343
[158,] 0.9235891276
[159,] 0.9265446275
[160,] 0.9293897521
[161,] 0.9321283125
[162,] 0.934764013
[163,] 0.9373004524
[164,] 0.9397411246
[165,] 0.9420894201
[166,] 0.9443486277
[167,] 0.9465219353
[168,] 0.9486124326
[169,] 0.9506231118
[170,] 0.9525568703
[171,] 0.9544165123
[172,] 0.9562047508
[173,] 0.9579242096
[174,] 0.9595774258
[175,] 0.9611668511
[176,] 0.9626948549
[177,] 0.9641637257
[178,] 0.9655756736
[179,] 0.9669328323
[180,] 0.9682372612
[181,] 0.9694909478
[182,] 0.9706958093
[183,] 0.9718536953
[184,] 0.9729663892
[185,] 0.9740356107
[186,] 0.9750630173
[187,] 0.976050207
[188,] 0.9769987192
[189,] 0.9779100376
[190,] 0.9787855911
[191,] 0.9796267564
[192,] 0.9804348592
[193,] 0.981211176
[194,] 0.9819569359
[195,] 0.9826733222
[196,] 0.983361474
[197,] 0.9840224874
[198,] 0.9846574174
[199,] 0.9852672794
[200,] 0.98585305 <<<<<< (1 - this)
The R code for the above
Tmax = 200
Nc = 50 # number of coupons or tickets
r = 2 # number of coupons (tickets) drawn each time
require(Rmpfr) # a must installed R package
prob = rep(0,Tmax)

for (T in ceiling(Nc/r):Tmax) {
k = 0:Nc
prob[T] = as.double(sum((-1)^k*choose(Nc,k)*(chooseZ(Nc - k,r)/chooseZ(Nc,r))^T)) # magic formula
}

prob = as.matrix(prob)
colnames(prob) = "Probability"
cat("\n")
print(formatC(prob, digits=10),quote=FALSE)

"I love it when a plan comes together"
winsome johnny (not Win some johnny)
dwheatley
dwheatley
  • Threads: 25
  • Posts: 1246
Joined: Nov 16, 2009
May 3rd, 2013 at 8:21:55 AM permalink
The problem is an extension of the Coupon Collector's problem (from combinatorial math).

This short paper's 2nd page has the solution for k-packs. Use N=50, k=2 and then solve the resulting matrix. Not pretty.
http://www.eecs.berkeley.edu/~walid/teasers/gcj-coupons.pdf

This paper refers to the problem as the Baseball Card Collector's Problem, allowing for r-packs as well.
http://www.math.clemson.edu/~kevja/REU/2003/CouponCollectorReport.pdf

That last paper has a combinatorial expression in terms of t, the number of trials required to fill the book, middle of page 2. You could get a computer to calculate many of those probabilities, then find the expectation.
Wisdom is the quality that keeps you out of situations where you would otherwise need it
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 3rd, 2013 at 8:59:17 AM permalink
Quote: kubikulann

This is essentially the problem of filling a "picture album" (Panini soccer players in Europe, you in the USA have the same, I guess, with pictures of base-ball players).


[...]

In the simpler case where you draw one at a time (with replacement), the average is twice the number of tickets (100, in the example).



You are right and wrong. Yes it is the "picture album problem*. For one picture at a time it has already been solved, and the solution is

X = N * (1/1 + 1/2 + ... + 1/N) ~= N * (ln(N) + 0.5772)

which is quite far away from your "2*N" for any reasonable number of pictures (N).
CrystalMath
CrystalMath
  • Threads: 8
  • Posts: 1911
Joined: May 10, 2011
May 3rd, 2013 at 9:09:55 AM permalink
Quote: Hunterhill

Lets say there is a drawing that has 50 tickets in a drum,each time two tickets will be pulled out,after the drawing, those two tickets will be returned to the drum.
How many drawings would they need to have before each ticket will have been picked? What if they pulled 4 tickets at a time?


2 tickets = 111.593945
4 tickets = 54.89879864
6 tickets = 35.98917945
8 tickets = 26.52536371
10 tickets = 20.83935048
12 tickets = 17.04175093
I heart Crystal Math.
Hunterhill
Hunterhill
  • Threads: 53
  • Posts: 2151
Joined: Aug 1, 2011
May 3rd, 2013 at 10:02:28 AM permalink
Quote: CrystalMath

2 tickets = 111.593945
4 tickets = 54.89879864
6 tickets = 35.98917945
8 tickets = 26.52536371
10 tickets = 20.83935048
12 tickets = 17.04175093

Thanks Crystal. Now if I can just win that drawing.
The mountain is tall but grass grows on top of the mountain.
Ayecarumba
Ayecarumba
  • Threads: 236
  • Posts: 6763
Joined: Nov 17, 2009
May 3rd, 2013 at 12:54:27 PM permalink
Quote: CrystalMath

2 tickets = 111.593945
4 tickets = 54.89879864
6 tickets = 35.98917945
8 tickets = 26.52536371
10 tickets = 20.83935048
12 tickets = 17.04175093



However, there is no guarantee that after the above number of drawings, every ticket will have been pulled. What is the level of confidence?
Simplicity is the ultimate sophistication - Leonardo da Vinci
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 3rd, 2013 at 2:52:15 PM permalink
You would need to calculate the probability distribution, which is given in the link provided by Wheatly in Section 3.

From this probability distirbution, you can calculate every detail such as variances, confidence intervals, you name it.
CrystalMath
CrystalMath
  • Threads: 8
  • Posts: 1911
Joined: May 10, 2011
May 3rd, 2013 at 6:39:58 PM permalink
I estimate the standard deviation to be 30.65 when drawing 2 items. I only calculated the discrete probabilities up to 650 draws. You have a 1.414695% chance that it will exceed 200 draws.
I heart Crystal Math.
Mission146
Mission146
  • Threads: 142
  • Posts: 16832
Joined: May 15, 2012
May 4th, 2013 at 7:48:44 AM permalink
Quote: CrystalMath

I estimate the standard deviation to be 30.65 when drawing 2 items. I only calculated the discrete probabilities up to 650 draws. You have a 1.414695% chance that it will exceed 200 draws.



Check it out, my method was really close, or is it just a coincidence?

(.96)^200 = 0.00028460767526957515

(1-0.00028460767526957515)^50 = 0.9858683927647606

1 - 0.9858683927647606 = 0.01413160723523943---1.413161%
https://wizardofvegas.com/forum/off-topic/gripes/11182-pet-peeves/120/#post815219
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 4th, 2013 at 8:06:29 AM permalink
Quote: Mission146

or is it just a coincidence?



In a way, because you guestimated 200 tickets which is quite near the expected 223 tickets.
  • Jump to: