Ardent1
Ardent1
  • Threads: 5
  • Posts: 168
Joined: Dec 19, 2012
December 21st, 2012 at 3:07:27 AM permalink
I am new here and I posted a question on the Gambling Forum and a couple people have been nice to try to answer my question (here is my original post https://wizardofvegas.com/forum/questions-and-answers/gambling/12336-on-standard-deviation-in-a-double-0-or-american-roulette/#post203906 )

I really have a simple question -- I am tracking an Interblock Double 0 roulette wheel. I have observed some really rare occurrences and I was curious to find out how many standard deviations of these events.

For example, I observed a 5-number section, i.e. neighbors or 5-consecutive numbers on the wheel, that missed for 67 consecutives spins. I know the mean occurrence is 7.6 or 38/5. For it to miss 67 consecutive spins is rare. Then there is the case where a a 6-number section missed for 74 consecutive spins, especially when the mean occurrence is 6.333 spins (38/6). That is to say I know the mean of the hit frequency, but I don't know the std dev or dispersion around the hit frequency, respectively.

Can anyone give me the forumula to solve for the std deviation of the hit frequency. So far, people are giving me the std dev for a given number of trials, but that isn't the same as the as the std deviation of the hit frequency, which is independent of the number of trials. 24Bingo came up with an answer (the std dev hit frequency for a 5-number section is 7.082), and when I tested it, I must have done something wrong because a 67-consecutive miss results in a 8.4 std dev event, and that can't be right.

If someone is kind enough to walk me through the calculations for a 5-number section, I can use that example to teach myself for the 6-number to 9-number sections.

Thanks.
kubikulann
kubikulann
  • Threads: 27
  • Posts: 905
Joined: Jun 28, 2011
December 21st, 2012 at 3:51:10 AM permalink
Quote: Ardent1

So far, people are giving me the std dev for a given number of trials, but that isn't the same as the as the std deviation of the hit frequency, which is independent of the number of trials.


The standard deviation is NOT independent from the number of trials. That is at the core of any statistical inference (the law of large numbers, to say it simply).

You have a binomial experiment. Two parameters: probability of hit (p) and number of trials (n). Variance of the number of hits = p(1-p)n. Variance of the hit frequency (or observed proportion) = p(1-p) / n.

In your cases, p is 5/38 or 6/38.

Problem is n. Most people, confronted with a rare event like this, compute the probability as if there had been only THAT specific succession of trials. They forget that their observation began way before the rare series began, and that the roulette has been spinning way before and will spin long after. So, what is the correct n to use?

There are more subtle models. Negative binomial gives the distribution of sequences before an event happens (here, the sequence of non-hits before a hit). But again, it assumes you began counting on the first non-hit, which is a way of biasing your figures.

So, you actually need to be aware of the total number of trials that you have been tracking (or, better, that you intend to track). Then there are distribution models for the apparition of sequences (the so-called "runs tests"). I don't remember the formulas right here. Type "run test" in a search engine.

Happy celebrations.
Reperiet qui quaesiverit
kubikulann
kubikulann
  • Threads: 27
  • Posts: 905
Joined: Jun 28, 2011
December 21st, 2012 at 4:13:25 AM permalink
A gross approach is like this.

You have something like p=0.0000785 of having to wait 67 trials before witnessing a hit. 1-(33/38)^67
Dump all shorter run lengths together: 1-p = .9999215

Now your tracking can be viewed as a succession of double runs: NNN...NHH...H. (H=Hit, N=Not hit)
How many of these double runs do you need for the observed event "67" to be considered normal?
Firstly, define normal (or "not rare"; it is a probability threshold, called alpha in standard statistics)
It means you look for the number n' of trials (double runs) before an event (a 67 run) of probability p (0.0000785) happens with probability (at least) alpha: typical negative binomial.

BUT... how do you translate a number n' of double runs into a number n of spins? You need to calculate the distribution of the length of a double run, by convoluting two negative binomial distributions with p=5/38 and 33/38. Not straightforward! As for simply the average length, it is the summed average of both runs, i.e. (38/5-1) + (38/33-1) = 6.7515 spins per double run.

Example: alpha=1%
Find n' such that F(n'|p) > 0.01
1 - (1-p)^n' * (p) > 0.01
n' = 58525
n= 6.7515 n' = 395130

Ouch! That means the "67" event has a 1%+ chance of happening in a series of 395130 spins. Quite rare indeed. (And actually leads me to think I must have made an error. Somebody correct me?)
Reperiet qui quaesiverit
Ardent1
Ardent1
  • Threads: 5
  • Posts: 168
Joined: Dec 19, 2012
December 21st, 2012 at 4:28:24 AM permalink
Quote: kubikulann

The standard deviation is NOT independent from the number of trials. That is at the core of any statistical inference (the law of large numbers, to say it simply).



Let me make sure we are on the same page. Are you stating that the mean hit frequency is also NOT independent of the the number of trials? That makes no sense? The mean hit frequency or expectation of a 5-number section is 7.6 spins regardless of the number of trials because there are 38 numbers and I am betting the 5 numbers each, respectively. That is true on any 1 spin -- or statedly differently, the mean occurrence or expectation is a CONSTANT. The ACTUAL mean hit frequency for each session may differ than expectation.

If the mean hit frequency is independent of the number of trials (i.e. it is a constant), then the dispersion (i.e. std dev) around this mean must also be independent of the number of trials.

I measure these cycles from the last time a 5-number section hit, I then count the consecutive misses, since the next spin is when the 5-number section hit again. I know EXACTLY when the cycle started and ended, and I was curious to discern how many std deviations these consecutive misses would be.

Also, since you are using the Law of Large Number ("LLN"), I can solve for this std dev of hit frequency using a monte carlo simulation (but I don't program or write code) or collect a huge amount of data since LLN tells me the data approaches the expected value. However, I am looking for a closed-form solution.

Personally, I think you made the same mistake that the first two people made -- except 24Bingo caught his mistake. I am NOT asking about a population-specific std deviation measured in units associated with a specific number of trials. I can't make this any clearer -- if the mean hit frequencyy is a constant, then there must be hit frequency std dev around this constant.

Thank you.

PS Here's another way to think about my problem using a single die. The frequency of "5" occurring is 1 in 6 because there are six sides (numbered 1 to 6) and only one 5. Someone now asks you what is the std dev of the "5" occuring given you know the expectation of a "5" is 1 in 6. How would you answer that -- and if your answer is the std dev depends on the number of trials, then you don't understand my question.
kubikulann
kubikulann
  • Threads: 27
  • Posts: 905
Joined: Jun 28, 2011
December 21st, 2012 at 4:50:55 AM permalink
Quote: Ardent1

Here's another way to think about my problem using a single die. The frequency of "5" occurring is 1 in 6 because there are six sides (numbered 1 to 6) and only one 5. Someone now asks you what is the std dev of the "5" occuring given you know the expectation of a "5" is 1 in 6. How would you answer that -- and if your answer is the std dev depends on the number of trials, then you don't understand my question.

In this question, you don't mention a number of trials. The die is thrown only once. Then n=1 and the stdev is SQRT(p(1-p)).

It appears like it is independent of any n, but it is not: it is a particular case where n=1.

Quote: Ardent1

If the mean hit frequency is independent of the number of trials (i.e. it is a constant), then the dispersion (i.e. std dev) around this mean must also be independent of the number of trials.

That is where your intuition fails you. Although the mean frequency is not affected by the sample size, the stdev IS affected. You state that it "must" be constant, but you have no proof of that. Wishful thinking. Just do the math.
Reperiet qui quaesiverit
kubikulann
kubikulann
  • Threads: 27
  • Posts: 905
Joined: Jun 28, 2011
December 21st, 2012 at 5:24:39 AM permalink
Hey, Ardent, I looked up in Gambling and see what is going wrong.

You speak of a hit frequency, but the term is ill-defined. We put a different meaning on it.
You understand it as the expected number of trials necessary to birng the first hit. Although this may be the way bettors talk (like when they say "the odds"), it is not a standard way of expressing things in math. hence our misunderstanding.

The variable you are interested in is the NegBin(p=5/38;k=1) (also known as Geometric distribution).
MEAN (1-p)k / p
VARIANCE (1-p)k / p²

In your case:
MEAN 33 / 5 (Why not 38/5? Because it means the number of N before a H, which can be zero.)
VARIANCE 33*38 / 25

67 is only 3.814 stdev away from the mean.

If you are restricting yourself to the cases when you have at least one Non-hit, then the numbers change. It becomes a conditional probability function. Mean is 38/5 but variance is slightly smaller.
Reperiet qui quaesiverit
24Bingo
24Bingo
  • Threads: 23
  • Posts: 1348
Joined: Jul 4, 2012
December 21st, 2012 at 10:29:33 PM permalink
I think what Ardent wishes is to treat the time before one hits a given window of numbers as a random variable, and is trying to apply things he's heard about the mean and standard deviation that only apply to approximately Gaussian variables (i.e., primarily, sums of many identically distributed variables), and thinks that by computing the standard deviation differently he can apply those things to this single variable. The problem is that they just don't apply at all - there's no "standard deviation for a single trial." If the distribution is shaped differently, as this one is, things like "eight standard deviations out" just don't work, no matter what number you call the "standard deviation."

I've given you the standard deviation: it's sqrt(1-n/38)/(n/38). This comes from the definition of the standard deviation, that it's the square root of the difference between the mean of the square and the square of the mean. That's not the standard deviation "for many trials," just the standard deviation. The way you're trying to use it only works over many trials, but that's not its fault. If you want to work out probabilities, you just have to work them out directly.

The probability that a window of n numbers will not have hit after k trials is ((38-n)/38)^k. That's the bottom line. 67 consecutive misses of five numbers, therefore, has a chance of 0.00785% - low, yes, possibly even evidence of a biased wheel, but it might also just be a run of bad luck. Lot of people play roulette.
The trick to poker is learning not to beat yourself up for your mistakes too much, and certainly not too little, but just the right amount.
Ardent1
Ardent1
  • Threads: 5
  • Posts: 168
Joined: Dec 19, 2012
December 24th, 2012 at 1:33:49 AM permalink
Quote: kubikulann

That is where your intuition fails you. Although the mean frequency is not affected by the sample size, the stdev IS affected. You state that it "must" be constant, but you have no proof of that. Wishful thinking. Just do the math.



This post is SOLELY addressing kubikulann's assertion that my intuition is flawed.

Here is my intuition explained using a fair, two-sided coin, which is much simplier than a 5-number section in a 38-number roulette wheel. I am interested in solving the std dev of the first hit (in my prior posts, I was not clear so I apologize for the poor wording).

We know the mean occurrence is a constant because there are ony two outcomes (heads or tails) and one result, thus the mean occurrence is 0.5 or 1 in 2.

I would argue the std dev "of the first hit" around this mean is also a constant. Why, because the variance is a constant, and the std dev is simply the square root of the variance.

The variance is simply the dispersion around the mean occurrence. Someone with better math skills than I can solve for the variance by looking at the following probability distribution -- here a tail would be considered a success when tossing a coin, and we are INTERESTED in the first hit or first tail after the maximum number of consecutive misses.

On the first toss, there is a 1 in 2 chance of hitting a tail
On two consecutive tosses, there is a 1 in 4 chance of hitting a tail (after missing a tail on the first toss)
On three consecutive tosses, there is a 1 in 8 chance of hitting a tail (after missing a tail on the first two tosses)
On four consecutive tosses, there is a 1 in 16 chance of hitting a tail (after missing a tail on the first three tosses)
etc ... you see the pattern or function

That is to say, the dispersion (or variance) around the mean occurrence is known ahead of time because it can be solved.

Now, if I used kubikulann's post of:
"The variable you are interested in is the NegBin(p=5/38;k=1) (also known as Geometric distribution).
MEAN (1-p)k / p
VARIANCE (1-p)k / p²"

If I plug in the numbers, I would get a mean of 1.0, a variance of 2.0, and a std dev of 1.414.

With 13 consecutive heads and then a tail appearing on the 14th toss, I can now calculate the number of std dev for this drought of "tails": (13 minus 1) / (1.414) or 8.49 std dev from the mean.
Ardent1
Ardent1
  • Threads: 5
  • Posts: 168
Joined: Dec 19, 2012
December 24th, 2012 at 1:58:09 AM permalink
Quote: kubikulann

You speak of a hit frequency, but the term is ill-defined. We put a different meaning on it.
You understand it as the expected number of trials necessary to birng the first hit. Although this may be the way bettors talk (like when they say "the odds"), it is not a standard way of expressing things in math. hence our misunderstanding.



This is a gambling website -- the majority of us with inferior math skills are probably bettors. 8-)

Quote:

The variable you are interested in is the NegBin(p=5/38;k=1) (also known as Geometric distribution).
MEAN (1-p)k / p
VARIANCE (1-p)k / p²

In your case:
MEAN 33 / 5 (Why not 38/5? Because it means the number of N before a H, which can be zero.)
VARIANCE 33*38 / 25

67 is only 3.814 stdev away from the mean.



I didn't get the 3.814 std dev answer, can you show me how you got that answer. I got 8.53 std dev from the mean (67 - 6.6) / 7.082, which is nowhere near your 3.814 number.

Btw, I ran the numbers through your formula:

5-number section with 67 consecutive misses; mean of 6.6, variance of 50.16, and std dev of 7.082
6-number section with 74 consecutive misses; mean of 5.333, variance of 33.778, and std dev of 5.812
7-number section with 51 consecutive misses; mean of 4.429, variance of 24.041, and std dev of 4.903
8-number section with 49 consecutive misses; mean of 3.75, variance of 17.813, and std dev of 4.220

And unfortunately, the numbers of std dev from the mean that I've calculated are wrong since I need to the steps to solve for it. Btw, the odds of a 6-number section not hitting for 74 consecutive spins is 1 - (32/38)^74 or 1 in about 333,340. However, number of std dev from the mean is a better statistics when testing for non-randomness.

Thanks.
Ardent1
Ardent1
  • Threads: 5
  • Posts: 168
Joined: Dec 19, 2012
December 24th, 2012 at 2:36:45 AM permalink
Quote: 24Bingo


I think what Ardent wishes is to treat the time before one hits a given window of numbers as a random variable, and is trying to apply things he's heard about the mean and standard deviation that only apply to approximately Gaussian variables (i.e., primarily, sums of many identically distributed variables), and thinks that by computing the standard deviation differently he can apply those things to this single variable. The problem is that they just don't apply at all - there's no "standard deviation for a single trial." If the distribution is shaped differently, as this one is, things like "eight standard deviations out" just don't work, no matter what number you call the "standard deviation."



My mistake was assuming the distribution to the first hit was a normal distribution; 7craps pointed it out my mistake when he wrote: "Not with a wait time dist. The mode is the first trial, the median is about 70% of the mean and the mean is not even close to the peak of the curve." In a normal distribution, the mean = median =mode such that the mean is the UNBIASED predicator; however, 7Craps makes it clear mode < median < mean for this "wait time" distribution.

Quote:

I've given you the standard deviation: it's sqrt(1-n/38)/(n/38). This comes from the definition of the standard deviation, that it's the square root of the difference between the mean of the square and the square of the mean. That's not the standard deviation "for many trials," just the standard deviation. The way you're trying to use it only works over many trials, but that's not its fault.



You have to understand JUST BECAUSE you understand it doesn't mean I'll understand it. You have to walk the person through an example, especially when you are using a math equation. Had you written the formula [{(1 - (n/38)} / (n/38) ]^(0.5), it would have been easier to understand it -- btw, the answer to that comes out to 2.569, which IS NOT the correct 7.082 number. And why did I get 2.569? because (1-5/38)/(5/38) gets me 6.6. In summary, if you understand something, you need to make sure the reader also understands. In your first post, you don't show the steps and I can't replicate your answers so no wonder I throw my hands up.

Quote:

The probability that a window of n numbers will not have hit after k trials is ((38-n)/38)^k. That's the bottom line. 67 consecutive misses of five numbers, therefore, has a chance of 0.00785% - low, yes, possibly even evidence of a biased wheel, but it might also just be a run of bad luck.



For a 5-number section, that works out to 1 in 12,737 trials, however, the 6-number section is 1 in about 333,340 trials and a 8-number is 1 in 107,264 trials.

I'll tell you why I am interested in std dev or std dev from the mean. It's the same logic of using correlation rather than covariance: The need to use a standardized number. Variance by itself tells me very little, std dev from the mean gives me more insights. These huge dispersions in these consecutive misses also give insights on the pusedo-random generator.

Cheers.
ybot
ybot
  • Threads: 15
  • Posts: 174
Joined: Jan 8, 2012
December 29th, 2013 at 10:29:22 AM permalink
Quote: Ardent1

I am new here and I posted a question on the Gambling Forum and a couple people have been nice to try to answer my question (here is my original post https://wizardofvegas.com/forum/questions-and-answers/gambling/12336-on-standard-deviation-in-a-double-0-or-american-roulette/#post203906 )

I really have a simple question -- I am tracking an Interblock Double 0 roulette wheel. I have observed some really rare occurrences and I was curious to find out how many standard deviations of these events.

For example, I observed a 5-number section, i.e. neighbors or 5-consecutive numbers on the wheel, that missed for 67 consecutives spins. I know the mean occurrence is 7.6 or 38/5. For it to miss 67 consecutive spins is rare. Then there is the case where a a 6-number section missed for 74 consecutive spins, especially when the mean occurrence is 6.333 spins (38/6). That is to say I know the mean of the hit frequency, but I don't know the std dev or dispersion around the hit frequency, respectively.

Can anyone give me the forumula to solve for the std deviation of the hit frequency. So far, people are giving me the std dev for a given number of trials, but that isn't the same as the as the std deviation of the hit frequency, which is independent of the number of trials. 24Bingo came up with an answer (the std dev hit frequency for a 5-number section is 7.082), and when I tested it, I must have done something wrong because a 67-consecutive miss results in a 8.4 std dev event, and that can't be right.

If someone is kind enough to walk me through the calculations for a 5-number section, I can use that example to teach myself for the 6-number to 9-number sections.

Thanks.



When you miss 67 times playing 5 numbers on a 00 Wheel you faced -3.18 standard deviations.
In case you played 6 numbers is -3.55 standard deviations.

It is not the same if you scan where the gap is than when you actually played the 67 spins.
There are 38 ways to pick 5 or 6 numbers on the Wheel.
It get easear to scan a gap than to positive pick a section a have 67 spins with no hits.

The way to calculate it is, for 5 numbers:

square root of (67(spins)*(5/38*33/38)) this wgives you the 1 standard deviation value(2.766 hits for a 5-number section)

The aritmethic mean for 5/38 in 67 spins is 67*5/38=8.81 hits

to go down to 0 hits in 67 spins you mus substract 2.766 to 8.81 the times to 0.

8.81/2.766=3.1871 negative standard deviations.
ybot
ybot
  • Threads: 15
  • Posts: 174
Joined: Jan 8, 2012
December 29th, 2013 at 10:35:50 AM permalink
Quote: kubikulann

The standard deviation is NOT independent from the number of trials. That is at the core of any statistical inference (the law of large numbers, to say it simply).

You have a binomial experiment. Two parameters: probability of hit (p) and number of trials (n). Variance of the number of hits = p(1-p)n. Variance of the hit frequency (or observed proportion) = p(1-p) / n.

Happy celebrations.



I do not why but, I know that +/- 3 standard deviations for example on 200 trials means something stronger than 3 standard deviations on 2000 trials.

I´d like to catch what kubikulann meant.

BR
ybot
ybot
  • Threads: 15
  • Posts: 174
Joined: Jan 8, 2012
December 29th, 2013 at 11:08:38 AM permalink
Quote: Ardent1


With 13 consecutive heads and then a tail appearing on the 14th toss, I can now calculate the number of std dev for this drought of "tails": (13 minus 1) / (1.414) or 8.49 std dev from the mean.



The negative/positive standard deviation number of 13 consecutive actual misses/hit on H or T is: sq root of (13*1/2*1/2) and we get the 1 st dev value 1.8 hits

The arithmetic mean for 13 trials for H and T is 6.5 hits
6.5 misses or hits/1.8=+/- 3.61 standard deviations
A very bad luck but posible. We all have seen this.
7craps
7craps
  • Threads: 18
  • Posts: 1977
Joined: Jan 23, 2010
December 30th, 2013 at 8:46:12 AM permalink
Quote: kubikulann

You have a binomial experiment.
Two parameters:
probability of hit (p) and number of trials (n). Variance of the number of hits = p(1-p)n.

Variance of the hit frequency (or observed proportion) = p(1-p) / n.


Quote: ybot

I´d like to catch what kubikulann meant.

hit frequency = P (p) = probability of success
P = 5/38 = 0.1315789 or 13.15789%
Q = 1-P (33/38)
N = 100

EV = P * N ( # of hits) = 13.15789474
VAR = N*P*Q = 11.4265928
1SD (or 1 binomial standard deviation) = SQRT of VAR = 3.380324363
3SD range = 23.30 to 3.02
how about 3 to 23
how close are we?
left to the reader to verify

we are very close


now the standard deviation of P (hit frequency) as a percentage in N trials
instead of the SD as the # of hits in N trials

(P*Q)/N = VAR of P = 0.001142659
1SD of P = SQRT of VAR of P = 0.033803244
for percentage = 0.033803244 * 100 = 3.3803%
of course we already know the 1SD, so we can divide that by N

3SD range of P for 100 trials = 23.2989% to 3.0169%

are the 3SD ranges the same?
do the math and see for yourself

last example: a fair coin toss (this is easier to see why this works, at least for me)
N = 100
we know 50% chance of H (or T)
SD of P = 5.0000%
3SD range = 35% to 65%

try it out for a spin!

how about P=5/38
N=200
I get 13.1579% +/- 2.3903% (1SD)
3SD range = 20.3286% to 5.9871%

Good Luck
winsome johnny (not Win some johnny)
ybot
ybot
  • Threads: 15
  • Posts: 174
Joined: Jan 8, 2012
December 30th, 2013 at 11:22:47 AM permalink
7craps, a very nice information.

! standard deviation for 5 numbers on 200 trials on a 00 roulette is 4.7805

The 3sd range(six-sigma) is (13.1579%*200)26.31hits the mean
The range is 11.96 hits to 40.65 hits

11.96/200 is 5.98%
40.65/200 is 20.32%
Is another way to get the same result

How did you get 2.3903% from 13.1579%?
7craps
7craps
  • Threads: 18
  • Posts: 1977
Joined: Jan 23, 2010
December 30th, 2013 at 12:11:33 PM permalink
Quote: ybot

! standard deviation for 5 numbers on 200 trials on a 00 roulette is 4.7805
How did you get 2.3903% from 13.1579%?

you already know the sd for n=200
use: sd/n
4.7805 / 200
the graph

I gather the OP is looking for sd of the probability of losing streaks.

13 heads in a row and one tail would be different from
5 heads in row, one tail and 8 heads in row from a standard deviation standpoint.

1st is an exponential type distribution (as opposed to a binomial distribution - the 2nd)
and one must use the gamma distribution for finding confidence levels.

I know how to do that but the results to me for losing streaks are meaningless.

"Unlike with normal confidence intervals,
the confidence intervals for the mean of an exponential are not generally centered on our sample mean. "
here is how to do that by BruceZ
http://forumserver.twoplustwo.com/25/probability/confidence-interval-mean-exponential-1258388/

also can mention the negative binomial distribution (wait time)
number of failures before one success (in this example) P = 5/38
(one success is a special case that is just a geometric distribution)
we can calculate the distribution or have a program do it

but still with a mean of 7.6 and 1SD of 7.08, trying to use the normal curve
is meaningless.
here is what the graph looks like.


nothing normal about it
the mean (7.6), mode (1) and median (5) are not even close to each other as they can be in a normal distribution
 x     prob[X=x]    prob[X<x]   prob[X>=x]   prob[X<=x]    prob[X>x]

1 0.131578947 0.000000000 1.000000000 0.131578947 0.868421053
2 0.114265928 0.131578947 0.868421053 0.245844875 0.754155125
3 0.099230937 0.245844875 0.754155125 0.345075813 0.654924187
4 0.086174235 0.345075813 0.654924187 0.431250048 0.568749952
5 0.074835520 0.431250048 0.568749952 0.506085568 0.493914432
6 0.064988741 0.506085568 0.493914432 0.571074309 0.428925691
7 0.056437591 0.571074309 0.428925691 0.627511900 0.372488100
8 0.049011592 0.627511900 0.372488100 0.676523492 0.323476508
9 0.042562698 0.676523492 0.323476508 0.719086190 0.280913810
10 0.036962343 0.719086190 0.280913810 0.756048534 0.243951466
11 0.032098877 0.756048534 0.243951466 0.788147411 0.211852589
12 0.027875341 0.788147411 0.211852589 0.816022752 0.183977248
13 0.024207533 0.816022752 0.183977248 0.840230284 0.159769716
14 0.021022331 0.840230284 0.159769716 0.861252615 0.138747385
15 0.018256235 0.861252615 0.138747385 0.879508850 0.120491150
16 0.015854099 0.879508850 0.120491150 0.895362949 0.104637051
17 0.013768033 0.895362949 0.104637051 0.909130982 0.090869018
18 0.011956450 0.909130982 0.090869018 0.921087432 0.078912568
19 0.010383233 0.921087432 0.078912568 0.931470664 0.068529336
20 0.009017018 0.931470664 0.068529336 0.940487682 0.059512318
21 0.007830568 0.940487682 0.059512318 0.948318250 0.051681750
22 0.006800230 0.948318250 0.051681750 0.955118481 0.044881519
23 0.005905463 0.955118481 0.044881519 0.961023944 0.038976056
24 0.005128428 0.961023944 0.038976056 0.966152372 0.033847628
25 0.004453635 0.966152372 0.033847628 0.970606007 0.029393993
26 0.003867631 0.970606007 0.029393993 0.974473638 0.025526362
27 0.003358732 0.974473638 0.025526362 0.977832370 0.022167630
28 0.002916793 0.977832370 0.022167630 0.980749163 0.019250837
29 0.002533005 0.980749163 0.019250837 0.983282168 0.016717832
30 0.002199715 0.983282168 0.016717832 0.985481883 0.014518117
31 0.001910279 0.985481883 0.014518117 0.987392161 0.012607839
32 0.001658926 0.987392161 0.012607839 0.989051088 0.010948912
33 0.001440646 0.989051088 0.010948912 0.990491734 0.009508266
34 0.001251088 0.990491734 0.009508266 0.991742822 0.008257178
35 0.001086471 0.991742822 0.008257178 0.992829292 0.007170708
36 0.000943514 0.992829292 0.007170708 0.993772807 0.006227193
37 0.000819368 0.993772807 0.006227193 0.994592174 0.005407826
38 0.000711556 0.994592174 0.005407826 0.995303730 0.004696270
39 0.000617930 0.995303730 0.004696270 0.995921660 0.004078340
40 0.000536624 0.995921660 0.004078340 0.996458284 0.003541716
41 0.000466015 0.996458284 0.003541716 0.996924299 0.003075701
42 0.000404697 0.996924299 0.003075701 0.997328997 0.002671003
43 0.000351448 0.997328997 0.002671003 0.997680445 0.002319555
44 0.000305205 0.997680445 0.002319555 0.997985649 0.002014351
45 0.000265046 0.997985649 0.002014351 0.998250695 0.001749305

the 67 in a row loss for 5/38 that was witnessed by OP
is just 1/(33/38)^67 = 1 in 12,737
nothing that newsworthy.

less than 4SD at 1/15.8k if one has to put a sd on it

But then, 67 in a row over the very next 67 spins
is different from 67 in a row over 167 spins
The probability drops to 1 in 900 for at least 1 event this being so rare says OP.
The more trials, the more likely the event will happen.
1000 spins it is about 1 in 103
2000 spins is about 1 in 50

overall the average number of spins to see 67 in a row without a 5/38 success is
96,793.56717
(1-(p^r))/(q*(p^r))
p=5/38
q=33/38
r=67

1SD = 96,733.64336
take the square roof of:
Var(N) = 1−(p^(1+2r)−qpr(1+2r)/q^2p^2r

from the actual distribution
mode = 67
median = 67,111

we can be waiting a very long time if the run (streak) does not happen early on
winsome johnny (not Win some johnny)
  • Jump to: