Thread Rating:

RS
RS
  • Threads: 62
  • Posts: 8626
Joined: Feb 11, 2014
May 19th, 2015 at 9:55:40 AM permalink
I'm wondering if this is a valid way of finding out the probability of something. First of all, let's say I have 5 probabilities: 0.5, 0.25, 0.125, 0.0625, 0.0625. Call them A, B, C, D, and E respectively.

I could do something like this:

Quote: code


$ranNum = mt_rand(1, 16);

if ($ranNum <= 8)
// A
else if ($ranNum > 8 AND $ranNum <= 12)
// B
else if ($ranNum > 12 AND $ranNum <= 14)
// C
else if ($ranNum == 15)
// D
else if ($ranNum == 16)
// E




For the problem I'm working on, that cannot be done. So my thought process has me to doing something like the following:


if (0.5 chance)
// this happens 50% the time
else if (0.25 / 0.5 chance)
// this happens 25% the time
else if (0.125 / (1-0.5-0.25))
// this happens 12.5% the time
else if (0.0625 / (1-0.5-0.25-0.125))
// this also happens 6.25% the time
else if (0.0625 / (1-0.5-0.25-0.0125))
// this also happens 6.25% the time



IN OTHER WORDS:

Quote:


50% = 0.5 / 1

25% = 0.25 / (1-0.5)
= 0.25 / 0.5
// but we only get to this part 50% of the time

12.5% = 0.125 / (1-0.5-0.25)
= 0.125 / 0.25
// we only get here 25% of the time

6.25% = 0.0625 / (1-0.5-0.25-0.125)
= 0.0625 / (0.125)
// we only get here 12.5% of the time

6.25% = 0.0625 / (1-0.5-0.25-0.125-0.0625)
= 0.0625 / (0.0625)
= 1 = 100%
// we only get here 6.25% of the time so it has a 100% chance of occuring 6.25% of the time



other words:

100% of the time A is successful 50% of the time (A = 0.5)

50% of the time B is successful 50% of the time (B = 0.25)

etc.


I'm assuming this is OK and doesn't violate any kind of math stuff. just want to make sure before I begin.
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
May 19th, 2015 at 10:17:02 AM permalink
If you're working with exclusive events where p(A)+p(B)+p(C)+p(D)+p(E) = 1 (which it looks like you are) then there's nothing wrong with your first method. I'd do it using reals though:

double a = 0.5, b = 0.25, c = 0.125, d = 0.0625, e = 0.0625;

double rng = genrand_real2(); // Mersenne Twister function for value on [0,1); replace with whatever you're using

if (rng < a)
{
// process event a here
}
else if (rng < (a+b))
{
// process event b here
}
else if (rng < (a + b + c))
{
// process event c here
}
etc...


Don't forget that you'll never get to the 2nd step of a conditional if the first was met, so you don't need to re-check all the prior conditions that you know you already didn't satisfy.
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
RS
RS
  • Threads: 62
  • Posts: 8626
Joined: Feb 11, 2014
May 19th, 2015 at 10:44:52 AM permalink
Quote:

Don't forget that you'll never get to the 2nd step of a conditional if the first was met, so you don't need to re-check all the prior conditions that you know you already didn't satisfy.



Right. I don't know if I included it, but I was going to put in a continue/break/next or whatever it is. I'm a little rusty on which language requires which word. Grr..


Problem with doing it that way that you described is some of the probabilities are really really tiny, like 0.00001 (0.001%) or possibly lower. Perhaps that is a better way of going about it. Not sure what kind of limitations I'd be facing regarding decimal precision. Hmm...



Hmm, maybe the best way to go about this is generating a huge random number. How would I generate a number between 1 and, say, 19 * 10^12? [The actual number would be precise down to the digit.]
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
May 19th, 2015 at 11:01:13 AM permalink
That'll depend on your RNG implementation. 19*10^12 is far less than what a 64-bit value holds so no worries there. But if you're stuck with 32-bit values you've got some work to do.
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
charliepatrick
charliepatrick
  • Threads: 39
  • Posts: 3011
Joined: Jun 17, 2011
May 19th, 2015 at 4:03:31 PM permalink
In your particular case you can use bit manipulation by just considering a small number of bits.
If random_number AND x'00000001' = 1 (i.e. last bit is on) then Probability of this is 50%
If random_number AND x'00000003' = 2 (i.e. last two bits are 10) then Probability of this is 25%
etc.
Personally I don't use else statements and use
Found=0;
If ((found==0)&(something)) {found=1; process something;};
But then I'm not too interested in performance and prefer readability (it's my background in application software that it was more important to understand quickly what you had been trying to do rather than some really elegant way - however if you're doing things billions of times then it might need to be fast!)
RS
RS
  • Threads: 62
  • Posts: 8626
Joined: Feb 11, 2014
May 21st, 2015 at 6:28:08 AM permalink
Quote: MathExtremist

That'll depend on your RNG implementation. 19*10^12 is far less than what a 64-bit value holds so no worries there. But if you're stuck with 32-bit values you've got some work to do.



Indeed. But, how would I go about generating this random number? The highest random number I can generate (with mt_rand()), which is about 4 digits too few.
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
May 21st, 2015 at 8:43:48 AM permalink
Either use a 64-bit RNG or, if your current 32-bit RNG doesn't suffer from serial correlation, generate two 32-bit numbers and concatenate them. But that's not really recommended, using a properly-designed 64-bit implementation is by far preferred. There are lots out there: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/SFMT/index.html
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
MangoJ
MangoJ
  • Threads: 10
  • Posts: 905
Joined: Mar 12, 2011
May 21st, 2015 at 12:06:43 PM permalink
Quote: RS

Problem with doing it that way that you described is some of the probabilities are really really tiny, like 0.00001 (0.001%) or possibly lower. Perhaps that is a better way of going about it. Not sure what kind of limitations I'd be facing regarding decimal precision. Hmm...



If you concatenate your if's like that, for performance I would test the most likely ones first, and then go down to the more rarer ones.

Other than that, if your probabilities are structured like a specific discrete probability distribution, you can try directly drawing from that distribution.
You could then use swith() or (depending on the problem) use the random variable as array index or function arguments ...
  • Jump to: