wudged
wudged
Joined: Aug 7, 2013
  • Threads: 2
  • Posts: 998
October 30th, 2013 at 11:48:10 AM permalink
Quote: Wizard

Do the advanced programmers on the forum approve of this Fisher-Yates shuffle I just wrote. As you can see, after it pulls out the card in the rn-th place in the old deck it slides everything after it one position, to keep the old deck sequential and without gaps.



void fisher_yates(int deck[], int NumCards)
{
int i,j,NewDeck[416];
unsigned int rn;
for (i=0; i<NumCards; i++)
{
rn=genrand_int32()%(NumCards-i);
NewDeck[i]=deck[rn];
for (j=rn; j<NumCards-1; j++)
deck[j]=deck[j+1];
}
for (i=0; i<NumCards; i++)
deck[i]=NewDeck[i];
}



It's been a while since I've used C++, but can't you initialize the NewDeck array using NumCards instead of 416? Algorithm looks correct with the exception of the bias ME pointed out.

Check out the modulo bias that jml24 pointed out
thecesspit
thecesspit
Joined: Apr 19, 2010
  • Threads: 53
  • Posts: 5936
October 30th, 2013 at 11:49:13 AM permalink
Quote: Wizard

I disagree. If the first random number is 0, the first card in the shuffled deck will be the first card from the original deck.



Apologies, I agree with your disagreement now, misread the code.

I'm not sure you need the shuffle down though, as you can do the whole think in place, and that avoids copying the shuffled deck back into the new Deck as well. But there might be reasons to keep both old and new.

Swap Position [0] with a random card from [0-n]. The swap Position [1] with a random card from 1-n, and repeat.
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
JB
Administrator
JB
Joined: Oct 14, 2009
  • Threads: 334
  • Posts: 2089
October 30th, 2013 at 11:53:50 AM permalink
I see a few things wrong with this Fisher-Yates implementation:

1) The for() loop has one iteration too many; it is better to count down from (NumCards - 1) to 1

2) The process of obtaining a random number may result in a bias because 232 % NumCards may not necessarily be zero

3) There is no need for an extra array to hold the shuffled cards; they are shuffled in place with Fisher-Yates

The following code should implement it correctly:


void fisher_yates(int deck[], int NumCards)
{
int i, r, t;

for (i = NumCards - 1; i > 0; i--)
{
do
{
r = genrand_int32();
}
while ((r < 0) || (r >= i));

t = deck[r];
deck[r] = deck[i];
deck[i] = t;
}
}

The do() loop obtains random integers repeatedly until an integer in the needed range is found. Then the card in that random position is swapped with the card in position i.
Wizard
Administrator
Wizard
Joined: Oct 14, 2009
  • Threads: 1390
  • Posts: 23442
October 30th, 2013 at 12:06:35 PM permalink
Quote: wudged

It's been a while since I've used C++, but can't you initialize the NewDeck array using NumCards instead of 416?



Visual Studio doesn't let you. I tried. Seems that it should. I just know I'm going to get an array error when I use this code later for a bigger array.

Thanks JB for your comments. The only thing I have a hard time with is waiting for the random number to be in the desired range. That could take a while, as they are drawn from 0 to 2^32-1. I know computer are fast these days, but I can remember the punch card days.

Given this large cycle, I think taking the mod of a number as big as 4,294,967,295, would result in a very negligible bias. Perhaps that is the answer to ME's question.

Here is my new and improved code, per your other suggestions.


void fisher_yates(int deck[], int NumCards)
{
int i,j,random_card;
unsigned int rn;
for (i=NumCards-1; i>=0; i--)
{
rn=genrand_int32()%(i+1);
random_card=deck[rn];
for (j=rn; j<i-1; j++)
deck[j]=deck[j+1];
deck[i]=random_card;
}
}
It's not whether you win or lose; it's whether or not you had a good bet.
AcesAndEights
AcesAndEights
Joined: Jan 5, 2012
  • Threads: 67
  • Posts: 4299
October 30th, 2013 at 12:20:18 PM permalink
Quote: Wizard

Visual Studio doesn't let you. I tried. Seems that it should. I just know I'm going to get an array error when I use this code later for a bigger array.


To define a variable-sized array you have to use a pointer and the new keyword. With a constant size array the compiler allocates that much space statically on the stack, but for a variable sized array you have to allocate memory on the heap. In C you would use malloc() which is what I was going to suggest, but I think in C++ you just use new.

reference.
"So drink gamble eat f***, because one day you will be dust." -ontariodealer
wudged
wudged
Joined: Aug 7, 2013
  • Threads: 2
  • Posts: 998
October 30th, 2013 at 12:28:31 PM permalink
What about:


void fisher_yates(int deck[], int NumCards)
{
int i,j,random_card, max;
unsigned int rn;

for (i=NumCards-1; i>0; i--)
{
max = floor(4294967295 / (i + 1)) * (i + 1);
do
{
rn = genrand_int32();
}
while ((rn < 0) || (rn >= max));

rn = rn % (i + 1);

random_card=deck[rn];
deck[rn] = deck;
deck = random_card;
}
}


This will strip off only the bias part and not spend "hours" trying to find a valid random number that avoids the bias.

(Edited to change the modulo to integer division when assigning the value to max)
JB
Administrator
JB
Joined: Oct 14, 2009
  • Threads: 334
  • Posts: 2089
October 30th, 2013 at 12:33:50 PM permalink
Quote: Wizard

Thanks JB for your comments. The only thing I have a hard time with is waiting for the random number to be in the desired range. That could take a while, as they are drawn from 0 to 2^32-1. I know computer are fast these days, but I can remember the punch card days.


I also have a problem with it...

Quote: Wizard

Given this large cycle, I think taking the mod of a number as big as 4,294,967,295, would result in a very negligible bias. Perhaps that is the answer to ME's question.


A better solution would be to generate only one random number per iteration; cast it to a double; divide it by 4294967296; multiply it by i; then cast it back to an int and put it in random_card.

Quote: Wizard

Here is my new and improved code, per your other suggestions.


In the for loop initializer, the middle condition should be i > 0 (not i >= 0). Also, moving cards about the array is unnecessary and slows down the function. Just swap the array values located in positions i and random_card.

Here's a great article about shuffling algorithms which illustrates bias using a 3-card deck:

http://www.codinghorror.com/blog/2007/12/the-danger-of-naivete.html
MathExtremist
MathExtremist
Joined: Aug 31, 2010
  • Threads: 88
  • Posts: 6526
October 30th, 2013 at 1:04:20 PM permalink
Quote: wudged


max = floor(4294967295 % (i + 1)) * (i + 1);
do
{
rn = genrand_int32();
}
while ((rn < 0) || (rn >= max));


This will strip off only the bias part and not spend "hours" trying to find a valid random number that avoids the bias


Yes, except that needs to be the integer-division operation, not the modulo operation.
max = floor(MAX_INT / (i + 1)) * (i + 1);
also, for clarity I'd either change the variable name to "toobig" or subtract 1 in the above formula and change the test to rn > max (strictly greater than).

The slight bias in the original code comes from doing a modulo operation on a range that does not evenly divide by the scaling range. The same bias is easier to see (and much larger) if you consider trying to use a 3-bit RNG to generate numbers between 0 and 2 inclusive. You want p(0) = p(1) = p(2) = 1/3, but without trimming the excess off the RNG range you end up with:

0 % 3 = 0
1 % 3 = 1
2 % 3 = 2
3 % 3 = 0
4 % 3 = 1
5 % 3 = 2
6 % 3 = 0
7 % 3 = 1


p(0) = p(1) = 3/8, while p(2) = 2/8. Whoops. The solution is to discard any RNG outcomes at the upper end of the range, in this case discarding 6 or 7, so then each outcome occurs with p=2/6.
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
wudged
wudged
Joined: Aug 7, 2013
  • Threads: 2
  • Posts: 998
October 30th, 2013 at 1:16:57 PM permalink
Ah! Yes, thank you, it should be integer division and not modulo.
Wizard
Administrator
Wizard
Joined: Oct 14, 2009
  • Threads: 1390
  • Posts: 23442
October 30th, 2013 at 1:31:37 PM permalink
Quote: JB

In the for loop initializer, the middle condition should be i > 0 (not i >= 0). Also, moving cards about the array is unnecessary and slows down the function. Just swap the array values located in positions i and random_card.



You're right, that does seem to work just as well, and I don't have to bother with the card slide. Thank you.

However, I think worrying about the bias over taking the mod of 232 is rather ridiculous for practical purposes. I'm not saying discussion about it isn't worthy, but I don't want to slow down the program over it.

I'd compare this to a video poker player memorizing an exception that increases the return by 0.000000001.

Anyway, here is version 4, I think, of my shuffle.


void fisher_yates(int deck[], int NumCards)
{
int i,hold;
unsigned int rn;
for (i=NumCards-1; i>0; i--)
{
rn=genrand_int32()%(i+1);
hold=deck[rn];
deck[rn]=deck[i];
deck[i]=hold;
}
}
It's not whether you win or lose; it's whether or not you had a good bet.

  • Jump to: