thecesspit
thecesspit
  • Threads: 53
  • Posts: 5936
Joined: Apr 19, 2010
November 7th, 2010 at 10:53:06 PM permalink
I've been updating my coding skills over the last week or so, and have taken on a toy project to help myself explore Groovy scripting further. However, this question is not really limited to groovy, but a general algorithm.design pattern question.

Say I have an ordered hash map of items, each with a weighted value: ( Nothing - 50, Pair - 25, Two Pair - 10, Trips - 5).

How would you code a method that returns a random item from that list so the frequency distribution of each item would equal to the weighted value. Ideally the method must work for any ordered map. Or even any map, as I'm not sure how strong the native map ordering is in Groovy.

I've got a solution that works, but it doesn't seem to be the most efficient way of doing it.

I'm imaging there's a couple of programmers who've hit this before and sailed through it. I'm happy to share my code if it helps.
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
mkl654321
mkl654321
  • Threads: 65
  • Posts: 3412
Joined: Aug 8, 2010
November 7th, 2010 at 10:58:24 PM permalink
Quote: thecesspit

I've been updating my coding skills over the last week or so, and have taken on a toy project to help myself explore Groovy scripting further. However, this question is not really limited to groovy, but a general algorithm.design pattern question.

Say I have an ordered hash map of items, each with a weighted value: ( Nothing - 50, Pair - 25, Two Pair - 10, Trips - 5).

How would you code a method that returns a random item from that list so the frequency distribution of each item would equal to the weighted value. Ideally the method must work for any ordered map. Or even any map, as I'm not sure how strong the native map ordering is in Groovy.

I've got a solution that works, but it doesn't seem to be the most efficient way of doing it.

I'm imaging there's a couple of programmers who've hit this before and sailed through it. I'm happy to share my code if it helps.



Well, twenty years ago, a programmer might have simply set up an array: nothing, nothing, nothing (x10), pair, pair (x5), two pair, two pair, trips, etc. That might still be a decent brute-force way to do it, but I'm sure there's a much more elegant algorithm available.

Do you know if video poker machines (I assume that's what you're experimenting with here) still use that same ancient C program that was running the 486 processors in 1993? I know that IGT was kind of reluctant to upgrade their programming as long as the old stuff worked, since a VP program isn't exactly taxing for the CPU.
The fact that a believer is happier than a skeptic is no more to the point than the fact that a drunken man is happier than a sober one. The happiness of credulity is a cheap and dangerous quality.---George Bernard Shaw
thecesspit
thecesspit
  • Threads: 53
  • Posts: 5936
Joined: Apr 19, 2010
November 7th, 2010 at 11:03:44 PM permalink
Quote: mkl654321

Well, twenty years ago, a programmer might have simply set up an array: nothing, nothing, nothing (x10), pair, pair (x5), two pair, two pair, trips, etc. That might still be a decent brute-force way to do it, but I'm sure there's a much more elegant algorithm available.



The weighted numbers are much larger than what I've used here, so that would rapidly get ugly. However, the hash map is small.

Quote:

Do you know if video poker machines (I assume that's what you're experimenting with here) still use that same ancient C program that was running the 486 processors in 1993? I know that IGT was kind of reluctant to upgrade their programming as long as the old stuff worked, since a VP program isn't exactly taxing for the CPU.



No idea... but I'd suspect a lot of the core code has been the same for years... writing bullet proof code is hard, and testing it to ensure it's bullet proof is why they pay me. I'd avoid paying me if I was IGT and use something that was "known good".
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
November 7th, 2010 at 11:40:29 PM permalink
Quote: thecesspit

Say I have an ordered hash map of items, each with a weighted value: ( Nothing - 50, Pair - 25, Two Pair - 10, Trips - 5).



Abstractly, you're talking about the same concept as a weighted slot machine reel. You want to "spin" the array and display the chosen "symbol". The algorithm choice comes down to space vs. time. You can do it in constant-time if you pre-generate an array of size N = total weight, as mkl suggested, and then just do a random draw between 0 and N-1. But if you're looking for a more space-efficient algorithm then you can probably auto-generate a tree and do a log-n traverse somehow. How big is N, anyway?
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
Dween
Dween
  • Threads: 66
  • Posts: 339
Joined: Jan 24, 2010
November 8th, 2010 at 5:07:27 AM permalink
I have used weighted choices for a slot simulation as well. Come to think of it, hasn't the Wizard done the same things for his Atkins slot simulation? Nonetheless, my solution follows in some kind of pseudo-code. I tried to make everything use 1-index versus zero-index:

hands = new Array(50,25,10,5);          // This initializes an array with each of the weights.

possibilities = 0; // Initialize the total number of possibilities to zero
for x=1 to hands.length // Count through the number of different outcomes (in this case, 4)
possibilities += hands(x); // Add up all the weights.
next x

function pickhand {
randnum = int(rand()*possibilities)+1; // pick a random number from 1 to total possibilities
pick = 0; // Initialize pick to zero
do {
pick++; // Add one to the pick variable
randnum -= hands(pick); // subtract the weighted chance from the random number
} while (randnum >=0); // if the random number drops to zero or below, keep the pick
return pick;
}


In English:
Add up all the weights.
Pick a number between 1 and that total.
Subtract the first weight from your picked number.
- If you hit zero or below, that's your pick.
- If not, go to the next weight and repeat.

Example:
50, 25, 10, 5 are the weights for picks 1, 2, 3, 4.
Total of weights: 90.
Pick a number between 1 and 90. (example: 75)
Subtract weight of pick 1. (75 - 50 = 25)
Drop to zero? (No)
Subtract weight of pick 2 (25 - 25 = 0)
Drop to zero? (Yes)
Pick 2 is your weighted pick.

Example 2:
Pick a number between 1 and 90. (example: 88)
Subtract weight of pick 1 (88 - 50 = 38)
Subtract weight of pick 2 (38 - 25 = 13)
Subtract weight of pick 3 (13 - 10 = 3)
Subtract weight of pick 4 (3 - 5 = -2)
Drop to zero or below? (Yes)
Pick 4 is your weighted pick.
-Dween!
weaselman
weaselman
  • Threads: 20
  • Posts: 2349
Joined: Jul 11, 2010
November 8th, 2010 at 5:35:19 AM permalink
The question is what exactly you mean be "efficient". The naive and most straightforward algorithm (scan the map once to get the sum of weights, then get a random number, and scan again to pick a value) is constant memory , and linear time, which is pretty good.

If you want to trade memory for speed, there are things you can do to make selection in constant time. For example, use the "alias method", described here: http://prxq.wordpress.com/2006/04/17/the-alias-method/. It is really really elegant, runs in constant time, and requires linear memory.

Interestingly, the map being sorted does not help the efficiency (but hurts the memory footprint). I can't think of any good approach that would make use of the sort order (apart from naive scanning of the map, that could save half the scan on average, but would still be linear).
"When two people always agree one of them is unnecessary"
thecesspit
thecesspit
  • Threads: 53
  • Posts: 5936
Joined: Apr 19, 2010
November 8th, 2010 at 7:12:15 AM permalink
X is one million.
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
November 8th, 2010 at 8:04:24 AM permalink
Quote: thecesspit

X is one million.


Yeah, then your best bet is to use something smarter with memory than the single-array pick. You've got a few choices - both the algorithm Dween described and the tree-traversal I suggested are listed here:

http://eli.thegreenplace.net/2010/01/22/weighted-random-generation-in-python/

More on the Alias method, including a comment section describing how it works:

http://code.activestate.com/recipes/576564-walkers-alias-method-for-random-objects-with-diffe/
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
November 8th, 2010 at 8:41:37 AM permalink
One more link, this time to a Springer book:
http://cg.scs.carleton.ca/~luc/rnbookindex.html

Chapter 3 probably has a solution for what you want. I haven't read it yet, but I'll probably take the PDFs to a printer and make myself a hard copy (the author says it's okay).
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
thecesspit
thecesspit
  • Threads: 53
  • Posts: 5936
Joined: Apr 19, 2010
November 8th, 2010 at 8:59:33 AM permalink
Thanks all... Dween's algorithm makes a lot of sense... my coding skills are so out of shape I'd forgotten that a WHILE loop would cut down my steps. My solution was was iterating across the entire cumalative frequency map to find the range which matched the random number. Working the other way is much smoother.

I'm going to take a look at the other algorithms as well.

I'm peeling off the whole thing into a seperate object, so the pay table, frequency and other items will be methods and variables I can call at any time.

Cheers.
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
thecesspit
thecesspit
  • Threads: 53
  • Posts: 5936
Joined: Apr 19, 2010
November 9th, 2010 at 12:18:45 AM permalink
Got it sorted, despite groovy not implementing a do...while.

Managed to simulate a very simple betting system using VP, plus simulating flat bets,

It all looks about the right sort of values using my mark 1 eyeball and a calculator of expected results. Next step is a more complex system, such as Mr Singer's... having done a simple "win 5 credits or lose 400", it already returns some (for me) interesting results on win %age, average win and loss, and likelihood of bustication.. several 100 "session" runs have turned out positive, but most fail. Still, I can get a 95% session win rate. I should sell this...
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
weaselman
weaselman
  • Threads: 20
  • Posts: 2349
Joined: Jul 11, 2010
November 9th, 2010 at 4:25:28 AM permalink
Quote: thecesspit

but most fail. Still, I can get a 95% session win rate.


How can these two sentences be true at the same time?
"When two people always agree one of them is unnecessary"
MathExtremist
MathExtremist
  • Threads: 88
  • Posts: 6526
Joined: Aug 31, 2010
November 9th, 2010 at 6:00:20 AM permalink
I think he's contrasting groups of 100 sessions vs. just 1 session. 95% of 1-sessions win (presumably, just a little bit), but most 100-sessions do not.
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice." -- Girolamo Cardano, 1563
thecesspit
thecesspit
  • Threads: 53
  • Posts: 5936
Joined: Apr 19, 2010
November 9th, 2010 at 7:30:24 AM permalink
Quote: MathExtremist

I think he's contrasting groups of 100 sessions vs. just 1 session. 95% of 1-sessions win (presumably, just a little bit), but most 100-sessions do not.

Exactly. Apologies for not being clear.
"Then you can admire the real gambler, who has neither eaten, slept, thought nor lived, he has so smarted under the scourge of his martingale, so suffered on the rack of his desire for a coup at trente-et-quarante" - Honore de Balzac, 1829
  • Jump to: