konglify
konglify
  • Threads: 28
  • Posts: 160
Joined: Aug 28, 2014
April 9th, 2015 at 12:37:11 PM permalink
Hi there,
It took me long time to calculate the blackjack payouts by using the basic strategy. Learning from other threads in the forum, I know that the basic strategy is based on the expectation payout for each possible initial hands, so to follow the strategy, we are trying to maximize our winning chance. I have one question on the case if we don't use the strategy table but instead we will go some way like try and error

for each starting hand, we may be able to surrender, hit, stand, double and split. And for each hand after one action, we may repeat the same (or partial of the actions) of surrender, hit, stand, double and split. So let's say we could try all possible and allowed combination of actions to find the best actions for one hand. Then we may find the maximum payout per initial hand. We could iterate all possible initial hand, for each hand following the above method to find the best outcomes. the final return to the player will be the total pay over the total bet.

So do you think this will actually raise the change of winning? I try to write a code to do the calculation based on this method but I end up with something like 150% of return to player.
ThatDonGuy
ThatDonGuy
  • Threads: 117
  • Posts: 6218
Joined: Jun 22, 2011
April 9th, 2015 at 12:55:41 PM permalink
I'm not entirely sure what it is that you are trying to do.

I think what you are saying is, for each initial hand, determine the best actions for that hand by going through every possible deal and seeing which action (hit, stand, double, split, surrender) has the best average result, then use those actions to determine the return. If this is correct, then you should know that this is exactly what is done in order to calculate the basic strategy in the first place.
konglify
konglify
  • Threads: 28
  • Posts: 160
Joined: Aug 28, 2014
April 9th, 2015 at 1:47:17 PM permalink
Quote: ThatDonGuy

I'm not entirely sure what it is that you are trying to do.

I think what you are saying is, for each initial hand, determine the best actions for that hand by going through every possible deal and seeing which action (hit, stand, double, split, surrender) has the best average result, then use those actions to determine the return. If this is correct, then you should know that this is exactly what is done in order to calculate the basic strategy in the first place.


Well, it is a bit different from the procedure to produce the strategy. Here is what I am doing

1) Initial random two cards to player, 1 card to dealer
2) For the player, try any possible actions (hit, stand, double, split, surrender), find the set of actions leading to the highest pay
3) record the highest pay for return to player calculation
repeat 1) and 2) for 10,000,000 times to get total pay and total bet

the main difference between what I am doing and what the procedure of strategy is the later one for each hand, it will try all possible cards dealt to find the expectation value and use the expectation value to determine the operation. But in my case, I don't try all possible dealt cards, but for the same dealt cards, I try all possible actions.

for example, let's say the deck is shuffled as 8 7 5 A 9 J Q 2 3 ...
The player has been dealt as 8,7, dealer has 5. For those initial hand, I try all actions, for each action, if needed, dealt more cards from the same deck, to see which one will give me higher pay and record the highest pay.

I don't understand why this will give a over 100% return to play.
Romes
Romes
  • Threads: 29
  • Posts: 5600
Joined: Jul 22, 2014
April 9th, 2015 at 2:18:42 PM permalink
Quote: konglify

...for example, let's say the deck is shuffled as 8 7 5 A 9 J Q 2 3 ...
The player has been dealt as 8,7, dealer has 5. For those initial hand, I try all actions, for each action, if needed, dealt more cards from the same deck, to see which one will give me higher pay and record the highest pay.

I don't understand why this will give a over 100% return to play.


First, the player hand would be 8-5, and the dealer card would be 7... If you're going to replicate a real shoe. Next, there's 2 things you could be doing. One, sounds a lot like basic strategy... In which you're doing something drastically wrong (perhaps in your rule definitions) because you should see over 10 million hands EV/House Edge. I've programmed a blackjack game as well for my particular style of play and to include counting and trust me the return is no where near 150%.

Secondly, it would appear as though you 'might' be playing each hand out according to what's in the shoe and then selecting the best case scenario (and I don't just mean optimal basic strategy play)? I don't think this is right, I'm just double checking... So in the example above you say the player has 8-7 to dealer 5... what happens next? Does your program think about hitting, look at the next card, then determine it should hit because taking the Ace from the dealer would be more profitable in the long run than appropriately staying on 15? I.E. Does your program "look ahead" in the shoe to decide what to do? If so, that's obviously why it's coming back 150% return and it's something that's realistic.

What would your program do with the following case... The shoe is: 10-7-9-10-4-x-x-x-x... Where the player, according to your setup, gets 17, and the dealer has 19. Does your program notice there's a 4 if it "incorrectly" hits against basic strategy and return a Hit decision anyways, since after all taking the 4 is the most profitable play?

If your program is not "looking ahead" then it sounds like you're programming pure basic strategy... Which should have a losing EV of whatever your rules are set up to make the HE.
Playing it correctly means you've already won.
OnceDear
OnceDear
  • Threads: 63
  • Posts: 7471
Joined: Jun 1, 2014
April 9th, 2015 at 2:23:17 PM permalink
Hi,

It's not obvious why your numbers are wrong (and they are wrong)
Here's a few suggestions.

Are you dealing your random selections from a correctly constructed shoe? Eg for 6 decks, are you drawing from a set of 24 aces, 24 twos etc up to 96 tens?

Are you correctly stopping a deal at appropriate points? E.g. stop when dealer hits 17? stop when your hand reaches 21? restrict number of splits or doubles as per normal rules?

Are you applying proper payout rules? EG push with a natural doesn't pay 3:2: Ace and ten after a split does not pay 3:2 etc

You're not doing something stupid like splitting non-pairs?

I suspect you are not accommodating the fact that the best way to play a hand, e.g. Ten, 5 will vary depending on the dealer's up card. You are not perhaps averaging your choice of 'best' across the range of potential up cards?

I'm troubled by "find the set of actions leading to the highest pay". It makes me suspect your simulation is way outside of real rules.

Random is not really the best way to do this, because it will be an estimate that is only as good as your RNG. Use your programming skills to analyse all the hands that CAN appear and analyse those. To show it's not too stressful in terms of calculation, there are ONLY 54433 possible dealer hands. From construction of a list of those hands, you can easily calculate the probability of any of those one hands occurring, pretty precisely. You could then compare your calculated probabilities to those available on many BJ sites. When you've done that for dealer hands, do the same for player hands (Single player will be fine) You could cheat and use an infinite deck and the difference to your percentages would be almost imperceptible. Or try with a single deck to get you started.
Psalm 25:16 Turn to me and be gracious to me, for I am lonely and afflicted. Proverbs 18:2 A fool finds no satisfaction in trying to understand, for he would rather express his own opinion.
ThatDonGuy
ThatDonGuy
  • Threads: 117
  • Posts: 6218
Joined: Jun 22, 2011
April 9th, 2015 at 4:45:21 PM permalink
Quote: konglify

Well, it is a bit different from the procedure to produce the strategy. Here is what I am doing

1) Initial random two cards to player, 1 card to dealer
2) For the player, try any possible actions (hit, stand, double, split, surrender), find the set of actions leading to the highest pay
3) record the highest pay for return to player calculation
repeat 1) and 2) for 10,000,000 times to get total pay and total bet

the main difference between what I am doing and what the procedure of strategy is the later one for each hand, it will try all possible cards dealt to find the expectation value and use the expectation value to determine the operation. But in my case, I don't try all possible dealt cards, but for the same dealt cards, I try all possible actions.

for example, let's say the deck is shuffled as 8 7 5 A 9 J Q 2 3 ...
The player has been dealt as 8,7, dealer has 5. For those initial hand, I try all actions, for each action, if needed, dealt more cards from the same deck, to see which one will give me higher pay and record the highest pay.

I don't understand why this will give a over 100% return to play.



Let's try a slightly different example, to see if I understand.
Suppose the deck is 8 7 9 7 K.
You are dealt the 8 and 7, and the dealer's up card is 9.
You check "stand on 15"; the dealer gets the 7 (16), then the King (bust), and you win.
You check "hit 15"; you get the 7, and bust, so you lose.
You can't split.
You check "double on 15"; you get the 7, and bust, so you lose 2.
You are recording "stand on 15", since that returns the highest value.

On the other hand, if the deck is 8 7 9 6 5 4:
You check "stand on 15": the dealer gets the 6 (15), then the 5 (20), and you lose.
You check "hit 15"; you get the 6 (21) and stand, and the dealer gets the 5 (14) and 4 (18), and you win.
You can't split.
You check "double on 15"; you get the 6, and have 21, and the dealer gets the 5 and 4, and you win 2.
You are recording "double on 15".

Note that if you combine the two hands, your total for hit = 0, for stand = 0, and for double = 0, since you win one hand and lose one hand in each case.
However, when you are counting just the best results, you count a +1 on a stand and a +2 on a double.
This is why your total > 100%. You have to add up all of the results for all of your options - not just the best ones.
konglify
konglify
  • Threads: 28
  • Posts: 160
Joined: Aug 28, 2014
April 10th, 2015 at 8:12:13 AM permalink
Quote: ThatDonGuy

Let's try a slightly different example, to see if I understand.
Suppose the deck is 8 7 9 7 K.
You are dealt the 8 and 7, and the dealer's up card is 9.
You check "stand on 15"; the dealer gets the 7 (16), then the King (bust), and you win.
You check "hit 15"; you get the 7, and bust, so you lose.
You can't split.
You check "double on 15"; you get the 7, and bust, so you lose 2.
You are recording "stand on 15", since that returns the highest value.

On the other hand, if the deck is 8 7 9 6 5 4:
You check "stand on 15": the dealer gets the 6 (15), then the 5 (20), and you lose.
You check "hit 15"; you get the 6 (21) and stand, and the dealer gets the 5 (14) and 4 (18), and you win.
You can't split.
You check "double on 15"; you get the 6, and have 21, and the dealer gets the 5 and 4, and you win 2.
You are recording "double on 15".

Note that if you combine the two hands, your total for hit = 0, for stand = 0, and for double = 0, since you win one hand and lose one hand in each case.
However, when you are counting just the best results, you count a +1 on a stand and a +2 on a double.
This is why your total > 100%. You have to add up all of the results for all of your options - not just the best ones.



Thanks. That's exactly what I am doing like your example. If I understand you correctly, you mean that we know the initial hands for dealer and player but there are many different possibility on the rest of the deck so we need to count all cases corresponding to the same initial hands. That is to say, we keep the first 3 cards in the deck unchanged but shuffle all the rest of the deck, for each shuffle, we find the best pay and record the win action and all the lost actions. So for those initial cards, weight all win actions and lost actions to find the average contribution?

I think that's why I didn't get the pay less than 100%? When I am designing the code, I am thinking that just like playing a real game but instead of making decision based on the strategy, I assume that the player can see through the deck. Now I understand that, since it is not possible for the player to know what cards will be in the deck, we need to consider the probability instead. So need to consider the average.

The reason why I start this code is I want to find the optimal strategy instead of the basic one. So what is the main difference between optimal and basic strategy?
Romes
Romes
  • Threads: 29
  • Posts: 5600
Joined: Jul 22, 2014
April 10th, 2015 at 8:18:09 AM permalink
Quote: konglify

...The reason why I start this code is I want to find the optimal strategy instead of the basic one. So what is the main difference between optimal and basic strategy?


The main difference is "optimal strategy," where you know everyone's cards and the next cards to come, is completely pointless. You'll never play it in a casino, you'll never play it even for fun (there's no chance or skill to the game, just ABC decisions). I'm not sure why you'd want to spend a lot of hours programming something that no one would ever use, but if you're having fun doing it, then I guess that's what counts.
Playing it correctly means you've already won.
konglify
konglify
  • Threads: 28
  • Posts: 160
Joined: Aug 28, 2014
April 10th, 2015 at 8:19:23 AM permalink
Quote: OnceDear

Hi,

It's not obvious why your numbers are wrong (and they are wrong)
Here's a few suggestions.

Are you dealing your random selections from a correctly constructed shoe? Eg for 6 decks, are you drawing from a set of 24 aces, 24 twos etc up to 96 tens?

Are you correctly stopping a deal at appropriate points? E.g. stop when dealer hits 17? stop when your hand reaches 21? restrict number of splits or doubles as per normal rules?

Are you applying proper payout rules? EG push with a natural doesn't pay 3:2: Ace and ten after a split does not pay 3:2 etc

You're not doing something stupid like splitting non-pairs?

I suspect you are not accommodating the fact that the best way to play a hand, e.g. Ten, 5 will vary depending on the dealer's up card. You are not perhaps averaging your choice of 'best' across the range of potential up cards?

I'm troubled by "find the set of actions leading to the highest pay". It makes me suspect your simulation is way outside of real rules.

Random is not really the best way to do this, because it will be an estimate that is only as good as your RNG. Use your programming skills to analyse all the hands that CAN appear and analyse those. To show it's not too stressful in terms of calculation, there are ONLY 54433 possible dealer hands. From construction of a list of those hands, you can easily calculate the probability of any of those one hands occurring, pretty precisely. You could then compare your calculated probabilities to those available on many BJ sites. When you've done that for dealer hands, do the same for player hands (Single player will be fine) You could cheat and use an infinite deck and the difference to your percentages would be almost imperceptible. Or try with a single deck to get you started.



Thanks for the reply. I wrote a code to use basic strategy, which properly deal with shuffle, split, hit on soft/hard 17 and etc., and I got the same result as the wizard of odds. I use the code to deal with your questions on shuffle, split, etc. in my new code also.

I understand your point for the last statement. I just wonder instead of using the basic strategy table, can we just directly calculate the payout instead. Ultimately, what I want to find out is optimal strategy, but seems that the way I am working is wrong.
Dieter
Administrator
Dieter
  • Threads: 16
  • Posts: 5477
Joined: Jul 23, 2014
April 10th, 2015 at 8:29:48 AM permalink
Quote: konglify

So what is the main difference between optimal and basic strategy?



"optimal" strategy either requires next-card knowledge (extremely rare - marked cards or end decking), or card counting & index play (somewhat less rare, but still challenging).

This too is a solved problem. Almost every card counting technique suggests certain index plays - deviations from basic strategy, based on seen cards.

The first on the list is almost always insurance - there are times when it is statistically optimal to take it, and there are other times when it is not.
May the cards fall in your favor.
konglify
konglify
  • Threads: 28
  • Posts: 160
Joined: Aug 28, 2014
April 10th, 2015 at 8:59:53 AM permalink
Quote: Dieter

"optimal" strategy either requires next-card knowledge (extremely rare - marked cards or end decking), or card counting & index play (somewhat less rare, but still challenging).

This too is a solved problem. Almost every card counting technique suggests certain index plays - deviations from basic strategy, based on seen cards.

The first on the list is almost always insurance - there are times when it is statistically optimal to take it, and there are other times when it is not.



It makes sense. So the so-called optimal means to increase the chance of guessing next card so to increase the expectation for pay for each hand? And why I did wrong is to assume knowing the next card for 100%, that's will definitely cause over 100% payout. Thanks.
charliepatrick
charliepatrick
  • Threads: 39
  • Posts: 2946
Joined: Jun 17, 2011
April 10th, 2015 at 9:55:17 AM permalink
Quote: ThatDonGuy

...record the highest pay...for 10,000,000 times...

I haven't read all the thread but in theory you need to evaluate the best course of action for all permutations of your cards and the dealer's up-card. If you were working it out for perfect strategy it's an iterative process. For instance only after you've decided what to do with (say) 5 7 9, 5 7 8, 5 7 7...5 7 A, could you go back to 5 7 and see what you should do.

I guess your method is to record at each stage the outcome for the various runs of cards for all combinations the player might have given the start deck (and say shuffle up after each hand). The problem is 10 million hands is never enough - for instance you would only get A-A vs A about 4500 times.

Blackjack is fairly volatile, at the moment I'm trying to work out, using basic UK strategy, what the House Edge assuming you shuffle after every hand. Each run is 1 billion hands/shoes and I'm getting a range of results, one can work out the average number of winning Blackjacks should be 4 532 299.

Expected # hands Money Won Money Lost Tie BlackJack (win)
Average
.461 501 900% 1 000 000 000 451 724 075 524 322 445 88 112 610 4 532 223
Each Run
.004 585 839 1 000 000 000 451 744 980 524 309 673 88 115 784 4 531 923
.004 623 015 1 000 000 000 451 721 225 524 324 873 88 110 953 4 532 042
.004 619 640 1 000 000 000 451 726 848 524 325 949 88 105 458 4 531 964
.004 596 668 1 000 000 000 451 721 881 524 318 569 88 104 321 4 533 334
.004 633 158 1 000 000 000 451 704 160 524 323 531 88 121 691 4 532 414
.004 592 613 1 000 000 000 451 734 302 524 311 161 88 109 542 4 532 283
.004 654 202 1 000 000 000 451 715 131 524 343 361 88 120 520 4 531 601
If you're having difficulties, one idea might be to restrict hands where the dealer has a 10-upcard (since there are less hands the dealer can make) and see whether your figures correspond. Also I'm am slightly worried that if a deck was (using your method) P=J7 D=J Deck=47... then would you work out that hitting was best, but that's only based on knowledge of the deck (for instance it's fairly easy to work through all the possibilities of the next seven cards, as x 2 A A A A y is longest needed).
ThatDonGuy
ThatDonGuy
  • Threads: 117
  • Posts: 6218
Joined: Jun 22, 2011
April 10th, 2015 at 10:25:32 AM permalink
Quote: konglify

The reason why I start this code is I want to find the optimal strategy instead of the basic one. So what is the main difference between optimal and basic strategy?


That depends on your definition of optimal strategy. If it is "determine what to do with the next card based on your hand, the dealer's up card, and the cards remaining in the deck," then not only would you need to know exactly how many of each card is left in the deck at every point, but you would need to be able to calculate the probabilities instantly at that point. Nobody is smart enough to do that in their head (and even if somebody could, he would be asked to leave for card counting), and you can't use any sort of device to calculate it for you when you are at a table.

Note that basic strategy and optimal strategy are the same thing when the deck is full - but that assumes you are the only player at that table, as any other players' cards would have to be taken into account.
Dieter
Administrator
Dieter
  • Threads: 16
  • Posts: 5477
Joined: Jul 23, 2014
April 10th, 2015 at 8:18:15 PM permalink
Quote: konglify

It makes sense. So the so-called optimal means to increase the chance of guessing next card so to increase the expectation for pay for each hand? And why I did wrong is to assume knowing the next card for 100%, that's will definitely cause over 100% payout. Thanks.



It goes back to one of the unique traits of blackjack - the cards are not shuffled after every hand.

As cards are dealt, they are removed from the undealt deck - based on what has been played, we can infer the composition of the undealt deck, and as that composition shifts, we can make adjustments both in what plays to make in a situation (index plays) and how much is wagered (to minimize the house win and maximize the player win).

If you don't know how the undealt deck differs from the full deck, use basic strategy.

If you do know how the undealt deck differs from the full deck, an index play might be in order.
May the cards fall in your favor.
98Clubs
98Clubs
  • Threads: 52
  • Posts: 1728
Joined: Jun 3, 2010
April 10th, 2015 at 9:13:58 PM permalink
There are several important points made... no shuffle after every hand, maximizing the win% NOT the goal, etc.

One thing though, the rule set does determine the win%. For example adding Late Surrender actually reduces the win%, but causes the House advantage to reduce. This because Surrendering is a better play in some cases.
Some people need to reimagine their thinking.
  • Jump to: