konglify
konglify
Joined: Aug 28, 2014
  • Threads: 28
  • Posts: 160
April 9th, 2015 at 12:37:11 PM permalink
Hi there,
It took me long time to calculate the blackjack payouts by using the basic strategy. Learning from other threads in the forum, I know that the basic strategy is based on the expectation payout for each possible initial hands, so to follow the strategy, we are trying to maximize our winning chance. I have one question on the case if we don't use the strategy table but instead we will go some way like try and error

for each starting hand, we may be able to surrender, hit, stand, double and split. And for each hand after one action, we may repeat the same (or partial of the actions) of surrender, hit, stand, double and split. So let's say we could try all possible and allowed combination of actions to find the best actions for one hand. Then we may find the maximum payout per initial hand. We could iterate all possible initial hand, for each hand following the above method to find the best outcomes. the final return to the player will be the total pay over the total bet.

So do you think this will actually raise the change of winning? I try to write a code to do the calculation based on this method but I end up with something like 150% of return to player.
ThatDonGuy
ThatDonGuy
Joined: Jun 22, 2011
  • Threads: 99
  • Posts: 4759
April 9th, 2015 at 12:55:41 PM permalink
I'm not entirely sure what it is that you are trying to do.

I think what you are saying is, for each initial hand, determine the best actions for that hand by going through every possible deal and seeing which action (hit, stand, double, split, surrender) has the best average result, then use those actions to determine the return. If this is correct, then you should know that this is exactly what is done in order to calculate the basic strategy in the first place.
konglify
konglify
Joined: Aug 28, 2014
  • Threads: 28
  • Posts: 160
April 9th, 2015 at 1:47:17 PM permalink
Quote: ThatDonGuy

I'm not entirely sure what it is that you are trying to do.

I think what you are saying is, for each initial hand, determine the best actions for that hand by going through every possible deal and seeing which action (hit, stand, double, split, surrender) has the best average result, then use those actions to determine the return. If this is correct, then you should know that this is exactly what is done in order to calculate the basic strategy in the first place.


Well, it is a bit different from the procedure to produce the strategy. Here is what I am doing

1) Initial random two cards to player, 1 card to dealer
2) For the player, try any possible actions (hit, stand, double, split, surrender), find the set of actions leading to the highest pay
3) record the highest pay for return to player calculation
repeat 1) and 2) for 10,000,000 times to get total pay and total bet

the main difference between what I am doing and what the procedure of strategy is the later one for each hand, it will try all possible cards dealt to find the expectation value and use the expectation value to determine the operation. But in my case, I don't try all possible dealt cards, but for the same dealt cards, I try all possible actions.

for example, let's say the deck is shuffled as 8 7 5 A 9 J Q 2 3 ...
The player has been dealt as 8,7, dealer has 5. For those initial hand, I try all actions, for each action, if needed, dealt more cards from the same deck, to see which one will give me higher pay and record the highest pay.

I don't understand why this will give a over 100% return to play.
Romes
Romes
Joined: Jul 22, 2014
  • Threads: 27
  • Posts: 5494
April 9th, 2015 at 2:18:42 PM permalink
Quote: konglify

...for example, let's say the deck is shuffled as 8 7 5 A 9 J Q 2 3 ...
The player has been dealt as 8,7, dealer has 5. For those initial hand, I try all actions, for each action, if needed, dealt more cards from the same deck, to see which one will give me higher pay and record the highest pay.

I don't understand why this will give a over 100% return to play.


First, the player hand would be 8-5, and the dealer card would be 7... If you're going to replicate a real shoe. Next, there's 2 things you could be doing. One, sounds a lot like basic strategy... In which you're doing something drastically wrong (perhaps in your rule definitions) because you should see over 10 million hands EV/House Edge. I've programmed a blackjack game as well for my particular style of play and to include counting and trust me the return is no where near 150%.

Secondly, it would appear as though you 'might' be playing each hand out according to what's in the shoe and then selecting the best case scenario (and I don't just mean optimal basic strategy play)? I don't think this is right, I'm just double checking... So in the example above you say the player has 8-7 to dealer 5... what happens next? Does your program think about hitting, look at the next card, then determine it should hit because taking the Ace from the dealer would be more profitable in the long run than appropriately staying on 15? I.E. Does your program "look ahead" in the shoe to decide what to do? If so, that's obviously why it's coming back 150% return and it's something that's realistic.

What would your program do with the following case... The shoe is: 10-7-9-10-4-x-x-x-x... Where the player, according to your setup, gets 17, and the dealer has 19. Does your program notice there's a 4 if it "incorrectly" hits against basic strategy and return a Hit decision anyways, since after all taking the 4 is the most profitable play?

If your program is not "looking ahead" then it sounds like you're programming pure basic strategy... Which should have a losing EV of whatever your rules are set up to make the HE.
Playing it correctly means you've already won.
OnceDear
Administrator
OnceDear
Joined: Jun 1, 2014
  • Threads: 46
  • Posts: 5138
April 9th, 2015 at 2:23:17 PM permalink
Hi,

It's not obvious why your numbers are wrong (and they are wrong)
Here's a few suggestions.

Are you dealing your random selections from a correctly constructed shoe? Eg for 6 decks, are you drawing from a set of 24 aces, 24 twos etc up to 96 tens?

Are you correctly stopping a deal at appropriate points? E.g. stop when dealer hits 17? stop when your hand reaches 21? restrict number of splits or doubles as per normal rules?

Are you applying proper payout rules? EG push with a natural doesn't pay 3:2: Ace and ten after a split does not pay 3:2 etc

You're not doing something stupid like splitting non-pairs?

I suspect you are not accommodating the fact that the best way to play a hand, e.g. Ten, 5 will vary depending on the dealer's up card. You are not perhaps averaging your choice of 'best' across the range of potential up cards?

I'm troubled by "find the set of actions leading to the highest pay". It makes me suspect your simulation is way outside of real rules.

Random is not really the best way to do this, because it will be an estimate that is only as good as your RNG. Use your programming skills to analyse all the hands that CAN appear and analyse those. To show it's not too stressful in terms of calculation, there are ONLY 54433 possible dealer hands. From construction of a list of those hands, you can easily calculate the probability of any of those one hands occurring, pretty precisely. You could then compare your calculated probabilities to those available on many BJ sites. When you've done that for dealer hands, do the same for player hands (Single player will be fine) You could cheat and use an infinite deck and the difference to your percentages would be almost imperceptible. Or try with a single deck to get you started.
Take care out there. Spare a thought for the newly poor who were happy in their world just a few days ago, but whose whole way of life just collapsed..
ThatDonGuy
ThatDonGuy
Joined: Jun 22, 2011
  • Threads: 99
  • Posts: 4759
April 9th, 2015 at 4:45:21 PM permalink
Quote: konglify

Well, it is a bit different from the procedure to produce the strategy. Here is what I am doing

1) Initial random two cards to player, 1 card to dealer
2) For the player, try any possible actions (hit, stand, double, split, surrender), find the set of actions leading to the highest pay
3) record the highest pay for return to player calculation
repeat 1) and 2) for 10,000,000 times to get total pay and total bet

the main difference between what I am doing and what the procedure of strategy is the later one for each hand, it will try all possible cards dealt to find the expectation value and use the expectation value to determine the operation. But in my case, I don't try all possible dealt cards, but for the same dealt cards, I try all possible actions.

for example, let's say the deck is shuffled as 8 7 5 A 9 J Q 2 3 ...
The player has been dealt as 8,7, dealer has 5. For those initial hand, I try all actions, for each action, if needed, dealt more cards from the same deck, to see which one will give me higher pay and record the highest pay.

I don't understand why this will give a over 100% return to play.



Let's try a slightly different example, to see if I understand.
Suppose the deck is 8 7 9 7 K.
You are dealt the 8 and 7, and the dealer's up card is 9.
You check "stand on 15"; the dealer gets the 7 (16), then the King (bust), and you win.
You check "hit 15"; you get the 7, and bust, so you lose.
You can't split.
You check "double on 15"; you get the 7, and bust, so you lose 2.
You are recording "stand on 15", since that returns the highest value.

On the other hand, if the deck is 8 7 9 6 5 4:
You check "stand on 15": the dealer gets the 6 (15), then the 5 (20), and you lose.
You check "hit 15"; you get the 6 (21) and stand, and the dealer gets the 5 (14) and 4 (18), and you win.
You can't split.
You check "double on 15"; you get the 6, and have 21, and the dealer gets the 5 and 4, and you win 2.
You are recording "double on 15".

Note that if you combine the two hands, your total for hit = 0, for stand = 0, and for double = 0, since you win one hand and lose one hand in each case.
However, when you are counting just the best results, you count a +1 on a stand and a +2 on a double.
This is why your total > 100%. You have to add up all of the results for all of your options - not just the best ones.
konglify
konglify
Joined: Aug 28, 2014
  • Threads: 28
  • Posts: 160
April 10th, 2015 at 8:12:13 AM permalink
Quote: ThatDonGuy

Let's try a slightly different example, to see if I understand.
Suppose the deck is 8 7 9 7 K.
You are dealt the 8 and 7, and the dealer's up card is 9.
You check "stand on 15"; the dealer gets the 7 (16), then the King (bust), and you win.
You check "hit 15"; you get the 7, and bust, so you lose.
You can't split.
You check "double on 15"; you get the 7, and bust, so you lose 2.
You are recording "stand on 15", since that returns the highest value.

On the other hand, if the deck is 8 7 9 6 5 4:
You check "stand on 15": the dealer gets the 6 (15), then the 5 (20), and you lose.
You check "hit 15"; you get the 6 (21) and stand, and the dealer gets the 5 (14) and 4 (18), and you win.
You can't split.
You check "double on 15"; you get the 6, and have 21, and the dealer gets the 5 and 4, and you win 2.
You are recording "double on 15".

Note that if you combine the two hands, your total for hit = 0, for stand = 0, and for double = 0, since you win one hand and lose one hand in each case.
However, when you are counting just the best results, you count a +1 on a stand and a +2 on a double.
This is why your total > 100%. You have to add up all of the results for all of your options - not just the best ones.



Thanks. That's exactly what I am doing like your example. If I understand you correctly, you mean that we know the initial hands for dealer and player but there are many different possibility on the rest of the deck so we need to count all cases corresponding to the same initial hands. That is to say, we keep the first 3 cards in the deck unchanged but shuffle all the rest of the deck, for each shuffle, we find the best pay and record the win action and all the lost actions. So for those initial cards, weight all win actions and lost actions to find the average contribution?

I think that's why I didn't get the pay less than 100%? When I am designing the code, I am thinking that just like playing a real game but instead of making decision based on the strategy, I assume that the player can see through the deck. Now I understand that, since it is not possible for the player to know what cards will be in the deck, we need to consider the probability instead. So need to consider the average.

The reason why I start this code is I want to find the optimal strategy instead of the basic one. So what is the main difference between optimal and basic strategy?
Romes
Romes
Joined: Jul 22, 2014
  • Threads: 27
  • Posts: 5494
April 10th, 2015 at 8:18:09 AM permalink
Quote: konglify

...The reason why I start this code is I want to find the optimal strategy instead of the basic one. So what is the main difference between optimal and basic strategy?


The main difference is "optimal strategy," where you know everyone's cards and the next cards to come, is completely pointless. You'll never play it in a casino, you'll never play it even for fun (there's no chance or skill to the game, just ABC decisions). I'm not sure why you'd want to spend a lot of hours programming something that no one would ever use, but if you're having fun doing it, then I guess that's what counts.
Playing it correctly means you've already won.
konglify
konglify
Joined: Aug 28, 2014
  • Threads: 28
  • Posts: 160
April 10th, 2015 at 8:19:23 AM permalink
Quote: OnceDear

Hi,

It's not obvious why your numbers are wrong (and they are wrong)
Here's a few suggestions.

Are you dealing your random selections from a correctly constructed shoe? Eg for 6 decks, are you drawing from a set of 24 aces, 24 twos etc up to 96 tens?

Are you correctly stopping a deal at appropriate points? E.g. stop when dealer hits 17? stop when your hand reaches 21? restrict number of splits or doubles as per normal rules?

Are you applying proper payout rules? EG push with a natural doesn't pay 3:2: Ace and ten after a split does not pay 3:2 etc

You're not doing something stupid like splitting non-pairs?

I suspect you are not accommodating the fact that the best way to play a hand, e.g. Ten, 5 will vary depending on the dealer's up card. You are not perhaps averaging your choice of 'best' across the range of potential up cards?

I'm troubled by "find the set of actions leading to the highest pay". It makes me suspect your simulation is way outside of real rules.

Random is not really the best way to do this, because it will be an estimate that is only as good as your RNG. Use your programming skills to analyse all the hands that CAN appear and analyse those. To show it's not too stressful in terms of calculation, there are ONLY 54433 possible dealer hands. From construction of a list of those hands, you can easily calculate the probability of any of those one hands occurring, pretty precisely. You could then compare your calculated probabilities to those available on many BJ sites. When you've done that for dealer hands, do the same for player hands (Single player will be fine) You could cheat and use an infinite deck and the difference to your percentages would be almost imperceptible. Or try with a single deck to get you started.



Thanks for the reply. I wrote a code to use basic strategy, which properly deal with shuffle, split, hit on soft/hard 17 and etc., and I got the same result as the wizard of odds. I use the code to deal with your questions on shuffle, split, etc. in my new code also.

I understand your point for the last statement. I just wonder instead of using the basic strategy table, can we just directly calculate the payout instead. Ultimately, what I want to find out is optimal strategy, but seems that the way I am working is wrong.
Dieter
Dieter
Joined: Jul 23, 2014
  • Threads: 7
  • Posts: 1507
April 10th, 2015 at 8:29:48 AM permalink
Quote: konglify

So what is the main difference between optimal and basic strategy?



"optimal" strategy either requires next-card knowledge (extremely rare - marked cards or end decking), or card counting & index play (somewhat less rare, but still challenging).

This too is a solved problem. Almost every card counting technique suggests certain index plays - deviations from basic strategy, based on seen cards.

The first on the list is almost always insurance - there are times when it is statistically optimal to take it, and there are other times when it is not.
May the cards fall in your favor.

  • Jump to: