September 29th, 2010 at 8:04:32 PM
permalink
I think most people on here are aware of the Gambler's fallacy that what has happened previously does not affect what is going to happen. My question is, when is it the gambler's fallacy and when is it an actual representation of the odds, and how do you know? It's easy if we are dealing with dice or roulette but what if the event is very hard to quantify mathematically such as a football game. I'll give two examples:
The dice are rolled 1000 times with a point of 4 or 10. 335 times the 4 or 10 is made and 665 times it's a seven out.
In that same sample, left handed males made the 4 or 10 21 times and sevened out 28 times, right handed males passed 32 times and sevened out 141 times, left handed females passed 84 times and sevened out 188 times, and right handed females passed 198 times and sevened out 308 times.
Now because we know the odds of dice we can see the first ratio is a pretty accurate representation of true odds.
In the second, it looks like left handed males make a 4 or 10 at a ratio of 3:4, so if you are getting paid 2:1 it's a +ev bet.
It's simple to see the variables in the second example have no correlation to the outcome. In some instances though, such as football, when the handicapping is inexact and historic outcomes are the only info you can use to determine true odds, how do you know when you are the first example or the second(overfitting)example?
The dice are rolled 1000 times with a point of 4 or 10. 335 times the 4 or 10 is made and 665 times it's a seven out.
In that same sample, left handed males made the 4 or 10 21 times and sevened out 28 times, right handed males passed 32 times and sevened out 141 times, left handed females passed 84 times and sevened out 188 times, and right handed females passed 198 times and sevened out 308 times.
Now because we know the odds of dice we can see the first ratio is a pretty accurate representation of true odds.
In the second, it looks like left handed males make a 4 or 10 at a ratio of 3:4, so if you are getting paid 2:1 it's a +ev bet.
It's simple to see the variables in the second example have no correlation to the outcome. In some instances though, such as football, when the handicapping is inexact and historic outcomes are the only info you can use to determine true odds, how do you know when you are the first example or the second(overfitting)example?
September 29th, 2010 at 10:01:22 PM
permalink
This is where common sense has to rear its ugly head.
In sports handicapping, RECENT past results matter, because the present makeup of the team is largely identical to that of the team when it produced those past results. Therefore, the handicapper can expect that the team will perform in similar fashion in the immediate future (and this, of course, goes for their opponents as well). This is reliable because the link between performance and results is clear, and has been empirically tested; the better teams usually win.
It would be faulty methodology to note, for instance, that a team has won more often on odd-numbered days than it should, and attach any meaning to that. In this case, unlike the above case, the correlation is illusory.
So the question you are really asking is, how do we tell the difference between a real correlation and an illusory one? Two ways:
1. We use common sense. We know that the sex or handedness of the dice thrower could not possibly have any influence on the outcome. As a quick and dirty heuristic, we can assume that the casino would have noticed such a bias in the outcomes and would forbid women, or left-handed people, or whomever, to throw the dice.
2. We empirically test the condition. This is a more rigorous version of the casino heuristic. As our sample size grows larger, our results should converge toward the mean. I would not consider significant, for example, any data set that was under 100,000 rolls to be significant, and would only pay attention to results that were five or more standard deviations to one side or the other. But those are simply MY criteria. The point is, there exists the ability to empirically test any such hypothesis; all you have to do is go out there and collect it.
In the absence of the lack of a life necessary to rigorously test such a condition, I think the "common sense" methodology is perfectly acceptable.
In sports handicapping, RECENT past results matter, because the present makeup of the team is largely identical to that of the team when it produced those past results. Therefore, the handicapper can expect that the team will perform in similar fashion in the immediate future (and this, of course, goes for their opponents as well). This is reliable because the link between performance and results is clear, and has been empirically tested; the better teams usually win.
It would be faulty methodology to note, for instance, that a team has won more often on odd-numbered days than it should, and attach any meaning to that. In this case, unlike the above case, the correlation is illusory.
So the question you are really asking is, how do we tell the difference between a real correlation and an illusory one? Two ways:
1. We use common sense. We know that the sex or handedness of the dice thrower could not possibly have any influence on the outcome. As a quick and dirty heuristic, we can assume that the casino would have noticed such a bias in the outcomes and would forbid women, or left-handed people, or whomever, to throw the dice.
2. We empirically test the condition. This is a more rigorous version of the casino heuristic. As our sample size grows larger, our results should converge toward the mean. I would not consider significant, for example, any data set that was under 100,000 rolls to be significant, and would only pay attention to results that were five or more standard deviations to one side or the other. But those are simply MY criteria. The point is, there exists the ability to empirically test any such hypothesis; all you have to do is go out there and collect it.
In the absence of the lack of a life necessary to rigorously test such a condition, I think the "common sense" methodology is perfectly acceptable.
The fact that a believer is happier than a skeptic is no more to the point than the fact that a drunken man is happier than a sober one. The happiness of credulity is a cheap and dangerous quality.---George Bernard Shaw
September 29th, 2010 at 10:10:23 PM
permalink
I think you're looking at whether past correlations are causative or if they're merely coincidental. The old adage "correlation does not imply causation" is apropos here. A football team that loses more games on grass than turf is likely to continue to lose more games on grass than turf. The causative factor might be something like "the team practices on turf" or "the linemen are allergic to grass" or whatnot. A football team that loses more games when the weather in Mexico is stormy, well, that's not likely to be causative, and I wouldn't go looking for Mexican weather reports on the way to the sportsbook.
"In my own case, when it seemed to me after a long illness that death was close at hand, I found no little solace in playing constantly at dice."
-- Girolamo Cardano, 1563