Probabilities in Reid-Angle Race

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 8:23:17 AM permalink

Quote: Wizard
So, if my math is right, and I'm far from sure it is, I say that Angle has an 85% chance of winning. Based on the LVRJ poll only, is my math correct?

When these polls say "4% margin of error," I always take that to be a 95% confidence interval. Knowing the confidence interval is the key to this, the rest of the calculation is easy from there. Without knowing the confidence interval that the 4% represents, I'm not sure what to do with these results.

From Wikipedia:

Quote:
Like confidence intervals, the margin of error can be defined for any desired confidence level, but usually a level of 90%, 95% or 99% is chosen (typically 95%).

--Ms. D.

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

JerryLogan

JerryLogan

Threads: 26
Posts: 1344

Joined: Jun 28, 2010

October 29th, 2010 at 8:31:48 AM permalink

Isn't the LVRJ involved in some kind of legal issue with LVA for the unauthorized posting of one of their articles on a site without written permission?

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 8:39:09 AM permalink

Quote: DorothyGale
When these polls say "4% margin of error," I always take that to be a 95% confidence interval. Knowing the confidence interval is the key to this, the rest of the calculation is easy from there. Without knowing the confidence interval that the 4% represents, I'm not sure what to do with these results.

Thanks. So if one standard deviation is 2.06%, then there is a 95% chance that Angle's actual percentage in the election will fall within 2.06%*1.96 = 4.04% of that. So on election day, there is a 95% chance her actual share will be within 48.08% and 56.17%, or 52.13% +/- 1.96*2.06%. For other readers who may be wondering where the 1.96 comes from, there is a 95% chance of falling within 1.96 standard deviations of expectations in any random sampling.

It would be nice of the papers said "The 95% margin of error is 4%," rather than just "The margin of error is 4%." How are we supposed to know they are referring to a 95% confidence interval? Why not 90%, 98%, 99%, or something else?

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

crazyiam

crazyiam

Threads: 0
Posts: 44

Joined: Feb 5, 2010

October 29th, 2010 at 8:46:46 AM permalink

Fivethirtyeight might be the best place for election predictions. It uses poll averaging and weighting metrics to come up with predictions. I believe the methodology uses more undecided people to increase the variance of possible results.

http://elections.nytimes.com/2010/forecasts/senate/nevada

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 8:47:35 AM permalink

Quote: JerryLogan
Isn't the LVRJ involved in some kind of legal issue with LVA for the unauthorized posting of one of their articles on a site without written permission?

Yes. Please visit the copyrighted material thread.

Also, everybody, please don't quote entire articles in this forum, especially from the LVRJ. Just small quotes, and properly attribute them.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 9:00:10 AM permalink

Quote: crazyiam
Fivethirtyeight might be the best place for election predictions. It uses poll averaging and weighting metrics to come up with predictions. I believe the methodology uses more undecided people to increase the variance of possible results.

That is why I described those other 38 people in the poll is "pesky." It would be one thing if they were wasting their votes on a third party candidate. However, it does add more variance if they are still undecided. I wish the LVRJ would have made that clear.

Nice to see my election odds are close to those of the New York Times (77.2% Angle, 22.8% Reid). I would not expect them to match exactly, since they used a different survey.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

matilda

Threads: 3
Posts: 317

Joined: Feb 4, 2010

October 29th, 2010 at 9:02:38 AM permalink

Quote: Wizard
Thanks. So if one standard deviation is 2.06%, then there is a 95% chance that Angle's actual percentage in the election will fall within 2.06%*1.96 = 4.04% of that. So on election day, there is a 95% chance her actual share will be within 48.08% and 56.17%, or 52.13% +/- 1.96*2.06%. For other readers who may be wondering where the 1.96 comes from, there is a 95% chance of falling within 1.96 standard deviations of expectations in any random sampling.

It would be nice of the papers said "The 95% margin of error is 4%," rather than just "The margin of error is 4%." How are we supposed to know they are referring to a 95% confidence interval? Why not 90%, 98%, 99%, or something else?

Your interpretation of the probability of a confidence interval is incorrect. The reason is that the population parameter that the sample is used to estimate is a constant. It is the sample estimate that varies according to a distribution such as the normal. Thus the correct statement is that if a large number of samples were taken, then 95% of the intervals constructed would contain the parameter being estimated. As for the probability of the parameter being contained in a single interval, it is zero or 1 depending on whether it is in the interval or not.

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 9:13:05 AM permalink

Quote: matilda
Your interpretation of the probability of a confidence interval is incorrect. The reason is that the population parameter that the sample is used to estimate is a constant. It is the sample estimate that varies according to a distribution such as the normal. Thus the correct statement is that if a large number of samples were taken, then 95% of the intervals constructed would contain the parameter being estimated. As for the probability of the parameter being contained in a single interval, it is zero or 1 depending on whether it is in the interval or not.

I'm not following you. What is the 4% margin of error telling us in this poll?

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 9:18:21 AM permalink

I agree with your math. Usually, there is an additional out-clause that is included in the press release for political polls. I don't see it in the LVJR article. There are two ways for error to be introduced into political polls. One is essentially statistical variance, which is accounted for by the 4% MOE that is stated on the poll. The other is methodological error. Political pollsters, including Mason-Dixon, randomly call phone numbers off of a list to get their survey sample. Then they use a combination of census data, exit-polling info, voter registration info, and intuition to manipulate the data. For example, any telephone poll will under-sample young voters and black voters and over-sample older voters and whites. So the pollster is left adjusting the results to more closely match the expected electorate. That usually results in a line at the bottom of the release that reads something like "in addition to the stated MOE for this poll, there is a second separate source of potential error that is less easy to quantify. This poll may be statistically sound and still vary from the election results by more than the stated MOE."

Here are the crosstabs for this poll.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 9:28:17 AM permalink

Quote: rdw4potus
Here are the crosstabs for this poll.

Thanks. Here is what that link says, in part:

Quote: LVRJ
The margin for error, according to standards customarily used by statisticians, is no more than ±4 percentage points. This means that there is a 95 percent probability that the "true" figure would fall within that range if the entire population were sampled. The margin for error is higher for any subgroup, such as a gender or regional grouping.

That is what I was trying to say in response to Dorothy's post, which Matilda has said is incorrect.

Quote: Wizard
So if one standard deviation is 2.06%, then there is a 95% chance that Angle's actual percentage in the election will fall within 2.06%*1.96 = 4.04% of that. So on election day, there is a 95% chance her actual share will be within 48.08% and 56.17%, or 52.13% +/- 1.96*2.06%.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

DorothyGale

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 9:40:01 AM permalink

Here is my analysis ... here is the LVRJ poll.

Reid: 45%
Angle: 49%
Margin of error: 4%
Sample size: 625

Assuming the undecided vote broke evenly (no reason to assume this, in reality she should get slightly more of the undecided because she is slightly ahead in the poll), then Angle is at 52%. If the standard deviation is 2.06%, then a result of 50% or higher corresponds to the rhs of z = -0.97, or about an 83.4% chance that Angle will win.

I'm not happy with giving Angle a true mean of 52% (assuming the undecideds break 50/50). So, assuming that the undecideds break along the same percent as the ratios in the poll, then you need to sum an infinite series to get Angle's true mean final result.

Angle's true mean is:

49% + 49%*6% + 49%*(6%)^2 + 49%*(6%)^3 + ... = 52.2%.

In this case, a result of 50% or higher corresponds to the rhs of z = -1.07, or about a 85.8% Angle will win.

In your original post you said:

Quote:
So, if my math is right, and I'm far from sure it is, I say that Angle has an 85% chance of winning. Based on the LVRJ poll only, is my math correct?

I'm not so sure about your math, but I sure like your conclusion.

--Ms. D.

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

matilda

Threads: 3
Posts: 317

Joined: Feb 4, 2010

October 29th, 2010 at 9:42:22 AM permalink

Quote: Wizard
I'm not following you. What is the 4% margin of error telling us in this poll?

It is saying the interval is 8% wide. If it in fact is a 95% confidence interval, then it is telling us that if a large number of samples were taken and the sample proportion was calculated for each and such an interval was calculated for each sample, then 95% of the intervals would contain the true, but unknown population proportion. Or put another way, if 1,000,000 polls were taken at the same time, then approximately 95% of the polls would be correct because the interval would contain the true proportion of voters. 5% of the polls would be wrong because the true proportion would fall outside of the interval.

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 10:04:24 AM permalink

Quote: matilda
It is saying the interval is 8% wide. If it in fact is a 95% confidence interval, then it is telling us that if a large number of samples were taken and the sample proportion was calculated for each and such an interval was calculated for each sample, then 95% of the intervals would contain the true, but unknown population proportion. Or put another way, if 1,000,000 polls were taken at the same time, then approximately 95% of the polls would be correct because the interval would contain the true proportion of voters. 5% of the polls would be wrong because the true proportion would fall outside of the interval.

First, what do you mean by "wide"? The true Gaussian curve should be infinitely wide.

Second, I don't see how what you said is different from what I wrote. I'm claiming that the true proportion of Angle voters will fall between 48.08% and 56.17% with a 95% chance. Where do you put the range?

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

matilda

Threads: 3
Posts: 317

Joined: Feb 4, 2010

October 29th, 2010 at 10:06:17 AM permalink

The LVRJ quote is also incorrect for the same reason.

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 10:12:29 AM permalink

Quote: matilda
The LVRJ quote is also incorrect for the same reason.

At least I have company. Care to quote any source that takes your side?

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

DJTeddyBear

DJTeddyBear

Threads: 207
Posts: 10992

Joined: Nov 2, 2009

October 29th, 2010 at 10:24:28 AM permalink

Quote: Wizard
It would be nice of the papers said "The 95% margin of error is 4%,"

That would confuse people even more. The immediate reaction would be, "What about the last 1%?"

I never knew how they calculate the margin of error, or what it really means. I never before heard the phrase 'confidence interval'.

Does a 4% margin mean that they think that, based upon the demographic analysis, that their poll sample is only a 96% accurate representation of the population?
Or does the 4% margin mean that they think 4% of the people polled might change their mind?
Or does the 4% margin mean that they think 4% of the people polled are wise-asses who intentionally gave the wrong answer?

I invented a few casino games. Info: http://www.DaveMillerGaming.com/ �� Superstitions are silly, childish, irrational rituals, born out of fear of the unknown. But how much does it cost to knock on wood? 😁

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 10:32:58 AM permalink

Quote: DJTeddyBear
That would confuse people even more. The immediate reaction would be, "What about the last 1%?"

I never knew how they calculate the margin of error, or what it really means. I never before heard the phrase 'confidence interval'.

Does a 4% margin mean that they think that, based upon the demographic analysis, that their poll sample is only a 96% accurate representation of the population?
Or does the 4% margin mean that they think 4% of the people polled might change their mind?
Or does the 4% margin mean that they think 4% of the people polled are wise-asses who intentionally gave the wrong answer?

Matilda will give a different answer, but I claim it means that the actual results will be within the stated margin of error of the poll results 95% of the time. For example, the LVRJ said Angle got 49% in their poll, and Reid 45%. So I claim that on election day Angle will get 45% to 53% with 95% chance. This factors in third party candidates and "none of the above," which I filtered out in my previous analysis. I'm still not sure what matilda would say, if put in layman's terms.

What I think the papers should do is just come right out and say what the probability of each candidate winning is. That is the important thing.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

Doc

Doc

Threads: 46
Posts: 7287

Joined: Feb 27, 2010

October 29th, 2010 at 10:40:39 AM permalink

Quote: Wizard
... Second, I don't see how what you said is different from what I wrote.

Just to butt my nose in where it wasn't invited, I think the difference in interpretations may go something like this, taking it away from political polls.

Suppose our experiment were to flip a coin 100 times and determine how often it came up heads. Suppose we were to repeat that experiment a number of times. We can study that topic and learn what the true mean is and what the variance is. We can determine these numbers without ever actually flipping a coin at all. This is a population distribution.

Knowing the population mean and standard deviation, we can use this info to predict with a degree of confidence the result of an individual experiment, which would be a sample of size 1 from the population. We can also predict the variability that we would encounter in this sample mean if we conducted the experiment a number of times.

On the other hand, suppose we don't know the population distribution (as we don't know the true opinions of potential voters). If we take a sample, perhaps a fairly large one, we can use it's mean to estimate the mean of the population. If we took lots of samples, we could make lots of estimates of the mean of the population, each with some degree of confidence.

I think that matilda is pointing out that it is a different thing to use one or many samples to estimate a population mean than it is to use the true population mean and standard deviation to estimate what will be found in a sample.

But I am very rusty at statistics and never really knew it all that well in the first place.

mkl654321

mkl654321

Threads: 65
Posts: 3412

Joined: Aug 8, 2010

October 29th, 2010 at 10:45:10 AM permalink

Quote: Wizard
Matilda will give a different answer, but I claim it means that the actual results will be within the stated margin of error of the poll results 95% of the time. For example, the LVRJ said Angle got 49% in their poll, and Reid 45%. So I claim that on election day Angle will get 45% to 53% with 95% chance. This factors in third party candidates and "none of the above," which I filtered out in my previous analysis. I'm still not sure what matilda would say, if put in layman's terms.

What I think the papers should do is just come right out and say what the probability of each candidate winning is. That is the important thing.

I think any such rigorous calculations are GIGO, because poll results are respresentative samples consisting of, not voters, but those who choose to answer/are solicited to answer pre-election polls. I doubt very much that the two sets--"voters" and "poll respondents"--are mutually congruent enough to make polls any more than a highly inaccurate barometer of the eventual results.

The fact that a believer is happier than a skeptic is no more to the point than the fact that a drunken man is happier than a sober one. The happiness of credulity is a cheap and dangerous quality.---George Bernard Shaw

DorothyGale

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 10:47:20 AM permalink

Quote: Wizard
So I claim that on election day Angle will get 45% to 53% with 95% chance.

False. Your interval does not include undecided voters. The actual election day interval is that with probability 0.95, the election day final total for Angle will be between 48.2% to 56.2% (per my computation of a 52.2% mean for Angle, above).

Any final predictions must include all undecided voters. A reasonable way to cast their votes is to say they will break consistent with the polls.

This assumes that the poll question was not "If you knew that Reid murdered babies, would you vote for him, or would you vote for his opponent, Angle, who has never murdered a baby?"

--Dorothy

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 10:47:52 AM permalink

Quote: Wizard
Matilda will give a different answer, but I claim it means that the actual results will be within the stated margin of error of the poll results 95% of the time. For example, the LVRJ said Angle got 49% in their poll, and Reid 45%. So I claim that on election day Angle will get 45% to 53% with 95% chance. This factors in third party candidates and "none of the above," which I filtered out in my previous analysis. I'm still not sure what matilda would say, if put in layman's terms.

What I think the papers should do is just come right out and say what the probability of each candidate winning is. That is the important thing.

If you have time to kill, check out fivethirtyeight.com. There is nobody in the industry better at what you're looking at doing than Nate Silver. I think you would especially like the regression-based approach that Silver uses.

I can tell you with a high degree of confidence that the paper will never be willing to post the win probability based on a single poll. They'd be taking a HUGE flyer that the demographic weightings in the poll were correct. They'd also be staking their name to Mason-Dixon's work. Some pollsters in recent years (Strategic Vision, Research 2000) have come into significant legal troubles for basically fabricating polling results. NOTE: The R2000 case is ongoing. I am NOT making a statement about whether or not they actually did fabricate polls, I'm simply stating that LVJR would take significant risk by staking their reputation to the win % suggested by a single poll. *edit*: I also do not mean to imply that Mason-Dixon is anything less than an accurate and reputable institution.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

DorothyGale

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 10:52:45 AM permalink

Quote: rdw4potus
There is nobody in the industry better at what you're looking at doing than Nate Silver.

Nate has a decidedly Libertarian/Right lean to his presentation. He's smart, but he's also personally biased. This bias shows in both his results, by means of how he weights the different polls, and his analysis of those results. He's not Fox biased, but he's out there.

But if you want to see REAL bias, just read a few copies of the LVRJ. The fact that the poll is presented in the LVRJ is almost enough to discredit it, prima facie.

--Dorothy

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 11:03:18 AM permalink

Quote: DorothyGale
False. Your interval does not include undecided voters. The actual election day interval is that with probability 0.95, the election day final total for Angle will be between 48.2% to 56.2% (per my computation of a 52.2% mean for Angle, above).

Any final predictions must include all undecided voters. A reasonable way to cast their votes is to say they will break consistent with the polls.

--Dorothy

Are you assuming that 100% of the "undecided" "likely" voters will show up to the polls? I'd caution against that. Anyone who says they're likely to vote but doesn't know who they're voting for 4 days before the election is at risk for not voting, leaving this question blank, or voting for "none of these." Speaking of which...is it too late for me to start campaigning for "none of these?" I think it'd have a real chance in this election...

I think, though I'm not sure, that the history also favors the incumbent with respect to undecided voters this late in the cycle. Of course, this election could easily break with that trend. But I would be willing to wager that more of the "undecideds" at this point are trying to justify voting against Reid instead of trying to justify voting for Angle.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 11:04:03 AM permalink

Quote: mkl654321
I think any such rigorous calculations are GIGO, because poll results are respresentative samples consisting of, not voters, but those who choose to answer/are solicited to answer pre-election polls. I doubt very much that the two sets--"voters" and "poll respondents"--are mutually congruent enough to make polls any more than a highly inaccurate barometer of the eventual results.

I think a good poll will factor in such biases. rdw4potus already made a good post on that issue. If you don't think the LVRJ poll is accurate, I'll give you 3 to 1 on Reid.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 11:05:43 AM permalink

Quote: DorothyGale
Nate has a decidedly Libertarian/Right lean to his presentation. He's smart, but he's also personally biased. This bias shows in both his results, by means of how he weights the different polls, and his analysis of those results. He's not Fox biased, but he's out there.

But if you want to see REAL bias, just read a few copies of the LVRJ. The fact that the poll is presented in the LVRJ is almost enough to discredit it, prima facie.

--Dorothy

It's funny you say that about Nate. His personal politics are pretty far left, and he tries to take that out of his analysis. Based on your comments, it sounds like he's over-correcting.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

DorothyGale

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 11:15:54 AM permalink

Quote: rdw4potus
Are you assuming that 100% of the "undecided" "likely" voters will show up to the polls?

No, I'm saying the final results Reid+Angle = 100%, and Mr. W overlooked this when he gave his interval. It doesn't matter who shows up -- if no undecided voters showed up, then the final result would be:

Reid = 45/(45+49) = 47.87%.
Angle = 49/(45+49) = 52.13%.

So, the Angle interval is centered around 52%, not around 49%.

--Dorothy

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

DorothyGale

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 11:20:10 AM permalink

Quote: Wizard
I'll give you 3 to 1 on Reid.

I'll take 9-to-2 -- in units of "honor," of course. By my records, you are currently up 3 units of honor on me.

--Dorothy

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 11:24:01 AM permalink

Quote: Wizard
I think a good poll will factor in such biases. rdw4potus already made a good post on that issue. If you don't think the LVRJ poll is accurate, I'll give you 3 to 1 on Reid.

I will take your action at 3:1. Payable in your choice of honor units or beverages (to be claimed at WOVcon I).

I would also book bets on other close races. I'll move that to another thread.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

mkl654321

mkl654321

Threads: 65
Posts: 3412

Joined: Aug 8, 2010

October 29th, 2010 at 11:28:18 AM permalink

Quote: Wizard
I think a good poll will factor in such biases. rdw4potus already made a good post on that issue. If you don't think the LVRJ poll is accurate, I'll give you 3 to 1 on Reid.

If a poll did collate all such potential biasing information (demographic and otherwise), then use past results to weight the collected data accordingly, then the poll would indeed be much more reliable. I think that such a realization, that all such respondents do not have equal weight, is absolutely necessary for the poll to mean anything.

Here in my neck of the woods, in 2008, an Obama win was a foregone conclusion, but the polls showed the race to be closer than it was, because students don't respond to polls, and 99.99% of students voted for Obama (a couple were stoned and punched the wrong button). In that case, the polls were so unrepresentative of what turned out to be the actual electorate, that they were utterly meaningless.

Reid is almost certainly going to win, because he has entrenched Party machinery behind him, he has massive funding available that his opponent does not, and he doubtless has a list of favors to call in longer than Santa's Christmas list. There's no way the Demos will let Reid lose--he's "too big to fail". It will also help Reid that his opponent is a raving loon. So I think you're not offering high enough odds--I would lay 5 to 1, or higher.

(By the way, I wonder if Harrah's is lobbying the state legislature to allow betting on elections, like in Great Britain. Think of the revenue...)

The fact that a believer is happier than a skeptic is no more to the point than the fact that a drunken man is happier than a sober one. The happiness of credulity is a cheap and dangerous quality.---George Bernard Shaw

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 11:52:46 AM permalink

Quote: mkl654321

If a poll did collate all such potential biasing information (demographic and otherwise), then use past results to weight the collected data accordingly, then the poll would indeed be much more reliable. I think that such a realization, that all such respondents do not have equal weight, is absolutely necessary for the poll to mean anything.

That is what they do, using a combination of past results and demography. But everyone does it differently, and it's more art than science. How do you scientifically adjust your biased poll to reflect the electorate, when the only clues you have about what this year's electorate will look like come from your admittedly biased poll (and other similarly flawed polls)?

Also, you imply that students simply do not reply to polls. I agree that students are both more likely to be unavailable in the evening and less likely to want to participate than the general population. There are a couple other factors to consider: 1. Many pollsters do not call cell phones. They completely miss households that do not include landline phones. This demographic is dominated by young voters. 2. Most pollsters only call residential landline phones. This can miss some key groups, like networks that are associated with one main "business" number. For example, the residence halls at my alma mater all roll back up to the University's main switchboard. Thus, no on-campus student landline phones can be sampled.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 12:44:44 PM permalink

Quote: DorothyGale
False. Your interval does not include undecided voters. The actual election day interval is that with probability 0.95, the election day final total for Angle will be between 48.2% to 56.2% (per my computation of a 52.2% mean for Angle, above).

I was trying to make things as simple as possible for DJTeddyBear. So I assumed that there were 0 undecided voters, and the other 6% were for third party candidates, or "none of the above," which is on the ballot in Nevada. As stated in other posts, I put the Angle mean at 52.13% and the 95% range at 48.08% to 56.17%.

Also, I think your 52.2% is in error.

49% + 49%*6% + 49%*(6%)^2 + 49%*(6%)^3 + ... = 52.13%, not 52.2%.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

DorothyGale

Threads: 40
Posts: 639

Joined: Nov 23, 2009

October 29th, 2010 at 12:48:47 PM permalink

Agree ... the sum of the series is the same as (49/(49+45)) ... 52.13% ... given that the probability of a Reid win is about 15%, that makes true odds about 17-to-3. So, will you take 9-to-2? I'll put up 2 units of Honor on Reid.

--Ms. D.

"Who would have thought a good little girl like you could destroy my beautiful wickedness!"

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 12:56:56 PM permalink

Quote: DorothyGale
Agree ... the sum of the series is the same as (49/(49+45)) ... 52.13% ... given that the probability of a Reid win is about 15%, that makes true odds about 17-to-3. So, will you take 9-to-2? I'll put up 2 units of Honor on Reid.

My gut tells me that Reid's odds are better than 15%. I think the phone polling with favor Angle, and last two major polls were done by right-leaning media. No, 3 to 1 is the best I can do.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

matilda

Threads: 3
Posts: 317

Joined: Feb 4, 2010

October 29th, 2010 at 2:20:46 PM permalink

Quote: Wizard
At least I have company. Care to quote any source that takes your side?

Sorry for the delay in answering. Any beginning statistics textbook will give the interpretation of a confidence interval.

� The random interval �X¯− √n ,X + √n � contains the true parameter θ with 95% probability. It is wrong to say that θ lies in the interval with 95% probability...θ is not a RV!

http://ocw.mit.edu/courses/economics/14-30-introduction-to-statistical-method-in-economics-spring-2006/lecture-notes/l9.pdf

(the quote didn't translate completely)

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 2:34:20 PM permalink

Quote: matilda
Sorry for the delay in answering. Any beginning statistics textbook will give the interpretation of a confidence interval.

� The random interval �X¯− √n ,X + √n � contains the true parameter θ with 95% probability. It is wrong to say that θ lies in the interval with 95% probability...θ is not a RV!

I don't see the difference. Until we know what θ is, it is a random variable. If we knew what it was, we wouldn't need to bother having the election.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 2:41:58 PM permalink

Quote: Wizard
I don't see the difference. Until we know what θ is, it is a random variable. If we knew what it was, we wouldn't need to bother having the election.

This couldn't be a more semantic argument, but I'm going to try to make it. θ is an unknown, but not random, number. It does not move. It is more correct to say that a given poll's interval surrounds the stationary θ with 95% confidence.

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

scotty81

scotty81

Threads: 8
Posts: 185

Joined: Feb 4, 2010

October 29th, 2010 at 3:07:51 PM permalink

Quote: Wizard
The Las' rel='nofollow' target='_blank'>http://www.lvrj.com/news/angle-poll-data-improve-106287803.html]Las Vegas Review Journal just published the latest poll on the race. Here is what they have:

Reid: 45%
Angle: 49%
Margin of error: 4%
Sample size: 625

I'm not sure how the exact math would work, but here is how I have always understood "Margin of error" in layman's terms:

The margin of error represents how the results would move given a normal distribution within 3 Sigmas (standard deviations) on each side of the curve. Of course, anything is possible, but from a practical standpoint a 3 Sigma event is considered outside of the "Margin of error"

So, on one end of the spectrum, you can add 4% to Reid, and subtract 4% from Angle, and this would be considered a 3 Sigma event:

Reid: 49%
Angle: 44%

At the other end of the spectrum, you can add 4% to Angle and subtract 4% from Reid, and this would be considered a 3 Sigma event:

Reid: 41%
Angle: 53%

The normal distribution between these +/- 3 Sigma events represents the probabilities of the election outcome.

I'm not saying this is correct, just that that is what I have always been told. I have no source for this.

Prediction is very difficult, especially about the future. - Niels Bohr

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 3:44:15 PM permalink

Quote: scotty81
The margin of error represents how the results would move given a normal distribution within 3 Sigmas (standard deviations) on each side of the curve. Of course, anything is possible, but from a practical standpoint a 3 Sigma event is considered outside of the "Margin of error".

Maybe that is how it is used in some other venue, but for elections, if my understanding is correct you only move 1.96 standard deviations in either direction.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

matilda

Threads: 3
Posts: 317

Joined: Feb 4, 2010

October 29th, 2010 at 4:40:12 PM permalink

Quote: Wizard
I don't see the difference. Until we know what è is, it is a random variable. If we knew what it was, we wouldn't need to bother having the election.

No, it is not a variable; it is fixed in value. Let me try this: What is the probability that the dealer has a ten down given he shows an ace up? We all can calculate this. Why? Because we know the population distribution., 52 cards, 16 tens, 4 aces. If we did not know the composition of a deck of cards, we could not calculate the probability. Suppose you see ten cards on the table, but you have no idea of the deck used by this casino. You cannot calculate the probability of a ten down.

In the case of a poll, unlike the deck of cards, we do not know the population distribution because the binomial (actually its a multinomial with undecideds) is defined by the probability of success and that parameter is unknown. If it was known why take the poll? Therefore you cannot calculate a probability such as a 95% range because you don't know what the population distribution is.

But when you interpret the interval as you do, you are actually saying that you know what the population distribution is for there is no way to calculate the probability without this knowledge. What you are actually doing is saying that the sampling distribution is same as the population distribution. But it is not. If it was then how do you account for different sampling distribution coming from samples of different size.

This is of course the key. The interval is the variable dependent on which sample happens to be collected. For each possible sample an interval can be calculated. While we cannot calculate a probabiliy about the population since the distribution is not known, we can calculate probabilities about the intervals themselves because the sampling distribution is known. We can calculate the probability that a particular interval contains a certain value. And more important for this discussion, we can calculate the percentage of intervals which contain a certain value. Therefore the correct interpretation of a 95% CI is that if all samples possible were taken and a 95% CI was constructed for each sample, then 95% of these intervals could contain the unknown, but constant, population parameter being estimated. The interval is the random variable not the population parameter which is a constant whether you know its value or not.

EvenBob

EvenBob

Threads: 441
Posts: 28676

Joined: Jul 18, 2010

October 29th, 2010 at 4:49:36 PM permalink

Oops, never mind.

"It's not called gambling if the math is on your side."

Doc

Doc

Threads: 46
Posts: 7287

Joined: Feb 27, 2010

October 29th, 2010 at 5:04:03 PM permalink

I think matilda's latest post coincides with my earlier attempt to explain the different viewpoints. I have a little difficulty though, trying to decide whether it is proper to say that 95% of the 95% confidence intervals would contain the unknown parameter of the total population. Something about the double 95% bothers me.

Again, my original limited knowledge of stat has become rusty.

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 29th, 2010 at 7:20:14 PM permalink

Quote: matilda
No, it is not a variable; it is fixed in value. Let me try this: What is the probability that the dealer has a ten down given he shows an ace up? We all can calculate this. Why? Because we know the population distribution., 52 cards, 16 tens, 4 aces. If we did not know the composition of a deck of cards, we could not calculate the probability. Suppose you see ten cards on the table, but you have no idea of the deck used by this casino. You cannot calculate the probability of a ten down.

I don't see the point of any of that. Would you say that the total points scored in the next Monday Night Football game is a random variable?

Quote: matilda
In the case of a poll, unlike the deck of cards, we do not know the population distribution because the binomial (actually its a multinomial with undecideds) is defined by the probability of success and that parameter is unknown. If it was known why take the poll? Therefore you cannot calculate a probability such as a 95% range because you don't know what the population distribution is.

But even you quoted (and I hate the syntax) "The random interval �X¡Â− ¡Ôn ,X + ¡Ôn � contains the true parameter £c with 95% probability." That is all I am saying. The actual Angle share will fall between 48.08% and 56.17% with 95% probability.

Quote: matilda
But when you interpret the interval as you do, you are actually saying that you know what the population distribution is for there is no way to calculate the probability without this knowledge. What you are actually doing is saying that the sampling distribution is same as the population distribution. But it is not. If it was then how do you account for different sampling distribution coming from samples of different size.

My confidence interval was based on the assumption that the 625 people sampled represented a fair random sampling of all Nevada voters. I think the random phone method favors Angle, but didn't want to muddy my math with that, so accepted the assumption. Of course if the poll was repated the results would likely not be exactly the same, due it being random who was called.

Quote: matilda
This is of course the key. The interval is the variable dependent on which sample happens to be collected. For each possible sample an interval can be calculated. While we cannot calculate a probabiliy about the population since the distribution is not known, we can calculate probabilities about the intervals themselves because the sampling distribution is known. We can calculate the probability that a particular interval contains a certain value. And more important for this discussion, we can calculate the percentage of intervals which contain a certain value. Therefore the correct interpretation of a 95% CI is that if all samples possible were taken and a 95% CI was constructed for each sample, then 95% of these intervals could contain the unknown, but constant, population parameter being estimated. The interval is the random variable not the population parameter which is a constant whether you know its value or not.

I don't disagree with any of that. But I still don't retract my statement that the results on election day will fall between 48.08% and 56.17% with 95% probability. Again, even that garbled quote from MIT seems to support that. I am still having a hard time identifying our point of departure.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

rdw4potus

Threads: 80
Posts: 7237

Joined: Mar 11, 2010

October 29th, 2010 at 10:15:27 PM permalink

Quote: Wizard
But I still don't retract my statement that the results on election day will fall between 48.08% and 56.17% with 95% probability.

The wording in the polling question is "if the election were held today...," so I think technically your CI tells you that if the election were held from 10/25-10/27, Angle's voteshare would have fallen between 48.08% and 56.17% with 95% probability. And of course, 2 days ago is very different from today - giant security threat and all...

"So as the clock ticked and the day passed, opportunity met preparation, and luck happened." - Maurice Clarett

matilda

Threads: 3
Posts: 317

Joined: Feb 4, 2010

October 31st, 2010 at 4:24:18 PM permalink

Quote: Wizard
I I am still having a hard time identifying our point of departure.

The parameter to be estimated is a constant not a variable. This is a requirement of the methodology of a CI. It is analogous to setting a value in the null hypothesis in a statistical test.

Your statement is incorrect for several reasons:

You are calculating a the limits of 95% symmetrical range of the binomial probability function , P unknown and N Known.
1. You used the result of the sample to estimate the unknown P. This means you have error because the probability, using the normal, a continuous function, that this estimate is exactly equal to P is zero. Yes, you have a 95% range, but it is of a different distribution from that you say you are measuring.
2. Since the variance of the binomial is a function of P, you cannot calculate it. You estimated the variance using the sample result and calculated your interval with the estimate. This procedure adds more error to you interval range.
3. You used the normal distribution to approximate the binomial. This is standard procedure, nothing wrong with it, but it is an approximation, a good one with a large sample, but still introduces error.

Because of these three errors introduced by the methodology used you cannot say that "the results on election day will fall between 48.08% and 56.17% with 95% probability. " The probability is not 95% because the parameters, the sample results, you used are are estimates.

However you could say that you are 95% confident in the interval that you have constructed.

A more major error is that in your statement is the implicit assumption that the parameters of the population will not change between the day of the sample and election day. But that is not a problem of statistical inference and is beyond the bounds of the current discussion.

This is the last post I will make on this subject.

Matilda

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

October 31st, 2010 at 9:45:47 PM permalink

Quote: matilda
This is the last post I will make on this subject.

Then I won't bother to reply in depth. It would have been nice to see the correct way to use the LVRJ poll to construct a confidence interval. At least I tried.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

November 1st, 2010 at 8:11:46 PM permalink

Preview of my "Ask the Wizard" question on this.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

dwheatley

dwheatley

Threads: 25
Posts: 1246

Joined: Nov 16, 2009

November 2nd, 2010 at 11:57:26 AM permalink

Some people have already tried to explain the misunderstanding that arose over confidence intervals. I'll try it another way:

In short, you can't use confidence intervals to predict the outcome of an election. You can only say that 95% of confidence intervals you create through proper polling will contain the true proportion of people who support a candidate. This is not a prediction, nor can it be properly used to make an accurate prediction.

Instead, you can use this paper by Medler & Tull to answer this exact question:

POLL POSITIONS AND WIN PROBABILITIES: A STOCHASTIC MODEL OF THE ELECTORAL PROCESS

The table on page 7 says that if Candidate X received 49% of the poll vote, and candidate Y received 45% of poll vote, then Candidate X has a 90% chance of winning.

Wisdom is the quality that keeps you out of situations where you would otherwise need it

Wizard
Administrator

Threads: 1493
Posts: 26501

Joined: Oct 14, 2009

November 3rd, 2010 at 2:59:38 AM permalink

I wrote several times yesterday to an FSA actuary about the confidence interval issue. He said that if we trust the sampling, and nobody changes his mind, then it is fine to say that:

pr(48.08%<= a <=56.17%)=95%, where a = Angle's actual vote share of Reid/Angle votes.

However he said I can't phrase that as a has a 95% chance of falling between 48.08% and 56.17%. That seems just ridiculous to me.

If the bathtub surrounds the baby, that means the baby is in the bathtub.

"For with much wisdom comes much sorrow." -- Ecclesiastes 1:18 (NIV)

matilda