Poll
| 1 vote (33.33%) | ||
| No votes (0%) | |||
| No votes (0%) | |||
| No votes (0%) | |||
| 1 vote (33.33%) | ||
| No votes (0%) | |||
| No votes (0%) | |||
| No votes (0%) | |||
| 1 vote (33.33%) | ||
| No votes (0%) |
3 members have voted
My question concerns the overall quality of an average if the lowest and highest data points are removed from the sample. I believe this is how judging works at the Olympics in events like figure skating.
To make a concrete example, I created 1,000 sets of 10 random variables according to the standard normal distribution. I looked at the average of all 10 as well as the average of the 8 in the middle. I did this many times. The results are inconclusive. I think I need a sampling in the millions.
What are your thoughts? The question for the poll is with standard normal variables, do you prefer to straight mean or a trimmed mean? Multiple votes allowed.
For example, the super niche statistic of "typical amount of brisket for lunch eaten by Americans in April 2023" would be skewed by both Jews celebrating Passover and Muslims observing Ramadan, justifying a trimmed mean.
To give another example, arm length among Americans should follow a normal standard distribution but amputees would change the data sample a lot. However, additionally trimming off people with exceptionally long arms might not be fair in this situation.
Whereas when measuring "average time from sunrise to sunset" in a given location, it would never be justified to make a trimmed mean.
I was doing an experiment in a lab, measuring the responses of some devices to a pulse of radiation in order to characterize them. On one of them the data, the ratio of the output to the input signal, was all over the place. So I did exactly that, tossed the really weird outliers and took the mean of the rest. No big deal, right?
Then I get cornered by some of my elders and betters, who seemed to take delight in educating this rookie, and the hard way! "What happened here? Why do you have these strange numbers?"
"Oh, just outliers. I didn't include them in the results."
"What do you mean, outliers? Isn't it data you took?"
"Sure."
"And you designed this experiment, right?"
"Yes."
"Didn't you choose the equipment, calculate the expected precision and accuracy of the results, and take all the data yourself?"
"Yes."
"Then why do you have outliers?"
"No idea."
"In that case, you don't understand your experiment. This setup is doing something you did not intend for it to do, therefore nothing about it can be trusted. You have to figure out what's going on and fix it."
And I thought about that for a while. Makes sense. I know what the precision of this equipment is supposed to be, and if I'm getting errors outside of that then something is wrong. It could be malfunctioning, I could be using it wrong, I could have done my calculations wrong... or I have just discovered something new! Whatever it is, I've got to track it down.
Turned out, it was just an alignment problem with some mounts and it had to be put together more neatly than I had thought to be reliable, and that really is no big deal to remedy. But it was very educational- if you've done all the preparation and you're still getting "outliers," there's something going on that you don't understand or didn't know about when you started your endeavor, and your experiment is not going to be answering the exact questions you think it is. So you have to fix it if you want a reliable measurement.
In the case of the house purchase, I would have recalculated the average to exclude houses that are dilapidated, because the criterion was "houses in similar condition" and I assume the house you bought was not in a dilapidated condition. I would also exclude houses that were not sold through a realtor on the open market, because that could include sales between family members at a discount that does not reflect actual market prices. On the other side I would exclude houses that have some special history that will increase their value to some, like having been owned by a celebrity.
How does one factor in the fact that the price can never be less than zero, but there's basically no limit to the top price?Quote: WizardAbout ten years ago I purchased a house from a friend. In negotiating the value we looked at comparable houses in the neighborhood. He wanted to remove the lowest recent sale because he said the house was in terrible shape and shouldn't depress the average. I said that was fine if I got to remove the highest recent sale from the average. This is known as a "trimmed mean."
My question concerns the overall quality of an average if the lowest and highest data points are removed from the sample. I believe this is how judging works at the Olympics in events like figure skating.
To make a concrete example, I created 1,000 sets of 10 random variables according to the standard normal distribution. I looked at the average of all 10 as well as the average of the 8 in the middle. I did this many times. The results are inconclusive. I think I need a sampling in the millions.
What are your thoughts? The question for the poll is with standard normal variables, do you prefer to straight mean or a trimmed mean? Multiple votes allowed.
link to original post
And how does something like this affect the football over-under bet, where the score can't go under zero, but it's possible to go extremely high?
Average of all 10 = 0.003450
Average of trimmed 8 =0.003445
Difference = 0.000004
So, the trimmed 8 is very slightly better. However, I think this is within the margin of error.
When dealing with something that doesn't require an exact figure, I found this gives a better 'average" than using 100% of the data.
It's a rule of thumb I was taught as a boy.
Quote: WizardI collected some more data. One million sets of ten random standard normal variables. Here are the mean distances from the true mean.
Average of all 10 = 0.003450
Average of trimmed 8 =0.003445
Difference = 0.000004
So, the trimmed 8 is very slightly better. However, I think this is within the margin of error.
link to original post
I think it would be instructive to report the standard deviation of the all 10 vs trimmed 8 samples over the trials.
I’ve always thought the point of trimming was to reduce the variance of outcomes given a small sample size. Of course over time the variance should balance out as reflected in the averages you report above.
Non-competitive homes should be eliminated as best as possible. Or values weighted to reflect how non-competitive they actually are.
Quote: WizardI collected some more data. One million sets of ten random standard normal variables. Here are the mean distances from the true mean.
Average of all 10 = 0.003450
Average of trimmed 8 =0.003445
Difference = 0.000004
So, the trimmed 8 is very slightly better. However, I think this is within the margin of error.
link to original post
Your results from simulating a normal distribution sound about right. Now try some other distributions…
One example that I thought of (that won’t balance as nicely), is wait times for a bus. Yet another example like Axel’s where you can’t go below zero, but at the other end, the sky’s the limit.
Let’s say you have a bus that is scheduled to arrive every 15 minutes. You would need 3 buses arriving 10 minutes early to “balance” the one bus that’s 30 minutes late. Let’s say that these happen to be the most extreme data points in a random sample, do you throw out 2 data points (one high and one low)?
Quote: camaplQuote: WizardI collected some more data. One million sets of ten random standard normal variables. Here are the mean distances from the true mean.
Average of all 10 = 0.003450
Average of trimmed 8 =0.003445
Difference = 0.000004
So, the trimmed 8 is very slightly better. However, I think this is within the margin of error.
link to original post
Your results from simulating a normal distribution sound about right. Now try some other distributions…
One example that I thought of (that won’t balance as nicely), is wait times for a bus. Yet another example like Axel’s where you can’t go below zero, but at the other end, the sky’s the limit.
Let’s say you have a bus that is scheduled to arrive every 15 minutes. You would need 3 buses arriving 10 minutes early to “balance” the one bus that’s 30 minutes late. Let’s say that these happen to be the most extreme data points in a random sample, do you throw out 2 data points (one high and one low)?
link to original post
The thing about real-world measurements is they aren't always independent. The same road conditions affecting one bus will be affecting many of them, and buses on a route are also interchangeable. When a bus on a 15 minute schedule is 30 minutes late, if the problem was with the bus the next bus would have gone around him so it couldn't be more than 15 minutes late. So the problem has to be on the road rather than one particular bus, thus you can't treat the other buses as independent of the one that you know is late.
How this applies to house prices is the same problems affecting one house in a neighborhood could be presented to others there. So let's say a house is discounted because it's loaded with termites and has structural damage. You can be sure that property is not the only place in the town where there are termites. It could be said it's fair to include that discounted price in the mean, because it represents the chance of you too having or getting termites and damage. On the other hand, termites can be inspected for, and with regular inspection and extermination you can prevent a serious termite infestation, and that expense is already factored into the prices of the provably non-termite homes, and if you can prove what you are buying is also non-termite it is fair to compare it only to the other non-termite homes.
The more intersting question is the effect on small samples. My instinct tells me that trimming is better, and trimming more is better, all the way to to the point where median is a better measure than mean.
In the house example, there is certainly some nuance. While I still think that median is the best measure, you also have to take the particular house into consideration, because there is nothing that says that the particular house you are looking at is in average condition. A recently re-done roof and a new AC (I live in AZ; I assume that Nevada is similar) could easily be worth $20k or so in the first couple of years of ownership.
RE: judging in the olympics, I think that is (at least partially) to mitigate the effects of bias and corruption, rather than the belief that one measure is statistically better than the other. In other words, it's trying to get around the fact that the scores may not be normally distributed around the "fair" score, rather than trying to come up with a good estimate of the middle of a normal distribution. Though the small sample size also leans things towards the median being the best score IMO, and some trimming is better than none.
When I used to work at (large tech company) they would regress each interver's scores to the mean when deciding whether to hire someone (there were a lot of interviewers, but some are tougher than others, so if you are interviewed by 4-5 people you could easily get lucky or unlucky and get particularly tough or easy interviewers)

