-
(单词翻译:双击或拖选)
by Jason Marshall
In the last article, we learned that there are several ways to calculate the average value of a sample of data. We focused on one such method, the arithmetic mean, and used it to calculate the average number of potato chips in a bag. Today, we’re going to continue where we left off and talk about a second quantity used to determine average values: the median.
But first, the podcast edition of this tip was sponsored by Go To Meeting. Save time and money by hosting your meetings online. Visit GoToMeeting.com/podcast and sign up for a free 45 day trial of their web conferencing solution.
Review of the Arithmetic Mean
Let’s start with a quick recap of the last article on how to calculate mean values. We imagined opening 11 fun-pack sized bags of potato chips and counting and recording1 the total number of chips in each bag. Our imaginary bags of chips contained 18, 15, 19, 18, 23, 17, 18, 16, 19, 34, and 17 chips. The first big statistical3 task we undertook was to calculate the mean number of chips in a bag. We calculated the arithmetic mean by adding up the total number of chips contained in all bags, and then dividing this number by the total number of bags. In our example, that’s 214 / 11, giving a mean value of about 19.45. Finally, just before finishing up, we noted4 that the arithmetic mean is not the only way to calculate an average value and that some of these other ways are more useful for analyzing5 certain types of problems, and I claimed that our potato chip problem is itself just such a case. But why did I make this claim?
The Problem with Arithmetic Means as Average Numbers
To find out why, let’s go back and take another look at our sample of data. First, let’s write out the number of chips in each bag in order from smallest to largest: 15, 16, 17, 17, 18, 18, 18, 19, 19, 23, 34. The mean value we calculated earlier, 19.45, logically falls between the minimum of 15 and the maximum of 34; but it’s not really in the middle of the sample like you might expect the average value to be. In fact, only 2 of the 11 numbers are larger than the mean value (23 and 34). Why is that? Well, it’s because the mean value is skewed toward a higher value since the number 34 is much larger than any of the other numbers—perhaps that particular bag was crushed, breaking the chips into a bunch of small pieces.
But the situation could be even worse: What if that bag was really crushed, and the chips were broken into really small pieces—instead of 34 small chips, imagine the bag contained 100 tiny chips! If that were the case, the mean number of chips would jump from 19.45 to almost 25.5. It’s certainly clear that 25.5 is not a very good representation of the typical number of chips in a fun-pack bag since this supposedly average value is higher than the total number of chips in all but one of the 11 bags. The problem is that the single anomalously6 high value of 100 chips is throwing off our calculation of the mean. So, we need another way to measure the average value that is resistant7 to this type of outlying value.
What is the Median Value?
And that’s exactly what the median value is: an outlier-resistant measure of the average value of a sample of data. In other words, it’s a value that’s similar in interpretation8 to the arithmetic mean, but that doesn’t get thrown off by a single crazy-big or small data point. So how is the median value actually calculated? It’s remarkably9 easy. The first step, which we’ve already done, is to write the data in our sample in order from smallest to largest. If we include the extremely crushed fun-pack bag of chips in our sample, this list is: 15, 16, 17, 17, 18, 18, 18, 19, 19, 23, 100. Now, the median value is simply defined to be the number in the middle. In this case, since there are 11 values, the median is the number in the middle with 5 values on either side of it. In other words, it’s 18.
Why the Median Value Matters in Real Life
Notice that the size of the large number 100 doesn’t impact the median value at all. In fact, that number could have been 1000, 10000, or even larger and the median value would have been exactly the same. That ability to resist outliers is exactly why the median value is such a useful and important statistic2 for describing many measurable quantities in the real-world—for example: average housing prices. Why? Well, most cities tend to have lots of mid-priced houses, and a few astronomically10 expensive properties. Describing the average housing price in a city using the median instead of the mean statistic ensures that these few extremely expensive (and certainly atypical) properties don’t skew the overall average price to a higher value.
But that’s not all median values have to offer. Next time, we’ll take a look at a very cool trick that uses median averaging to make people disappear from your photographs! Who knows—it might come in handy after your next vacation, so be sure to check it out. And be sure to watch this week’s Math Dude “Video Extra!” episode on YouTube too—it’ll feature a few more tips and tricks to help you calculate median values in various situations.
Wrap Up
Okay, that’s all the math we have time for today. If you like what you’ve read today and have a few minutes to spare, can you please do me a favor and leave a review on iTunes? Thanks in advance! And while you’re there, don’t forget to subscribe11 to the podcast and ensure you’ll never miss a new Math Dude episode.
Thanks again to our sponsor this week, Go To Meeting. Visit GoToMeeting.com/podcast and sign up for a free 45 day trial of their online conferencing service.
If you long for more math, I have two great ways to help you get your fill. First, if you’re interested in my day-to-day thoughts about the latest math and science news, please follow me on Twitter. And second, if you’d like to get updates about the show and to interact with your fellow math fans, please become a fan of the Math Dude on Facebook. I hope to see you there!
Until next time, this is Jason Marshall with The Math Dude’s Quick and Dirty Tips to Make Math Easier. Thanks for reading, math fans!
1 recording | |
n.录音,记录 | |
参考例句: |
|
|
2 statistic | |
n.统计量;adj.统计的,统计学的 | |
参考例句: |
|
|
3 statistical | |
adj.统计的,统计学的 | |
参考例句: |
|
|
4 noted | |
adj.著名的,知名的 | |
参考例句: |
|
|
5 analyzing | |
v.分析;分析( analyze的现在分词 );分解;解释;对…进行心理分析n.分析 | |
参考例句: |
|
|
6 anomalously | |
参考例句: |
|
|
7 resistant | |
adj.(to)抵抗的,有抵抗力的 | |
参考例句: |
|
|
8 interpretation | |
n.解释,说明,描述;艺术处理 | |
参考例句: |
|
|
9 remarkably | |
ad.不同寻常地,相当地 | |
参考例句: |
|
|
10 astronomically | |
天文学上 | |
参考例句: |
|
|
11 subscribe | |
vi.(to)订阅,订购;同意;vt.捐助,赞助 | |
参考例句: |
|
|