Thursday, August 29, 2013

Statistical Interpretation

Recently, reporters have been using a variety of statistical terms that some seem not to fully understand.  During my education, there was even an article of required reading warning us to look for ways people deliberately manipulate statistics to prove a point.

A former professor of mine, a research teacher, used to refer to the subject as "sadistics".  This was his way of expressing the difficulty of understanding the topic, as well as his aversion to it.  A lot of people feel the same way.

There are, for instance, three ways of determining the "average" of a set of figures.  They are called the mean, the median and the mode. 

The median is, literally, the middle number.  If you have 61 numbers in your sample or set, you count down the list until you reach the thirty-first number and that is the median.  Once again, it is a form of computing the average.

The mode, on the other hand, is the most frequent number.  If you have the same 61 numbers, one or two of each value, and the lowest number is repeated several times, then the lowest number is the mode. 

The arithmetic mean is determined by adding (computing the sum of) all sixty-one numbers and then dividing the sum by the number of numbers.  In our example this is sixty-one.

In the following hypothetical example, let us compute the three different averages and talk abut the results.  I have numbered the set of figures for ease in finding the median.

Our Hypothetical Sample

 1.  15,000,000                                                               32.      15,000
 2.  15,000,000                                                               33.      15,000
 3.    5,000,000                                                               34.      15,000
 4.    5,000,000                                                               35.      15,000
 5.    5,000,000                                                               36.      15,000
 6.    5,000,000                                                               37.      15,000
 7.    5,000,000                                                               38.      15,000
 8.    5,000,000                                                               39.      15,000
 9.       500,000                                                               40.      15,000
10.      500,000                                                               41.      15,000
11.      500,000                                                               42.      15,000
12.      500,000                                                               43.      15,000
13.      500,000                                                               44.      15,000
14.      500,000                                                               45.      15,000
15.      500,000                                                               46.      15,000
16.      500,000                                                               47.      15,000
17.      500,000                                                               48.      15,000
18.      500,000                                                               49.      15,000
19.        25,000                                                               50.      15,000
20.        25,000                                                               51.      15,000   
21.        25,000                                                               52.      15,000
22.        25,000                                                               53.      15,000
23.        25,000                                                               54.      15,000
24.        25,000                                                               55.      15,000
25.        25,000                                                               56.      15,000
26.        25,000                                                               57.      15,000
27.        25,000                                                               58.      15,000
28.        25,000                                                               59.      15,000
29.        15,000                                                               60.      15,000
30.        15,000                                                               61.      15,000
31.        15,000

In the above example, it is easy to determine the median and the mode.  Both of them are 15,000.  The arithmetic mean, however, is quite different.  This figure is radically skewed higher because of the large figures at the top.  The arithmetic mean is 1,077,786.885.  Quite a difference, isn't it? 

So, let's play with the figures a little.  Are you trying to prove that American seniors are about to become the richest age group in the country?  Would you use the mean, the median or the mode?  The median and the mode would show a low figure, wouldn't they?  They are both $15,000.  To prove American seniors are rich, one would have to use the arithmetic mean at $1,077,786.88.  But would this be a true representation of the wealth of American citizens?  Not at all.  Most American citizens represented by this figure, have $15,000, not over $1,000,000.  In fact, it would take a person with $15,000 a year income over 71 years with no withholding and no spending to accumulate a figure commensurate with this arithmetic mean.  You dream the impossible dream.

One of the worst interpretations of statistics is made when interpreting the results of group comparison studies.  This kind of study is used a lot in medical research, including the dental plaque compared with heart disease research.  A former local news anchor was interpreting the results as plaque on the teeth is causing heart trouble.  No can do!  The best we can say of correlational evidence is that the two groups "co-relate."  We cannot assume a causal relationship from correlational data.  It could be the plaque does cause heart trouble.  It could be that heart trouble causes plaque on the teeth.  It could even be an accidental coincidence.

Remember to check the figures of your reporters, your doctors, your editorialists when making your decisions.  All are subject to error.  A few intend to mislead.  Oh yes, and please check my figures to see if I made any errors when keying in the sample.  To err is human, or so the great writer once said.

No comments:

Post a Comment