We have all heard the saying about there being, “lies, damn lies, and statistics.” From where does the saying come? No one knows for sure, though it comes from the 19th century. The word statistics did not exist in English until 1787, so the saying cannot come from prior to that. A mathematician from the University of York has tried to trace down the phrase with no success:
A few years ago I thought that I had successfully tied down the origin of this quotation. I concluded that it came from Lord Courtney in 1895 as explained in the first sub-section below.
Since then I have come across a pseudonymous quotation to the same effect from 1891 (see Anonymous in 1891 below; the judge referred to may be Baron Bramwell, for whom also see below), and a very similar phrase attributed to Sir Charles Wentworth Dilke (1843–1911) in the same year (see Dilke below). A couple of years later another use of the phrase occurs (listed below under Traveling Engineers). Further, my attention has been drawn to a use of the phrase (or something very like it) by Sir Robert Giffen (1837–1910) in January 1892. Later in the same year Arthur James Balfour (later 1st Earl of Balfour) (1848–1930) and Mrs Andrew Crosse (Cornelia Augusta Hewitt Crosse) (1827–1895) employed the phrase. Details of their use of the phrase can be found below. It should be noted that even Balfour referred to it as “old” and that Giffen regarded it is a recent adaptation of an old jest about scientific experts. Slightly after that, a doctor called M Price read a paper to an 1894 gathering in which he referred to “the proverbial kinds of falsehoods, ‘lies, damned lies, and statistics.’ …”
The phrase does express the persuasive power of statistics. They have a veneer of science about them, as though they cannot be questioned. But, as many a politician or lawyer has found out, statistics can be a very effective way to tell a truth in such a way that a person reading or listening is led to a wrong conclusion. Part of the reason is that most people do not know mathematics or symbolic logic well enough to be able to analyze a statement. As a result, they may hear a statistic and not be able to work through what that statistic is really saying.
Let me give you an example of how two mathematical words and how not knowing mathematics will let you be misled. The two words are median and the average. Do you remember your middle school math?
In probability theory and statistics, a median is described as the numeric value separating the higher half of a sample, a population, or a probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. If there is an even number of observations, then there is no single middle value; the median is then usually defined to be the mean of the two middle values.
There is a very complex definition of what is a mathematical average. For our purposes, let me just point you to your grade school math. If you wanted to find the average of two numbers, you added them together and divided by two. If you wanted to find the average of three numbers, you added them together and divided by three. If you want to find the average of 411 numbers, you add them together and divide by 411.
But, the results of the median and the average can be quite different. And, that difference is one of the areas in mathematics that can easily be exploited by the unscrupulous. Let me give you a couple of example. Let’s say that five obese people go on a new diet. After six months the results are tallied and it is found that one person lost 10 pounds, a second lost 20 pounds, a third lost 15 pounds, a fourth lost 25 pounds, but the fifth lost an amazing 180 pounds! Now, let’s look at the median and the average.
If I am writing advertising copy, I am going to want to use the average over the mean. Why is that? Well, the mean amount of weight loss is the middle of the five values, and that is 20 pounds. But, if we average them, then after six months, all five participants lost an average of 50 pounds! That sounds fabulous. So, now the public relations writer can get to work promising that if you buy this product, you will lose 50 pounds on the average.
Now here is my question. Is the statement that you will lose 50 pounds on the average completely true? Yes it is, there is no doubt about it, you can do the calculations yourself. Is it misleading? Yes, it is incredibly misleading. The reality is that half of the people who buy this product will lose under 20 pounds, while half of the people will lose over 20 pounds. But even that last statement is misleading. The range and the mode are other mathematical expressions for how you can evaluate statistical data.
The range is the difference between the smallest and the largest number, so in this case, the range is 170 pounds. No advertising manager is going to allow the range to be used in this case because the range shows that there is quite a wide spread of results, which would warn you to do some more investigation. I cannot use the mode in the sample above because there are no repeating numbers. But, let’s say that the numbers had been 10, 12, 15, 20, 20, 18, 25, 23, 180, 178. The mode in this case would have been 20, which again would have warned you that the average weight lost is a completely misleading statistic. (No, I will not define the mode now.)
What happens in real life is that whether it is an advertising manager, a politician, a lawyer, or even a pastor trying to come up with a hard hitting sermon, the tendency is to look for the statistic that will best support your viewpoint, rather than the statistic that will best explain the data. The danger is that we will read or hear a completely true statistic, but we will not have access to the numbers that were used to calculate that statistic. And, rarely will the person quoting the statistic give you other statistics that would help to ensure that what the person is writing or saying is not misleading.
For instance, one wag has said that given the statistics on all the people who have “come to Christ” in the USA, that everyone in the USA must now be a believer. Most crusades will quote the statistic for people who have “come forward.” Few organizations are as honest and truthful as the Billy Graham Evangelistic Association. They also have given out the statistics on the number of people who have never made it into a church, the number of people who have come forward more than once, and so on. When one has those figures then one can more clearly evaluate the success of the BGEA. And, btw, yes they are successful, and honorable, and truthful, and to be believed.
My point is that we need to be very cautious when we hear or read statistics so that we may really understand what is the truth of the matter.
===MORE TO COME===
Leave a Reply