30 March 2025

On Mistakes, Blunders and Errors II: Statistics and Probabilities

“It is a capital mistake to theorize before you have all the evidence. It biases the judgment.” (Sir Arthur C Doyle, “A Study in Scarlet”, 1887)

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.” (Sir Arthur C Doyle, “The Adventures of Sherlock Holmes”, 1892)

"What real and permanent tendencies there are lie hid beneath the shifting superfices of chance, as it were a desert in which the inexperienced traveller mistakes the temporary agglomerations of drifting sand for the real configuration of the ground" (Francis Y Edgeworth, 1898)

"Some of the common ways of producing a false statistical argument are to quote figures without their context, omitting the cautions as to their incompleteness, or to apply them to a group of phenomena quite different to that to which they in reality relate; to take these estimates referring to only part of a group as complete; to enumerate the events favorable to an argument, omitting the other side; and to argue hastily from effect to cause, this last error being the one most often fathered on to statistics. For all these elementary mistakes in logic, statistics is held responsible." (Sir Arthur L Bowley, "Elements of Statistics", 1901)

"If the chance of error alone were the sole basis for evaluating methods of inference, we would never reach a decision, but would merely keep increasing the sample size indefinitely." (C West Churchman, "Theory of Experimental Inference", 1948)

"There are instances of research results presented in terms of probability values of ‘statistical significance’ alone, without noting the magnitude and importance of the relationships found. These attempts to use the probability levels of significance tests as measures of the strengths of relationships are very common and very mistaken." (Leslie Kish, "Some statistical problems in research design", American Sociological Review 24, 1959)

"Poor statistics may be attributed to a number of causes. There are the mistakes which arise in the course of collecting the data, and there are those which occur when those data are being converted into manageable form for publication. Still later, mistakes arise because the conclusions drawn from the published data are wrong. The real trouble with errors which arise during the course of collecting the data is that they are the hardest to detect." (Alfred R Ilersic, "Statistics", 1959)

"The rounding of individual values comprising an aggregate can give rise to what are known as unbiased or biased errors. [...]The biased error arises because all the individual figures are reduced to the lower 1,000 [...] The unbiased error is so described since by rounding each item to the nearest 1,000 some of the approximations are greater and some smaller than the original figures. Given a large number of such approximations, the final total may therefore correspond very closely to the true or original total, since the approximations tend to offset each other. [...] With biased approximations, however, the errors are cumulative and their aggregate increases with the number of items in the series." (Alfred R Ilersic, "Statistics", 1959)

"While it is true to assert that much statistical work involves arithmetic and mathematics, it would be quite untrue to suggest that the main source of errors in statistics and their use is due to inaccurate calculations." (Alfred R Ilersic, "Statistics", 1959)

"No observations are absolutely trustworthy. In no field of observation can we entirely rule out the possibility that an observation is vitiated by a large measurement or execution error. If a reading is found to lie a very long way from its fellows in a series of replicate observations, there must be a suspicion that the deviation is caused by a blunder or gross error of some kind. [...] One sufficiently erroneous reading can wreck the whole of a statistical analysis, however many observations there are." (Francis J Anscombe, "Rejection of Outliers", Technometrics Vol. 2 (2), 1960)

"The most important and frequently stressed prescription for avoiding pitfalls in the use of economic statistics, is that one should find out before using any set of published statistics, how they have been collected, analysed and tabulated. This is especially important, as you know, when the statistics arise not from a special statistical enquiry, but are a by-product of law or administration. Only in this way can one be sure of discovering what exactly it is that the figures measure, avoid comparing the non-comparable, take account of changes in definition and coverage, and as a consequence not be misled into mistaken interpretations and analysis of the events which the statistics portray." (Ely Devons, "Essays in Economics", 1961)

"The problem of error has preoccupied philosophers since the earliest antiquity. According to the subtle remark made by a famous Greek philosopher, the man who makes a mistake is twice ignorant, for he does not know the correct answer, and he does not know that he does not know it." (Félix Borel, "Probability and Certainty", 1963)

"He who accepts statistics indiscriminately will often be duped unnecessarily. But he who distrusts statistics indiscriminately will often be ignorant unnecessarily. There is an accessible alternative between blind gullibility and blind distrust. It is possible to interpret statistics skillfully. The art of interpretation need not be monopolized by statisticians, though, of course, technical statistical knowledge helps. Many important ideas of technical statistics can be conveyed to the non-statistician without distortion or dilution. Statistical interpretation depends not only on statistical ideas but also on ordinary clear thinking. Clear thinking is not only indispensable in interpreting statistics but is often sufficient even in the absence of specific statistical knowledge. For the statistician not only death and taxes but also statistical fallacies are unavoidable. With skill, common sense, patience and above all objectivity, their frequency can be reduced and their effects minimised. But eternal vigilance is the price of freedom from serious statistical blunders." (W Allen Wallis & Harry V Roberts, "The Nature of Statistics", 1965)

"The calculus of probability can say absolutely nothing about reality [...] We have to stress this point because these attempts assume many forms and are always dangerous. In one sentence: to make a mistake of this kind leaves one inevitably faced with all sorts of fallacious arguments and contradictions whenever an attempt is made to state, on the basis of probabilistic considerations, that something must occur, or that its occurrence confirms or disproves some probabilistic assumptions." (Bruno de Finetti, "Theory of Probability", 1974)

"Mistakes arising from retrospective data analysis led to the idea of experimentation, and experience with experimentation led to the idea of controlled experiments and then to the proper design of experiments for efficiency and credibility. When someone is pushing a conclusion at you, it's a good idea to ask where it came from - was there an experiment, and if so, was it controlled and was it relevant?" (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"There are no mistakes. The events we bring upon ourselves, no matter how unpleasant, are necessary in order to learn what we need to learn; whatever steps we take, they’re necessary to reach the places we’ve chosen to go." (Richard Bach, "The Bridge across Forever", 1984)

"Correlation and causation are two quite different words, and the innumerate are more prone to mistake them than most." (John A Paulos, "Innumeracy: Mathematical Illiteracy and its Consequences", 1988)

"When you want to use some data to give the answer to a question, the first step is to formulate the question precisely by expressing it as a hypothesis. Then you consider the consequences of that hypothesis, and choose a suitable test to apply to the data. From the result of the test you accept or reject the hypothesis according to prearranged criteria. This cannot be infallible, and there is always a chance of getting the wrong answer, so you try and reduce the chance of such a mistake to a level which you consider reasonable." (Roger J Barlow, "Statistics: A guide to the use of statistical methods in the physical sciences", 1989)

"Exploratory regression methods attempt to reveal unexpected patterns, so they are ideal for a first look at the data. Unlike other regression techniques, they do not require that we specify a particular model beforehand. Thus exploratory techniques warn against mistakenly fitting a linear model when the relation is curved, a waxing curve when the relation is S-shaped, and so forth." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"Most statistical models assume error free measurement, at least of independent (predictor) variables. However, as we all know, measurements are seldom if ever perfect. Particularly when dealing with noisy data such as questionnaire responses or processes which are difficult to measure precisely, we need to pay close attention to the effects of measurement errors. Two characteristics of measurement which are particularly important in psychological measurement are reliability and validity." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995)

"We can consider three broad classes of statistical pitfalls. The first involves sources of bias. These are conditions or circumstances which affect the external validity of statistical results. The second category is errors in methodology, which can lead to inaccurate or invalid results. The third class of problems concerns interpretation of results, or how statistical results are applied (or misapplied) to real world issues." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

"This notion of 'being due' - what is sometimes called the gambler’s fallacy - is a mistake we make because we cannot help it. The problem with life is that we have to live it from the beginning, but it makes sense only when seen from the end. As a result, our whole experience is one of coming to provisional conclusions based on insufficient evidence: read ing the signs, gauging the odds." (John Haigh," Taking Chances: Winning With Probability", 1999)

"Big numbers warn us that the problem is a common one, compelling our attention, concern, and action. The media like to report statistics because numbers seem to be 'hard facts' - little nuggets of indisputable truth. [...] One common innumerate error involves not distinguishing among large numbers. [...] Because many people have trouble appreciating the differences among big numbers, they tend to uncritically accept social statistics (which often, of course, feature big numbers)." (Joel Best, "Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists", 2001)

"Compound errors can begin with any of the standard sorts of bad statistics - a guess, a poor sample, an inadvertent transformation, perhaps confusion over the meaning of a complex statistic. People inevitably want to put statistics to use, to explore a number's implications. [...] The strengths and weaknesses of those original numbers should affect our confidence in the second-generation statistics." (Joel Best, "Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists", 2001)

 "A major problem with many studies is that the population of interest is not adequately defined before the sample is drawn. Don’t make this mistake. A second major source of error is that the sample proves to have been drawn from a different population than was originally envisioned." (Phillip I Good & James W Hardin, "Common Errors in Statistics (and How to Avoid Them)", 2003)

"The difference between 'statistically significant' and 'not statistically significant' is not in itself necessarily statistically significant. By this, I mean more than the obvious point about arbitrary divisions, that there is essentially no difference between something significant at the 0.049 level or the 0.051 level. I have a bigger point to make. It is common in applied research–in the last couple of weeks, I have seen this mistake made in a talk by a leading political scientist and a paper by a psychologist–to compare two effects, from two different analyses, one of which is statistically significant and one which is not, and then to try to interpret/explain the difference. Without any recognition that the difference itself was not statistically significant." (Andrew Gelman, "The difference between ‘statistically significant’ and ‘not statistically significant’ is not in itself necessarily statistically significant", 2005)

"[…] an outlier is an observation that lies an 'abnormal' distance from other values in a batch of data. There are two possible explanations for the occurrence of an outlier. One is that this happens to be a rare but valid data item that is either extremely large or extremely small. The other is that it isa mistake – maybe due to a measuring or recording error." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Many scientists who work not just with noise but with probability make a common mistake: They assume that a bell curve is automatically Gauss's bell curve. Empirical tests with real data can often show that such an assumption is false. The result can be a noise model that grossly misrepresents the real noise pattern. It also favors a limited view of what counts as normal versus non-normal or abnormal behavior. This assumption is especially troubling when applied to human behavior. It can also lead one to dismiss extreme data as error when in fact the data is part of a pattern." (Bart Kosko, "Noise", 2006) 

"A naive interpretation of regression to the mean is that heights, or baseball records, or other variable phenomena necessarily become more and more 'average' over time. This view is mistaken because it ignores the error in the regression predicting y from x. For any data point xi, the point prediction for its yi will be regressed toward the mean, but the actual yi that is observed will not be exactly where it is predicted. Some points end up falling closer to the mean and some fall further." (Andrew Gelman & Jennifer Hill, "Data Analysis Using Regression and Multilevel/Hierarchical Models", 2007)

"If there is an outlier there are two possibilities: The model is wrong – after all, a theory is the basis on which we decide whether a data point is an outlier (an unexpected value) or not. The value of the data point is wrong because of a failure of the apparatus or a human mistake. There is a third possibility, though: The data point might not be an actual  outlier, but part of a (legitimate) statistical fluctuation." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"In error analysis the so-called 'chi-squared' is a measure of the agreement between the uncorrelated internal and the external uncertainties of a measured functional relation. The simplest such relation would be time independence. Theory of the chi-squared requires that the uncertainties be normally distributed. Nevertheless, it was found that the test can be applied to most probability distributions encountered in practice." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"Another kind of error possibly related to the use of the representativeness heuristic is the gambler’s fallacy, otherwise known as the law of averages. If you are playing roulette and the last four spins of the wheel have led to the ball’s landing on black, you may think that the next ball is more likely than otherwise to land on red. This cannot be. The roulette wheel has no memory. The chance of black is just what it always is. The reason people tend to think otherwise may be that they expect the sequence of events to be representative of random sequences, and the typical random sequence at roulette does not have five blacks in a row." (Jonathan Baron, "Thinking and Deciding" 4th Ed, 2008)

"[…] humans make mistakes when they try to count large numbers in complicated systems. They make even greater errors when they attempt - as they always do - to reduce complicated systems to simple numbers." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"There is a growing realization that reported 'statistically significant' claims in statistical publications  are routinely mistaken. Researchers typically express the confidence in their data in terms of p-value: the probability that a perceived result is actually the result of random variation. The value of p (for 'probability') is a way of measuring the extent to which a data set provides evidence against a so-called null hypothesis. By convention, a p- value below 0.05 is considered a meaningful refutation of the null hypothesis; however, such conclusions are less solid than they appear." (Andrew Gelman & Eric Loken, "The Statistical Crisis in Science", American Scientist Vol. 102(6), 2014)

"Using a sample to estimate results in the full population is common in data analysis. But you have to be careful, because even small mistakes can quickly become big ones, given that each observation represents many others. There are also many factors you need to consider if you want to make sure your inferences are accurate." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"The central limit conjecture states that most errors are the result of many small errors and, as such, have a normal distribution. The assumption of a normal distribution for error has many advantages and has often been made in applications of statistical models." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

"Variance is error from sensitivity to fluctuations in the training set. If our training set contains sampling or measurement error, this noise introduces variance into the resulting model. [...] Errors of variance result in overfit models: their quest for accuracy causes them to mistake noise for signal, and they adjust so well to the training data that noise leads them astray. Models that do much better on testing data than training data are overfit." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Statistical models have two main components. First, a mathematical formula that expresses a deterministic, predictable component, for example the fitted straight line that enables us to make a prediction [...]. But the deterministic part of a model is not going to be a perfect representation of the observed world [...] and the difference between what the model predicts, and what actually happens, is the second component of a model and is known as the residual error - although it is important to remember that in statistical modelling, ‘error’ does not refer to a mistake, but the inevitable inability of a model to exactly represent what we observe." (David Spiegelhalter, "The Art of Statistics: Learning from Data", 2019)

"If we don’t understand the statistics, we’re likely to be badly mistaken about the way the world is. It is all too easy to convince ourselves that whatever we’ve seen with our own eyes is the whole truth; it isn’t. Understanding causation is tough even with good statistics, but hopeless without them. [...] And yet, if we understand only the statistics, we understand little. We need to be curious about the world that we see, hear, touch, and smell, as well as the world we can examine through a spreadsheet." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Premature enumeration is an equal-opportunity blunder: the most numerate among us may be just as much at risk as those who find their heads spinning at the first mention of a fraction. Indeed, if you’re confident with numbers you may be more prone than most to slicing and dicing, correlating and regressing, normalizing and rebasing, effortlessly manipulating the numbers on the spreadsheet or in the statistical package - without ever realizing that you don’t fully understand what these abstract quantities refer to. Arguably this temptation lay at the root of the last financial crisis: the sophistication of mathematical risk models obscured the question of how, exactly, risks were being measured, and whether those measurements were something you’d really want to bet your global banking system on." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)


"Always expect to find at least one error when you proofread your own statistics. If you don’t, you are probably making the same mistake twice." (Cheryl Russell)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Mathematical Thinking

"It should therewith be remembered that as mathematics studies neutral complexes, mathematical thinking is an organizational process an...