17 July 2021

Gary Smith - Collected Quotes

"A computer makes calculations quickly and correctly, but doesn’t ask if the calculations are meaningful or sensible. A computer just does what it is told." (Gary Smith, "Standard Deviations", 2014)

"A study that leaves out data is waving a big red flag. A decision to include or exclude data sometimes makes all the difference in the world. This decision should be based on the relevance and quality of the data, not on whether the data support or undermine a conclusion that is expected or desired." (Gary Smith, "Standard Deviations", 2014)

"A very different - and very incorrect - argument is that successes must be balanced by failures (and failures by successes) so that things average out. Every coin flip that lands heads makes tails more likely. Every red at roulette makes black more likely. […] These beliefs are all incorrect. Good luck will certainly not continue indefinitely, but do not assume that good luck makes bad luck more likely, or vice versa." (Gary Smith, "Standard Deviations", 2014)

"Another way to secure statistical significance is to use the data to discover a theory. Statistical tests assume that the researcher starts with a theory, collects data to test the theory, and reports the results - whether statistically significant or not. Many people work in the other direction, scrutinizing the data until they find a pattern and then making up a theory that fits the pattern." (Gary Smith, "Standard Deviations", 2014)

"Comparisons are the lifeblood of empirical studies. We can’t determine if a medicine, treatment, policy, or strategy is effective unless we compare it to some alternative. But watch out for superficial comparisons: comparisons of percentage changes in big numbers and small numbers, comparisons of things that have nothing in common except that they increase over time, comparisons of irrelevant data. All of these are like comparing apples to prunes." (Gary Smith, "Standard Deviations", 2014)

"Data clusters are everywhere, even in random data. Someone who looks for an explanation will inevitably find one, but a theory that fits a data cluster is not persuasive evidence. The found explanation needs to make sense and it needs to be tested with uncontaminated data." (Gary Smith, "Standard Deviations", 2014)

"Data without theory can fuel a speculative stock market bubble or create the illusion of a bubble where there is none. How do we tell the difference between a real bubble and a false alarm? You know the answer: we need a theory. Data are not enough. […] Data without theory is alluring, but misleading." (Gary Smith, "Standard Deviations", 2014)

"Don’t just do the calculations. Use common sense to see whether you are answering the correct question, the assumptions are reasonable, and the results are plausible. If a statistical argument doesn’t make sense, think about it carefully - you may discover that the argument is nonsense." (Gary Smith, "Standard Deviations", 2014)

"Graphs can help us interpret data and draw inferences. They can help us see tendencies, patterns, trends, and relationships. A picture can be worth not only a thousand words, but a thousand numbers. However, a graph is essentially descriptive - a picture meant to tell a story. As with any story, bumblers may mangle the punch line and the dishonest may lie." (Gary Smith, "Standard Deviations", 2014)

"Graphs should not be mere decoration, to amuse the easily bored. A useful graph displays data accurately and coherently, and helps us understand the data. Chartjunk, in contrast, distracts, confuses, and annoys. Chartjunk may be well-intentioned, but it is misguided. It may also be a deliberate attempt to mystify." (Gary Smith, "Standard Deviations", 2014)

"How can we tell the difference between a good theory and quackery? There are two effective antidotes: common sense and fresh data. If it is a ridiculous theory, we shouldn’t be persuaded by anything less than overwhelming evidence, and even then be skeptical. Extraordinary claims require extraordinary evidence. Unfortunately, common sense is an uncommon commodity these days, and many silly theories have been seriously promoted by honest researchers." (Gary Smith, "Standard Deviations", 2014)

"If somebody ransacks data to find a pattern, we still need a theory that makes sense. On the other hand, a theory is just a theory until it is tested with persuasive data." (Gary Smith, "Standard Deviations", 2014)

"[…] many gamblers believe in the fallacious law of averages because they are eager to find a profitable pattern in the chaos created by random chance." (Gary Smith, "Standard Deviations", 2014)

"Numbers are not inherently tedious. They can be illuminating, fascinating, even entertaining. The trouble starts when we decide that it is more important for a graph to be artistic than informative." (Gary Smith, "Standard Deviations", 2014)

"Provocative assertions are provocative precisely because they are counterintuitive - which is a very good reason for skepticism." (Gary Smith, "Standard Deviations", 2014)

"Regression does not describe changes in ability that happen as time passes […]. Regression is caused by performances fluctuating about ability, so that performances far from the mean reflect abilities that are closer to the mean." (Gary Smith, "Standard Deviations", 2014)

"Remember that even random coin flips can yield striking, even stunning, patterns that mean nothing at all. When someone shows you a pattern, no matter how impressive the person’s credentials, consider the possibility that the pattern is just a coincidence. Ask why, not what. No matter what the pattern, the question is: Why should we expect to find this pattern?" (Gary Smith, "Standard Deviations", 2014)

"Self-selection bias occurs when people choose to be in the data - for example, when people choose to go to college, marry, or have children. […] Self-selection bias is pervasive in 'observational data', where we collect data by observing what people do. Because these people chose to do what they are doing, their choices may reflect who they are. This self-selection bias could be avoided with a controlled experiment in which people are randomly assigned to groups and told what to do." (Gary Smith, "Standard Deviations", 2014)

"Self-selection bias occurs when we compare people who made different choices without thinking about why they made these choices. […] Our conclusions would be more convincing if choice was removed […]" (Gary Smith, "Standard Deviations", 2014)

"The omission of zero magnifies the ups and downs in the data, allowing us to detect changes that might otherwise be ambiguous. However, once zero has been omitted, the graph is no longer an accurate guide to the magnitude of the changes. Instead, we need to look at the actual numbers." (Gary Smith, "Standard Deviations", 2014)

"These practices - selective reporting and data pillaging - are known as data grubbing. The discovery of statistical significance by data grubbing shows little other than the researcher’s endurance. We cannot tell whether a data grubbing marathon demonstrates the validity of a useful theory or the perseverance of a determined researcher until independent tests confirm or refute the finding. But more often than not, the tests stop there. After all, you won’t become a star by confirming other people’s research, so why not spend your time discovering new theories? The data-grubbed theory consequently sits out there, untested and unchallenged." (Gary Smith, "Standard Deviations", 2014)

"We are genetically predisposed to look for patterns and to believe that the patterns we observe are meaningful. […] Don’t be fooled into thinking that a pattern is proof. We need a logical, persuasive explanation and we need to test the explanation with fresh data." (Gary Smith, "Standard Deviations", 2014)

"We are hardwired to make sense of the world around us - to notice patterns and invent theories to explain these patterns. We underestimate how easily pat - terns can be created by inexplicable random events - by good luck and bad luck." (Gary Smith, "Standard Deviations", 2014)

"We are seduced by patterns and we want explanations for these patterns. When we see a string of successes, we think that a hot hand has made success more likely. If we see a string of failures, we think a cold hand has made failure more likely. It is easy to dismiss such theories when they involve coin flips, but it is not so easy with humans. We surely have emotions and ailments that can cause our abilities to go up and down. The question is whether these fluctuations are important or trivial." (Gary Smith, "Standard Deviations", 2014)

"We naturally draw conclusions from what we see […]. We should also think about what we do not see […]. The unseen data may be just as important, or even more important, than the seen data. To avoid survivor bias, start in the past and look forward." (Gary Smith, "Standard Deviations", 2014)

"We encounter regression in many contexts - pretty much whenever we see an imperfect measure of what we are trying to measure. Standardized tests are obviously an imperfect measure of ability. [...] Each experimental score is an imperfect measure of 'ability', the benefits from the layout. To the extent there is randomness in this experiment - and there surely is - the prospective benefits from the layout that has the highest score are probably closer to the mean than was the score." (Gary Smith, "Standard Deviations", 2014)

"With fast computers and plentiful data, finding statistical significance is trivial. If you look hard enough, it can even be found in tables of random numbers." (Gary Smith, "Standard Deviations", 2014)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...