02 October 2024

On Statisticians (2000 -)

 "Things are changing. Statisticians now recognize that computer scientists are making novel contributions while computer scientists now recognize the generality of statistical theory and methodology. Clever data mining algorithms are more scalable than statisticians ever thought possible. Formal statistical theory is more pervasive than computer scientists had realized." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004

"[...] statisticians are constantly looking out for missed nuances: a statistical average for all groups may well hide vital differences that exist between these groups. Ignoring group differences when they are present frequently portends inequitable treatment." (Kaiser Fung, "Numbers Rule the World", 2010)

"What is so unconventional about the statistical way of thinking? First, statisticians do not care much for the popular concept of the statistical average; instead, they fixate on any deviation from the average. They worry about how large these variations are, how frequently they occur, and why they exist. [...] Second, variability does not need to be explained by reasonable causes, despite our natural desire for a rational explanation of everything; statisticians are frequently just as happy to pore over patterns of correlation. [...] Third, statisticians are constantly looking out for missed nuances: a statistical average for all groups may well hide vital differences that exist between these groups. Ignoring group differences when they are present frequently portends inequitable treatment. [...] Fourth, decisions based on statistics can be calibrated to strike a balance between two types of errors. Predictably, decision makers have an incentive to focus exclusively on minimizing any mistake that could bring about public humiliation, but statisticians point out that because of this bias, their decisions will aggravate other errors, which are unnoticed but serious. [...] Finally, statisticians follow a specific protocol known as statistical testing when deciding whether the evidence fits the crime, so to speak. Unlike some of us, they don’t believe in miracles. In other words, if the most unusual coincidence must be contrived to explain the inexplicable, they prefer leaving the crime unsolved." (Kaiser Fung, "Numbers Rule the World", 2010)

"Diagrams furnish only approximate information. They do not add anything to the meaning of the data and, therefore, are not of much use to a statistician or research worker for further mathematical treatment or statistical analysis. On the other hand, graphs are more obvious, precise and accurate than the diagrams and are quite helpful to the statistician for the study of slopes, rates of change and estimation, (interpolation and extrapolation), wherever possible." (S C Gupta & Indra Gupta, "Business Statistics", 2013)

"Good design is an important part of any visualization, while decoration (or chart-junk) is best omitted. Statisticians should also be careful about comparing themselves to artists and designers; our goals are so different that we will fare poorly in comparison." (Hadley Wickham, "Graphical Criticism: Some Historical Notes", Journal of Computational and Graphical Statistics Vol. 22(1), 2013) 

"Missing data is the blind spot of statisticians. If they are not paying full attention, they lose track of these little details. Even when they notice, many unwittingly sway things our way. Most ranking systems ignore missing values." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"Statisticians set a high bar when they assign a cause to an effect. [...] A model that ignores cause–effect relationships cannot attain the status of a model in the physical sciences. This is a structural limitation that no amount of data - not even Big Data - can surmount." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"When statisticians, trained in math and probability theory, try to assess likely outcomes, they demand a plethora of data points. Even then, they recognize that unless it’s a very simple and controlled action such as flipping a coin, unforeseen variables can exert significant influence." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"Optimization is more than finding the best simulation results. It is itself a complex and evolving field that, subject to certain information constraints, allows data scientists, statisticians, engineers, and traders alike to perform reality checks on modeling results." (Chris Conlan, "Automated Trading with R: Quantitative Research and Platform Development", 2016)

"The tricky part is that there aren’t really any hard- and- fast rules when it comes to identifying outliers. Some economists say an outlier is anything that’s a certain distance away from the mean, but in practice it’s fairly subjective and open to interpretation. That’s why statisticians spend so much time looking at data on a case-by-case basis to determine what is - and isn’t - an outlier." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"The job of the statistician is to formulate an inventory of all those things that matter in order to obtain a representative sample. Researchers have to avoid the tendency to capture variables that are easy to identify or collect data on - sometimes the things that matter are not obvious or are difficult to measure." (Daniel J Levitin, "Weaponized Lies", 2017)

"To be any good, a sample has to be representative. A sample is representative if every person or thing in the group you’re studying has an equally likely chance of being chosen. If not, your sample is biased. […] The job of the statistician is to formulate an inventory of all those things that matter in order to obtain a representative sample. Researchers have to avoid the tendency to capture variables that are easy to identify or collect data on - sometimes the things that matter are not obvious or are difficult to measure." (Daniel J Levitin, "Weaponized Lies", 2017)

"Some scientists (e.g., econometricians) like to work with mathematical equations; others (e.g., hard-core statisticians) prefer a list of assumptions that ostensibly summarizes the structure of the diagram. Regardless of language, the model should depict, however qualitatively, the process that generates the data - in other words, the cause-effect forces that operate in the environment and shape the data generated." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"Statisticians are sometimes dismissed as bean counters. The sneering term is misleading as well as unfair. Most of the concepts that matter in policy are not like beans; they are not merely difficult to count, but difficult to define. Once you’re sure what you mean by 'bean', the bean counting itself may come more easily. But if we don’t understand the definition, then there is little point in looking at the numbers. We have fooled ourselves before we have begun."(Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Construction VII: Mental Models

"Physics is the attempt at the conceptual construction of a model of the real world and its lawful structure." (Albert Einstein, [...