30 January 2022

On Statisticians (1975 - 1999)

"Competent statisticians will be front line troops in our war for survival-but how do we get them? I think there is now a wide readiness to agree that what we want are neither mere theorem provers nor mere users of a cookbook. A proper balance of theory and practice is needed and, most important, statisticians must learn how to be good scientists; a talent which has to be acquired by experience and example." (George E P Box, "Science and Statistics", Journal of the American Statistical Association 71, 1976)

"When the statistician looks at the outside world, he cannot, for example, rely on finding errors that are independently and identically distributed in approximately normal distributions. In particular, most economic and business data are collected serially and can be expected, therefore, to be heavily serially dependent. So is much of the data collected from the automatic instruments which are becoming so common in laboratories these days. Analysis of such data, using procedures such as standard regression analysis which assume independence, can lead to gross error. Furthermore, the possibility of contamination of the error distribution by outliers is always present and has recently received much attention. More generally, real data sets, especially if they are long, usually show inhomogeneity in the mean, the variance, or both, and it is not always possible to randomize." (George E P Box, "Some Problems of Statistics and Everyday Life", Journal of the American Statistical Association, Vol. 74 (365), 1979)

"Science usually amounts to a lot more than blind trial and error. Good statistics consists of much more than just significance tests; there are more sophisticated tools available for the analysis of results, such as confidence statements, multiple comparisons, and Bayesian analysis, to drop a few names. However, not all scientists are good statisticians, or want to be, and not all people who are called scientists by the media deserve to be so described." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Another reason for the applied statistician to care about Bayesian inference is that consumers of statistical answers, at least interval estimates, commonly interpret them as probability statements about the possible values of parameters. Consequently, the answers statisticians provide to consumers should be capable of being interpreted as approximate Bayesian statements." (Donald B Rubin, "Bayesianly justifiable and relevant frequency calculations for the applied statistician", Annals of Statistics 12(4), 1984)

"Stepwise regression is probably the most abused computerized statistical technique ever devised. If you think you need stepwise regression to solve a particular problem you have, it is almost certain that you do not. Professional statisticians rarely use automated stepwise regression." (Leland Wilkinson, "SYSTAT", 1984)

"The result is that non-statisticians tend to place undue reliance on single ‘cookbook’ techniques, and it has for example become impossible to get results published in some medical, psychological and biological journals without reporting significance values even if of doubtful validity. It is sad that students may actually be more confused and less numerate at the end of a ‘service course’ than they were at the beginning, and more likely to overlook a descriptive approach in favor of some inferential method which may be inappropriate or incorrectly executed." (Christopher Chatfield, "The initial examination of data", Journal of the Royal Statistical Society, Series A 14, 1985)

"Too much of what all statisticians do [...] is blatantly subjective for any of us to kid ourselves or the users of our technology into believing that we have operated ‘impartially’ in any true sense. [...] We can do what seems to us most appropriate, but we can not be objective and would do well to avoid language that hints to the contrary." (Steve V Vardeman, Comment, Journal of the American Statistical Association 82, 1987)

"[In statistics] you have the fact that the concepts are not very clean. The idea of probability, of randomness, is not a clean mathematical idea. You cannot produce random numbers mathematically. They can only be produced by things like tossing dice or spinning a roulette wheel. With a formula, any formula, the number you get would be predictable and therefore not random. So as a statistician you have to rely on some conception of a world where things happen in some way at random, a conception which mathematicians don’t have." (Lucien LeCam, [interview] 1988)

"It is clear that a statistician who is involved at the start of an investigation, advises on data collection, and who knows the background and objectives, will generally make a better job of the analysis than a statistician who was called in later on." (Christopher Chatfield, "Problem solving: a statistician’s guide", 1988)

"The statistician should not always remain in his or her own office: not only is relevant information more likely to be on hand in the experimenter’s department, but in the longer term the statistician stands to gain immeasurably in understanding of agricultural problems by often visiting other departments and their laboratories and fields." (David J Finney, "Was this in your statistics textbook?", Experimental Agriculture 24, 1988)

"A little thought reveals a fact widely understood among statisticians: The null hypothesis, taken literally (and that’s the only way you can take it in formal hypothesis testing), is always false in the real world. [...] If it is false, even to a tiny degree, it must be the case that a large enough sample will produce a significant result and lead to its rejection. So if the null hypothesis is always false, what’s the big deal about rejecting it?" (Jacob Cohen, "Things I Have Learned (So Far)", American Psychologist, 1990)

"Statisticians classically asked the wrong question–and were willing to answer with a lie, one that was often a downright lie. They asked 'Are the effects of A and B different?' and they were willing to answer “no”. All we know about the world teaches us that the effects of A and B are always different–in some decimal place–for every A and B. Thus asking 'Are the effects different?' is foolish. What we should be answering first is 'Can we tell the direction in which the effects of A differ from the effects of B?' In other words, can we be confident about the direction from A to B? Is it 'up', 'down' or 'uncertain'?" (John W Tukey, "The Philosophy of Multiple Comparisons", Statistical Science 6, 1991)

"Statistics is a very powerful and persuasive mathematical tool. People put a lot of faith in printed numbers. It seems when a situation is described by assigning it a numerical value, the validity of the report increases in the mind of the viewer. It is the statistician's obligation to be aware that data in the eyes of the uninformed or poor data in the eyes of the naive viewer can be as deceptive as any falsehoods." (Theoni Pappas, "More Joy of Mathematics: Exploring mathematical insights & concepts", 1991)

"A careful and sophisticated analysis of the data is often quite useless if the statistician cannot communicate the essential features of the data to a client for whom statistics is an entirely foreign language." (Christopher J Wild, "Embracing the ‘Wider view’ of Statistics", The American Statistician 48, 1994)

"We have to teach non-statisticians to recognize where statistical expertise is required. No one else will. We teach students how to solve simple statistical problems, but how often do we make any serious effort to teach them to recognize situations that call for statistical expertise that is beyond the technical content of the course." (Christopher J Wild, "Embracing the ‘Wider view’ of Statistics", The American Statistician 48, 1994)

"Because no one becomes statistically self-sufficient after one semester of study, I try to prepare students to become intelligent consumers of the assistance that they will inevitably seek. Service courses train future clients, not future statisticians." (Michael W Tosset, "Statistical Science", 1998)

"There are aspects of statistics other than it being intellectually difficult that are barriers to learning. For one thing, statistics does not benefit from a glamorous image that motivates students to persist through tedious and frustrating lessons[...]there are no TV dramas with a good-looking statistician playing the lead, and few mothers’ chests swell with pride as they introduce their son or daughter as 'the statistician'." (Chap T Le & James R Boen, "Health and Numbers: Basic Statistical Methods", 1995)

"When an analyst selects the wrong tool, this is a misuse which usually leads to invalid conclusions. Incorrect use of even a tool as simple as the mean can lead to serious misuses. […] But all statisticians know that more complex tools do not guarantee an analysis free of misuses. Vigilance is required on every statistical level."  (Herbert F Spirer et al, "Misused Statistics" 2nd Ed, 1998)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...