20 October 2019

John W Tukey - Collected Quotes

"[We] need men who can practice science - not a particular science - in a word, we need scientific generalists." (John W Tukey, "The Education of a Scientific Generalist", 1949)

"[...] the whole of modern statistics, philosophy and methods alike, is based on the principle of interpreting what did happen in terms of what might have happened." (John W Tukey, "Standard Methods of Analyzing Data, 1951)

"Just remember that not all statistics has been mathematized - and that we may not have to wait for its mathematization in order to use it." (John W Tukey, "The Growth of Experimental Design in a Research Laboratory, 1953)

"Difficulties in identifying problems have delayed statistics far more than difficulties in solving problems." (John W Tukey, Unsolved Problems of Experimental Statistics, 1954)

"The practical power of a statistical test is the product of its’ statistical power and the probability of use." (John W Tukey, A Quick, "Compact, Two Sample Test to Duckworth’s Specifications", 1959)

"Predictions, prophecies, and perhaps even guidance - those who suggested this title to me must have hoped for such-even though occasional indulgences in such actions by statisticians has undoubtedly contributed to the characterization of a statistician as a man who draws straight lines from insufficient data to foregone conclusions!" (John W Tukey, "Where do We Go From Here?", Journal of the American Statistical Association, Vol. 55 (289), 1960)

"Today one of statistics' great needs is a body of able investigators who make it clear to the intellectual world that they are scientific statisticians. and they are proud of that fact that to them mathematics is incidental, though perhaps indispensable." (John W Tukey, "Statistical and Quantitative Methodology, 1961)

"If data analysis is to be well done, much of it must be a matter of judgment, and ‘theory’ whether statistical or non-statistical, will have to guide, not command." (John W Tukey, "The Future of Data Analysis", Annals of Mathematical Statistics, Vol. 33 (1), 1962)

"If one technique of data analysis were to be exalted above all others for its ability to be revealing to the mind in connection with each of many different models, there is little doubt which one would be chosen. The simple graph has brought more information to the data analyst’s mind than any other device. It specializes in providing indications of unexpected phenomena." (John W Tukey, "The Future of Data Analysis", Annals of Mathematical Statistics Vol. 33 (1), 1962)

"The most important maxim for data analysis to heed, and one which many statisticians seem to have shunned is this: ‘Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise.’ Data analysis must progress by approximate answers, at best, since its knowledge of what the problem really is will at best be approximate." (John W Tukey, "The Future of Data Analysis", Annals of Mathematical Statistics, Vol. 33, No. 1, 1962)

"The physical sciences are used to ‘praying over’ their data, examining the same data from a variety of points of view. This process has been very rewarding, and has led to many extremely valuable insights. Without this sort of flexibility, progress in physical science would have been much slower. Flexibility in analysis is often to be had honestly at the price of a willingness not to demand that what has already been observed shall establish, or prove, what analysis suggests. In physical science generally, the results of praying over the data are thought of as something to be put to further test in another experiment, as indications rather than conclusions." (John W Tukey, "The Future of Data Analysis", Annals of Mathematical Statistics Vol. 33 (1), 1962)

"The histogram, with its columns of area proportional to number, like the bar graph, is one of the most classical of statistical graphs. Its combination with a fitted bell-shaped curve has been common since the days when the Gaussian curve entered statistics. Yet as a graphical technique it really performs quite poorly. Who is there among us who can look at a histogram-fitted Gaussian combination and tell us, reliably, whether the fit is excellent, neutral, or poor? Who can tell us, when the fit is poor, of what the poorness consists? Yet these are just the sort of questions that a good graphical technique should answer at least approximately." (John W Tukey, "The Future of Processes of Data Analysis", 1965)

"The first step in data analysis is often an omnibus step. We dare not expect otherwise, but we equally dare not forget that this step, and that step, and other step, are all omnibus steps and that we owe the users of such techniques a deep and important obligation to develop ways, often varied and competitive, of replacing omnibus procedures by ones that are more sharply focused." (John W Tukey, "The Future of Processes of Data Analysis", 1965)

"The basic general intent of data analysis is simply stated: to seek through a body of data for interesting relationships and information and to exhibit the results in such a way as to make them recognizable to the data analyzer and recordable for posterity. Its creative task is to be productively descriptive, with as much attention as possible to previous knowledge, and thus to contribute to the mysterious process called insight." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"Comparable objectives in data analysis are (l) to achieve more specific description of what is loosely known or suspected; (2) to find unanticipated aspects in the data, and to suggest unthought-of-models for the data's summarization and exposure; (3) to employ the data to assess the (always incomplete) adequacy of a contemplated model; (4) to provide both incentives and guidance for further analysis of the data; and (5) to keep the investigator usefully stimulated while he absorbs the feeling of his data and considers what to do next." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"The science and art of data analysis concerns the process of learning from quantitative records of experience. By its very nature it exists in relation to people. Thus, the techniques and the technology of data analysis must be harnessed to suit human requirements and talents. Some implications for effective data analysis are: (1) that it is essential to have convenience of interaction of people and intermediate results and (2) that at all stages of data analysis the nature and detail of output, both actual and potential, need to be matched to the capabilities of the people who use it and want it." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"In many instances, a picture is indeed worth a thousand words. To make this true in more diverse circumstances, much more creative effort is needed to pictorialize the output from data analysis. Naive pictures are often extremely helpful, but more sophisticated pictures can be both simple and even more informative." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"Data analysis must be iterative to be effective. [...] The iterative and interactive interplay of summarizing by fit and exposing by residuals is vital to effective data analysis. Summarizing and exposing are complementary and pervasive." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"Summarizing data is a process of constrained and partial a process that essentially and inevitably corresponds to description - some sort of fitting, though it need not necessarily involve formal criteria or well-defined computations." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"The typical statistician has learned from bitter experience that negative results are just as important as positive ones, sometimes more so." (John W Tukey, "A Statistician's Comment", 1967)

"It is fair to say that statistics has made its greatest progress by having to move away from certainty [...] If we really want to make progress, we need to identify our next step away from certainty." (John W Tukey, "What Have Statisticians Been Forgetting", 1967)

"Every student of the art of data analysis repeatedly needs to build upon his previous statistical knowledge and to reform that foundation through fresh insights and emphasis." (John W Tukey, "Data Analysis, Including Statistics", 1968)

"Every graph is at least an indication, by contrast with some common instances of numbers." (John W Tukey, "Data Analysis, Including Statistics", 1968)

"Nothing can substitute for relatively direct assessment of variability." (John W Tukey, "Data Analysis, Including Statistics", 1968)

"No one knows how to appraise a procedure safely except by using different bodies of data from those that determined it."  (John W Tukey, "Data Analysis, Including Statistics", 1968)

"The problems of different fields are much more alike than their practitioners think, much more alike than different." (John W Tukey, "Analyzing Data: Sanctification or Detective Work?", 1969)

"[...] bending the question to fit the analysis is to be shunned at all costs." (John W Tukey, "Analyzing Data: Sanctification or Detective Work?", 1969)

"Data analysis is in important ways an antithesis of pure mathematics." (John W Tukey, "Data Analysis, Computation and Mathematics", 1972)

"Undoubtedly, the swing to exploratory data analysis will go somewhat too far. However : It is better to ride a damped pendulum than to be stuck in the mud." (John W Tukey, "Exploratory Data Analysis as Part of a Larger Whole", 1973)

"The twin assumptions of normality of distribution and homogeneity of variance are not ever exactly fulfilled in practice, and often they do not even hold to a good approximation." (John W Tukey, "The problem of multiple comparisons", [unpublished manuscript] 1973)

"Exploratory data analysis can never be the whole story, but nothing else can serve as the foundation stone - as the first step." (John W. Tukey, "Exploratory Data Analysis", 1977)

"In data analysis, a plot of y against x may help us when we know nothing about the logical connection from x to y–even when we do not know whether or not there is one–even when we know that such a connection is impossible." (John W Tukey, "Exploratory Data Analysis", 1977)

"It is a rare thing that a specific body of data tells us as clearly as we would wish how it itself should be analyzed." (John W Tukey, "Exploratory Data Analysis", 1977)

"One thing the data analyst has to learn is how to expose himself to what his data are willing - or even anxious - to tell him. Finding clues requires looking in the right places and with the right magnifying glass." (John W. Tukey, "Exploratory Data Analysis", 1977)

"The greatest value of a picture is when it forces us to notice what we never expected to see." (John W Tukey, "Exploratory Data Analysis", 1977)

"There is no excuse for failing to plot and look." (John W Tukey, "Exploratory Data Analysis", 1977)

"There is no more reason to expect one graph to ‘tell all’ than to expect one number to do the same." (John W Tukey, "Exploratory Data Analysis", 1977)

"Unless exploratory data analysis uncovers indications, usually quantitative ones, there is likely to nothing for confirmatory data analysis to consider." (John W Tukey, "Exploratory Data Analysis", 1977)

"Whatever the data, we can try to gain understanding by straightening or by flattening. When we succeed in doing one or both, we almost always see more clearly what is going on." (John W Tukey, "Exploratory Data Analysis", 1977)

"[...] exploratory data analysis is an attitude, a state of flexibility, a willingness to look for those things that we believe are not there, as well as for those we believe might be there. Except for its emphasis on graphs, its tools are secondary to its purpose." (John W Tukey, [comment] 1979)

"There is NO question of teaching confirmatory OR exploratory - we need to teach both." (John W Tukey, "We Need Both Exploratory and Confirmatory", 1980)

"Finding the question is often more important than finding the answer." (John W Tukey, "We Need Both Exploratory and Confirmatory", 1980)

"[...] any hope that we are smart enough to find even transiently optimum solutions to our data analysis problems is doomed to failure, and, indeed, if taken seriously, will mislead us in the allocation of effort, thus wasting both intellectual and computational effort." (John W Tukey, "Choosing Techniques for the Analysis of Data", 1981)

"Detailed study of the quality of data sources is an essential part of applied work. [...] Data analysts need to understand more about the measurement processes through which their data come. To know the name by which a column of figures is headed is far from being enough." (John W Tukey, "An Overview of Techniques of Data Analysis, Emphasizing Its Exploratory Aspects", 1982)

"Exploratory data analysis, EDA, calls for a relatively free hand in exploring the data, together with dual obligations: (•) to look for all plausible alternatives and oddities - and a few implausible ones, (graphic techniques can be most helpful here) and (•) to remove each appearance that seems large enough to be meaningful - ordinarily by some form of fitting, adjustment, or standardization [...] so that what remains, the residuals, can be examined for further appearances." (John W Tukey, "Introduction to Styles of Data Analysis Techniques", 1982)

"A competent data analysis of an even moderately complex set of data is a thing of trials and retreats, of dead ends and branches." (John W Tukey, Computer Science and Statistics: Proceedings of the 14th Symposium on the Interface, 1983)

"If we need a short suggestion of what exploratory data analysis is, I would suggest that: (1) it is an attitude, AND (2) a flexibility, AND (3) some graph paper (or transparencies, or both)." (John W Tukey [in "The collected works of John W. Tukey: Philosophy and principles of data analysis 1949-1964" Vols. III & IV, 1986])

"The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." (John W Tukey, "Sunset Salvo", The American Statistician Vol. 40 (1), 1986)

"Three of the main strategies of data analysis are: (1) graphical presentation, (2) provision of flexibility in viewpoint and in facilities, (3) intensive search for parsimony and simplicity."  (John W Tukey [in "The collected works of John W. Tukey: Philosophy and principles of data analysis 1949-1964" Vols. III & IV, 1986])

"The greatest possibilities of visual display lie in vividness and inescapability of the intended message. A visual display can stop your mental flow in its tracks and make you think. A visual display can force you to notice what you never expected to see. One should see the intended at once; one should not even have to wait for it to appear." (John W Tukey, "Data-based graphics: Visual display in the decades to come", Statistical Science 5, 1990)

"Empirical knowledge is always fuzzy! And theoretical knowledge, like all the laws of physics, as of today’s date, is always wrong-in detail, though possibly providing some very good approximations indeed." (John W Tukey, "The philosophy of multiple comparisons", Statistical Science 6, 1991)

"Statisticians classically asked the wrong question–and were willing to answer with a lie, one that was often a downright lie. They asked 'Are the effects of A and B different?' and they were willing to answer 'no'. All we know about the world teaches us that the effects of A and B are always different–in some decimal place–for every A and B. Thus asking 'Are the effects different?' is foolish. What we should be answering first is 'Can we tell the direction in which the effects of A differ from the effects of B?' In other words, can we be confident about the direction from A to B? Is it 'up', 'down' or 'uncertain'?" (John W Tukey, "The Philosophy of Multiple Comparisons", Statistical Science 6, 1991)

"The worst, i.e., most dangerous, feature of 'accepting the null hypothesis' is the giving up of explicit uncertainty. […] Mathematics can sometimes be put in such black-and-white terms, but our knowledge or belief about the external world never can." (John W Tukey, "The Philosophy of Multiple Comparisons", Statistical Science Vol. 6 (1), 1991)

"No one has ever shown that he or she had a free lunch. Here, of course, 'free lunch' means 'usefulness of a model that is locally easy to make inferences from'. (John Tukey, "Issues relevant to an honest account of data-based inference, partially in the light of Laurie Davies’ paper", 1993)

"The purpose of plotting is to convey phenomena to the viewer’s cortex, not to provide a place to lookup observed numbers." (Kaye Basford & John W Tukey, "Graphical Analysis of Multi-Response Data", 1998)

"I believe that there are many classes of problems where Bayesian analyses are reasonable, mainly classes with which I have little acquaintance." (John Tukey, "The life and professional contributions of John W. Tukey, The Annals of Statistics", Vol 30, 2001)

"Since the aim of exploratory data analysis is to learn what seems to be, it should be no surprise that pictures play a vital role in doing it well." (John W Tukey, "John W Tukey’s Works on Interactive Graphics", The Annals of Statistics Vol. 30 (6), 2002)

"Just which robust/resistant methods you use is not important–what is important is that you use some. It is perfectly proper to use both classical and robust/resistant methods routinely, and only worry when they differ enough to matter. But, when they differ, you should think hard." (John W Tukey)

"Statistics is the science, the art, the philosophy, and the technique of making inferences from the particular to the general." (John W Tukey)

"The greatest possibilities of visual display lie in vividness and inescapability of the intended message. A visual display can stop your mental flow in its tracks and make you think. A visual display can force you to notice what you never expected to see." (John W Tukey)

"The purpose of [data] display is comparison (recognition of phenomena), not numbers [...] The phenomena are the main actors, numbers are the supporting cast." (John W Tukey)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...