22 April 2021

On Data (1980-1989)

"Facts and theories are different things, not rungs in a hierarchy of increasing certainty. Facts are the world's data. Theories are structures of ideas that explain and interpret facts. Facts do not go away while scientists debate rival theories for explaining them." (Stephen J Gould "Evolution as Fact and Theory", 1981)

"In natural science we are concerned ultimately, not with convenient arrangements of observational data which can be generalized into universal explanatory form, but with movements of thought, at once theoretical and empirical, which penetrate into the intrinsic structure of the universe in such a way that there becomes disclosed to us its basic design and we fi nd ourselves at grips with reality.[...] We cannot pursue natural science scientifically without engaging at the same time in meta-scientific operations." (Thomas F Torrance, "Divine and Contingent Order", 1981)

"People often feel inept when faced with numerical data. Many of us think that we lack numeracy, the ability to cope with numbers. […] The fault is not in ourselves, but in our data. Most data are badly presented and so the cure lies with the producers of the data. To draw an analogy with literacy, we do not need to learn to read better, but writers need to be taught to write better." (Andrew Ehrenberg, "The problem of numeracy", American Statistician 35(2), 1981)

"The fact must be expressed as data, but there is a problem in that the correct data is difficult to catch. So that I always say 'When you see the data, doubt it!' 'When you see the measurement instrument, doubt it!' [...]For example, if the methods such as sampling, measurement, testing and chemical analysis methods were incorrect, data. […] to measure true characteristics and in an unavoidable case, using statistical sensory test and express them as data." (Kaoru Ishikawa, Annual Quality Congress Transactions, 1981)

"There is a tendency to mistake data for wisdom, just as there has always been a tendency to confuse logic with values, intelligence with insight. Unobstructed access to facts can produce unlimited good only if it is matched by the desire and ability to find out what they mean and where they lead." (Norman Cousins, "Human Options : An Autobiographical Notebook", 1981)

"A scientist should not cheat or falsify data or quote out of context or do any other thing that is intellectually dishonest. Of course, as always, some individuals fail; but science as a whole disapproves of such action. Indeed, when transgressors are detected, they are usually expelled from the community." (Michael Ruse, "Response to the Commentary: Pro Judice", Science, Technology and Human Values Vol. 7 (41), 1982)

"In all human activities, it is not ideas of machines that dominate; it is people. I have heard people speak of 'the effect of personality on science'. But this is a backward thought. Rather, we should talk about the effect of science on personalities. Science is not the dispassionate analysis of impartial data. It is the human, and thus passionate, exercise of skill and sense on such data. (Philip Hilts, "Scientific Temperaments: Three Lives in Contemporary Science", 1982)

"Data in isolation are meaningless, a collection of numbers. Only in context of a theory do they assume significance […]" (George Greenstein, "Frozen Star", 1983)

"Excellence in statistical graphics consists of complex ideas communicated with clarity, precision, and efficiency. Graphical displays should show the data, induce the viewer to think about the substance rather that about the methodology, graphic design, the technology of graphic production, or something else, avoid distorting what the data have to say, present many numbers in a small space make large data sets coherent, encourage the eye to compare different pieces of data, reveal the data at several levels of detail, from a broad overview to the fine structure, serve a reasonable clear purpose: description, exploration, tabulation, or decoration [should] be closely integrated with the statistical and verbal descriptions of a data set." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"In all scientific fields, theory is frequently more important than experimental data. Scientists are generally reluctant to accept the existence of a phenomenon when they do not know how to explain it. On the other hand, they will often accept a theory that is especially plausible before there exists any data to support it." (Richard Morris, 1983)

"Inept graphics also flourish because many graphic artists believe that statistics are boring and tedious. It then follows that decorated graphics must pep up, animate, and all too often exaggerate what evidence there is in the data. […] If the statistics are boring, then you've got the wrong numbers." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

“The purpose of models is not to fit the data but to sharpen the questions.” (Samuel Karlin, 1983)

“There are those who try to generalize, synthesize, and build models, and there are those who believe nothing and constantly call for more data. The tension between these two groups is a healthy one; science develops mainly because of the model builders, yet they need the second group to keep them honest.” (Andrew Miall, “Principles of Sedimentary Basin Analysis”, 1984)

"Data is raw. It simply exists and has no significance beyond its existence (in and of itself). It can exist in any form, usable or not. It does not have meaning of itself. In computer parlance, a spreadsheet generally starts out by holding data." (Russell L Ackoff, "Towards a Systems Theory of Organization, 1985)

"Information is data that has been given meaning by way of relational connection. This 'meaning' can be useful, but does not have to be. In computer parlance, a relational database makes information from the data stored within it." (Russell L Ackoff, "Towards a Systems Theory of Organization", 1985)

"Intuition becomes increasingly valuable in the new information society precisely because there is so much data." (John Naisbitt, "Re-Inventing the Corporation", 1985)

"Probability is the mathematics of uncertainty. Not only do we constantly face situations in which there is neither adequate data nor an adequate theory, but many modem theories have uncertainty built into their foundations. Thus learning to think in terms of probability is essential. Statistics is the reverse of probability (glibly speaking). In probability you go from the model of the situation to what you expect to see; in statistics you have the observations and you wish to estimate features of the underlying model." (Richard W Hamming, "Methods of Mathematics Applied to Calculus, Probability, and Statistics", 1985)

"Thus statistics should generally be taught more as a practical subject with analyses of real data. Of course some theory and an appropriate range of statistical tools need to be learnt, but students should be taught that Statistics is much more than a collection of standard prescriptions." (Christopher Chatfield, "The Initial Examination of Data", Journal of the Royal Statistical Society A Vol. 148, 1985)

"Models are often used to decide issues in situations marked by uncertainty. However statistical differences from data depend on assumptions about the process which generated these data. If the assumptions do not hold, the inferences may not be reliable either. This limitation is often ignored by applied workers who fail to identify crucial assumptions or subject them to any kind of empirical testing. In such circumstances, using statistical procedures may only compound the uncertainty." (David A Greedman & William C Navidi, "Regression Models for Adjusting the 1980 Census", Statistical Science Vol. 1 (1), 1986)

"The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data." (John Tukey, "Sunset Salvo", The American Statistician Vol. 40 (1), 1986)

"Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion." (Stephen M Stigler, "Neutral Models in Biology", 1987)

"[…] no good model ever accounted for all the facts, since some data was bound to be misleading if not plain wrong. A theory that did fit all the data would have been ‘carpentered’ to do this and would thus be open to suspicion." (Francis H C Crick, "What Mad Pursuit: A Personal View of Scientific Discovery", 1988)

"Physicists are all too apt to look for the wrong sorts of generalizations, to concoct theoretical models that are too neat, too powerful, and too clean. Not surprisingly, these seldom fi t well with data. To produce a really good biological theory, one must try to see through the clutter produced by evolution to the basic mechanisms. What seems to physicists to be a hopelessly complicated process may have been what nature found simplest, because nature could build on what was already there." (Francis H C Crick, "What Mad Pursuit?: A Personal View of Scientific Discovery", 1988)

"[...] to acknowledge the subjectivity inherent in the interpretation of data is to recognize the central role of statistical analysis as a formal mechanism by which new evidence can be integrated with existing knowledge. Such a view of statistics as a dynamic discipline is far from the common perception of a rather dry, automatic technology for processing data." (Donald A Berry, "Statistical Analysis and the Illusion of Objectivity", American Scientist Vol. 76, 1988)

"Randomness is a difficult notion for people to accept. When events come in clusters and streaks, people look for explanations and patterns. They refuse to believe that such patterns - which frequently occur in random data - could equally well be derived from tossing a coin. So it is in the stock market as well." (Burton G Malkiel, "A Random Walk Down Wall Street", 1989)

"Some methods, such as those governing the design of experiments or the statistical treatment of data, can be written down and studied. But many methods are learned only through personal experience and interactions with other scientists. Some are even harder to describe or teach. Many of the intangible influences on scientific discovery - curiosity, intuition, creativity - largely defy rational analysis, yet they are often the tools that scientists bring to their work." (Committee on the Conduct of Science, "On Being a Scientist", 1989)

"When evaluating a model, at least two broad standards are relevant. One is whether the model is consistent with the data. The other is whether the model is consistent with the ‘real world’." (Kenneth A Bollen, "Structural Equations with Latent Variables", 1989)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Hypothesis Testing III

  "A little thought reveals a fact widely understood among statisticians: The null hypothesis, taken literally (and that’s the only way...