09 April 2023

Alan Graham - Collected Quotes

"An essential feature of mathematics and statistics, particularly at a higher level, is the use of shorthand notation for a variety of concepts and measures. While this can be a strength in terms of providing conciseness and precision, statistical notation often proves to be an obstacle for learners in the early stages of learning." (Alan Graham, "Developing Thinking in Statistics", 2006)

"A feature shared by both the range and the interquartile range is that they are each calculated on the basis of just two values - the range uses the maximum and the minimum values, while the IQR uses the two quartiles. The standard deviation, on the other hand, has the distinction of using, directly, every value in the set as part of its calculation. In terms of representativeness, this is a great strength. But the chief drawback of the standard deviation is that, conceptually, it is harder to grasp than other more intuitive measures of spread." (Alan Graham, "Developing Thinking in Statistics", 2006)

 "A useful feature of a stem plot is that the values maintain their natural order, while at the same time they are laid out in a way that emphasises the overall distribution of where the values are concentrated (that is, where the longer branches are). This enables you easily to pick out key values such as the median and quartiles." (Alan Graham, "Developing Thinking in Statistics", 2006)

"[…] an outlier is an observation that lies an 'abnormal' distance from other values in a batch of data. There are two possible explanations for the occurrence of an outlier. One is that this happens to be a rare but valid data item that is either extremely large or extremely small. The other is that it isa mistake – maybe due to a measuring or recording error." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Cleverly drawn pictures can sometimes disguise or render invisible what is there. At other times, they can make you see things that are not really there. It is helpful to be aware of how these illusions are achieved, as some of the illusionist’s 'tricks of the trade' can also be found in distortions used in graphs and diagrams." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Exploratory Data Analysis is more than just a collection of data-analysis techniques; it provides a philosophy of how to dissect a data set. It stresses the power of visualisation and aspects such as what to look for, how to look for it and how to interpret the information it contains. Most EDA techniques are graphical in nature, because the main aim of EDA is to explore data in an open-minded way. Using graphics, rather than calculations, keeps open possibilities of spotting interesting patterns or anomalies that would not be apparent with a calculation (where assumptions and decisions about the nature of the data tend to be made in advance)." (Alan Graham, "Developing Thinking in Statistics", 2006) 

"One of the apparent paradoxes in probability is that, while the outcome of the next roll of a die or toss of a coin may be unpredictable, there are nevertheless underlying patterns in the outcomes overall. Specifically, when a fair die is rolled many times, there is a 'settling down' effect as the proportion of each outcome (1, 2, 3, …, 6) gradually approaches 1/6. In the limiting case, as the number of rolls reaches infinity, the shape of the probability distribution becomes uniform." (Alan Graham, "Developing Thinking in Statistics", 2006)

"One feature of probability is that the likelihood of a particular event can sometimes change as a result of some earlier event having taken place. For example, if you are drawing one ball at a time, without replacement, from a bag containing, say, three white balls and two red balls, then the probabilities of the various outcomes at each stage will vary, depending on which balls have already been removed. Contrast this with sampling with replacement, where the probabilities remain fixed." (Alan Graham, "Developing Thinking in Statistics", 2006)

"People sometimes appeal to the ‘law of averages’ to justify their faith in the gambler’s fallacy. They may reason that, since all outcomes are equally likely, in the long run they will come out roughly equal in frequency. However, the next throw is very much in the short run and the coin, die or roulette wheel has no memory of what went before." (Alan Graham, "Developing Thinking in Statistics", 2006)

"People tend to give greater weight to the data that they have just been exposed to than other relevant data. […] This phenomenon, where people give greater attention to recent or easily available data, is often referred to as an availability error." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Probability is about making decisions under uncertainty - indeed, where there is no uncertainty, no decision is required, as you would simply choose the outcome that you know will occur. A 'good' or 'rational' decision favours the Cartesian principle that ‘when it is not in our power to follow what is true, we ought to follow what is most probable’. Of course, rational decisions sometimes turn out to be wrong. That does not mean that the decisions were bad - they may have been the best choices, given the information available at the time. […] In the long run, the vagaries of chance tend to even out, but in particular cases it can happen that the long shot comes in first. This is the corollary of a 'good' decision that has bad consequences - a 'bad' or 'irrational' decision that turns out to be right." (Alan Graham, "Developing Thinking in Statistics", 2006) 

"Random number generators do not always need to be symmetrical. This misconception of assuming equal likelihood for each outcome is fostered in a restricted learning environment, where learners see only such situations (that is, dice, coins and spinners). It is therefore very important for learners to be aware of situations where the different outcomes are not equally likely (as with the drawing-pins example)." (Alan Graham, "Developing Thinking in Statistics", 2006)

"'Regression to the mean' describes a natural phenomenon whereby, after a short period of success, things tend to return to normal immediately afterwards. This notion applies particularly to random events." (Alan Graham, "Developing Thinking in Statistics", 2006)

"The notion of outcomes covering a space is a very useful mental image, as it ties in strongly with the use of Venn diagrams and tables for clarifying the nature of possible events resulting from a trial. There are two important aspects to this. First, when enumerating the various outcomes that comprise an event, the number of (equally. likely) outcomes should correspond, visually, with the area of that part of the diagram represented by the event in question – the greater the probability, the larger the area. Secondly, where events overlap (for example, when rolling a die, consider the two events 'getting an even score' and 'getting a score greater than 2' ), the various regions in the Venn diagram help to clarify the various combinations of events that might occur." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Unlike in mathematics, where relationships tend to be clearly defined and unambiguous, statistical relationships tend to reflect the general messiness of the real world from which the data were drawn." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Use of a histogram should be strictly reserved for continuous numerical data or for data that can be effectively modelled as continuous […]. Unlike bar charts, therefore, the bars of a histogram corresponding to adjacent intervals should not have gaps between them, for obvious reasons." (Alan Graham, "Developing Thinking in Statistics", 2006)

"What sets statistics apart from the rest of mathematics is that in statistics events occur under conditions of uncertainty. Whereas in pure mathematics all even numbers possess the property of evenness, a statistical variable may take a range of different values that are usually unpredictable in advance." (Alan Graham, "Developing Thinking in Statistics", 2006)

"When it comes to drawing a picture of continuous data, you need to think through carefully where one interval ends and the next one begins. Failing to do this can result in overlaps or gaps between adjacent intervals, which can cause confusion." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Where correlation exists, it is tempting to assume that one of the factors has caused the changes in the other (that is, that there is a cause-and-effect relationship between them). Although this may be true, often it is not. When an unwarranted or incorrect assumption is made about cause and effect, this is referred to as spurious correlation […]" (Alan Graham, "Developing Thinking in Statistics", 2006)

"Whereas regression is about attempting to specify the underlying relationship that summarises a set of paired data, correlation is about assessing the strength of that relationship. Where there is a very close match between the scatter of points and the regression line, correlation is said to be 'strong' or 'high' . Where the points are widely scattered, the correlation is said to be 'weak' or 'low'." (Alan Graham, "Developing Thinking in Statistics", 2006)

05 April 2023

On Noise IV

"Experiments usually are looking for 'signals' of truth, and the search is always ham pered by 'noise' of one kind or another. In judging someone else's experimental results it's important to find out whether they represent a true signal or whether they are just so much noise." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"In a real experiment the noise present in a signal is usually considered to be the result of the interplay of a large number of degrees of freedom over which one has no control. This type of noise can be reduced by improving the experimental apparatus. But we have seen that another type of noise, which is not removable by any refinement of technique, can be present. This is what we have called the deterministic noise. Despite its intractability it provides us with a way to describe noisy signals by simple mathematical models, making possible a dynamical system approach to the problem of turbulence." (David Ruelle, "Chaotic Evolution and Strange Attractors: The statistical analysis of time series for deterministic nonlinear systems", 1989)

"Fitting is essential to visualizing hypervariate data. The structure of data in many dimensions can be exceedingly complex. The visualization of a fit to hypervariate data, by reducing the amount of noise, can often lead to more insight. The fit is a hypervariate surface, a function of three or more variables. As with bivariate and trivariate data, our fitting tools are loess and parametric fitting by least-squares. And each tool can employ bisquare iterations to produce robust estimates when outliers or other forms of leptokurtosis are present." (William S Cleveland, "Visualizing Data", 1993)

"Noise is a problem in most signals. [...] It's easy to see that noise is random; it fluctuates erratically with no pattern." (Barry R Parker, "Chaos in the Cosmos: The stunning complexity of the universe", 1996)

"Although the shape of chaos is nightmarish, its voice is oddly soothing. When played through a loudspeaker, chaos sounds like white noise, like the soft static that helps insomniacs fall asleep." (Steven Strogatz, "Sync: The Emerging Science of Spontaneous Order", 2003)

"Before you can even consider creating a data story, you must have a meaningful insight to share. One of the essential attributes of a data story is a central or main insight. Without a main point, your data story will lack purpose, direction, and cohesion. A central insight is the unifying theme (telos appeal) that ties your various findings together and guides your audience to a focal point or climax for your data story. However, when you have an increasing amount of data at your disposal, insights can be elusive. The noise from irrelevant and peripheral data can interfere with your ability to pinpoint the important signals hidden within its core." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"In addition to managing how the data is visualized to reduce noise, you can also decrease the visual interference by minimizing the extraneous cognitive load. In these cases, the nonrelevant information and design elements surrounding the data can cause extraneous noise. Poor design or display decisions by the data storyteller can inadvertently interfere with the communication of the intended signal. This form of noise can occur at both a macro and micro level." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"A defining feature of system noise is that it is unwanted, and we should stress right here that variability in judgments is not always unwanted." (Daniel Kahneman, "Noise: A Flaw in Human Judgment", 2021)

"A general property of noise is that you can recognize and measure it while knowing nothing about the target or bias." (Daniel Kahneman, "Noise: A Flaw in Human Judgment", 2021) 

"Bias and noise - systematic deviation and random scatter - are different components of error. […] To understand error in judgment, we must understand both bias and noise. Sometimes, as we will see, noise is the more important problem. But in public conversations about human error and in organizations all over the world, noise is rarely recognized. Bias is the star of the show. Noise is a bit player, usually offstage. […] Wherever you look at human judgments, you are likely to find noise. To improve the quality of our judgments, we need to overcome noise as well as bias." (Daniel Kahneman, "Noise: A Flaw in Human Judgment", 2021)

Robert Hooke - Collected Quotes

"All of us learn by experience. Except for pure deductive processes, everything we learn is from someone's experience. All experience is a sample from an immense range of possible experience that no one individual can ever take in. It behooves us to know what parts of the information we get from samples can be trusted and what cannot." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

 "Being experimental, however, doesn't necessarily make a scientific study entirely credible. One weakness of experimental work is that it can be out of touch with reality when its controls are so rigid that conclusions are valid only in the experimental situation and don't carryover into the real world." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Correlation analysis is a useful tool for uncovering a tenuous relationship, but it doesn't necessarily provide any real understanding of the relationship, and it certainly doesn't provide any evidence that the relationship is one of cause and effect. People who don't understand correlation tend to credit it with being a more fundamental approach than it is." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Experiments usually are looking for 'signals' of truth, and the search is always ham pered by 'noise' of one kind or another. In judging someone else's experimental results it's important to find out whether they represent a true signal or whether they are just so much noise." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

 "First and foremost an experiment should have a goal, and the goal should be something worth achieving, especially if the experimenter is working on someone else's (for example, the taxpayers') money. 'Worth achieving' implies more than just beneficial; it also should mean that the experiment is the most beneficial thing we can think of doing. Obviously we can't predict accurately the value of an experiment (this may not even be possible after we see how it turns out), but we should feel obliged to make as intelligent a choice as we can. Such a choice is sometimes labeled a 'value judgment'." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"In general a small-scale test or experiment will not detect a small effect, or small differences among various products." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Mistakes arising from retrospective data analysis led to the idea of experimentation, and experience with experimentation led to the idea of controlled experiments and then to the proper design of experiments for efficiency and credibility. When someone is pushing a conclusion at you, it's a good idea to ask where it came from - was there an experiment, and if so, was it controlled and was it relevant?" (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"One important way of developing our powers of discrimination between good and bad statistical studies is to learn about the differences between backward-looking (retrospective or historical) data and data obtained through carefully planned and controlled (forward-looking) experiments." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Only a 0 correlation is uninteresting, and in practice 0 correlations do not occur. When you stuff a bunch of numbers into the correlation formula, the chance of getting exactly 0, even if no correlation is truly present, is about the same as the chance of a tossed coin ending up on edge instead of heads or tails."

"Randomization is usually a cheap and harmless way of improving the effectiveness of experimentation with very little extra effort." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Science usually amounts to a lot more than blind trial and error. Good statistics consists of much more than just significance tests; there are more sophisticated tools available for the analysis of results, such as confidence statements, multiple comparisons, and Bayesian analysis, to drop a few names. However, not all scientists are good statisticians, or want to be, and not all people who are called scientists by the media deserve to be so described." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Statistical reasoning is such a fundamental part of experimental science that the study of principles of data analysis has become a vital part of the scientist's education. Furthermore, […] the existence of a lot of data does not necessarily mean that any useful information is there ready to be extracted." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"The idea of statistical significance is valuable because it often keeps us from announcing results that later turn out to be nonresults. A significant result tells us that enough cases were observed to provide reasonable assurance of a real effect. It does not necessarily mean, though, that the effect is big enough to be important." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"Today's scientific investigations are so complicated that even experts in related fields may not understand them well. But there is a logic in the planning of experiments and in the analysis of their results that all intelligent people can grasp, and this logic is a great help in determining when to believe what we hear and read and when to be skeptical. This logic has a great deal to do with statistics, which is why statisticians have a unique interest in the scientific method, and why some knowledge of statistics can so often be brought to bear in distinguishing good arguments from bad ones." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

"When a real situation involves chance we have to use probability mathematics to understand it quantitatively. Direct mathematical solutions sometimes exist […] but most real systems are too complicated for direct solutions. In these cases the computer, once taught to generate random numbers, can use simulation to get useful answers to otherwise impossible problems." (Robert Hooke, "How to Tell the Liars from the Statisticians", 1983)

02 April 2023

On Diagrams: Venn Diagrams

"[...] for merely theoretical purposes the rule of formation would be very simple. It would merely be to begin by drawing any closed figure, and then proceed [sic] to draw others, subject to the one condition that each is to intersect once and once only all the existing subdivisions produced by those which had gone before." (John Venn, "On the Diagrammatic and Mechanical Representation of Propositions and Reasonings", 1880)

"[…] it must be noticed that these diagrams do not naturally harmonize with the propositions of ordinary life or ordinary logic. […] The great bulk of the propositions which we commonly meet with are founded, and rightly founded, on an imperfect knowledge of the actual mutual relations of the implied classes to one another. […] one very marked characteristic about these circular diagrams is that they forbid the natural expression of such uncertainty, and are therefore only directly applicable to a very small number of such propositions as we commonly meet with." (John Venn, "On the Diagrammatic and Mechanical Representation of Propositions and Reasonings", 1880)

"[...] we can not readily break up a complicated problem into successive steps which can be taken independently. We have, in fact, to solve the problem first, by determining what are the actual mutual relations of the classes involved, and then to draw the circles to represent this final result; we cannot work step-by-step towards the conclusion by aid of our figures." (John Venn, "On the Diagrammatic and Mechanical Representation of Propositions and Reasonings", 1880)

"Whereas the Eulerian plan endeavoured at once and directly to represent propositions, or relations of class terms to one another, we shall find it best to begin by representing only classes, and then proceed to modify these in some way so as to make them indicate what our propositions have to say. How, then, shall we represent all the subclasses which two or more class terms can produce? Bear in mind that what we have to indicate is the successive duplication of the number of subdivisions produced by the introduction of each successive term. and we shall see our way to a very important departure from the Eulerian conception. All that we have to do is to draw our figures, say circles, so that each successive one which we introduce shall intersect once, and once only, all the subdivisions already existing, and we then have what may be called a general framework indicating every possible combination producible by the given class terms." (John Venn, "On the Diagrammatic and Mechanical Representation of Propositions and Reasonings", 1880)

"It will be found that there is a tendency for the resultant outlines thus successively drawn to assume a comb-like shape after the first four or five [...]The fifth-term figure will have two teeth, the sixth four, and so on. [...] There is no trouble in drawing such a diagram for any number of terms which our paper will find room for. But, as has already been repeatedly remarked, the visual aid for which mainly such diagrams exist is soon lost on such a path."  (John Venn, "Symbolic Logic", [footnote], 1881)

"There is no need here to exhibit such figures, as they would probably be distasteful to any but the mathematician, and he would see his way to drawing them readily enough for himself [...]" (John Venn, "Symbolic Logic", 1881)

"We endeavour to employ only symmetrical figures, such as should not only be an aid to reasoning, through the sense of sight, but should also be to some extent elegant in themselves." (John Venn, "Symbolic Logic", 1881)

"At the basis of our Symbolic Logic, however represented, whether by words by letters or by diagrams, we shall always find the same state of things. What we ultimately have to do is to break up the entire field before us into a definite number of classes or compartments which are mutually exclusive and collectively exhaustive." (John Venn, "Symbolic Logic" 2nd Ed., 1894)

"The best way of introducing this question will be to enquire a little more strictly whether it is really classes that we thus represent, or merely compartments into which classes may be put? […] The most accurate answer is that our diagrammatic subdivisions, or for that matter our symbols generally, stand for compartments and not for classes. We may doubtless regard them as representing the latter, but if we do so we should never fail to keep in mind the proviso, 'if there be such things in existence'. And when this condition is insisted upon, it seems as if we expressed our meaning best by saying that what our symbols stand for are compartments which may or may not happen to be occupied." (John Venn, "Symbolic Logic" 2nd Ed., 1894)

 "My Method of Diagrams resembles Mr. Venn's, in having separate Compartments assigned to the various Classes, and in marking these Compartments as occupied or as empty; but it differs from his Method, in assigning a closed area to the Universe of Discourse, so that the Class which, under Mr. Venn's liberal sway, has been ranging at will through Infinite Space, is suddenly dismayed to find itself "cabin'd, cribb'd, confined" in a limited Cell like any other Class! Also I use rectilinear, instead of curvilinear Figures" (Charles Dogson [Lews Carroll], 1896)

"This is why a 'web' of notes with links (like references) between them is far more useful than a fixed hierarchical system. When describing a complex system, many people resort to diagrams with circles and arrows. Circles and arrows leave one free to describe the interrelationships between things in a way that tables, for example, do not. The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything." (Tim Berners-Lee, "Information Management: A Proposal", 1989)

"Venn diagrams are widely used to solve problems in set theory and to test the validity of syllogisms in logic. […] However, it is a fact that Venn diagrams are not considered valid proofs, but heuristic tools for finding valid formal proofs." (Sun-Joo Shin, "Situation-Theoretic Account of Valid Reasoning with Venn Diagrams", [in "Logical Reasoning with Diagrams"], 1996)

"Venn diagrams provide us with a formalism that consists of a standardized system of representations, together with rules for manipulating them. In this regard, they could be considered a primitive visual analog of the formal systems of deduction developed in logic." (Jon Barwise & John Etchemendy, "Visual Information and Valid Reasoning", [in "Logical Reasoning with Diagrams"], 1996)

"A Venn diagram is a simple representation of the sample space, that is often helpful in seeing 'what is going on'. Usually the sample space is represented by a rectangle, with individual regions within the rectangle representing events. It is Often helpful to imagine that the actual areas Of the various regions in a Venn diagram are in proportion to the corresponding probabilities. However, there is no need to spend a long time drawing these diagrams - their use is simply as a reminder of what is happening." (Graham Upton & Ian Cook, "Introducing Statistics", 2001)

"Two types of graphic organizers are commonly used for comparison: the Venn diagram and the comparison matrix [...] the Venn diagram provides students with a visual display of the similarities and differences between two items. The similarities between elements are listed in the intersection between the two circles. The differences are listed in the parts of each circle that do not intersect. Ideally, a new Venn diagram should be completed for each characteristic so that students can easily see how similar and different the elements are for each characteristic used in the comparison." (Robert J. Marzano et al, "Classroom Instruction that Works: Research-based strategies for increasing student achievement, 2001)

"It is a curious fact that if you draw an endless line on a piece of paper so that it cuts itself any number of times (but never cuts itself more than once at the same point), then you can color the resulting regions using only two colors without any adjoining regions being the same color. [...] Venn diagrams also possess this property, but for a separate reason, which at first sight seems to be nicely demonstrated by induction." (Anthony W F Edwards, "Cogwheels of the mind: The story of Venn diagrams", 2004)

"The notion of outcomes covering a space is a very useful mental image, as it ties in strongly with the use of Venn diagrams and tables for clarifying the nature of possible events resulting from a trial. There are two important aspects to this. First, when enumerating the various outcomes that comprise an event, the number of (equally. likely) outcomes should correspond, visually, with the area of that part of the diagram represented by the event in question - the greater the probability, the larger the area. Secondly, where events overlap (for example, when rolling a die, consider the two events 'getting an even score' and 'getting a score greater than 2' ), the various regions in the Venn diagram help to clarify the various combinations of events that might occur." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Venn diagrams visually ground symbolic logic and abstract set operations. They do not ground probability. Their common overuse in introducing probability, especially in teaching, can have undesirable consequences." (R W Oldford & W H Cherry, "Picturing Probability: the poverty of Venn diagrams, the richness of Eikosograms", 2006)

"Venn diagramming, it turns out, is a very effective technique for performing syllogistic reasoning. Its chief advantage (over the Euler graph in particular as we noted earlier) is the ability to incrementally add knowledge to the diagram. While an Euler graph has visual power in terms of representing the relations between sets very intuitively, it is impossible to combine more than one piece of information onto a Euler graph. A Venn diagram, on the other hand, easily lends itself to the representation of partial knowledge and can be manipulated to add successively more knowledge to the diagram. This means that when our knowledge of the relations between sets increases, we simply put in more symbols and shadings into the appropriate compartments of the Venn diagram. Thus we are able to accumulate knowledge in a Venn diagram. This capability turns out to be a powerful feature, one that endows Venn diagrams with a more dynamic quality that is sorely lacking in the Euler system." (Robbie T Nakatsu, "Diagrammatic Reasoning in AI", 2010)

Florence Nightingale - Collected Quotes

"Diagrams are of great utility for illustrating certain questions of vital statistics by conveying ideas on the subject through the eye, which cannot be so readily grasped when contained in figures." (Florence Nightingale, "Mortality of the British Army", 1857)

"Whenever I am infuriated, I revenge myself with a new Diagram." (Florence Nightingale, [letter to Sidney Herbert] 1857)

"But law is no explanation of anything; law is simply a generalization, a category of facts. Law is neither a cause, nor a reason, nor a power, nor a coercive force. It is nothing but a general formula, a statistical table." (Florence Nightingale, "Suggestions for Thought", 1860)

"Newton's law is nothing but the statistics of gravitation, it has no power whatever. Let us get rid of the idea of power from law altogether. Call law tabulation of facts, expression of facts, or what you will; anything rather than suppose that it either explains or compels."(Florence Nightingale, "Suggestions for Thought", 1860)

"Again I must repeat my objections to intermingling causation with statistics. It might be to a certain extent admissible if you had no sanitary head. But you have one, & his report should be quite separate. The statistician has nothing to do with causation: he is almost certain in the present state of knowledge to err." (Florence Nightingale, [letter] 1861)

"All do statistics, some on paper, some by memory. Those who fail take care to give no statistics. Among those who succeed or think they have succeeded are some of small or accidental experience." (Florence Nightingale) 

"All sciences of observation depend upon statistical methods; without these [they] are blind empiricism. Make your facts comparable before deducing causes." (Florence Nightingale) 

"Only by consulting the past can the statesman judge for the future, recognize the elements necessary to realize plans, appreciate what needs reform." (Florence Nightingale

"Statistics are necessary to appreciate the effects of law." (Florence Nightingale) 

"To understand God's thoughts we must study statistics, for these are the measure of His purpose." (Florence Nightingale)  [attributed]

01 April 2023

Charles Livingston - Collected Quotes

"Cautions about combining groups: apples and oranges. In computing an average, be careful about combining groups in which the average for each group is of more interest than the overall average. […] Avoid combining distinct quantities in a single average." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Central tendency is the formal expression for the notion of where data is centered, best understood by most readers as 'average'. There is no one way of measuring where data are centered, and different measures provide different insights." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Concluding that the population is becoming more centralized by observing behavior at the extremes is called the 'Regression to the Mean' Fallacy. […] When looking for a change in a population, do not look only at the extremes; there you will always find a motion to the mean. Look at the entire population." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Data often arrive in raw form, as long lists of numbers. In this case your job is to summarize the data in a way that captures its essence and conveys its meaning. This can be done numerically, with measures such as the average and standard deviation, or graphically. At other times you find data already in summarized form; in this case you must understand what the summary is telling, and what it is not telling, and then interpret the information for your readers or viewers." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"If a hypothesis test points to rejection of the alternative hypothesis, it might not indicate that the null hypothesis is correct or that the alternative hypothesis is false." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Limit a sentence to no more than three numerical values. If you've got more important quantities to report, break those up into other sentences. More importantly, however, make sure that each number is an important piece of information. Which are the important numbers that truly advance the story?" (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Numbers are often useful in stories because they record a recent change in some amount, or because they are being compared with other numbers. Percentages, ratios and proportions are often better than raw numbers in establishing a context." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Probability is sometimes called the language of statistics. […] The probability of an event occurring might be described as the likelihood of it happening. […] In a formal sense the word "probability" is used only when an event or experiment is repeatable and the long term likelihood of a certain outcome can be determined." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Roughly stated, the standard deviation gives the average of the differences between the numbers on the list and the mean of that list. If data are very spread out, the standard deviation will be large. If the data are concentrated near the mean, the standard deviation will be small." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"The basic idea of going from an estimate to an inference is simple. Drawing the conclusion with confidence, and measuring the level of confidence, is where the hard work of professional statistics comes in." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"The central limit theorem […] states that regardless of the shape of the curve of the original population, if you repeatedly randomly sample a large segment of your group of interest and take the average result, the set of averages will follow a normal curve." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"The dual meaning of the word significant brings into focus the distinction between drawing a mathematical inference and practical inference from statistical results." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"The percentage is one of the best (mathematical) friends a journalist can have, because it quickly puts numbers into context. And it's a context that the vast majority of readers and viewers can comprehend immediately." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...