24 October 2024

Clay Helberg - Collected Quotes

"Another key element in making informative graphs is to avoid confounding design variation with data variation. This means that changes in the scale of the graphic should always correspond to changes in the data being represented." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

"Another trouble spot with graphs is multidimensional variation. This occurs where two-dimensional figures are used to represent one-dimensional values. What often happens is that the size of the graphic is scaled both horizontally and vertically according to the value being graphed. However, this results in the area of the graphic varying with the square of the underlying data, causing the eye to read an exaggerated effect in the graph." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

"It may be helpful to consider some aspects of statistical thought which might lead many people to be distrustful of it. First of all, statistics requires the ability to consider things from a probabilistic perspective, employing quantitative technical concepts such as 'confidence', 'reliability', 'significance'. This is in contrast to the way non-mathematicians often cast problems: logical, concrete, often dichotomous conceptualizations are the norm: right or wrong, large or small, this or that." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

"[...] many non-mathematicians hold quantitative data in a sort of awe. They have been lead to believe that numbers are, or at least should be, unquestionably correct." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

"Most statistical models assume error free measurement, at least of independent (predictor) variables. However, as we all know, measurements are seldom if ever perfect. Particularly when dealing with noisy data such as questionnaire responses or processes which are difficult to measure precisely, we need to pay close attention to the effects of measurement errors. Two characteristics of measurement which are particularly important in psychological measurement are reliability and validity." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

"Remember that a p-value merely indicates the probability of a particular set of data being generated by the null model - it has little to say about the size of a deviation from that model (especially in the tails of the distribution, where large changes in effect size cause only small changes in p-values)." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995)

"There are a number of ways that statistical techniques can be misapplied to problems in the real world. Three of the most common hazards are designing experiments with insufficient power, ignoring measurement error, and performing multiple comparisons." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995)

"We can consider three broad classes of statistical pitfalls. The first involves sources of bias. These are conditions or circumstances which affect the external validity of statistical results. The second category is errors in methodology, which can lead to inaccurate or invalid results. The third class of problems concerns interpretation of results, or how statistical results are applied (or misapplied) to real world issues." (Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995) 

References:
[1] Clay Helberg, "Pitfalls of Data Analysis (or How to Avoid Lies and Damned Lies)", 1995 [link]

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...