20 December 2020

On Noise III

"Economists should study financial markets as they actually operate, not as they assume them to operate - observing the way in which information is actually processed, observing the serial correlations, bonanzas, and sudden stops, not assuming these away as noise around the edges of efficient and rational markets." (Adair Turner, "Economics after the Crisis: Objectives and means", 2012)

"Finding patterns is easy in any kind of data-rich environment; that's what mediocre gamblers do. The key is in determining whether the patterns represent signal or noise." (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don't", 2012)

"The signal is the truth. The noise is what distracts us from the truth." (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don't", 2012)

"When some systems are stuck in a dangerous impasse, randomness and only randomness can unlock them and set them free. You can see here that absence of randomness equals guaranteed death. The idea of injecting random noise into a system to improve its functioning has been applied across fields. By a mechanism called stochastic resonance, adding random noise to the background makes you hear the sounds (say, music) with more accuracy." (Nassim N Taleb, "Antifragile: Things that gain from disorder", 2012)

"A signal is a useful message that resides in data. Data that isn’t useful is noise. […] When data is expressed visually, noise can exist not only as data that doesn’t inform but also as meaningless non-data elements of the display (e.g. irrelevant attributes, such as a third dimension of depth in bars, color variation that has no significance, and artificial light and shadow effects)." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

"Data contain descriptions. Some are true, some are not. Some are useful, most are not. Skillful use of data requires that we learn to pick out the pieces that are true and useful. [...] To find signals in data, we must learn to reduce the noise - not just the noise that resides in the data, but also the noise that resides in us. It is nearly impossible for noisy minds to perceive anything but noise in data." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

"When we find data quality issues due to valid data during data exploration, we should note these issues in a data quality plan for potential handling later in the project. The most common issues in this regard are missing values and outliers, which are both examples of noise in the data." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015)

"Information theory leads to the quantification of the information content of the source, as denoted by entropy, the characterization of the information-bearing capacity of the communication channel, as related to its noise characteristics, and consequently the establishment of the relationship between the information content of the source and the capacity of the channel. In short, information theory provides a quantitative measure of the information contained in message signals and help determine the capacity of a communication system to transfer this information from source to sink over a noisy channel in a reliable fashion." (Ali Grami, "Information Theory", 2016)

"Repeated observations of the same phenomenon do not always produce the same results, due to random noise or error. Sampling errors result when our observations capture unrepresentative circumstances, like measuring rush hour traffic on weekends as well as during the work week. Measurement errors reflect the limits of precision inherent in any sensing device. The notion of signal to noise ratio captures the degree to which a series of observations reflects a quantity of interest as opposed to data variance. As data scientists, we care about changes in the signal instead of the noise, and such variance often makes this problem surprisingly difficult." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Using noise (the uncorrelated variables) to fit noise (the residual left from a simple model on the genuinely correlated variables) is asking for trouble." (Steven S Skiena, "The Data Science Design Manual", 2017)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...