10 September 2023

On Distributions III

"Linear regression assumes that in the population a normal distribution of error values around the predicted Y is associated with each X value, and that the dispersion of the error values for each X value is the same. The assumptions imply normal and similarly dispersed error distributions." (Fred C Pampel, "Linear Regression: A primer", 2000)

"The principle of maximum entropy is employed for estimating unknown probabilities (which cannot be derived deductively) on the basis of the available information. According to this principle, the estimated probability distribution should be such that its entropy reaches maximum within the constraints of the situation, i.e., constraints that represent the available information. This principle thus guarantees that no more information is used in estimating the probabilities than available." (George J Klir & Doug Elias, "Architecture of Systems Problem Solving" 2nd Ed, 2003) 

"The principle of minimum entropy is employed in the formulation of resolution forms and related problems. According to this principle, the entropy of the estimated probability distribution, conditioned by a particular classification of the given events (e.g., states of the variable involved), is minimum subject to the constraints of the situation. This principle thus guarantees that all available information is used, as much as possible within the given constraints (e.g., required number of states), in the estimation of the unknown probabilities." (George J Klir & Doug Elias, "Architecture of Systems Problem Solving" 2nd Ed, 2003)

"In the laws of probability theory, likelihood distributions are fixed properties of a hypothesis. In the art of rationality, to explain is to anticipate. To anticipate is to explain." (Eliezer S. Yudkowsky, "A Technical Explanation of Technical Explanation", 2005)

"The central limit theorem says that, under conditions almost always satisfied in the real world of experimentation, the distribution of such a linear function of errors will tend to normality as the number of its components becomes large. The tendency to normality occurs almost regardless of the individual distributions of the component errors. An important proviso is that several sources of error must make important contributions to the overall error and that no particular source of error dominate the rest." (George E P Box et al, "Statistics for Experimenters: Design, discovery, and innovation" 2nd Ed., 2005)

"Two things explain the importance of the normal distribution: (1) The central limit effect that produces a tendency for real error distributions to be 'normal like'. (2) The robustness to nonnormality of some common statistical procedures, where 'robustness' means insensitivity to deviations from theoretical normality." (George E P Box et al, "Statistics for Experimenters: Design, discovery, and innovation" 2nd Ed., 2005)

"Traditional statistics is strong in devising ways of describing data and inferring distributional parameters from sample. Causal inference requires two additional ingredients: a science-friendly language for articulating causal knowledge, and a mathematical machinery for processing that knowledge, combining it with data and drawing new causal conclusions about a phenomenon." (Judea Pearl, "Causal inference in statistics: An overview", Statistics Surveys 3, 2009)

"The elements of this cloud of uncertainty (the set of all possible errors) can be described in terms of probability. The center of the cloud is the number zero, and elements of the cloud that are close to zero are more probable than elements that are far away from that center. We can be more precise in this definition by defining the cloud of uncertainty in terms of a mathematical function, called the probability distribution." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...