10 September 2023

On Distributions I

"The state of a system at a given moment depends on two things - its initial state, and the law according to which that state varies. If we know both this law and this initial state, we have a simple mathematical problem to solve, and we fall back upon our first degree of ignorance. Then it often happens that we know the law and do not know the initial state. It may be asked, for instance, what is the present distribution of the minor planets? We know that from all time they have obeyed the laws of Kepler, but we do not know what was their initial distribution. In the kinetic theory of gases we assume that the gaseous molecules follow recti-linear paths and obey the laws of impact and elastic bodies; yet as we know nothing of their initial velocities, we know nothing of their present velocities. The calculus of probabilities alone enables us to predict the mean phenomena which will result from a combination of these velocities. This is the second degree of ignorance. Finally it is possible, that not only the initial conditions but the laws themselves are unknown. We then reach the third degree of ignorance, and in general we can no longer affirm anything at all as to the probability of a phenomenon. It often happens that instead of trying to discover an event by means of a more or less imperfect knowledge of the law, the events may be known, and we want to find the law; or that, instead of deducing effects from causes, we wish to deduce the causes."  (Henri Poincaré, "Science and Hypothesis", 1902)

"If the number of experiments be very large, we may have precise information as to the value of the mean, but if our sample be small, we have two sources of uncertainty: (I) owing to the 'error of random sampling' the mean of our series of experiments deviates more or less widely from the mean of the population, and (2) the sample is not sufficiently large to determine what is the law of distribution of individuals." (William S Gosset, "The Probable Error of a Mean", Biometrika, 1908)

"We know not to what are due the accidental errors, and precisely because we do not know, we are aware they obey the law of Gauss. Such is the paradox." (Henri Poincaré, "The Foundations of Science", 1913)

"The problems which arise in the reduction of data may thus conveniently be divided into three types: (i) Problems of Specification, which arise in the choice of the mathematical form of the population. (ii) When a specification has been obtained, problems of Estimation arise. These involve the choice among the methods of calculating, from our sample, statistics fit to estimate the unknow n parameters of the population. (iii) Problems of Distribution include the mathematical deduction of the exact nature of the distributions in random samples of our estimates of the parameters, and of other statistics designed to test the validity of our specification (tests of Goodness of Fit)." (Sir Ronald A Fisher, "Statistical Methods for Research Workers", 1925)

"An inference, if it is to have scientific value, must constitute a prediction concerning future data. If the inference is to be made purely with the help of the distribution theory of statistics, the experiments that constitute evidence for the inference must arise from a state of statistical control; until that state is reached, there is no universe, normal or otherwise, and the statistician’s calculations by themselves are an illusion if not a delusion. The fact is that when distribution theory is not applicable for lack of control, any inference, statistical or otherwise, is little better than a conjecture. The state of statistical control is therefore the goal of all experimentation. (William E Deming, "Statistical Method from the Viewpoint of Quality Control", 1939)

"Just as in applied statistics the crux of a problem is often the devising of some method of sampling that avoids bias, our problem is that of finding a probability assignment which avoids bias, while agreeing with whatever information is given. The great advance provided by information theory lies in the discovery that there is a unique, unambiguous criterion for the 'amount of uncertainty' represented by a discrete probability distribution, which agrees with our intuitive notions that a broad distribution represents more uncertainty than does a sharply peaked one, and satisfies all other conditions which make it reasonable." (Edwin T Jaynes, "Information Theory and Statistical Mechanics" I, 1956)

"Normality is a myth; there never has, and never will be, a normal distribution." (Roy C Geary, "Testing for Normality", Biometrika Vol. 34, 1947)

"[A] sequence is random if it has every property that is shared by all infinite sequences of independent samples of random variables from the uniform distribution." (Joel N Franklin, 1962)

"Mathematical statistics provides an exceptionally clear example of the relationship between mathematics and the external world. The external world provides the experimentally measured distribution curve; mathematics provides the equation (the mathematical model) that corresponds to the empirical curve. The statistician may be guided by a thought experiment in finding the corresponding equation." (Marshall J Walker, "The Nature of Scientific Thought", 1963)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

Occam's Razor = The Law of Parsimony (1500 - 1899)

"We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances. Therefore, to...