12 December 2024

On Data: Longitudinal Data

 "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One objective of statistical analysis is to describe the marginal expectation of the outcome variable as a function of the covariates while accounting for the correlation among the repeated observations for a given subject." (Scott L Zeger & Kung-Yee Liang, "Longitudinal Data Analysis for Discrete and Continuous Outcomes", Biometrics Vol. 42(1), 1986)

"Longitudinal data sets in which the outcome variable cannot be transformed to be Gaussian are more difficult to analyze for two reasons. First, simple models for the conditional expectation of the outcome do not imply equally simple models for the marginal expectation, as is the case for Gaussian data. Hence, the analyst must choose to model either the marginal or conditional expectation. Second, likelihood analyses often lead to estimators of the regression coefficients which are consistent only when the time dependence is correctly specified." (Scott L Zeger & Kung-Yee Liang, "Longitudinal Data Analysis for Discrete and Continuous Outcomes", Biometrics Vol. 42(1), 1986)

"Longitudinal data comprise repeated observations over time on each of many individuals. Longitudinal data are in contrast to cross-sectional data where only a single response is available for each person. The statistical analysis of longitudinal data presents special opportunities and challenges because the repeated outcomes for one individual tend to be correlated with one another." (Scott L Zeger & Kung‐Yee Liang, "An overview of methods for the analysis of longitudinal data", Statistics in medicine vol. 11, 1992)

"We have two objectives for statistical models of longitudinal data: (1) to adopt the conventional regression tools, which relate the response variables to the explanatory variables; and (2) to account for the within subject correlation." (Scott L Zeger & Kung‐Yee Liang, "An overview of methods for the analysis of longitudinal data", Statistics in medicine vol. 11, 1992)

"Analysis of longitudinal data tends to be simpler because subjects can usually be assumed independent. Valid inferences can be made by borrowing strength across people. That is, the consistency of a pattern across subjects is the basis for substantive conclusions. For this reason, inferences from longitudinal studies can be made more robust to model assumptions than those from time series data, particularly to assumptions about the nature of the correlation." (Peter J Diggle et al, "Analysis of Longitudinal Data", 2002)

"The defining feature of a longitudinal data set is repeated observations on individuals enabling direct study of change. Longitudinal data require special statistical methods because the set of observations on one subject tends to be intercorrelated. This correlation must be taken into account to draw valid scientific inferences." (Peter J Diggle et al, "Analysis of Longitudinal Data", 2002)

Scott L Zeger - Collected Quotes

"Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One objective of statistical analysis is to describe the marginal expectation of the outcome variable as a function of the covariates while accounting for the correlation among the repeated observations for a given subject." (Scott L Zeger & Kung-Yee Liang, "Longitudinal Data Analysis for Discrete and Continuous Outcomes", Biometrics Vol. 42(1), 1986)

"Longitudinal data sets in which the outcome variable cannot be transformed to be Gaussian are more difficult to analyze for two reasons. First, simple models for the conditional expectation of the outcome do not imply equally simple models for the marginal expectation, as is the case for Gaussian data. Hence, the analyst must choose to model either the marginal or conditional expectation. Second, likelihood analyses often lead to estimators of the regression coefficients which are consistent only when the time dependence is correctly specified." (Scott L Zeger & Kung-Yee Liang, "Longitudinal Data Analysis for Discrete and Continuous Outcomes", Biometrics Vol. 42(1), 1986)

"Statistical models are sometimes misunderstood in epidemiology. Statistical models for data are never true. The question whether a model is true is irrelevant. A more appropriate question is whether we obtain the correct scientific conclusion if we pretend that the process under study behaves according to a particular statistical model." (Scott Zeger, "Statistical reasoning in epidemiology", American Journal of Epidemiology, 1991)

"Statistical models for data are never true. The question whether a model is true is irrelevant. A more appropriate question is whether we obtain the correct scientific conclusion if we pretend that the process under study behaves according to a particular statistical model." (Scott Zeger, "Statistical reasoning in epidemiology", American Journal of Epidemiology, 1991)

"Statistical reasoning is based upon two simple precepts: (1) that natural processes can usefully be described by stochastic models and (2) that by studying apparently haphazard collections of autonomous individuals, one can discover, at a higher level, systematic patterns of potential scientific import." (Scott Zeger, "Statistical reasoning in epidemiology", American Journal of Epidemiology, 1991)

"The rise of statistical reasoning was a key step in the birth of many empirical sciences, especially epidemiology. The ability to focus on the aggregate behavior amidst apparently chaotic variation across autonomous individuals has dramatically increased our understanding of disease processes that affect the health of the public. Simple statistical models based upon the laws of probability provide the language for this population perspective." (Scott Zeger, "Statistical reasoning in epidemiology", American Journal of Epidemiology, 1991)

"Longitudinal data comprise repeated observations over time on each of many individuals. Longitudinal data are in contrast to cross-sectional data where only a single response is available for each person. The statistical analysis of longitudinal data presents special opportunities and challenges because the repeated outcomes for one individual tend to be correlated with one another." (Scott L Zeger & Kung‐Yee Liang, "An overview of methods for the analysis of longitudinal data", Statistics in medicine vol. 11, 1992)

"We have two objectives for statistical models of longitudinal data: (1) to adopt the conventional regression tools, which relate the response variables to the explanatory variables; and (2) to account for the within subject correlation." (Scott L Zeger & Kung‐Yee Liang, "An overview of methods for the analysis of longitudinal data", Statistics in medicine vol. 11, 1992)

09 December 2024

On Manifolds: Definitions

"A manifold, roughly, is a topological space in which some neighborhood of each point admits a coordinate system, consisting of real coordinate functions on the points of the neighborhood, which determine the position of points and the topology of that neighborhood; that is, the space is locally cartesian. Moreover, the passage from one coordinate system to another is smooth in the overlapping region, so that the meaning of 'differentiable' curve, function, or map is consistent when referred to either system." (Richard L Bishop & Samuel I Goldberg, "Tensor Analysis on Manifolds", 1968)

"A manifold M of dimension n, or n-manifold, is a topological space with the following properties: (i) M is Hausdorff, (ii) M is locally Euclidean of dimension n, and (iii) M has a countable basis of open sets." (William M Boothby, "An introduction to differentiable manifolds and Riemannian geometry" 2nd Ed., 1986)

"[...] a manifold is a set M on which 'nearness' is introduced (a topological space), and this nearness can be described at each point in M by using coordinates. It also requires that in an overlapping region, where two coordinate systems intersect, the coordinate transformation is given by differentiable transition functions." (Kenji Ueno & Toshikazu Sunada, "A Mathematical Gift, III: The Interplay Between Topology, Functions, Geometry, and Algebra", Mathematical World Vol. 23, 1996)

"A manifold Mn of dimension n is a Hausdorff topological space such that each point P of Mn has a neighborhood Ω homeomorphic to Rn (or equivalently to an open set of Rn." (Thierry Aubin, "A Course in Differential Geometry", 2000)

"Manifolds are a type of topological spaces we are interested in. They correspond well to the spaces we are most familiar with, the Euclidean spaces. Intuitively, a manifold is a topological space that locally looks like Rn. In other words, each point admits a coordinate system, consisting of coordinate functions on the points of the neighborhood, determining the topology of the neighborhood." (Afra J Zomorodian, "Topology for Computing", 2005)

"Roughly speaking, a manifold is essentially a space that is locally similar to the Euclidean space. This resemblance permits differentiation to be defined. On a manifold, we do not distinguish between two different local coordinate systems. Thus, the concepts considered are just those independent of the coordinates chosen. This makes more sense if we consider the situation from the physics point of view. In this interpretation, the systems of coordinates are systems of reference." (Ovidiu Calin & Der-Chen Chang,  "Geometric Mechanics on Riemannian Manifolds : Applications to partial differential equations", 2005)

"A manifold is an abstract mathematical space, which locally (i.e., in a close–up view) resembles the spaces described by Euclidean geometry, but which globally (i.e., when viewed as a whole) may have a more complicated structure." (Vladimir G Ivancevic & Tijana T Ivancevic, "Applied Differential Geometry: A Modern Introduction", 2007)

"A topological manifold of dimension k is a Hausdorff topological space M with a countable base such that for all x ∈ M, there exists an open neighborhood of x that is homeomorphic to an open set of Rk." (Stephen Lovett, "Differential Geometry of Manifolds", 2010)

"Roughly speaking, a manifold is a set whose points can be labeled by coordinates." (Gerardo F. Torres del Castillo, "Differentiable Manifolds: A Theoretical Physics Approach", 2010)

"You can very generally think of a manifold as a space which is locally Euclidian - that means that if you look closely enough at one small part of a manifold then it basically looks like Rn for some n." (Jon P Fortney, "A Visual Introduction to Differential Forms and Calculus on Manifolds", 2018)

02 December 2024

Occam's Razor = The Law of Parsimony (1500 - 1899)

"We are to admit no more causes of natural things than such as are both true and sufficient to explain their appearances. Therefore, to the same natural effects we must, as far as possible, assign the same causes." (Isaac Newton, "Philosophiæ Naturalis Principia Mathematica" ["Mathematical Principles of Natural Philosophy"], 1687) 

"Entia non sunt multiplicanda praeter necessitatem."
"Entities are not to be multiplied beyond what is necessary." (John Ponce, cca. 17th century)

"Parsimony is enough to make the master of the golden mines as poor as he that has nothing; for a man may be brought to a morsel of bread by parsimony as well as profusion." (Henry Home [Lord Kames] ," Introduction to the Art of Thinking", 1761)

"Mere parsimony is not economy. Expense, and great expense, may be an essential part in true economy." (Edmund Burke, "A Letter to a Noble Lord", 1796)

"It is, after all, a principle of logic not to multiply entities unnecessarily." (Antoine-Laurent Lavoisier, "Réflexions sur le phlogistique", 1862)

"The first obligation of Simplicity is that of using the simplest means to secure the fullest effect." (George H Lewes, "The Principles of Success in Literature", 1865)

"In no case may we interpret an action [of an animal] as the outcome of the exercise of a higher psychical faculty, if it can be interpreted as the outcome of the exercise of one which stands lower in the psychological scale." (Conwy Lloyd Morgan, "An Introduction to Comparative Psychology", 1894) [Morgan's canon, the principle of parsimony in animal research]

"The question is therefore to demonstrate all geometrical truths with the smallest possible number of assumptions." (Augustus de Morgan, "On the Study and Difficulties of Mathematics", 1898)

"Scientists must use the simplest means of arriving at their results and exclude everything not perceived by the senses." (Ernst Mach)

Occam's Razor = The Law of Parsimony (1950 - 1999)

"Nonbeing must in some sense be, otherwise what is it that there is not? This tangled doctrine might be nicknamed Plato's beard; historically it has proved tough, frequently dulling the edge of Occam's razor." (Willard van Orman Quine, "On What There Is" From a Logical Point of View: Nine Logico-Philosophical Essays", 1953)

"[…] the grand aim of all science […] is to cover the greatest possible number of empirical facts by logical deductions from the smallest possible number of hypotheses or axioms.” (Albert Einstein, 1954)

"The principle of parsimony is valid esthetically in that the artist must not go beyond what is needed for his purpose. (Rudolf Arnheim," Art and Visual Perception: A Psychology of the Creative Eye", 1954)

"Our craving for generality has [as one] source […] our preoccupation with the method of science. I mean the method the method of reducing the explanation of natural phenomena to the smallest possible number of primitive natural laws; and, in mathematics, of unifying the treatment of different topics by using a generalization." (Ludwig Wittgenstein, "The Blue and Brown Books", 1958)

"[…] entities must not be reduced to the point of inadequacy and, more generally, that it is in vain to try to do with fewer what requires more." (Karl Menger, "A Counterpart of Occam's Razor in Pure and Applied Mathematics Ontological Uses", Synthese Vol. 12 (4), 1960)

"Let us consider, for a moment, the world as described by the physicist. It consists of a number of fundamental particles which, if shot through their own space, appear as waves, and are thus [...] of the same laminated structure as pearls or onions, and other wave forms called electromagnetic which it is convenient, by Occam’s razor, to consider as travelling through space with a standard velocity. All these appear bound by certain natural laws which indicate the form of their relationship." (G Spencer-Brown, "Laws of Form", 1969)

"For if as scientists we seek simplicity, then obviously we try the simplest surviving theory first, and retreat from it only when it proves false. Not this course, but any other, requires explanation. If you want to go somewhere quickly, and several alternate routes are equally likely to be open, no one asks why you take the shortest. The simplest theory is to be chosen not because it is the most likely to be true but because it is scientifically the most rewarding among equally likely alternatives. We aim at simplicity and hope for truth." (Nelson Goodman, "Problems and Projects", 1972)

"As glimpsed by physicists, Nature's rules are simple, but also intricate: Different rules are subtly related to each other. The intricate relations between the rules produce interesting effects in many physical situations. [...] Nature's design is not only simple, but minimally so, in the sense that were the design any simpler, the universe would be a much duller place." (Anthony Zee, "Fearful Symmetry: The Search for Beauty in Modern Physics", 1986)

"A mechanistic model has the following advantages: 1. It contributes to our scientific understanding of the phenomenon under study. 2. It usually provides a better basis for extrapolation (at least to conditions worthy of further experimental investigation if not through the entire range of all input variables). 3. It tends to be parsimonious (i. e, frugal) in the use of parameters and to provide better estimates of the response." (George E P Box, "Empirical Model-Building and Response Surfaces", 1987)

"I seek […] to show that - other things being equal - the simplest hypothesis proposed as an explanation of phenomena is more likely to be the true one than is any other available hypothesis, that its predictions are more likely to be true than those of any other available hypothesis, and that it is an ultimate a priori epistemic principle that simplicity is evidence for truth." (Richard Swinburne, "Simplicity as Evidence for Truth", 1997)

"Were it not for Occam's Razor, which always demands simplicity, I'd be tempted to believe that human beings are more influenced by distant causes than immediate ones. This would especially be true of overeducated people, who are capable of thinking past the immediate, of becoming obsessed by the remote. It's the old stuff, the conflicts we've never come to terms with, that sneaks up on us, half forgotten, insisting upon action."(Richard Russo,"Straight Man", 1997)

"It is part of the lore of science that the most parsimonious explanation of observed facts is to be preferred over convoluted and long-winded theories. Ptolemaic epicycles gave way to the Copernican system largely on this premise, and in general, scientific inquiry is governed by the oft-quoted dictum of the medieval cleric William of Occam that 'nunquam ponenda est pluralitas sine necesitate' , which may be paraphrased as 'choose the simplest explanation for the observed facts' ." (Edward Beltrami, "What is Random?: Chaos and Order in Mathematics and Life", 1999)

Occam's Razor = The Law of Parsimony (2000-)

"A smaller model with fewer covariates has two advantages: it might give better predictions than a big model and it is more parsimonious (simpler). Generally, as you add more variables to a regression, the bias of the predictions decreases and the variance increases. Too few covariates yields high bias; this called underfitting. Too many covariates yields high variance; this called overfitting. Good predictions result from achieving a good balance between bias and variance. […] fiding a good model involves trading of fit and complexity." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"Mathematics is not about abstract entities alone but is about relation of abstract entities with real entities. […] Adequacy relations between abstract and real entities provide space or opportunity where mathematical and logical thought operates parsimoniously." (Navjyoti Singh, "Classical Indian Mathematical Thought", 2005)

"The model theory postulates that mental models are parsimonious. They represent what is possible, but not what is impossible, according to assertions. This principle of parsimony minimizes the load on working memory, and so it applies unless something exceptional occurs to overrule it." (Philip N Johnson-Laird, Mental Models, Sentential Reasoning, and Illusory Inferences, [in "Mental Models and the Mind"], 2006)

"Two systems concepts lie at the disposal of the architect to reflect the beauty of harmony: parsimony and variety. The law of parsimony states that given several explanations of a specific phenomenon, the simplest is probably the best. […] On the other hand, the law of requisite variety states that for a system to survive in its environment the variety of choice that the system is able to make must equal or exceed the variety of influences that the environment can impose on the system." (John Boardman & Brian Sauser, "Systems Thinking: Coping with 21st Century Problems", 2008)

"What advantages do diagrams have over verbal descriptions in promoting system understanding? First, by providing a diagram, massive amounts of information can be presented more efficiently. A diagram can strip down informational complexity to its core - in this sense, it can result in a parsimonious, minimalist description of a system. Second, a diagram can help us see patterns in information and data that may appear disordered otherwise. For example, a diagram can help us see mechanisms of cause and effect or can illustrate sequence and flow in a complex system. Third, a diagram can result in a less ambiguous description than a verbal description because it forces one to come up with a more structured description." (Robbie T Nakatsu, "Diagrammatic Reasoning in AI", 2010)

"In my view, the argument from parsimony is really no argument at all - it typically functions only to shut down more interesting discussion. If history is any guide, it's never a good idea to assume that a scientific problem is cornered." (David Eagleman, "Incognito: The Secret Lives of the Brain", 2011)

"Scientists often talk of parsimony (as in 'the simplest explanation is probably correct', also known as Occam’s razor), but we should not get seduced by the apparent elegance of argument from parsimony; this line of reasoning has failed in the past at least as many times as it has succeeded. For example, it is more parsimonious to assume that the sun goes around the Earth, that atoms at the smallest scale operate in accordance with the same rules that objects at larger scales follow, and that we perceive what is really out there. All of these positions were long defended by argument from parsimony, and they were all wrong. In my view, the argument from parsimony is really no argument at all - it typically functions only to shut down more interesting discussion. If history is any guide, it’s never a good idea to assume that a scientific problem is cornered." (David Eagleman, "Incognito: The Secret Lives of the Brain", 2011)

"What can be done with fewer [assumptions] is done in vain with more." (Alan Baker, "Simplicity", The Stanford Encyclopedia of Philosophy, 2012)
Related Posts Plugin for WordPress, Blogger...

On Data: Longitudinal Data

  "Longitudinal data sets are comprised of repeated observations of an outcome and a set of covariates for each of many subjects. One o...