01 August 2021

On Variables (2000-2009)

"The greatest plus of data modeling is that it produces a simple and understandable picture of the relationship between the input variables and responses [...] different models, all of them equally good, may give different pictures of the relation between the predictor and response variables [...] One reason for this multiplicity is that goodness-of-fit tests and other methods for checking fit give a yes–no answer. With the lack of power of these tests with data having more than a small number of dimensions, there will be a large number of models whose fit is acceptable. There is no way, among the yes–no methods for gauging fit, of determining which is the better model." (Leo Breiman, "Statistical modeling: The two cultures" Statistical Science 16(3), 2001)

"Trimming potentially theoretically meaningful variables is not advisable unless one is quite certain that the coefficient for the variable is near zero, that the variable is inconsequential, and that trimming will not introduce misspecification error." (James Jaccard, "Interaction Effects in Logistic Regression", 2001)

"Probably the first clear insight into the deep nature of control […] was that it is not about pulling levers to produce intended and inexorable results. This notion of control applies only to trivial machines. It never applies to a total system that includes any kind of probabilistic element - from the weather, to people; from markets, to the political economy. No: the characteristic of a non-trivial system that is under control, is that despite dealing with variables too many to count, too uncertain to express, and too difficult even to understand, something can be done to generate a predictable goal. Wiener found just the word he wanted in the operation of the long ships of ancient Greece. At sea, the long ships battled with rain, wind and tides - matters in no way predictable. However, if the man operating the rudder kept his eye on a distant lighthouse, he could manipulate the tiller, adjusting continuously in real-time towards the light. This is the function of steersmanship. As far back as Homer, the Greek word for steersman was kubernetes, which transliterates into English as cybernetes." (Stafford Beer, "What is cybernetics?", Kybernetes, 2002)

"A conceptual model is simply a framework or schematic to understand the interaction of workforce education and development systems with other variables in a society." (Jay W Rojewski, "International Perspectives on Workforce Education and Development", 2004)

"A smaller model with fewer covariates has two advantages: it might give better predictions than a big model and it is more parsimonious (simpler). Generally, as you add more variables to a regression, the bias of the predictions decreases and the variance increases. Too few covariates yields high bias; this called underfitting. Too many covariates yields high variance; this called overfitting. Good predictions result from achieving a good balance between bias and variance. […] fiding a good model involves trading of fit and complexity." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"Nonetheless, the basic principles regarding correlations between variables are not that diffcult to understand. We must look for patterns that reveal potential relationships and for evidence that variables are actually related. But when we do spot those relationships, we should not jump to conclusions about causality. Instead, we need to weigh the strength of the relationship and the plausibility of our theory, and we must always try to discount the possibility of spuriousness." (Joel Best, "More Damned Lies and Statistics: How numbers confuse public issues", 2004)

"Humans have difficulty perceiving variables accurately […]. However, in general, they tend to have inaccurate perceptions of system states, including past, current, and future states. This is due, in part, to limited ‘mental models’ of the phenomena of interest in terms of both how things work and how to influence things. Consequently, people have difficulty determining the full implications of what is known, as well as considering future contingencies for potential systems states and the long-term value of addressing these contingencies. " (William B Rouse, "People and Organizations: Explorations of Human-Centered Design", 2007)

"Swarm intelligence can be effective when applied to highly complicated problems with many nonlinear factors, although it is often less effective than the genetic algorithm approach discussed later in this chapter. Swarm intelligence is related to swarm optimization […]. As with swarm intelligence, there is some evidence that at least some of the time swarm optimization can produce solutions that are more robust than genetic algorithms. Robustness here is defined as a solution’s resistance to performance degradation when the underlying variables are changed." (Michael J North & Charles M Macal, "Managing Business Complexity: Discovering Strategic Solutions with Agent-Based Modeling and Simulation", 2007)

"We have to be aware that probabilities are relative to a level of observation, and that what is most probable at one level is not necessarily so at another. Moreover, a state is defined by an observer, being the conjunction of the values for all the variables or attributes that the observer considers relevant for the phenomenon being modeled. Therefore, we can have different degrees of order or ‘entropies’ for different models or levels of observation of the same entity."(Carlos Gershenson, "Design and Control of Self-organizing Systems", 2007)

"Graphical displays are often constructed to place principal focus on the individual observations in a dataset, and this is particularly helpful in identifying both the typical positions of data points and unusual or influential cases. However, in many investigations, principal interest lies in identifying the nature of underlying trends and relationships between variables, and so it is often helpful to enhance graphical displays in ways which give deeper insight into these features. This can be very beneficial both for small datasets, where variation can obscure underlying patterns, and large datasets, where the volume of data is so large that effective representation inevitably involves suitable summaries." (Adrian W Bowman, "Smoothing Techniques for Visualisation" [in "Handbook of Data Visualization"], 2008)

"System dynamics is a top-down approach for modelling system changes over time. Key state variables that define the behaviour of the system have to be identified and these are then related to each other through coupled, differential equations." (Peer-Olaf Siebers & Uwe Aickelin, "Introduction to Multi-Agent Simulation", 2008)

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...

On Hypothesis Testing III

  "A little thought reveals a fact widely understood among statisticians: The null hypothesis, taken literally (and that’s the only way...