"Data almost always contain uncertainty. This uncertainty may
arise from selection of the items to be measured, or it may arise from
variability of the measurement process. Drawing general conclusions from data
is the basis for increasing knowledge about the world, and is the basis for all
rational scientific inquiry. Statistical inference gives us methods and tools
for doing this despite the uncertainty in the data. The methods used for
analysis depend on the way the data were gathered. It is vitally important that
there is a probability model explaining how the uncertainty gets into the data." (
"Independence of two events is not a property of the events
themselves, rather it is a property that comes from the probabilities of the
events and their intersection. This is in contrast to mutually exclusive
events, which have the property that they contain no elements in common. Two
mutually exclusive events each with non-zero probability cannot be independent.
Their intersection is the empty set, so it must have probability zero, which
cannot equal the product of the probabilities of the two events!"
"Since we cannot completely eliminate uncertainty, we need to
model it. In real life when we are faced with uncertainty, we use plausible
reasoning. We adjust our belief about something, based on the occurrence or
nonoccurrence of something else."
"Statistics is the science that relates data to specific
questions of interest. This includes devising methods to gather data relevant
to the question, methods to summarize and display the data to shed light on the
question, and methods that enable us to draw answers to the question that are
supported by the data." (
"The lack of direct control means the outside factors will be
affecting the data. There is a danger that the wrong conclusions could be drawn
from the experiment due to these uncontrolled outside factors. The important
statistical idea of randomization has been developed to deal with this
possibility. The unidentified outside factors can be 'averaged out' by randomly
assigning each unit to either treatment or control group. This contributes
variability to the data. Statistical conclusions always have some uncertainty
or error due to variability in the data. We can develop a probability model
of the data variability based on the randomization used. Randomization not only
reduces this uncertainty due to outside factors, it also allows us to measure
the amount of uncertainty that remains using the probability model.
Randomization lets us control the outside factors statistically, by averaging
out their effects." (
"The goal of scientific inquiry is to gain new knowledge
about the cause-and-effect relationship between a factor and a response
variable. We gather data to help us determine these relationships and to
develop mathematical models to explain them."
"The scientific method searches for cause-and-effect
relationships between an experimental variable and an outcome variable. In
other words, how changing the experimental variable results in a change to the
outcome variable. Scientific modeling develops mathematical models of these
relationships. Both of them need to isolate the experiment from outside factors
that could affect the experimental results. All outside factors that can be identified
as possibly affecting the results must be controlled."
"Variability in data solely due to chance can be averaged out
by increasing the sample size. Variability due to other causes cannot be." (
No comments:
Post a Comment