Sometimes it's
hard to really get a grasp on statistical concepts based on what a textbook
says or the material presented in a lecture. These applets provide
a step-by-step guide to some important statistical concepts and lets you
to actively explore and experiment on your own. In order to develop
a better feel for these statistical concepts try playing around with each
applet, change some parameters and see how the results are affected.
Some key terms used in the applets are:
mu = the population mean
var = the population variance
X-bar = the sample mean
s = the sample standard deviation
N = sample size
To exit any applet and return to this page simply click
on your browser's "Back" button.
Applet 1: Standardizing a normally distributed random variable
In order to calculate probabilities associated with a
normally distributed random variable (one that follows a symmetric bell-shaped
curve) you have to convert to the standard normal variable. Standardizing
takes any normally distributed random variable and converts it into the
standard normal variable (Z), which has a mean of 0 and a variance
of 1. Your statistics book contains a standard normal table, from which
you can then determine any probability of interest. For an interactive
exercise on standardizing a normally distributed random variable click
here.
Applet 2: The Central Limit Theorem
The Central Limit Theorem is one of the most important
theorems in statistical theory. It states that as the sample size increases
the distribution of the sample mean becomes more and more normally distributed
regardless of the population distribution. This means that we can use the
normal distribution to describe the sample mean from any population, even
non-normal ones, if we have a large enough sample. The general rule of
thumb is that you need a sample of at least 30 observations for the Central
Limit Theorem to apply (i.e., for the distribution of the sample mean to
be reasonably approximated with the normal distribution). For an
interactive exercise on the Central Limit Theorem click
here.
Applet 3: Confidence Intervals
A confidence interval, or interval estimate, is a range
of values that contains the population mean with a level of confidence
that the researcher chooses. The most common levels of confidence are 90%,
95%, and 99%. For example, a 95% confidence interval would be a range
of values that has a 95% chance of containing the population mean.
For an interactive exercise on confidence intervals (interval estimates)
click
here.
Applet 4: Hypothesis Tests of the Population Mean
Once we have used a sample to produce an estimate of the
population mean we often need to use that estimate to make a decision.
For example, suppose you're told the average grade in a particular class
is a 75. You collect a random sample of grades and observe a mean of 65
-- is that sufficient evidence to decide that the average grade is not
75? There are two approaches to formally making such a decision:
the critical regions approach and the p-value approach. For an interactive
exercise on both approaches to hypothesis testing click
here.
Applet 5: Simple Linear Regression
Simple linear regression is a popular tool for describing the relationship between two random variables. Regression analysis presumes that one variable (Y) depends linearly on another variable (X). Regression involves finding the line that best represents the relationship between Y and X based on sample points (X,Y). To determine how well the estimated line fits the data analysis of variance is conducted. This involves figuring out how much of the variation in Y is explained by variation in X and how much is unexplained, or random.
Some key terms used in this applet are:
Uy = the population mean of Y
Yest1, Yest2 ...= the value of Y1, Y2.... predicted, or estimated, by the regression line (Y-hat)
alpha = the level of significance
SST = Sum of Squares Total, a measure of all the variation in Y about its mean
SSE = Sum of Squared Errors, the sum
of the squared distances between actual Y and predicted
Y. This measures how much variation in Y is not explained by the regression
line.
SSR = Sum of Squares Regression, a
measure of how much variation in Y is explained by Y's
linear relationship with X (i.e. variation in Y due to variation in X).
For an interactive exercise on linear regression and
analysis of variance click
here. Caution: If you draw in a sample line that is really bad
just to see what happens, you will find that you get some nonsensical results.
So, try to draw in a line that fits the data.
Applet 6: Quality Control using a Control Chart for the Sample Mean
Firms often use statistical analysis to monitor and maintain
the quality of their products. One tool used in quality control analysis
is the control chart for the sample mean. This device helps firms
determine if some aspect of their production process has a serious problem
that needs to be investigated and repaired. The firm's problem is
to distinguish between normal (random) variation in their product and systematic
(non-random) variation due to a problem with inputs or the production process.
Click here for an interactive exercise on using a Control
Chart for Sample Mean.
Applet 7: Quality Control using a Control Chart for the Sample Proportion
In addition to control charts which track the sample mean,
quality control analysis sometimes also uses control charts for the sample
proportion. The sample proportion is the fraction of sample observations
that has some characteristic of interest. This is another device
that helps firms determine if some aspect of their production process has
a serious problem that needs to be investigated and repaired. Click
here for an interactive exercise on using a Control
Chart for Sample Proportion