A "Begin" button will appear at the bottom of this file when the applet is finished loading. This may take a minute or two depending on the speed of your internet connection and computer. Please be patient.
This Java applet lets you explore various aspects
of sampling distributions. When the applet begins, a histogram of a normal
distribution is displayed at the topic of the screen.
The distribution portrayed at the top of the screen is the population from
which samples are taken. The mean of the distribution is indicated by a small
blue line and the median is indicated by a small purple line. Since the mean
and median are the same, the two lines overlap. The red line extends from the
mean one standard deviation in each direction. Note the correspondence between
the colors used on the histogram and the statistics displayed to the left of
the histogram.
The second histogram displays the sample data. This histogram is initially
blank. The third and fourth histograms show the distribution of statistics
computed from the sample data. The number of samples (replications) that the
third and fourth histograms are based on is indicated by the label
"Reps=."
Basic
operations
The simulation is set to initially sample five numbers from the population,
compute the mean of the five numbers, and plot the mean. Click the
"Animated sample" button and you will see the five numbers appear in
the histogram. The mean of the five numbers will be computed and the mean will
be plotted in the third histogram. Do this several times to see the
distribution of means begin to be formed. Once you see how this works, you can
speed things up by taking 5, 1,000, or 10,000 samples at a time.
Choosing a
statistic
The following statistics can be computed from the samples by choosing form the
pop-up menu:
Mean
Standard deviation of the sample (N is used in the denominator)
Variance of the sample (N is used in the denominator)
Unbiased estimate of variance (N-1 is used in denominator)
Mean absolute value of the deviation from the mean
Range
Selecting a
sample size
The size of each sample can be set to 2, 5, 10, 16, 20 or 25 from the pop-up
menu. Be sure not to confuse sample size with number of samples.
Comparison
to a normal distribution
By clicking the "Fit normal" button you can see a normal distribution
superimposed over the simulated sampling distribution.
Changing the
population distribution
You can change the population by clicking on the top histogram with the mouse
and dragging.
Understanding
sampling distributions
1. Click the "Animated
sample" button. Five scores from a normal distribution will be sampled and
plotted in a histogram. The mean of the sample will be computed and plotted in
a second histogram. Repeat this 3 or 4 times or until you understand the how
the "Distribution of Means" is created. The red line extends from the
mean one standard deviation in each direction. The colored vertical bars on the
X-axis correspond to the statistic of the same color.
2. Click the "5 samples" button to sample 5 samples of 5 scores each. The five means will be plotted. Click the "500 samples" and/or "2000 samples" until the distribution of means has stabilized. The sampling distribution of the mean is the distribution that is approached as the number of samples approaches infinity. With 5,000 to 10,000 you get a pretty good approximation.
3. The distribution plotted in (2) above is the sampling distribution of the mean of a sample size of 5. Approximate the sampling distribution of the mean for other sample sizes.
4. Any statistic you can compute in a sample has a sampling distribution. Approximate the sampling distribution of other statistics. The statistics available to compute are:
Mean
Median
Standard deviation (sd) (Using N in the denominator)
Variance (Using N in the denominator)
Mean absolute deviation from the mean (MAD)
Range
Understanding
the Standard error
1. The standard error is the standard deviation of the sampling distribution.
Approximate the sampling distribution of the mean for N=5. The standard
deviation of the distribution is the standard error of the mean. Find the
standard error of the mean and the standard error of the range for N=10 using
the normal distribution.
2. Determine how the standard error is affected by sample size. Plot the standard error of the mean as a function of sample size for different standard deviations? Can you discover a formula relating the standard error of the mean to the sample size and the standard deviation? If so, see if it holds for distributions other than the normal distribution.
3. Redo #2 above for the median.
Understanding
Bias
1. A statistic is unbiased if the mean of the sampling distribution of the
statistic is the parameter. Test to see if the sample mean is an unbiased
estimate of the population mean. Try out different sample sizes and
distributions.
2. Find a distribution/sample size combination for which the sample median is a biased estimate of the population median.
3. Is the sample variance an unbiased estimate of the population variance? If not, see if you can find a correction based on sample size. Does the correction hold for distributions other than the normal distribution?
4. For what statistic is the mean of the sampling distribution dependent on sample size?
Understanding
Efficiency
1. For a normal distribution, compare the
size of the standard error of the median and the standard error of the mean.
Find a relationship that holds (approximately) across sample sizes?
2. Does this relationship hold for a uniform distribution?
3. Find a distribution for which the standard error of the median is smaller than the standard error of the mean. (You may find this difficult, but don't give up.)
4. Compare the standard error of the standard deviation and the standard error of the mean absolute deviation from the mean (MAD). Does the relationship depend on the distribution?
Understanding
the Central Limit Theorem
1. The central limit theorem states that the sampling distribution of the mean
approaches a normal distribution as the sample size increases. Sample from the
uniform distribution and determine how large a sample size is needed for the
distribution to be a very close approximation of the normal distribution.
2. Do the same thing sampling from the skewed distribution.
3. Determine whether the sampling distribution of the median approaches a normal distribution as sample size increases.
Source: http://www.ruf.rice.edu/~lane/rvls.html
This is a wonderful source for statistical information like this.