Discussion 6: Estimating from a sample#
STATS 60 / STATS 160 / PSYCH 10
Small sample size#
The discussion assignment for the week was to find an example of a study or survey with a small sample size.
Why is the sample size too small?
Roughly how large is the standard deviation?
Does the normal approximation apply? How confident are you in the estimated value?

Sample size#
Random sampling can be used to estimate the population mean \(\mu\)
If \(x_1,\ldots,x_n\) are a random sample, then the sample mean \(\hat\mu_n = \frac{1}{n}\sum_{i=1}^n x_i\) is a good guess for \(\mu\)
The sample mean is a random quantity (because the sample is random)
The sample mean gets more accurate as the sample size increases (bigger is better)
The standard deviation of \(\hat\mu_n\) is \(\frac{1}{\sqrt{n}}\) times \(\sigma_x\), the population standard deviation of \(x_i\)
Most of the time, the error \(|\hat\mu_n - \mu| \le C \frac{\sigma_x}{\sqrt{n}}\) for \(C\) not too large, say \(C \le 3\)
The normal approximation#
For large \(n\), the distribution of \(\hat\mu_n\) becomes close to the normal distribution
The 68-95-99 rule lets us construct confidence intervals for \(\hat\mu_n\)
We can be \(68\%\) confident that \(|\hat\mu_n - \mu| \le \frac{\sigma_x}{\sqrt{n}}\).
We can be \(95\%\) confident that \(|\hat\mu_n - \mu| \le 2\frac{\sigma_x}{\sqrt{n}}\).
We can be \(99.7\%\) confident that \(|\hat\mu_n - \mu| \le 3\frac{\sigma_x}{\sqrt{n}}\).
If you want to target a certain error level \(\epsilon = |\hat\mu_n - \mu|\) and confidence level \(1-\alpha\), you can use the Normal approximation to decide how large a sample to take.
Large sample sizes lead to smaller confidence intervals and higher levels of confidence
Estimating the mean of a 6-sided dice#
Roll your dice 30 times and write down your results
Enter the sample mean for the first 10, 20 and 30 rolls here
Estimating the mean from a biased sample#
Now we will see what happens if we use a biased measurement
Roll two dice and record the smaller value
Repeat this 20 times and calculate the sample mean
Enter the sample mean here
Recap#
The sample mean of the dice roll got more accurate with the sample size
The number of people in the class is too small to see the normal approximation in the histogram
When using a biased measurement, the sample mean does not approximate the population mean
This website can roll lots and lots of dice for us