Discussion 6: Estimating from a sample#
STATS 60 / STATS 160 / PSYCH 10
Small sample size#
The discussion assignment for the week was to find an example of a study or survey with a small sample size.
- Why is the sample size too small? 
- Roughly how large is the standard deviation? 
- Does the normal approximation apply? How confident are you in the estimated value? 

Sample size#
- Random sampling can be used to estimate the population mean \(\mu\) 
- If \(x_1,\ldots,x_n\) are a random sample, then the sample mean \(\hat\mu_n = \frac{1}{n}\sum_{i=1}^n x_i\) is a good guess for \(\mu\) 
- The sample mean is a random quantity (because the sample is random) 
- The sample mean gets more accurate as the sample size increases (bigger is better) 
- The standard deviation of \(\hat\mu_n\) is \(\frac{1}{\sqrt{n}}\) times \(\sigma_x\), the population standard deviation of \(x_i\) 
- Most of the time, the error \(|\hat\mu_n - \mu| \le C \frac{\sigma_x}{\sqrt{n}}\) for \(C\) not too large, say \(C \le 3\) 
The normal approximation#
- For large \(n\), the distribution of \(\hat\mu_n\) becomes close to the normal distribution 
- The 68-95-99 rule lets us construct confidence intervals for \(\hat\mu_n\) - We can be \(68\%\) confident that \(|\hat\mu_n - \mu| \le \frac{\sigma_x}{\sqrt{n}}\). 
- We can be \(95\%\) confident that \(|\hat\mu_n - \mu| \le 2\frac{\sigma_x}{\sqrt{n}}\). 
- We can be \(99.7\%\) confident that \(|\hat\mu_n - \mu| \le 3\frac{\sigma_x}{\sqrt{n}}\). 
 
- If you want to target a certain error level \(\epsilon = |\hat\mu_n - \mu|\) and confidence level \(1-\alpha\), you can use the Normal approximation to decide how large a sample to take. 
- Large sample sizes lead to smaller confidence intervals and higher levels of confidence 
Estimating the mean of a 6-sided dice#
- Roll your dice 30 times and write down your results 
- Enter the sample mean for the first 10, 20 and 30 rolls here 

Estimating the mean from a biased sample#
Now we will see what happens if we use a biased measurement
- Roll two dice and record the smaller value 
- Repeat this 20 times and calculate the sample mean 
- Enter the sample mean here 

Recap#
- The sample mean of the dice roll got more accurate with the sample size 
- The number of people in the class is too small to see the normal approximation in the histogram 
- When using a biased measurement, the sample mean does not approximate the population mean 
- This website can roll lots and lots of dice for us 
