Worksheet 15: Correlation#
Your name:
Your student ID number:
Brainstorm ideas for quantitative measurements of positive association.
In this exercise you’ll think through why differences in variability can impose constraints on the slope of the best-fit line in a scatterplot.
a. What is the slope of the line between the points \((0,0)\) and \((100,1)\)?
b. Suppose the variable \(X\) varies in the range of \(0\) to \(100\), and \(Y\) varies in the range \(0\) to \(1\). Explain why the slope of the best-fit line will probably not be much larger than 1/100, even if \(X\) and \(Y\) are very positively associated.
Why is the slope sensitive to units?
What is the mean of a standardized dataset? What is the standard deviation of a standardized dataset?
If \(X\) and \(Y\) are correlated, can we infer that \(X\) causes \(Y\) or that \(Y\) causes \(X\)?