Week 10 Homework
1. Imagine a dataset with a continuous outcome variable, one continuous predictor
variable, and one categorical predictor variable with two levels. Imagine this data is
plotted on a scatterplot, with the outcome variable on the y axis, and the continuous
predictor variable on the x axis, and the categorical predictor represented with different
color points. Imagine the scatterplot shows that there is a negative correlation between
the two continuous variables, and there is a main effect of the categorical predictor, and
there is no interaction between the two predictor variables. You add two best-fit
regression lines to the plot, one for each level of the categorical predictor. What do the
lines look like? You can draw or describe them.
variable, and one categorical predictor variable with two levels. Imagine this data is
plotted on a scatterplot, with the outcome variable on the y axis, and the continuous
predictor variable on the x axis, and the categorical predictor represented with different
color points. Imagine the scatterplot shows that there is a negative correlation between
the two continuous variables, and there is a main effect of the categorical predictor, and
there is no interaction between the two predictor variables. You add two best-fit
regression lines to the plot, one for each level of the categorical predictor. What do the
lines look like? You can draw or describe them.
2. Take a look at the boxplots above. Imagine you ran a regression model with this formula:
value ~ age * dialect
a. How many coefficients would be in the model output?
b. Which coefficients would have positive vs negative estimates?
c. Recall that each line of the coefficients output is testing a null hypothesis. What is
the null hypothesis each line is testing?
d. Which coefficients do you think would show significant effects?
3. Take a look at the linear regression model output above. This is made-up data looking
at the effects of Age and Formality on LengthOfUtterance (in number of words).
a. What formula was used to create this model?
b. What is the reference level of the following variables?
i. age
ii. formality
c. What is the mean length of utterance for each set of data listed below?
i. older informal
ii. younger informal
iii. older formal
iv. younger formal
d. How would you interpret the results (including main effects and interaction
at the effects of Age and Formality on LengthOfUtterance (in number of words).
a. What formula was used to create this model?
b. What is the reference level of the following variables?
i. age
ii. formality
c. What is the mean length of utterance for each set of data listed below?
i. older informal
ii. younger informal
iii. older formal
iv. younger formal
d. How would you interpret the results (including main effects and interaction
