STATS 112/203 Midterm Practice Solutions
Midterm Practice Solutions
项目类别:统计学

Hello, dear friend, you can consult us at any time if you have any questions, add  WeChat:  zz-x2580


STATS 112/203

Midterm Practice Solutions
1. This question deals with clustered data. A study is conducted where 5 people from each of the
50 states in the USA are randomly sampled (California, Nevada, New York, Florida, etc). The
response variable is the blood pressure of each subject.
As a result, the states represent clusters with 5 observation each (and 50 clusters).
a. Say Yi is the outcome vector for a given state (5x1). What would the compound symmetric
correlation matrix look like for a given Yi? (that is to say write out the matrix for corr(Yi))
Solution: This matrix is a 5 by 5 matrix, with 1’s on the diagonals and ρ everywhere else.
1 ρ ρ ρ ρ
ρ 1 ρ ρ ρ
ρ ρ 1 ρ ρ
ρ ρ ρ 1 ρ
ρ ρ ρ ρ 1

b. Why is the compound symmetric correlation matrix a reasonable choice for the correlation
matrix in context of this problem?
Solution: The off diagonals of the correlation matrix represent the correlation between the ob-
servations within a cluster, where the different observations are different people. The compound
symmetric correlation assumes that the correlation between any two of the observations is the same.
This is a reasonable assumption since the subjects are randomly sampled. The correlation between
subject 1 and subject 2 should be the same as that between 1 and 3 which should be the same as
the correlation between subject 4 and 5 and so on.
This makes sense in this context since there is no time component - the subjects within a state are
not ordered in any way.
c. Now say the 5 subjects in any given state are ordered by their age in the dataset. Say i=1, thus
Y11 is the the youngest persons blood pressure sampling unit i=1 and Y15 is the oldest. Assume
persons ages are spaced 20, 30, 40, 50, and 60 years old.
Why could the autoregressive order 1 (AR 1) correlation structure be a reasonable choice?
Solution: Now the subjects are ordered in a meaningful way. Subject 1 and 2 can be argued to have
higher correlation than subject 1 and 5, since 1 and 2 are closer in age. The AR 1 model specifies
corr(Yij , Yik) = ρ|j−k|. Thus age 20 and 40 for example have the same correlation as age 40 and
i
60. And the bigger the difference in ages, the lower the correlation (since |ρ| < 1). These all seem
reasonable given the scenario.
2. Consider the following model:
Yij = β0 + b0i + β1tij + b1itij + εij
Yij is the i-th units j-th response. Times recorded are tij = 0, 5, 10, 15. Also b0i and b1i are
independent of the errors ε and follow a bivariate Normal distribution with mean vector 0 and
covariance matrix G =
[
σ20 σ01
σ01 σ
2
1
]
.
Also assume that εij are independent and identically distributed as normal with mean 0 and variance
σ2.
a. What is the conditional variance var(Yij |b0i, b1i)?
Solution: var(Yij |b0i, b1i) = var(β0 + b0i + β1tij + b1itij + εij) = var(εij) = σ2
b. What is the marginal mean at day 5 (for someone from the population) ?
Solution: E(Yij) = β0 + β1 ∗ 5. The b’s factor out.
c. What is the conditional mean at day 5 for i=2 (expected response for subject i=2)? What
about the conditional mean at day 5 for i=8?
Solution: E(Yij |b0i, b1i) = β0 + b0i + β1 ∗ 5 + b1i ∗ 5.
For i=2 it is β0 + b02 + β1 ∗ 5 + b12 ∗ 5 and for i=8 it is β0 + b08 + β1 ∗ 5 + b18 ∗ 5.
d. Say a researcher has a new subject (not in the study). They want to know the effect of time
on the response. Can this be answered with this model? Interpret the effect of a unit increase in
time on the response for this subject.
Solution: Yes we can answer the question since it is a linear model. For a new subject (one from
the population), as time increases 1 unit, the expected response will change by β1.
e. Say a researcher has a current subject from the dataset (say i=2). They want to know the
effect of time on the response. Interpret the effect of a unit increase in time on the response for this
subject.
Solution: For this specific subject, a 1 unit increase in time, the expected response for this subject
will change by β1 + b1i.
f. For subject i=2, what is the expected response value at time=0?
Solution: E(Y2j |b02) = β0 + b02.
ii
g. Using the covariance matrix, what is the formal null hypothesis for testing if all the random
effects are equal to 0?
Solution: H0 : σ20 = σ21 = σ01 = 0, this would imply that the random effects equal 0 with probability
1.
h. Now you want to obtain inference on whether or not the random effects should be in the model.
Describe a way to conduct this.
Solution: Can fit the both models (one with the random effects and the other without it) using
REML and compare AIC. Or conduct a likelihood ratio test where the test statistic is a mixture of
chi squared distributions (and obtain a p-value).
3. For this question, the dataset contains 878 subjects with 5 repeated measures each. Yij is
the response for the -th units j-th response, the subjects cholesterol level (mg/dL). There are 2
covariates, X1 and X2, where X1 is the subjects height (in meters) and X2 is the subjects weight
(in stones, about 6 kilograms). A linear mixed effect model is fit with a random intercept and
random slope on X1.
Linear mixed-effects model fit by maximum likelihood
Random effects:
Formula: ~1 + X1 | ID
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 25.94 (Intr)
X1 2.85 -0.78
Residual 4.26
Fixed effects: Y ~ 1 + X1+X2
Value Std.Error DF t-value p-value
(Intercept) 194.57 9.25 3511 21.13 0
X1 -24.25 12.42 3511 -1.95 0.06
X2 4.56 1.13 3511 4.02 0
Correlation:
(Intr)
X1 -0.90
iii
>coef(model)
(Intercept) X1 X2
1 200.12 -22.32 4.56
2 194.01 -25.12 4.56
3 197.56 -23.31 4.56
a. Write out the fitted equation for the subject specific (conditional) mean response. (your answer
will have numbers and bˆ’s)
Solution: E(Yij |bi) = 194.57− 24.25X1ij + 4.56X2ij + bˆ0i + bˆ1iX1ij .
b. Give an interpretation of the estimated coefficient on X1.
Solution: The main/fixed effect of X1 is -24.25. That is to say a 1 unit increase in X1 will change
the marginal mean by -24.25 (for some general subject, 1 unit increase in X1 will change expected
response by -24.25.
c. Does the intercept for this model have any meaningful interpretation? Explain.
Solution: No. That would mean the expected response for someone with height=weight=0.
d. Write out the marginal mean response equation.
Solution: E(Yij) = 194.57− 24.25X1ij + 4.56X2ij .
e. Use the R output of coef(model) to write out the conditional model for subject i=2.
Solution: E(Yij |i = 2) = 194.01− 25.12X1ij + 4.56X2ij .
4. Say a generalized linear mixed model is fit with the response Yij being a categorical yes/no
response and a single covariate X1. Specify a random intercept and slope for this model.
a. Write out the model that is to be estimated using proper link and notation.
Solution: log
(
pij
1−pij
)
= β0 + β1X1ij + b0i + b1iX1ij .
b. Interpret the effect ofX1 on the response. What are assuming about to make this interpretation.
Solution: We assume this subject is the average subject (with respect to the random effects), thus
b0i = b1i = 0. Thus for the average subject, a 1 unit increase in X1 will result in a relative change
in odds of eβ1 .
c. Say you are only interested in making marginal effect interpretation. Write out the model you
are to estimate and state what method can be used to estimate the coefficients.
Solution: log
(
pij
1−pij
)
= β0 + β1X1ij .
iv
This is now a marginal model, which can be used to make inference on marginal means. We can
use generalized estimating equation (GEE) to obtain parameter estimates.
留学ICU™️ 留学生辅助指导品牌
在线客服 7*24 全天为您提供咨询服务
咨询电话(全球): +86 17530857517
客服QQ:2405269519
微信咨询:zz-x2580
关于我们
微信订阅号
© 2012-2021 ABC网站 站点地图:Google Sitemap | 服务条款 | 隐私政策
提示:ABC网站所开展服务及提供的文稿基于客户所提供资料,客户可用于研究目的等方面,本机构不鼓励、不提倡任何学术欺诈行为。