Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: zz-x2580
EMET8002 Case Studies in Applied Economic
Analysis and Econometrics
Computer Lab in Week 3
Question 1: Simple Linear Regression
Download the “states” data from Wattle and open it in Stata. As part of this question we explore the relationship between SAT
(Scholastic Assessment Test) scores and the per pupil expenditure in primary and secondary school, in the U.S. on a state level.
(a) Describe the variables of interest (the SAT score, coded as “csat” and education expense, coded as “expense”) individually
as well as their correlations and a scatterplot. Are there any outliers?
(b) Run a simple linear regression model where “csat” is the dependent (outcome)
variable and “expense” is the independent (explanatory) variable. Do this with and without accounting for outliers. What
changes? Which model do you prefer?
(c) Test whether the distribution of the residuals from your regressions in part (b) follows a normal distribution. Does the normality
assumption hold?
Question 2: Multiple Linear Regression and Quantile Regression
We continue working with the “states” dataset. As part of this question we explore the relationship between SAT (Scholastic
Assessment Test) scores and the following four
variables: (1) Per pupil expenditure in primary and secondary school ("expense"), (2) % High school graduates taking SAT
("percent"), (3) Median household income in $1,000 ("income") and (4) % adults college degree ("college"). The data is provided on
a state level for the U.S.
(a) Describe the five variables of interest individually as well as their correlations.
(b) Run a multiple linear regression model where “csat” is the dependent (outcome)
variable and the other four variables are the independent (explanatory) variables.
(c) Test whether the distribution of the residuals from your regressions in part (b) follows a normal distribution. Does the normality
assumption hold?
(d) Instead of running a multiple linear regression which estimates the mean test scores, as in part (b), run quantile regressions to
estimate the median, the 10th quantile and the 90th quantile of mean test scores. Use the same dependent and independent
variables as in your model from part (b).
Question 3: Preparation for the Research Report [not required for problem set]
Last week we discussed some aspects of the research report (worth 45% of your final mark) and we now continue the preparation
for the report as well as the research proposal. We strongly recommend starting your work on the project as soon as possible.
(a) Have a look at the section with the research report on Wattle and discuss the structure of the final research report.
(b) What data is required for replicating the papers? What are the data sources? If you
need to apply for the data through the Australian Data Archive werecommend to start the process now.
(c) As part of the project you are required to replicate and extend one of the papers. First of all, explain in your own words what is
meant by replicating the main findings of a paper.
(d) Now explain in your own words what is meant by extending the results of a paper. (e) As an example, consider the possible
extension to update the data (e.g., using new waves of data). Discuss some ideashow this could be turned into a research question
and backed up with economic theory and/or academic literature.