Question 1
1,200 words limit
(a) [25%]
Discuss an empirical example (different from lectures and seminars’ examples) of a
linear regression model with endogeneity. Write the regression equation and provide
details on the dependent and explanatory variables and on the interpretation of the
coefficients. Explain the cause of endogeneity and why the ordinary least squares
estimation would be inconsistent. More points will be given to realistic empirical
examples with more than one explanatory variable and with an appropriate choice of
explanatory variables.
(b) [25%]
Explain how you would estimate your model in (1.a) using a two-stage least squares
estimation and define the instrument(s) you would use. Provide details on how the
two-stage least squares estimation is computed. Write the formula for the two-stage
least squares estimation considering the model defined in point (1.a).
(c) [15%]
Explain what assumptions your instrumental variable(s) must satisfy to produce a
consistent estimation of the model defined in point (1.a). Show that the instrumental
variable estimation defined in (1.b) is consistent under these assumptions.
(d) [15%]
Explain how you would test for the validity of your instrumental variable(s). Explain
also how you would test for whether there is an endogeneity issue in your model.
Provide details on how you would perform these tests using the model defined in
point (1.a).
(e) [20%]
Discuss potential drawbacks of the instrumental variable(s) you proposed in point
(1.b). Discuss also what is the consequence of using an instrument whose effect on
the endogenous variable conditional on the remaining control variables is not
statistically very significant.
Question 2
1,200 words limit
(a) [30%]
Discuss an empirical example (different from lectures and seminars’ examples) of a
panel data model where you would use a fixed effect estimation rather than a random
effect estimation. Write the regression equation and provide details on the dependent
and explanatory variables, on the error term and on the interpretation of the
coefficients. Explain how you would compute the fixed effect estimation using your
defined model.
(b) [35%]
Explain what the unobserved individual effects in the model defined in (2.a) capture.
Explain the differences in the assumptions needed for the consistency of the fixed
effect estimation and of the random effect estimation for the model you discussed in
(2.a). Why is the fixed effect estimation more appropriate than the random effect
Page 3 of 9
Question continued overleaf
estimation in the empirical example you discussed in point (2.a)? Explain how you
would perform a test to decide whether to adopt a random effect or a fixed effect
estimation.
(c) [35%]
Discuss an empirical example of a panel data model where one of the explanatory
variables is endogenous because it is correlated with unobserved variables that are
relevant to explain both the dependent variable and the endogenous variable. Explain
under which conditions the fixed effect estimation can solve such an issue of
endogeneity. Explain which type of estimation you would adopt to solve the
endogeneity issue if the conditions for the consistency of the fixed effect estimation
were not satisfied.
Question 3
1,200 words limit
(a) [35%]
Suppose that a researcher has information on the type of health insurance for a
random sample of individuals. The researcher observes a categorical variable, insure,
taking value 1 for individuals who choose an indemnity plan (a fee-for-service
insurance), 2 for individuals who choose a prepaid plan (a fixed up-front payment with
unlimited use) and 3 for individuals who are uninsured (uninsure). The researcher
wants to analyse the demographic factors linked with the three choices of insurance
and observes for each individual the following variables: age in years, male which is a
dummy variable taking value 1 for men and 0 otherwise, and nonwhite which is a
dummy variable taking value 1 for people who are not of white ethnicity and 0
otherwise. Provide an interpretation of the results that are reported in Table 3.1
below. Write down the model that the researcher has estimated. Explain how the
estimation has been computed.