POLS 6482 ADVANCED MULTIVARIATE STATISTICS
Eleventh Assignment
Due 19 November 2001
In this problem we going to apply probit, logit, and
linear probability to some data from the 1968, 1996, and 2000 NES Presidential
election surveys. These estimation methods are designed for binary
dependent variables. The 1968 and 1996 data are similar to the data we used in part (1)
of the 2nd assignment. The variables are:
Party Identification: 0=strong democrat
1=weak democrat
2=lean democrat
3=independent
4=lean republican
5=weak republican
6=strong republican
Family Income: Raw Data (we will not use this variable)
Family Income Quintile: 1 is the lowest quintile, 5 is the highest
Race: 0 = White
1 = Black
Sex: 0 = Male
1 = Female
South: 0 = North
1 = South
Education: 1 = High School or less
2 = Some College
3 = College degree
Age: In Years
Presidential Vote: 0 = Did Not Vote
1 = Voted for Democratic Candidate For President
2 = Voted for Republican Candidate For President
3 = Voted for 3rd Party Candidate for President
The data are in three text files:
1968 Data
1996 Data
2000 Data
Download these three files and
load them into EVIEWS and
Stata. Turn in the
d and
summ commands for all three datasets.
In Stata, a binary dependent variable is
always defined as 0 being the "negative" outcome with all other nonmissing values
being the "positive" outcome. Use Presidential Vote
as a dependent variable with the remaining variables as independent variables; that is,
run the following model on all three election years:
probit voted party income race sex south education age
You can interpret the probit coefficients roughly the same way that you interpret the regular
multiple regression coefficients. A positive
bj
means that the independent
variable is increasing (decreasing) the probability of a
"positive" ("negative") outcome. Compare the results for all three elections. What is
your interpretation of the coefficients (what do they tell you about American
Politics)? Be Specific.
In EVIEWS, a binary dependent variable is
always defined as 0 being the "negative" outcome and 1
being the "positive" outcome. Create a dependent variable from
Presidential Vote
where 0 = Voted for the Democratic Party Candidate and 1 = Voted for Republican Party
Candidate (note that non-voters and 3rd party voters are missing data!).
Run the following logit model on all three election years:
logit y c party income race sex south education age
You can interpret the logit coefficients roughly the same way that you interpret the regular
multiple regression coefficients. A positive
bj
means that the independent
variable is increasing (decreasing) the probability of a
"positive" ("negative") outcome. Compare the results for all three elections. What is
your interpretation of the coefficients (what do they tell you about American
Politics)? Be Specific.
Linear Probability is simply regular regression with the White
Standard Error Correction applied to a binary dependent variable. Replicate the
estimations of part (c) using linear probability in
EVIEWS. To compare the logit and linear probability
coefficients, normalize the
bj's (except for
b0) so that their sum of squares is
equal to 1. That is, square the
k bj's, add them up, take the square
root of this sum, and divide through the
bj's by this number.
Make a table showing these normalized
bj's (except for the intercept term)
and their p-values for the two models.
In this problem we are going to apply ordered probit to the three
Presidential election datasets. An ordered probit
estimation is designed for a dependent variable with multiple categories where it is reasonable
to assume that the categories can be rank ordered. For example, the
party
variable ranges from 0 to 6 where 0 = strong democrat, 1 = weak democrat, 2 = lean democrat,
3 = independent, 4 = lean republican, 5 = weak republican, 6 = strong republican.
Run the standard regression
of
Party on
income,
race,
sex,
south,
education, and
age in both Stata and
EVIEWS.
In
EVIEWS, to run an ordered probit issue the command:
ordered party income race sex south education age
In
EVIEWS, to run an ordered logit issue the command:
ordered(d=L) party income race sex south education age
In Stata
to run an ordered probit issue the command:
oprobit party income race sex south education age
In Stata
to run an ordered logit issue the command:
ologit party income race sex south education age
Note the absence of C in the
EVIEWS ordered probit and logit commands. As I
will explain in class, the intercept term is picked up by the estimation of the cutting points
on the latent dimension of the dependent variable. In
EVIEWS if you issue the command:
ordered party c income race sex south education age
you will get exactly the same answer.
Make a table for each election showing the normalized coefficients
and their P-Values for all three models -- regular regression, ordered probit, and
ordered logit.
EVIEWS
has two nice tables that you can produce for the ordered probit table. Under
the View button on the probit table results you will find two
options --
Dependent Variable Frequencies and
Expectation-Prediction Table. The former is simply a table of
the frequencies and is self-explanatory. The latter contains the
predicted categories (3rd column in the
table) for the dependent variable. Interpret the results shown in the
Expectation-Prediction Tables corresponding to
the three ordered probit
estimations.