Moving Towards Multivariate Analysis
In this assignment, you are looking at three variables at a time – what is called multivariate analysis.
Multivariate analysis helps us explore why a relationship exists.
For example, let’s say I take a random selection of 1100 people. If I measure each person’s height in inches and I give each person the same vocabulary test (number of words out of 100 correctly defined), I would find a strong correlation between the two. Taller people, on average, know more words than shorter people. Why should this be so?
Answering that question invokes theory. Theory is an interpretation of data. In this case, an interpretation of why taller people know more words than shorter people. Our theory might be:
An example of 1 would probably have to be physiological. You might theorize that the same biological processes that account for growth account for brain development, thus leading taller people to be capable of learning language qualitatively better than shorter people. If this theory is correct, then we should see the same relationship whether we look at men or women, baby boomers or millennials, democrats or republicans, and so on.
An example of 2 might suggest that height opens up opportunities for education that then create imbalances in vocabulary in the population. Maybe we, as a society, tend to give taller people more chances at a lot of different educational opportunities – scholarships, honors classes, suggestions that you pursue a graduate degree, etc. Since it is reasonable to assume increased formal education increases one’s vocabulary, we would be theorizing that education has a direct effect on vocabulary and that height has an indirect effect on vocabulary through education. If this is correct, then when we look just at people who got graduate degrees, we would expect height and vocabulary to be unrelated. When we look just at people who didn’t get any formal education, we would expect height and vocabulary to be unrelated. In other words, controlling for education, I would expect that height and vocabulary to be unrelated. I am theorizing that height causes vocabulary knowledge only indirectly through education.
An example of 3 would be thinking that the relationship between height and vocabulary is misleading and purely (or mostly) the product of another cause of both of these variables. For example, age is definitely related to both height and vocabulary. People tend to grow taller from birth to age 20 or so. And they tend to acquire more and more vocabulary during this time as well. Maybe shorter people know fewer words because shorter people are kids and kids know relatively fewer words. If we divide our sample into ages, do short 40 year olds know fewer words that tall 40 year olds? Taking age into account probably eliminates the relationship between height and vocabulary knowledge. The bivariate relationship is misleading, what many people call “spurious”, because there is in no way in which height causes vocabulary. It’s an illusion created by the omission of an important factor related to both height (x) and vocabulary score (y).
You all found a relationship between opinions about race in America and evaluations of how Trump has influenced race relations. But why does this relationship exist?
Did pre-existing opinions about race in America determine how people were going to react to Trump?
Or are both sets of opinions a product of general political ideology?
Or maybe both are a product of political party loyalty?
These are the sorts of questions I’d like you to explore in this paper.
The dataset contains the demographic variables listed on the following pages.
The list is long. Moreover, in your final write-up, you don’t need to discuss more than one of these demographic variables depending on what you find.
Moving Towards Multivariate Analysis Moving Towards Multivariate Analysis