12 Common Mistakes In Empirical Social Science

Management students
entering my thesis prep course without having been involved in research before,
or taken a probability course, reliably make these mistakes. Many students
go on to do empirical quantitative theses, meaning that their misconceptions about sampling
and analysis will come back to bite them.

For the
benefit of future classes, and of alumni who need reminders:

1. Random does not mean
haphazard or accidental. Asked what ‘random’
means, students reply ‘no pattern,’ or something of that nature. Your sample
ain’t random just because you say it’s random! Show how you’ve taken pains to
ensure the sample meets the true definition of randomness, namely that each
member of the studied population has the same probability of being selected for
the sample.

o If the sample departs from randomness, show why the departure is small,
and not material to answering the research question.

o Never speak of a ‘random sample of experts’: You choose experts for their
special expertise, not for their typicality. Expert panels must be chosen
expertly, not randomly. Their responses may be subject to descriptive
statistics, but never to inferential statistics.

2. Specify the
population. Oh, how students forget that statistical
inference is reasoning from a sample to a specified population. As a late
replacement for his dissertation examiner, I listened to a candidate’s oral
defense. He showed a sophisticated analysis and a big n, and claimed support
for his hypothesis at a high p level. I asked, What is the population? He
replied, Huh?

o A sample must be drawn from a population. If you can’t say what the
population is, your sample is crap. Then the dissertation is crap.

3. And specify the
sampling frame and the sampling plan. If the
population is ‘shoppers at Horton Plaza Mall in San Diego,’ the frame might be
‘people entering Horton Plaza through the south doors between 11a.m. and 6p.m.
next Tuesday and next Saturday.’ The plan is ‘Interview every 25^th
person to enter, with at least 300 interviews to be completed.’

o You give your undergrad assistant your clipboarded questionnaire, along
with the sampling plan. Face it: If the 25^th person is a smelly guy
with a beer gut, your assistant will give him a pass and instead interview #26,
who is an attractive member of the opposite sex. Thus violating randomness.

o Solution: Send pairs of assistants. One will do the interviewing while
the other enforces the sampling plan.

4. Don’t confuse a
census with a sample. If you measure every
member of the population, you’ve done a census, not a sample. No statistical
inference is needed, or appropriate. You’re not making inference from sample to
population, because you’ve ‘sampled’ the entire population!

o One student measured every country in his well-specified population of
interest. He then performed statistical tests. ‘Why?’ I asked. ‘Well,’ he said,
‘it wouldn’t look like much of a dissertation if I didn’t have statistical
tests.’ BRRRP! Buzzer! Wrong answer!

5. Don’t forget
nonresponse bias. You invited 1000 people
(randomly selected from a specified population, natch) to your surveymonkey.com
site. 700 of them completed the questionnaire. Congratulations, this is
actually a very good response rate for management research. However… Why do you
believe the responses of the 300 non-responders – if they had responded – would
have been similar to the responses of the 700? A passable dissertation must
include an answer to this. There are techniques for estimating nonresponse
bias.

o One student got a response rate like this, questioning businesspeople in
his country about their interest in interacting with foreign businesspeople.
Doesn’t it stand to reason, his examiners asked, that people not interested in
foreigners would not be interested in a questionnaire about their interest in
foreigners? Is this not a red flag for nonresponse bias?

o The student got a ‘conditional pass’ and was required to come back later
with written estimates of nonresponse bias and its impact on his results.

6. Too many hypotheses? Testing many hypotheses on one too-small sample will almost guarantee at
least one false positive or false negative. As a rule of thumb, you need a
perfectly random sample of n=30 – with perfect controls on non-treatment
effects – to get a good estimate of just one parameter.

o If you must test multiple hypotheses on multiple quantities, scale up
your n accordingly.

o There’s no hard and fast rule. But you will need a bigger sample than you
think you need.

7. Assuming normality?
Take care. There are two issues here: Normality of
the quantity you’re measuring, and normality of measurement errors. There are tests for normality.

o The world financial crash occurred (among other reasons) because experienced
financial managers believed returns on assets would follow a Gaussian distribution.
In fact, a longer-tailed distribution (e.g., a power law) was needed to catch
the true probability of an extreme event of the kind that did, in fact, happen.
For more on this, see The Black Swan.

o Your research model is y = e^x + ε . You’re more comfortable with linear regression, so you transform it to
log_e y = x + ε’ . Regression depends on
Gaussian-distributed error terms. Is it ε
or ε’ that’s normally distributed? Or neither? (It almost surely won't be both.) Find out before you regress!

8. Justify your use of a
statistical test by understanding its mathematical
foundation and its match to your research question. Students, and some
published scholars, tend to wave away this step with a magic gesture, saying
‘So-and-so (1998) used this test in a similar situation.’ Citing a prior use of
the test is not a categorical no-no. But it’s better to show you know why it was used.

o I guarantee you, Professor So-and-so’s research question was quite
different from yours. Just because his referees saw the logic of using the test
doesn’t mean yours will.

9. The matter of
replicability. Is your 90% significance level ‘in
principle’ only, or possibly relevant in practice? Compare your management
research to that of a medical investigator in a mouse lab: That investigator
knows mouse physiology will not suddenly change tomorrow. Tomorrow’s repetition
of the mouse experiment will show a result much the same as today’s. The management
researcher knows the business environment will
change tomorrow. The prospect of replicating your management study on many
independent samples under the same conditions is nil.

o If an answer to your research question is likely to be very ephemeral,
don’t do the study.

§ A possible exception: If you’re studying an important one-of-a-kind
event, for example a nation’s adoption of the Euro, and its effect on consumer
prices or attitudes.

o If you think you’ve revealed a management principle of lasting value, say
why you think so. Many extra points if you can convince the reader of the
study’s practical replicability.

10.The dangers of SEM,
multi-level models, etc. The worst reason in
the world to do something is just because
you can. These highly complex statistical procedures are possible only
because of the power of today’s computers. That alone is no reason to use them;
you’ll need a much better reason. SEM, factor analysis and the like require
judgment on the researcher’s part. (They are not just plug-in formulas.) Can a
novice researcher exercise that judgment?

o You write a thesis to show you can do a supervised research project – not
to display your virtuosity.

o Using a technique without showing complete understanding of it (see #8
above) won’t persuade your examiners to award the degree you seek.

11.Non-sampling errors
will almost always be much bigger than sampling errors. Professors teach statistical math because it’s easier than teaching research
logic. Sampling error, or reliability, is the significance level or p-value of
your test. Errors that you make in formulating an unambiguous research question
and measurable hypotheses, framing your study, interpreting results, and so on,
as well as errors described in #1 through #4 above, are non-sampling errors.

o Avoiding them requires as much or more attention than doing the tests
properly.

12.The p-value is not the
probability of H_o being true. Or false. The
hypothesis statement is a matter of fact, not of probability. Either it is true
or it isn’t, out there in the real world. Makes no sense to speak of the
probability of it being true. Though they shouldn’t have, the statistical
forefathers used ‘p’ sometimes to denote a probability, and sometimes to denote
a quantile of an error distribution. They thought you could keep the two
straight. Don’t prove them wrong!

o Ditto for the significance level α. α is not a probability – of the truth or of anything else.

Modern
statistical inference is one of the top intellectual achievements of the 20^th
century, and one of history’s greatest advances in applied epistemology.
However, deciding a hypothesis at the 95% level (which is almost impossible in
management studies anyway) only means that if you repeated the experiment 100
times on 100 independent samples, you would make the same decision
approximately 95 times.

In other words, you
still don’t know whether the hypothesis is true or not. All you have done
is quantify your confidence in its truth or falsity. Interpret your data with
the appropriate modesty and do not use the word ‘proved.’

Old NID

151861

test content

12 Common Mistakes In Empirical Social Science

No, Trump’s Executive Orders Can’t Cancel Your Rights.

The US Discourages Pregnant Women From Drinking Alcohol - Vegetarian Diets Are Worse

In British Iron Age Culture, Margaret Thatcher Was The Norm