The Problem of Base Rate

The Story

You are a scientist looking for a gene associated with a deadly disease. Discovering this gene will guide further research on gene therapies and early interventions for carriers of those genes. You've narrowed your search down to 100 possible genes. All this means that you have 100 hypotheses to check, and the base rate of true hypotheses is 1/100 = 0.01.

Your methods are not perfect, but they are "pretty good" by scientific standards. Your method has a power of 0.8, which means that if a gene is actually associated with the disease, your results will be positive 80% of the time. There is also the possibility of a false positive, which is when a false hypothesis nevertheless gives you a positive result. Following standard practice, your test is calibrated to have a false positive rate of 0.05, which means if a gene has no association with the disease, you will nevertheless get a positive result 5% of the time.

Get to Work

You can test each gene one at a time, or you can test them in batches. Your results will either come back positive (BaseRates) or negative (BaseRates). Mark any genes you think are actually TRUE and associated with the disease. Then you can ask to check at the bottom, and we will let you know which of your selections are actually true.

Up Next...

This exercise shows that separating true hypotheses from false ones is not as easy as simply conducting an experiment. Up next, we will discuss the very serious problem of the base rate in greater detail. And after that, we'll play another game to see that things might not be as dire as they might seem.


Test each hypothetical gene

Base rate: Power: False positive rate:
Novel hypotheses: Expected to be TRUE:

Hyp#   Research Findings What do you think?