The New York Times reported that the Internal Revenue Service conducted one of the strictest types of audits of James B. Comey, the former FBI director, and Andrew G. McCabe, his former deputy.
This raised a lot of perfectly reasonable questions, most of them variants of: What are the odds? As the article notes, the chances of two high-profile political foes of President Donald J. Trump being verified by pure coincidence are minimal.
But the minus is not zero.
If we were to believe this was a coincidence, how incredible would we say it is? Here we try to evaluate this probability as seriously as possible.
First, the facts: Both men were selected for National Research Program (NRP) audits, a small portion of all the audits the IRS conducts each year. These audits scrutinize a sample of returns to gather data on tax compliance.
According to the IRS, there were about 5,000 such audits in 2017, 4,000 in 2018 and 8,000 in 2019 — selected from about 154 million individual tax returns each year. Mr. Comey’s audit was of his 2017 tax return; Mr McCabe’s was about his return in 2019.
Many aspects of the NRP complicate our calculations, including the sampling methodology of IRS auditors and the different years of the audits themselves. We will return to these questions later. For now, we will assume that all taxpayers have an equal chance of being audited and that both men were audited in 2017.
If this problem appeared in a probability textbook, it might read like this:
If there are 154 million marbles (the approximate number of tax returns filed each year) in a giant urn, and a small number of them are red (those representing Mr. Comey and Mr. McCabe among them), what are the chances that you will draw two or more red balls if you randomly draw a few thousand from the urn (the number of audits in that year)?
It might sound complicated, but it’s a relatively well-studied problem, something that many math or statistics majors would encounter in their college coursework. People have already derived equations to estimate these probabilities with names like the hypergeometric distribution, which has applications such as election auditing and card counting.
We can simply enter our estimates of the total number of balls, the number of red balls, and the number of draws and we will get a probability. If we believe that there are only two red balls—that is, if we limit the exercise to just Mr. McCabe and Mr. Comey—this equation gives a probability of roughly one in 950 million.
These are significantly better odds than your odds of winning the Powerball. This is also an almost meaningless result. At best, it’s the right answer to the wrong question.
To understand why it is necessary to acknowledge the absurdity inherent in our exercise: To best estimate the probability of an improbable event, we must set aside the fact that we know it has already happened. (The probability of this happening is 100 percent.)
Jordan Ellenberg, a professor at the University of Wisconsin who has written books on mathematics and reasoning, describes it this way: “In some comparable universe, what is the probability that this thing that has already happened in our universe would happen?”
It may seem strange, but the same problems arise even in probability exercises as elementary as flipping a coin.
If you flipped a coin 20 times in a row, your particular sequence of heads and tails is extremely rare, about one in a million, but it has happened. And some sequence of reversals will always occur. This is a surprising match only if this is the sequence you set out to get before flipping.
Likewise, it is wrong to narrow our search to just Mr. Comey and Mr. McCabe, because we are likely to explore those probabilities if we learn that two other prominent political enemies of the administration were audited instead of those two men.
A better question is: What is the probability that two or more people like Mr. Comey and Mr. McCabe will be audited during that period?
Should this group of people include two senior FBI officials? Any two senior DOJ officials? This framing—a subjective rather than a factual decision—is what determines any probability estimate more than any choice of statistical distribution or sampling weights.
Here is a chart of the probability that our equation gives under different choices for the number of red balls, ranging from two (Mr. Comey and Mr. McCabe and no one else) to 400 (a conservative estimate of the number of Americans Mr. Trump offended by Twitter name since the start of his run for president).
The probability increases dramatically with the choice of who is considered a red marble along with Mr. Comey and Mr. McCabe.
The point is not to decide on a number, but to recognize that our choice of group size is what drives our response. While some assumptions are certainly better than others, many choices are defensible.
Turning to the details
Now let’s try to narrow it down to something a little more realistic and go back to some of the things we overlooked in our simple interpretation of this problem.
First, the two men were not audited for the same year. By expanding our scope to cover the three-year period from 2017 to 2019, our resulting probabilities increase significantly. This much is clear: if a person has a certain chance of being audited in a given year, more years means more opportunities to be audited.
Second, we are only interested in the probability that at least two people are selected. We will not consider the probability of the same person being selected twice; seems unlikely, given that the audits could last more than a year, according to Mr. Comey’s account. Note that we are looking at the probability that at least two people are selected, not exactly two, since it would also be significant if three or more people were selected from a group.
Finally, the IRS does not select people in a truly random fashion. Instead, the agency tends to single out certain types of taxpayers, including high-income earners, more often than others. For the 2001 tax year, the NRP sample includes returns from people around the 90th percentile of income at about 1.7 times the rate one would expect if the returns were selected regardless of income. That rate jumps across the top income charts, so that people with incomes in the top 0.5 percent are more than 10 times more likely to be in the sample than people closer to the median income.
We can probably assume that any group of enemies of Mr. Trump would win more than a random sample of Americans. But we cannot realistically estimate the full income of everyone in our group for each year. We also know that the IRS considered other factors in its sampling, such as the type of returns taxpayers file, and that sampling methods can change from year to year. This leaves us with little guidance on how to conform to IRS methods. As such, we will leave our estimates unweighted by revenue. As a background exercise, if you are concerned about how income affects these results, you can double the resulting probability if you think the members of a group have very high incomes, and multiply it by 10 if you think they are unusually wealthy .
Assembling them all
Including these choices, the table below provides some approximate probabilities depending on the size of the group being considered.
Alternatively, if our choices aren’t satisfactory, we’ve created a simple calculator to make your own probabilities:
So which assessment is “correct”?
Most realistic outcomes of this equation could accurately be described as “very rare” or even “extremely rare,” but neither is evidence of wrongdoing.
“It’s a bit like the irresistible force and the immovable object,” said Andrew Gelman, a professor of statistics and political science at Columbia University, when told in the abstract about this exercise. “On the one hand, you say it’s completely random. On the other hand, you suspect it isn’t.”
Mr. Gelman, like every other statistician who spoke to The Times about the issue, said the biggest hurdle is not the details but the definition of the issue itself.
When we try to calculate the probability of an event because we suspect it might not be random, we find ourselves in the difficult position of trying to imagine how we would predict the probability of the event before it happens, said David Spiegelhalter. He heads the Winton Center for Risk and Evidence Communication at the University of Cambridge, an organization dedicated to improving the way quantitative evidence is used in society.
The math is easy, he said, but phrasing the question is difficult, bordering on “nonsense,” in large part because of how difficult it is to define the group of interest.
“What’s the chance of that happening?” is easy to say,” he said. “This is a familiar statement. But it’s actually a very difficult question to answer.
Mathematics has its limits. The point of trying to estimate a probability like this, Mr. Gelman said, is not to put too much stock in the numbers, but to let the result make you understand more.
In this case, the best question is not one with an answer you can look up in a statistics textbook.
Instead, Mr. Gelman said, the question to ask is, “What’s going on?”
Matthew Cullen contributed reporting.
Add Comment