Unreliable Data Can Threaten Democracy

Keep an eye on President Trump's management of the 2020 Census.

Be counted.

Photographer: Sean Rayford/Getty Images

Data analysis is playing an increasing role in the U.S. electoral system, raising an important question as the Trump administration prepares to oversee the 2020 Census: What if the data aren't reliable?

Across the country, political operatives have been hard at work finding new ways to use data to gain an advantage. In North Carolina, as the Washington Post reported, staffers of the Republican-controlled state government employed breakdowns by race of early voting behavior and ID preferences to craft laws that seemed guaranteed to reduce black representation. Luckily, a federal judge looked at the same data and prevented the state from enforcing the laws.

Data analysis has also become an integral part of gerrymandering -- a practice pioneered in 1812 by then Massachusetts Governor Elbridge Gerry, who redrew the state's voting map to concentrate his opponents' supporters in a way that allowed his party to win most districts (his own district famously took on the shape of a salamander). Nowadays, practitioners often use census data on race as a proxy for party affiliation, since minorities tend to vote Democratic.

Computers aren’t yet drawing the district lines. Rather, as gerrymandering expert and Tufts math professor Moon Duchin puts it, political operatives "use the computer to help them rig it. Call it computer-aided gerrymandering.”

Data analysis has been useful in combating such practices, by demonstrating that a given map is a “statistical outlier” in some way. If, for example, a proposed map has many fewer districts dominated by minorities -- or many more dominated by Republicans -- than a computer-made “best possible” map, a judge can reject the redistricting, as happened recently in Wisconsin.

But what happens if certain groups, such as minorities, are systematically miscounted? That's a legitimate concern as the country approaches its next decennial census, which will be used to decide such important issues as how many representatives individual states get in Congress.

The census is done in three stages. First, the bureau sends forms to every household in America. Second, people go to randomly chosen households to see how closely they can verify the information collected (the answer is always “somewhat," with the larger discrepancies tending to appear in minority and poor neighborhoods). Third, statisticians use the discrepancies to make corrections to overall estimates of populations and other variables.

Problem is, the census appears to be going through a crisis. The bureau's director recently announced his resignation amid concerns about inadequate funding. This is troubling, because statisticians won’t have good data to make their corrections unless there's sufficient money to do enough house-to-house sampling. Moreover, the quality of the census depends crucially on following norms and having respect for expertise, none of which seems guaranteed under President Trump.

It’s possible that Republican self-interest will prevail. Although the party might benefit from undercounting minorities on a local level, it also has a big incentive to ensure that certain states with rapidly growing minority populations -- such as Texas -- are fully represented in Congress. Still, given what's at stake, it's a process that needs watching.

This column does not necessarily reflect the opinion of the editorial board or Bloomberg LP and its owners.

    To contact the author of this story:
    Cathy O'Neil at

    To contact the editor responsible for this story:
    Mark Whitehouse at

    Before it's here, it's on the Bloomberg Terminal.