How Big Data Could Help Identify the Next Felon -- Or Blame the Wrong Guy
Think of it as big data meets "Minority Report."
While working as the chief privacy officer at Intelius, an online provider of background checks, Jim Adler created software that demonstrates how just a few details about a person could be used to estimate the chances of someone committing a felony. Accurately, he says.
If that sounds like the stuff of science fiction, similar to the Tom Cruise movie where people are arrested before the crimes happen, early forms of this type of predictive policing can be attempted today because of the enormous amounts of digital data on individuals being collected and analyzed.
To test his computer program, Adler began with tens of thousands of criminal records owned by Intelius and focused only on a few details about each person, including gender, eye and skin color, the number of traffic tickets and minor offenses, and whether the individual has tattoos. Based on that data, and excluding any information about a felony conviction, he said his algorithm determined with reasonable accuracy whether a person had committed a serious crime.
While Adler acknowledges there was "sample bias" in the data and that his program is "not ready for prime time," he said a bigger sample with more historical information about individuals could be used to create a felon predictor -- software that gives the statistical likelihood of someone committing a serious crime in the future. Scores could even be assigned to individuals.
Adler, who has testified before Congress and the Federal Trade Commission on big data and privacy issues, created the program to show both the potential benefits of using big data to stop trouble before it happens, as well as the possible dangers of going too far with using predictive technologies.
"It's important that geeks and suits and wonks get together and talk about these things," said Adler, who recently left Intelius on good terms and is now a vice president at Metanautix, a data analytics startup. "Because geeks like me can do stuff like this, we can make stuff work - it's not our job to figure out if it's right or not. We often don't know."
This type of predictive policing is already causing alarm for some civil libertarians.
"When we start using data to make decisions that imprison people and execute people and impact their freedom, that is a reason to be enormously careful," said Jules Polonetsky, executive director of the Future of Privacy Forum, a Washington-based group. "Every individual has the free will to not be a criminal, despite what the statistics said yesterday."
Government use of big data is a growing concern as law enforcement and intelligence agencies amass more personal information from e-mail providers, social networks and financial institutions. Efforts such as the National Security Agency's vast electronic spying operation are fueling fears that expanding dossiers of personal data are being used to profile innocent people. Separately, accusations of profiling were in the news this week when a federal judge ruled that the New York Police Department's stop-and-frisk program unlawfully targets minorities, a ruling the city plans to appeal.
Still, some law enforcement agencies say they're finding success with predictive software. Los Angeles has recorded declines in property crimes, while Memphis, Tennessee, has seen a drop in robberies, burglaries and rapes with the help of these programs, which analyze broad patterns to help police identify hot spots for trouble.
Adler's program goes deeper, he says, by pinpointing specific people. He does this by drawing upon a source of data most researchers don't have access to -- Intelius's trove of records on 630 million criminal cases and 40 million defendants in the U.S.
To create his software, Adler examined court records dating back to the early 1980s of everyone who had brushes with the law in Kentucky and whose information was in Intelius's database. The dataset, which he chose because it was small and easier to work with, included people who didn't have felonies but did have traffic tickets and misdemeanors.
Certain features correlated highly with having a felony record: being male, having hazel eyes, minor offenses beyond traffic tickets, and tattoos. Another was being light-skinned. Eighty-nine percent of Kentucky's residents and 74 percent of its inmates are white. Adler wanted to see how many felons he could identify if the fact of the felony record was hidden.
The accuracy of the software depends on the number of false positives one is willing to tolerate, a range that Adler calls the "anarchy to tyranny" spectrum. At its most aggressive, his program can correctly identify all 51,246 felons while misidentifying 2,220 non-felons, numbers an iron-fisted ruler could live with. At a more lenient setting, it can correctly identify 37,842 felons while misidentifying 152 non-felons -- a smaller number of false positives, but still far from perfect.
"If we can do this smarter and faster and the right way, then certainly, especially when it revolves around gun violence and gang violence, it could be a useful tool," said Bruce Ferrell, a retired homicide and gang investigator with the Omaha Police Department and now president of the National Alliance of Gang Investigators' Associations.
The real test of Adler's algorithm will be when it's used with other states' data and modified to include their defendants' information. He doesn't plan to do that since he no longer has access to Intelius's data.
The use of physical characteristics such as hair, eye and skin color to predict future crimes would raise "giant red privacy flags" since they are a proxy for race and could reinforce discriminatory practices in hiring, lending or law enforcement, said Chi Chi Wu, staff attorney at the National Consumer Law Center.
And even though Adler's software is described as a research project, Wu says it isn't farfetched for the background-check industry, which has more than 3,600 companies, to begin offering this type of service.
"These companies are always looking for new markets and they're very competitive," she said. "It wouldn't be surprising if one of them tried to sell this as an add-on product."
First Advantage, which performs more than 23 million background checks annually and is owned by Palo Alto, California-based private equity firm Symphony Technology Group, declined to comment. Altegrity, the Falls Church, Virginia-based company that provides more than 12 million checks per year through its HireRight and Kroll divisions, also did not comment.
"This is where this kind of technology is going," said Adler. "This is dangerous territory, where false positives lead to tyrannical government abuse of the innocent. But the allure is that such technology might tip off the authorities to the likes of Snowden, the Boston bombers or the mass murderers that hit Newtown and Aurora."
Driven by security fears and an improving economy, the background-check industry is expected to grow to $4.5 billion by 2018, according to researcher IBISWorld. Background checks in the U.S. are covered under the Fair Credit Reporting Act, so employers would have to notify job applicants if they were rejected because of a predictive model.
Adler’s work grew out of a new service that Intelius’s parent company, Inome Inc., has developed to match its public records with private databases maintained by banks, retailers, political candidates, police departments and other entities for risk assessment and other predictive uses. There are no plans to use felony prediction, but elements of the concept are being incorporated as part of the service, said Niraj Shah, the company’s co-founder.
Adler’s research convinced executives of the untapped power of using predictive techniques on demographic data, Shah said.
"If you can start to predict intent - for crime, for retail, for politics, for business, for commerce - it’s powerful," Shah said in an interview. "There’s so much opportunity when you know one’s intent."