If You're Cheating On Your Taxes...

States and the IRS are mining personal and business data to sniff out scofflaws

The Texas Comptroller's Office suspected for years that well-heeled Lone Star citizens were buying big-ticket private planes out of state to dodge sales taxes. But the tax collector couldn't prove it. Then the agency installed new computer technology that matched federal airplane registrations with state tax records. In just the past six months, Texas has collected $5 million in unpaid taxes from 43 scofflaws.

As tax season nears its Apr. 15 peak, revenue agencies are reaching for a software tool kit that has long been popular in Silicon Valley and in the back offices of big retailers. A combination of advanced data mining programs and vast repositories called data warehouses is allowing the taxman in about a dozen states to gather and analyze unprecedented heaps of information about individuals and businesses, especially small companies.


These states, and to a lesser extent the Internal Revenue Service, increasingly rely on such software to help capture a chunk of the more than $350 billion in annual taxes that are owed but never paid. California alone has used such systems to identify 600,000 non-filers and collect an extra $184 million annually. "Business has been using this for years," says Massachusetts Revenue Commissioner Alan LeBovidge. "It allows us to sort data that is beyond human ability to sort manually."

Until recently, the only data revenue agencies gathered on most taxpayers came from the returns they filed. And while other information was theoretically available, few agencies ever found it because they were saddled with creaky computer systems and a shortage of staffers with the necessary technical skills. "We would try running programs on a mainframe," says Lisa McCormack, area manager for the Texas Comptroller's Audit Div., "but it took forever."

For now, most state data mining programs simply gather information from other government agencies, such as U.S. Customs and Border Protection and state motor vehicle and employment offices. Some states also tap commercial sources, such as infoUSA Inc. They then screen the data to see whether a taxpayer's income and spending patterns match what is reported on returns.

But a few states, such as Texas, are building more sophisticated data mining programs that will predict taxpayer behavior, much as credit-card companies try to estimate how much consumers will spend over the course of a year. "The capability is there to figure out which taxpayers have the highest probability of becoming noncompliant," says Steven E. Taylor, director of the revenue and compliance team of the data warehousing firm Teradata, a unit of NCR Corp. (NCR ).

Iowa, Massachusetts, and Virginia are also in the data mining vanguard. Typically, each state revenue agency will work with one data management company and its subcontractors, drawing from a list that includes Teradata, Revenue Solutions Inc. of Pembroke, Mass., and CGI Group Inc. (GIB ) of Montreal. The data miners can construct powerful programs that assign each taxpayer the equivalent of a credit score, flagging those who should be targeted for an audit. They can project who is likely to file on time, who won't pay until they get a visit from a collection agent, and even who is likely to declare bankruptcy before paying their taxes.

For years data mining was too expensive and complicated for states to undertake. But costs have come down, and the processing and storage capacity of the hardware is much greater. At the same time, the latest programs allow users to search multiple databases without having to move massive amounts of information from one computer to another.

The programs work like this: A tax agency may decide to search state employment records to learn how many workers a pizza restaurant has hired. It then matches tax return information against that of other, similar-size pizza parlors in the same Zip Code. The software is now able to figure out that the shop ought to be reporting, say, $500,000 in sales. If it is not, the business may be an audit candidate.

The analysis can go even deeper. It can match sales-tax payments from the restaurant with the personal tax return of the owner. It can also check state motor vehicle registrations to see what cars the pizza guy owns. If the pieces don't add up, the auditors may pay him a visit.

So far, states have avoided routinely searching bank and credit-card databases, fearing a backlash by taxpayers angry at government rooting through their financial records. But eventually, tax agencies may begin to comb through widely available commercial information.

Business purchasing records may be their first stop. "For the next generation, we'll be able to see how many pizza boxes you order," says LeBovidge. "If someone orders 50,000 boxes and says he only sold 3,000 pizzas, they better be able to show me where the other 47,000 boxes went."


The IRS has fallen behind the state agencies, although it has used some data mining for specific projects. For instance, in 2003 it hired an outside vendor to scrutinize information on 4,000 credit-card accounts to determine whether people were using the plastic to hide income they were stashing offshore. But IRS officials say the agency is not routinely matching tax information with data from other government sources. The IRS, in fact, is just beginning to tap state tax information.

The blossoming of data mining in tax offices has many privacy experts on edge. "This can be more like Big Brother than legitimate tax collection," says Marc Rotenberg, executive director of the Washington (D.C.)-based Electronic Privacy Information Center. "There has to be oversight."

To calm those concerns, states insist they have built numerous safeguards to protect the detailed personal information they mine. State employees and private vendors are barred from disclosing the data, contractors cannot resell or reuse the info in any way, and taxpayer information is electronically tagged, so anyone who taps into it leaves a record. Even so, politicians and voters must eventually decide how much intrusion they're willing to live with so that individuals and businesses pay what they owe.

By Howard Gleckman

    Before it's here, it's on the Bloomberg Terminal.