On the Internet, Nobody Knows You’re a Robot

CSIdentity’s artificial intelligence program extracts data from hackers

Hackers have proven they can crack just about any computer network, from Sony’s to Citigroup’s. Afterward, they face another challenge: unloading the virtual booty. They often take stolen credit-card numbers, online banking credentials, e-mail logins, and Social Security numbers to a sprawling network of underground chat rooms and invitation-only forums, where such data are bought and sold. Law enforcement investigators hoping to catch the crooks lurk there as well, but with hacking incidents on the rise, the problem is far too big to police by traditional means.

Enter the robot informant. A security firm in Austin, Tex., CSIdentity has created artificial-intelligence software capable of posing as a hacker and engaging ne’er-do-wells in the underground forums. Its goal is to solicit stolen data—a hacker hoping to fence 1,000 credit-card numbers will offer dozens for free to prove they’re real—and send them back to flesh-and-blood investigators. CSIdentity sells the data it collects to banks, cybersecurity companies, and others who have a stake in quickly discovering which businesses, accounts, and credit cards have been compromised. “Very often we are able to notify our customers that something is wrong before their bank” does, says Scott Mitic, chief executive officer of TrustedID, an identity-theft protection company which purchases CSIdentity’s data.

To design the chatbots, CSIdentity’s 10-person analyst team studied the dialog in hacker chat rooms, looking for patterns in the interactions, says Joe C. Ross, the company’s president. The hacker argot is filled with slang: A newly stolen credit card is “fresh,” and a “fullz” is a credit-card record that includes the victim’s personal data and the card’s three-digit security code along with the number. To keep up with the fast-moving conversation, the virtual informants use a technique from computer science known as fuzzy pattern recognition, which helps the bots make sense of terms and phrases that can be expressed in different ways. (When a hacker threatens to “doss that server,” the machine knows he means a distributed denial of service attack, a common way to shut down websites.)

Ross says the hunt has become a cat-and-mouse game. Hackers know to look for dialog that seems more algorithm than human, and employ tricks to ferret out the bots. One common one is to order the chat room members to log out and reconvene in a different room. If the chatbots have trouble understanding the commands, they will end up in an empty room.

The bots are helped by the fact that many hackers are non-native English speakers and more forgiving of an odd-sounding statement here and there, says Ross. And, when confused, the bots can always fall back on a swear word in these profanity-riddled forums. “Someone will make a comment and the bot will respond with an expletive,” Ross says. As they work, the bots send catalogs of stolen data and snippets of conversations back to the CSIdentity team, which often works late into the night, when the chat rooms are most active. The company fridge is stocked with Red Bull to help the humans keep up.

The bots aren’t of much use when it comes to the most sensitive undercover stings, such as those that attempt to penetrate members-only hacker forums run by organized crime rings in Eastern Europe, says Ross. Yet they can help make the problem of data loss a little more manageable, especially as its scale grows. In a single week in August, CSIdentity’s bots uncovered 419,000 new records up for sale. The data consisted mostly of e-mail account logins and passwords but also 15,000 credit card numbers and 168 Social Security numbers.

Among the compromised companies was ShoWorks, an events manager based in Spokane, Wash. CSIdentity doesn’t act on the data, only collects and sells it, so Cathy Doerr, the company’s president, says she found out about the network break-in only when federal investigators called in late August. “It’s created just a mess, and we’ve spent the last two weeks trying to clean it up,” she says. CSIdentity’s bots wouldn’t have prevented the theft, but they might have helped Doerr discover it sooner. “This happens every single day,” says Ross. “The scary thing is this is just the tip of the iceberg.”


    The bottom line: As hackers attack more companies, some investigators are relying on automated chat programs to learn what data has been stolen.

    Before it's here, it's on the Bloomberg Terminal.