Bloomberg and “the magic” of machine learning

Gary Kazantsev is a researcher and software engineer who leads machine learning at Bloomberg. Machine learning is an increasingly important area at Bloomberg, a company that manages massive amounts of data in a real-time environment. While machine learning is generally about giving computers the ability to learn by using algorithms to analyze data, find patterns or predict outcomes, much of Bloomberg’s efforts today in this area are focused on helping the company’s customers to pluck intelligence and insight from the financial information and data coursing through its network that feeds the Bloomberg Terminal. Fresh off of his presentations at two key industry events, Gary explains what he and his team are doing and how that is helping investors and Bloomberg customers make better, more informed decisions.

The conversation with Gary has been edited for length.

What is fueling heavy investment in machine learning and how does it fit into the workplace?
A lot of customers’ workflows are being automated, entirely or partially. What they’re doing today is more on the cognitive side: strategy and portfolio selection, formulating the missing pieces. Let’s say you want to know which Chinese companies have made investments in the U.S. last year. You could answer that question [yourself] but it would take a whole lot of leg work. With the solutions stuff we have implemented, you could just type that question into the search bar and get the answer right there [on your Terminal screen].

Why don’t traditional technologies suffice?
To implement these kinds of automation systems for trading or portfolio selection or whatever it is that they do, you need machinery to pull data. In many cases the machinery converges the data that’s held internally, or analyzes data on top of the data in the drive.

How does it index that data in a useful way?
When you ask [the system] to show news stories about Bloomberg from Bloomberg, it knows that the first “Bloomberg” is a company or a person, and that the second “Bloomberg” is a news source. That’s called synaptic parsing, which is a big, open research area.

How does machine learning make that feature possible?
One big [way] is text analysis: anomaly detection, text topic clustering, classification, and novelty detection. Another big body of work is recognition systems: recommending news stories that [the user] might want to read on the basis of their behavior or their workflow completion.

Where does all that unstructured text come from?
We ingest in the order of 1.5 million documents per day, in roughly 30 different languages from 125,000 different sources. Social media is a big chunk of it, but there is also our content, and there is third-party premium content. There are 100,000 different websites that we scrape: third-party providers, governments, data from legal entities—you name it.

How does the system “learn” to answer these questions?
To implement these kinds of automation systems, you need machinery to pull data. We actually have one system where it is basically just user behavioral data, where we train 450,000 logistic regression laws every day. If [the user] suddenly started doing something wildly different, their suggestions would change. The classification models also change daily. For example, if two companies undergo a merger, you don’t want to be tagging them in the same way.

How many engineers work on machine learning at Bloomberg?
Approximately 100 people. The answer would have been quite different if you had asked me those questions five years ago. Since then, the number and diversity of applications of this stuff here at Bloomberg has increased dramatically. [The team] has basically been doubling every year.

Why are these systems challenging to develop?
Sentiment analysis is actually a very complicated topic by itself, because a lot depends on the way that you formulate a problem. To give you an example, suppose we ask an editor whether a story is positive or negative, and the story happens to be about a company laying off staff. If that person is a social issues editor, they’re going to say that story a negative. Whereas for a trader, it’s going to be positive for your trade.

Do different data sources require a different machine learning approach?
A lot of information is actually relevant to the financial markets is purely factual; there’s no statement of opinion or indicative words in the story at all. The fact that [a company] missed quarterly expectations is just a fact. On the other hand, what makes it much easier is that you generally don’t need to deal with things like sarcasm and metaphor. The Wall Street Journal is not known for writing in aphorisms. Basically, [human] annotators had to be trained and in-house data had to be produced. All of it was done in-house. It’s something special.

How is this different from a consumer-oriented search algorithm like Google’s?
There’s a fairly deep and involved pipeline that hides behind it. There’s name identity recognition for companies and dissemination. Furthermore, there’s person-name recognition using roughly the same methods. There is topic classification for a very large and involved set of categories. Then there is a bunch of analytical stuff [like] indexing, which needs to be real-time. Google has been slowly adding these kinds of structured workflows to answer specific types of questions, but it’s not necessarily corresponding to our customers needs.

What is the impact for the end user?
The Terminal—albeit a powerful system—is not necessarily the most novice-friendly. It’s exactly the opposite: it’s an expert friendly system. In many cases [machine learning] is invisible. You run a search for news, and you suddenly get magically better results.

Request a demo.