This article was written by Lei Huang and Daniel Caporaletti for Bloomberg Markets Magazine. It appeared first on the Bloomberg Terminal.
Systematic trading is an ever-evolving competition. One key arena in that contest: the inputs used in quantitative trading models. Over the years, the list of data sets that quants use has expanded from those based on historical prices, to statistical cross-asset correlations, to exposures to unique risk factors and beyond. To get an edge, many quant researchers and portfolio managers are continually looking for unconventional, independent factors that make sense.
News represents a rich source of data sets for developing trading strategies.
Many features of news-based information make it especially appealing. Real-time news, though less structured than conventional fundamental valuation measures, usually encodes the first clues of major changes affecting a company. You could, for example, use news fragments to build statistical forecasting models that dynamically adjust price targets.
Another promising source of information is social media, systems in which the feedback on content can be directly measured. Sharing items, for example, tends to separate noteworthy content from ambient noise. Replies and discussions provide on-the-spot feedback about opinions and emotional responses.
Aggregated data on news supply and demand can be used to detect abnormal spikes. Often, such spikes happen alongside major company events. When you track such analytics for a large portfolio of stocks, you can identify market focus and hot-spot stocks in real time.
The advantages of news-related data are obvious—but so are the technical challenges. Deriving quantitative indicators from text is daunting—it could, for example, require experts to prepare and maintain a set of classification codes and taxonomies. Although there’s a large amount of academic research in the field, significant advances began only recently with the availability of efficient machine-learning techniques and high-performance computing hardware.
Researching signals and backtesting strategies require large amounts of sample data. You need both broad news coverage and deep historical securities data to evaluate performance.
For Bloomberg-generated news and third-party content, event-driven feeds  deliver two levels of machine-readable news-derived data. Story-level analytics calculate quantifiable metrics for individual news stories—sentiment and impact, for example. Company-level analytics aggregate information for individual companies to track continuing developments.
So how can you use EDF? Let’s take a look at one key indicator: sentiment derived from social media. Sentiment, essentially a gauge of herd mentality, can be considered a driver of price momentum. To test the viability of using social sentiment as a trading signal, we worked with Bloomberg researchers and ran a series of backtests. Using stocks from major U.S. indexes, we constructed three types of long-short equity portfolios.

The first was a proportional portfolio. It weighted each stock in the index by the deviation of its sentiment score from the mean score. If above the mean, the portfolio would buy the stock; if below, it would short the shares.
The second type, a so-called high-minus-low, one-third portfolio, was more concentrated. It would go long the top one-third of the index stocks ranked by sentiment and short the bottom one-third. All the selected stocks were equally weighted.
The third was even more focused. It went long the top 5 percent and short the bottom 5 percent, equally weighting all the selected stocks.
In the test, a basket of stocks was created each day when the market opened. The shares were held throughout the day and liquidated at the close. The test ignored transaction costs and risk management.
To analyze differences among market-cap groups, we used three indexes: the S&P 500 as a proxy for large-company stocks, the Russell 3000, and the Russell 2000 for small-company shares. Over the 15-month period of the backtest, eight of the nine sentiment-driven portfolios performed better than the benchmarks. In addition, the portfolio returns displayed low volatility, low beta, and high alpha. Sharpe ratios, a measure of risk-adjusted returns, were higher than those of the benchmarks for eight of the nine portfolios, ranging from 0.66 to 5.77.
A key finding: Performance was consistently better for the small-cap portfolios. One explanation is that because of the lack of analyst coverage and market attention, it takes longer to price in fundamental changes. Such market inefficiencies could make news-based analytics stronger predictors.
Huang and Caporaletti are product managers for Enterprise Solutions at Bloomberg in New York.
Bloomberg’s Event-Driven Feed is an enterprise product that provides machine-readable news-derived data. Click here to learn more. Existing Bloomberg Terminal subscribers can also go to EDF <GO>.