Informational Advantage & Alternative Data

  • Gediminas Rickevičius, VP of Global Partnerships at Oxylabs.io

  • 14.06.2022 11:45 am
  • #data

Alternative data has been making waves in the financial sector at least since 2017. While the concept of alternative data has existed for nearly a decade prior, the value of investment strategies took some time before it was discovered.

An important factor in determining the value of alternative data is an informational advantage. Investing, whether it’s high-frequency trading or more passive strategies, is a competitive endeavour. Competition is often decided by the ability to extract signals from the information.

The alternative data industry is expected to continue growing at a rapid rate with over $1B dollars being spent on the buy-side from 2020. In the financial sector, we expect the growth to be driven by the informational advantage it delivers.

A little on alternative data

One of the unique features of alternative data is that it’s negatively defined in contrast to traditional sources. In other words, the former is everything that the latter is not. We do have, however, a fairly good understanding of traditional sources.

Traditional data includes financial reports, censuses, administrative and governmental records, etc. These have been used for decades by financial institutions to make investment decisions. It does have an issue - it’s easily convertible to signals.

Convertibility to investment signals means that nearly anyone can make use of such data. As the adage goes, if the data is public, the markets have already adjusted to it. Due to nearly every financial institution using these sources, they can struggle to generate profitable strategies.

Alternative data, on the other hand, includes various, mostly unstructured sources. Such data can include social media sources, satellite imagery, payment data, etc. Some, such as satellite imagery, have to be bought from third-party providers, others can be acquired in-house or by outsourcing relatively cheap processes such as web scraping.

Unstructured data is an advantage

The one seemingly painful hurdle is that most of the alternative data is unstructured. It comes, at best, in semi-structured file formats such as JSON. At worst, it can come in the form of images converted to base64 or the like.

Getting valuable insights or signals from such data is a challenging task. It has to be parsed, prepared, cleaned and evaluated. All of these steps take resources, expertise, and time. Working with alternative data is fundamentally different from traditional sources as more preparation is required.

Additionally, the signals that can be derived from alternative data aren’t as clearly visible as those from traditional sources. Company financial reports are closely related to performance. Twitter sentiment about the company or its stock is also related, but the strength of the correlation is less obvious.

All of these “issues” would seem to point to the fact that alternative data isn’t worth it. It’s expensive to procure, hard to process, and the signals seem weak. In finance, however, that may be seen as an advantage.

Since signals are significantly harder to extract out of such datasets, that causes informational asymmetry. In other words, only a select few can extract valuable insights from a particular dataset. In turn, there’s no way for the market to adapt easily to such signals as no other participants (or relatively few) have access to the information.

Therefore, alternative data, while producing potentially weaker signals, create unique ones. In other words, it delivers an informational advantage over other market participants - something many investors dream of.

One caveat, however, we should note is that signal decay does exist. Since alternative data should be collected from publicly accessible sources (due to the surrounding legal landscape), potentially anyone could uncover the same or correlated signals. It’s unlikely that meaningful Sharpe ratio indicators will remain for long periods of time, especially as competition increases.

Evaluating alternative data

One issue, which I see as truly concerning, is the value of alternative data. Some sets might be a mish-mash of sentiment and various posts by individuals that eventually leads to nowhere. Other sets might deliver signals so strong that they could serve as an independent investment strategy.

There are two things to consider when evaluating alternative data - backtesting and signal accessibility. The former lets us know if the dataset would have created alpha-generating investment strategies. Most industry professionals should be familiar with such a process.

Signal accessibility is a collection of well-known concepts reapplied to alternative data. There are a few important factors to consider when evaluating accessibility:

  1. Cost of acquisition. Sentiment-related data can cost up to several thousand while detailed satellite imagery can reach millions of dollars per year.

  2. Cost of analysis. Depending on the way it’s structured, signal extraction processes can, especially when working with previously unseen data, vary in costs.

  3. Exclusivity. Signal value depreciates with decreased data exclusivity due to a lower informational advantage.

  4. Risk premiums. Some sets (e.g., social media sentiment, credit card payments, etc.) of alternative data can produce signals with a low correlation to traditional risk premiums (e.g. momentum, quality of volatility, etc.). Such alternative data can diversify risks in portfolios, raising the overall value of the set.

  5. Timeliness and latency. Some data can take longer periods of time to acquire. Additionally, the predictive power over a time horizon can differ.

In the end, alternative data is a different beast from traditional sources. Its main advantage is delivered through the fact that accessing insights isn’t as easy. As such, alternative data provides an informational advantage over other market participants, creating new ways to generate alpha with investment strategies.

Related Blogs

ISO 20022 Enhanced Data - The Golden Standard
  • 5 months 4 weeks ago 07:00 am
Data Compression Strategies
  • 6 months 1 week ago 07:00 am

Other Blogs