While data managers repeat the mantra, "Its all about the data," the more I look around, the more I think the slogan should be: "Its only about the data." Data is becoming all encompassing. While trading data always has been critical, symbology, reference data, valuations, risk metrics and unstructured data (from the depths of the web) are becoming increasingly important.
From financial, risk and compliance perspectives, firms are investing in data infrastructures. Beyond the need to better understand internal positions, risk and profitability, banks increasingly will need to provide data to industry utilities and to regulators. The Office of Financial Research eventually will need daily positions and exposure data; the SEC wants a real-time audit trail; and all OTC derivatives transactions will need to be reported to, and reconciled with, clearinghouses and Swaps Data Repositories.
Data on the trading side is no less important. In fact, the amount of data that firms are inhaling to find an edge is one of the hottest discussion topics these days.
There are three ways to play the data game: Faster, Smarter and Dirtier. While "faster" always will be important (until markets move away from time priority), it is an increasingly difficult game to win. Data rates measured in milliseconds five years ago are now being measured in microseconds, and at least one vendor is promoting data delivery in hundreds of nanoseconds. As Wikipedia puts it, "One nanosecond is to one second as one second is to 31.7 years."
But unless we can change the laws of physics, the high-frequency/low-latency game has become played out. The cost to compete at the micro/nanosecond level has become prohibitive. In the U.S., the lowest-latency game likely will be capped at between 10 and 20 players.
As a result, firms are trying to be smarter about their trading. Increasingly, firms not only are mining market data but investing in analytics to look for arbitrage, cross-asset and even obscure correlations to find the proverbial needle in a haystack. This, however, also is becoming increasingly difficult as the amount of data generated by the financial markets is exploding. One year of U.S. options data now requires more than 52 terabytes of storage, while in 2000 it comprised about 172 gigabytes.
Down and Dirty
Meanwhile, an increasing cadre of firms is going deeper into news and scouring the web and social networks for glimmers of early insight on market-moving information. Let's call this analysis of unstructured data "dirty." Individuals always have made decisions with dirty and unstructured data, as research, news stories, press releases and good old fashioned gumshoe work always have impacted asset values. Today, however, there is an increasing body of information at our fingertips via the web.
But no matter how challenging negotiating 50 terabytes of structured options data is, finding, analyzing and interpreting petabytes of unstructured data coursing through the Internet is incredibly challenging. To parse this data you need to understand what is being said, who is saying it, score its credibility, determine the sentiment, create an implementation strategy and then trade. None of these steps is easy.
While the web provides an avenue to gather vast information not only from around the block but from around the globe, making sense of it all is hard. But with the proper analytics tools, space may not be the final frontier; the final frontier may just be the alchemy of turning an "lol" into ROI.Larry Tabb is the founder and CEO of TABB Group, the financial markets' research and strategic advisory firm focused exclusively on capital markets. Founded in 2003 and based on the interview-based research methodology of "first-person knowledge" he developed, TABB Group ... View Full Bio