Big Data is a trending topic and the recent introduction of recording for voice and SMS (short message service) on mobile phones for trading activities in the UK, has drawn further attention to the issue. As additional voice recording requirements contribute to the growing repository of data being held by firms to meet with regulatory requirements, technologists are starting to prepare for the increasing data flows.
A snowballing need to hold more data is a big challenge, not only in respect of volume, but also diversity. With so many forms of trading communication now in play, including voice, IM, SMS and social networks, technologists must manage these increasing data volumes effectively. The challenge is further exacerbated by the possibility of holding additional data sources such as video from video conferencing along with information that may have influenced or could contextualize the communication being recorded (i.e. news feeds or streaming). The expectation is that regulators will push for this broader range of data types to be held for far longer periods than before, with current averages in the range of 3-6 months moving closer to 3-5 years (as seen in Dodd Frank and MiFID II proposals). Thinking about storage requirements for the future is tricky as volume and diversity of data is often affected by increased activity in the markets which is hard to foresee, along with the seasonality of specific market activities. The main concern is that existing data technologies and infrastructures could soon reach their limits of scalability and performance.
Processing the required analysis and interrogation of all this data is also an issue. Speed of processing is key, as well as transparency to the geographic locations of data and the requesting user. In this area, we are now starting to see the inclusion of voice, along with a wider set of communication data, in the deployment of data analytics engines. The most advanced of these engines being required to provide the ability to ‘slice and dice’ this diverse and generally unstructured data by specific search criteria (i.e. all activities of a trader or desk within a time window) or key values (i.e. specific counterparty, particular word or trigger event).
In today’s cloud based technology “as-a-service” world, we are seeing institutions seriously considering specialist service suppliers to provide flexible data storage solutions that are fully scalable. Although this was not a strategy that was immediately accepted for critical applications, due to uncertainty on the risks in security and reliability, the general concept of outsourcing key infrastructure (data storage) along with highly proprietary and confidential data is now gaining acceptance. With a growing base of customers building a track record in the market, this acceptance can only gain momentum. Additionally, at least for the larger firms, the option also exists to deploy their own private clouds as an alternative.
Most of the larger firms already have programs and proof of concepts for the advanced data analytics tools in place. However, many of these will need to be expanded to a wider selection of compliance data (i.e. voice, SMS) which in turn needs further investment. For most of the remaining medium-small sized firms there is the cost of entry and subsequent understanding of the tangible benefits to justify such a move. This is particularly difficult when the need for these firms to conduct such analysis is not a regular requirement and could probably be dealt with manually on a case by case basis, but with a rapidly growing data repository it is not sure how long this would be feasible with a quick enough turnaround. Once again the concept of analytics “as a service” could be a way forward and some of the tools are available on this basis today.
Although we have not reached a point where a firm can easily buy analytics over their data for specific investigations or analysis on demand (i.e. “pay as you go”), this concept is reasonably easy to understand and appears more accessible to a wider range of budgets. An infrequent need to perform such analysis, however, means format and availability of data would need to follow some general standards, which may not be as straight forward, especially with legacy data.
The topic of Big Data is a Big Deal! Certainly for larger firms, Big Data is likely to become a reality extremely quickly. Outsourcing and “as a service” options have in the past been viewed warily. However, they are now gathering pace to help with the Big Data challenge. Technologists must prepare now as the data deluge will only continue to increase. _Sébastien Jaouen is head of global sales - Trading Community Services, at Orange Business Services - Trading Solutions