12:10 PM
Welcome to the New Quants: Data Scientists
Buy side firms are constantly looking for new ways to analyze data that can yield alpha, so don’t be surprised if the role of data scientist emerges at hedge funds and traditional asset managers.
“The area of data science is only three or four years old and it’s wrapped up with the buzz around Big Data,” said John 'Fawce' Fawcett, CEO of Quantopian, a community for algorithmic development.
On Thursday, BNY Mellon released a white paper on the transformational impact that data science will have on all phases of financial services.
Big data will lead to new approaches for analysis in all phases of financial markets, including asset management, research, analytics, asset allocation, trading, and risk management, according to the report.
[For more on Big Data Can Transform Global Financial Markets , see Greg MacSweeney's related story.]
“For the buy side, analytical and statistical analysis and then automating that analytic with software via machine learning is a burgeoning field across all different fields of analysis,” said
While portfolio analysis and risk analysis has always been quantitative, the difference is that with the typical approach, quants develop models that explain behavior and then use that model to predict the future. But data science is more about mining data and hunting for interesting patterns. “The danger is over fitting. If you mine it enough you think you have something predictive, cautioned Fawcett.
Data science is important due to all the automated trading in the market, and all the trade flow and information that the exchanges yield has become saturated with machine learning and automated analysis, said Fawcett. Algorithmic trading is based on similar techniques, he notes.
What is a data scientist?
From Wikipedia:
>
However, there may be a shortage of people with the right experience, since most people with experience in data science are either financial quants or people working at Internet companies like Google and Facebook.
As far as the future talent pool, aspiring traders will be those with a background in engineering, math and any of the hard sciences, as well as statistics and biology. All of these disciplines train people for the core areas of data science, while operations research is another popular academic area that drives people into this area, said Quantopian’s founder.
Sites like Quantopian are also helping to quant researchers to design their own algorithms.
The firm is developing new tools for quants to build algorithms on its site and new backtesting tools while the site is gearing up for live trading.
Last month, Quantopian went live with Fetcher, a backtesting tool. This lets someone take an algorithm and run it through historical paper trading and then live trading, explains Fawcett. Traders can take data from Quandl, a web site that provides four million data sets and integrates through Fetcher. It offers data in the CSV (common separate values) format, which is the most common data format for time series data.
One of the areas that it's trying to streamline is the gap between the quant research and the quant developer. Usually a quant researcher will hand over their research to a developer who writes code for placing the trades in another language, C, Java or Python. This can be a dangerous area— which Fawcett calls "the rewrite gap" — where bugs are introduced between the original idea and the implementation. "What we did is eliminate that. What you write for your backtesting and simulation is what we will run for your live trading," said Fawcett.
If data science becomes as big as people think, it could lead to new analytical approaches for analyzing torrents of data and new trading strategies, and data scientists could replace the old breed of rocket scientists on Wall Street.
Data scientists solve complex data problems through employing deep expertise in some scientific discipline. It is generally expected that data scientists are able to work with various elements of mathematics, statistics and computer science, although expertise in these subjects are not required. However, a data scientist is most likely to be an expert in only one or two of these disciplines and proficient in another two or three. There is probably no living person who is an expert in all of these disciplines — if so they would be extremely rare. This means that data science must be practiced as a team, where across the membership of the team there is expertise and proficiency across all the disciplines.