01:07 PM
Big Data's Big Performance Headache
Is there a dark side to big data? Mounting cost of storing data has long been deemed a top issue, but the greatest cost issue is really manifesting from the functionality of the systems tasked with retrieving data.
"It's a real problem," says David Fetter of Quadron Data Solutions, a provider of technology solutions for bank and insurance broker-dealers. "Historically it's worked better for firms to add more storage when their incoming data exceeds capacity. But eventually it's safe to say that the volume of data and the functionality that's being performed on that data is growing even faster than some of the systems. So really, more than storage itself being expensive, you start to run into performance considerations."
Risks
On the data warehousing side it's a huge issue with clients who accumulate data, buy more disk, but eventually have more data than performance considerations.
"We've heard a lot of stories," he says. "There's one about a commission databases where it takes them 12 hours to save processed commissions." The firm of a couple hundred advisors had a data warehousing commission vendor. Someone would have to grind away 12 hours or more to pull up commission reports. "That's a case where there's room to optimize, but it's a good example of the data volume problem manifesting itself as a performance issue more than as a cost of storage issue."
What about scaling down and eliminating unnecessary data from the storage systems? According to Fetter, the way applications use data it's almost impossible pull data out, even junk. All historic data has go to be there, otherwise it can hurt integrity of entire data warehouse. It can play a role in accounts moving forward and audit trails. "You can't just yank data out of that."
If you look at the way technology has evolved, it's always been to build bigger and faster computers to support more complex software. This is a case where data is so big, it's not about cost of storage, it's about performance impact. Even applications customized for complexity get in a situation where reports that should run in minutes takes more time.
Value Your Strategy
The best defense is a strategy or software solution laid out from the beginning of the data accumulation. As firms begin to build out a data warehouse the systems architected must allow for archived data. It can be difficult to do later.
Regardless of the the of data type there's strategy in how you store it, explains Fetter. You can have a ultra high performance disk at one price point, and there's less expensive slower disks at a different price point, and you can archive data that's removable. Either way, "you have got to tier storage requirements any time data gets large. Smaller firms may not look at this, but for server providers, data warehouses, larger firms, that's something that happens early as data grows."
The key point of the big data dilemma is the problems don't manifest themselves as cost of storage but as cost issues, and there are bunch of dynamics in the Financial Service industry that makes it it a big problem. For one, there are a lot of niche players that have developed software on the fly that hasn't gotten the level of architecture and performance tuning that it should have. As data volumes grows performance goes to completely unacceptable levels. From Quadron's perspective industry spend and focus is not on making data smaller, but overcoming performance issues with the data already there.
"Focus has got to come back to how systems are architected from the beginning. It's happening and it's hard to dig out of once you're there," concludes Fetter. "It's a case of pay me now or pay me later." Becca Lipman is Senior Editor for Wall Street & Technology. She writes in-depth news articles with a focus on big data and compliance in the capital markets. She regularly meets with information technology leaders and innovators and writes about cloud computing, datacenters, ... View Full Bio