Microsoft, Sybase and Vertica Raise Data Warehouse AnteMicrosoft, Sybase and Vertica Raise Data Warehouse Ante

This week has seen not one, not two, but three fairly significant data-warehouse-related product announcements at this week' TDWI event in Las Vegas. That's a testament to the pace of innovation in data warehousing and to the insatiable demand for better, faster, cheaper ways of crunching more numbers.

Doug Henschen, Executive Editor, Enterprise Apps

February 26, 2009

5 Min Read
information logo in a gray background | information

This week has seen not one, not two, but three fairly significant data-warehouse-related product announcements at this week' TDWI event in Las Vegas. That's a testament to the pace of innovation in data warehousing and to the insatiable demand for better, faster, cheaper ways of crunching more numbers.

The first of this week's announcements came from Microsoft with its release of its Fast Track Data Warehouse reference architectures. These preconfigured, SQL Server-ready 4-terabyte to 32-terabyte server-and-storage bundles are akin to Oracle's Optimized Warehouses and IBM's Balanced Configuration Units. But in Microsoft's case they're also billed as a stepping stone to Microsoft's Project Madison release, which will take SQL Server into the hundreds of terabytes with massively parallel processing (MPP) and scale-out architecture.How can a non-MPP (symmetric multiprocessor) appliance sold today be a stepping stone to an MPP-based system to be offered by next year? "If you want to move to a multi-node architecture, you can do that through what we call a hub-and-spoke architecture," says Herain Oberoi, Group Product Manager, SQL Server. "The hub would be an MPP-based Madison deployment, and that would sync with the individual spokes with high-speed data transfer capabilities."

In other words, the Fast Track Warehouse(s) you build today could later become spokes on a Madison/MPP-based hub that would create the enterprise data warehouse. That doesn't mean, however, that the spokes you build today have to be data marts, says Oberoi: "Some customers will use these as full data warehouses, it's just that they'll tend to have more of a departmental focus. When the time is right, they can add an MPP hub for scalability and extreme processing power."

The second of this week's announcements was the release of Sybase IQ 15, an upgrade that brings several performance enhancements to what is undisputedly the leading column-oriented database with more than 1,500 active customers. Key upgrades include improved scalability in grid environments, streamlined query algorithms for faster query execution and multi-node loading that speeds time to query. Earlier this week I talked to Asif Rahman at Loan Performance. This Sybase IQ customer is a division of insurance giant FirstAmerican that tracks the performance of mortgage loans. (My first question was, "so you guys were in a position to prevent the big subprime mortgage mess we're in, eh?" Rahman responded that most of the loans the firm tracks are those that are held by the originators rather than those that are securitized and sold off to the likes of Fannie Mae and Freddie Mac - which are said to account for the bulk of the troubled loans).

Interestingly enough, Loan Performance initially launched the warehouse behind its True Standings analysis product on Microsoft SQL Server, but the data volumes and query complexity soon proved to be too much. "When we rolled out in 2004, end users were thrilled because they could build reports from scratch and drag and drop any field they wanted," says Rahman, director of application development. "Unfortunately, people soon started to complain about the query performance and we were also having a difficult time updating the database."

After considering Oracle, Netezza and a higher-horsepower deployment of SQL Server, Loan Performance switched to Sybase IQ in late 2005 because it concluded that "a general-purpose database would not work for us," Rahman explains. "With Sybase IQ, we can add fields to an analysis without worrying that it will slow down the queries."

It should be noted that SQL Server 2008 has since introduced a "resource governor" feature and improved compression capabilities aimed at enhanced scalability. Project Madison's MPP-architecture will take scalability and performance to even further extremes, but even then I doubt it will match Sybase IQ, Vertica or any other column-oriented database in terms analytic query performance. When the task is querying selected attributes stored in columns, row-oriented databases like Oracle, Microsoft SQL Server and IBM DB2 just can't keep up, even with the aid of parallel processing.

Rahman says Loan Performance is beta testing Sybase IQ 15, and he's particularly interested in the multi-node writing capability and extended support for parallel processing. "Right now we have only one writer, but we have two nodes in production and [the multi-node] feature would cut down our update times," he explains. "As for the parallelism, we see some support for that in older versions of IQ, but they've refined it in IQ 15, and without making changes in our hardware, we've seen 15% to 20% improvements in performance."

Loan Performance customers query the True Standings database directly, and response times currently range from sub-second to as long as five minutes, depending on how many millions or billions of records are being explored. A 15% to 20% improvement in query performance would mean that much higher customer satisfaction, says Rahman.

The third and final announcement this week was from Sybase IQ rival Vertica, which introduced a Vertica Virtualized Analytic Database that runs in a VMware virtual machine. This option gives data warehouse pros an option to quickly add processing horsepower when spiky applications, seasonal demand, one-time projects or pilot tests would swamp fixed deployments. Costs start at $100,000 for a 1-terabyte deployment.

My takeaway on this week's news is that the options for data warehousing just keep getting better and more numerous while competition and Moore's Law keep increasing the performance and capacity per dollar. Unlike some categories I cover, the market seems dynamic, fast-moving and anything but commoditized, despite the move to commodity hardware.This week has seen not one, not two, but three fairly significant data-warehouse-related product announcements at this week' TDWI event in Las Vegas. That's a testament to the pace of innovation in data warehousing and to the insatiable demand for better, faster, cheaper ways of crunching more numbers.

Read more about:

20092009

About the Author

Doug Henschen

Executive Editor, Enterprise Apps

Doug Henschen is Executive Editor of information, where he covers the intersection of enterprise applications with information management, business intelligence, big data and analytics. He previously served as editor in chief of Intelligent Enterprise, editor in chief of Transform Magazine, and Executive Editor at DM News. He has covered IT and data-driven marketing for more than 15 years.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights