Q&A With Gartner's Don Feinberg on Database as a Service and Cloud DBsQ&A With Gartner's Don Feinberg on Database as a Service and Cloud DBs
Microsoft, IBM, Oracle and Sun are now fueling the growing fire around the database-as-a-service and cloud database markets, but what's the difference between these offerings and what's the appeal? Database guru Don Feinberg defines terms and raises important questions about reliability and security.
What's your take on the emerging terms "Database as a Service" (DBaaS) and "Cloud Databases" (Cloud DB)?
Gartner is using "Database as a Service" [for the broad category] because we do not want to associate this only with "the cloud." To draw the distinction, companies like Kognitio and 1010data will sell you a database running on their systems at their sites. They host your database for you on their DBMS. You send them the data and they put it in a database and set everything up. You run your queries against their remote service. That is DBaaS, as opposed to managed services, because they're not having you pay for the hardware and then managing it for you. They charge you by the month for usage by the terabyte.
Now, why would I want to do that? There are several possible reasons. One is that my IT department can't do it for me, so I just work around them and go out and buy the service. A second reason might be that the IT department doesn't have the bandwidth and they encouraged me to take this route. Or maybe I have a short-term project that will last only a couple of months.
How does this compare with cloud databases?
I'll get to that in a moment, but first let's talk about DBaaS offerings that run in the cloud. The difference here is that instead of the vendor running the database service for me on their site, they are going to run it in the cloud. That's what Vertica, EnterpriseDB and Sun/MySQL are doing. You call them up and say, "I want an instance of your database in the cloud." They contract with Amazon EC2, set up the instance and give me a simple link. Now I have my database in the cloud. Oracle will also let you host your license on EC2. They're going to call it Oracle in a Cloud, but you'll have to put it on the EC2 virtual machine yourself. That's a little different than what Vertica and EntepriseDB are doing because they will handle everything and you pay one vendor rather than dealing with Amazon on your own.
What's the difference between cloud compute power and what you get from a vendor with a data center in a particular location?
The key difference is that I don't know where it is. With the cloud, it could be in Bangalore, it could be in Russia or it could be in San Paulo, Brazil. Amazon won't tell you were their machines are for security reasons. You have no control over what machine your database is running on. You're buying a virtual machine — that's what the cloud is — and I don't know or care where it is.
This presents problems that I don't have if I'm using a DBaaS that's at a vendor's site. Number one, at a vendor site I can specify whether I'm using shared hardware and I have more control over who is using the same machines. Security wise, if it's at a vendor site, I'm a little bit more comfortable; people use Salesforce.com in part because they are comfortable that the data is at their site. From a scalability standpoint, with DBaaS, I'm using the vendor's hardware and infrastructure, and they can tell me exactly what they do to ensure availability, redundancy and recovery. When you're in the cloud, you may not have those assurances. EC2 went down the other day and everybody went down with it.
Widely trusted companies like IBM and Microsoft are developing cloud capacity, but it sounds like you're inherently uncomfortable with that model.
Today, yes, I'm uncomfortable with the cloud. When Salesforce.com got started with software as a service (SaaS), very few companies used them. People didn't understand it. They knew that the customer data was going to be stored someplace else. They didn't know if it was secure. They didn't know if it would scale. And they also had no idea whether it would be reliable. Over time, Salesforce.com proved their offering to be secure, scalable and reliable, and today, anybody will put their information out there, but it took them seven or eight years to build up that level of confidence.
Database Services Classified
Service | DBaaS | Cloud DB |
---|---|---|
ENT | ||
SMB | ||
SMB | ||
ENT | ||
ENT | ||
SMB | ||
SMB | ||
ENT | ||
ENT | ||
ENT | ||
ENT | ||
SMB | ||
ENT |
Let's get back to cloud databases. How do they differ from DBaaS or DBaaS running in the cloud?
In a third variation, the cloud vendor is also supplying the database, and that's Amazon SimpleDB, GoogleBase and Google BigTable. The problem with these, in my view, is that they are not real database management systems (DBMS) at this point. They have no transaction consistency and they have no persistence of data; that has to be built into your application.
Why is that a problem?
If two people write to the database at the same time, nobody knows who wins. The application has to lock one of us out, which means the application has to do a lot more work unless it's a single-user application. Basically, they are spreadsheets in the cloud. They are not really databases at this point.
How do you classify Microsoft SQL Server Data Services (SSDS), which is currently in beta?
That's a full-blown version of Microsoft SQL Server that will be a DBaaS running in Microsoft's cloud. Microsoft is setting up a cloud just like Amazon did.
And IBM is partnering with Google on the cloud so we know that they'll be in this as well, but when do you start getting comfortable with the approach?
All the major vendors are going to set up cloud computing, and that will help to mature the market. But what people don't understand is that maturity is not a product feature that you can simply add. Maturity takes time.
Setting maturity aside for a moment, let's talk about the architectural differences between hosted databases and cloud databases. In the SaaS realm, it took multi-tenancy to make these apps scalable because the vendors didn't have to host a separate instance for each and every customer. Will DBaaS success similarly hinge on multi-tenancy?
Absolutely it will require multi-tenancy, but hosted DBaaS and cloud DB vendors both offer that approach today. If you look at 1010data, for example, you can choose multi-tenancy or not; having dedicated hardware is more expensive. It's the same thing for Kognitio. Multi-tenancy is where you get your savings. In the case of the cloud, it's one big computer and you buy virtual machines, so clearly it's multi-tenancy.
What are you more concerned about, having your data on a shared server (with virtual partitions) or having your data in the cloud on a shared server?
If it's Salesforce.com or 1010data or Kognitio and they are serving you with multi-tenancy, you still know they are physically managing and securing that computer. In the cloud, you don't know where it is. If I buy two virtual machines, one might be in India and one might be in China. What happens if there's a civil uprising in China and the server that I'm running on happens to be there? Then I have a problem.
Today, I trust me first, I trust a hosted service second and I trust the cloud the least. I'd say it will be two to four years before the cloud gets to maturity. As they prove that they have the reliability and all the required capabilities for disaster recovery, then people will start to accept and trust the cloud.
What's the appeal until that time?
You're going to see it used for specific purposes. For a small company, it might be used as the infrastructure for everything. I don't foresee larger companies using it that way, but they will use it for development purposes and short-term projects. Instead of having to set up projects in a data center and having special equipment that may sit idle if nobody is developing, I can buy those services from the cloud, develop on whatever I want to use and then move it into my data center when it's ready for production use. I could also use it for short-term projects. I could set up a data mart for a one-time campaign. Or let's say I need to get all my customer names in sync, but then new data quality features in my applications will keep those records clean. I can do a one-time data-cleansing project on a cloud platform.
Why can't you see big companies turning to the cloud as their primary platform?
Large companies with big data centers can probably do it as cheaply or cheaper than a cloud offering. If I had to sum it up, I'd say there is a future for the cloud, but initially it's going to be mostly vendors using it for developing third-party software and doing proof-of-concept projects. It will then move into development projects for corporations. Finally, it will move into short-term projects and platform use by smaller companies, but it will be a long time before we see it go much beyond that.
About the Author
You May Also Like