Altiscale Preps Large-Scale Hadoop As A ServiceAltiscale Preps Large-Scale Hadoop As A Service

Startup, led by former Yahoo CTO Raymie Stata, aims to turn 10- and 20-node Hadoop users into 100- and 500-node users.

Charles Babcock, Editor at Large, Cloud

July 1, 2013

3 Min Read
information logo in a gray background | information

5 Big Wishes For Big Data Deployments

5 Big Wishes For Big Data Deployments


5 Big Wishes For Big Data Deployments(click image for larger view and for slideshow)

Another form of Hadoop expertise has emerged with the announcement of Altiscale, a small startup that is working on offering large-scale Hadoop operation as a service.

Raymie Stata, CEO of Altiscale, was CTO at Yahoo when Doug Cutting and other Yahoo developers put Hadoop to work indexing the Web. But there's already so much former Yahoo expertise in the Hadoop marketplace that Stata felt compelled to ask in an Altiscale blog post on June 12: "Does the world need yet another Hadoop startup from a bunch of former Yahoos?"

The answer, obviously, was yes. Altiscale's plan is to offer large-scale Hadoop as a service, and that approach became public June 19 at the GigaOm Structure show in San Francisco. Altiscale's product is still in private beta, but Stata said in an interview that it will be ready when the production version of Hadoop 2.0 comes out of the Apache Software Foundation. That's expected to happen in August.

Many companies have found Hadoop highly useful and are doing important work on 10- or 20-node clusters. Hadoop is natural cluster software, spreading unstructured data out across a cluster, and then assigning data to a CPU close to it when it comes time to sort and process it. "We want to target the folks who are using Hadoop on 10-20 node clusters," Stata explained. "They're using them successfully, and their usage is growing."

[ Read about Hadoop's evolution from backroom science project to industry-leading big data manager. See Hadoop: From Experiment To Leading Big Data Platform. ]

But clusters are unlikely to grow at the same pace as users' appetite for more Hadoop analytics. As Hadoop clusters get bigger, managing them becomes more complicated and time-consuming until Hadoop users face diminishing returns on their effort, Stata said.

His firm wants to use large-scale Hadoop expertise to turn those 10- and 20-node users into 100- and 500-node users. "We want to build big Hadoop clusters [and make them available as an online service]," Stata said.

The future Altiscale service will make use of the Yarn resource manager, which is expected to be part of Hadoop 2.0. Yarn moves Hadoop beyond its one-job-at-a-time batch processing style of operation and allows it to run multiple applications simultaneously. With Yarn, Hadoop becomes a message passing system rather than a MapReduce system, with the messages able to more dynamically change the data available to the cluster's CPU power.

By catching the release of Hadoop 2.0, Altiscale hopes to bring enough new features to market to capture business from the many Hadoop companies already firmly established in the marketplace, such as Hortonworks and Cloudera.

Previously known as Verticloud, Altiscale changed its name to avoid disputes over trademark and copyright.

Read more about:

20132013

About the Author

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for information and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights