“Big Data” is in and hot these days. This year’s
Hadoop Summit 2011 attracted nearly 1,600 people, doubling the size of the conference from last year. Topics discussed at the Summit ranged from questions and concerns about
Hortonworks, a fresh spinoff from Yahoo!, to various technical and use case discussions around Hadoop. While the center stage was dominated by full distribution players, such as
Cloudera, Hortonworks and
MapR, newcomers focused on providing alternative, best-of-breed component solutions in the stack are also emerging and getting increased traction from the market. This is not a surprise; the market for “Big Data” is still young and fragmented, and people at various phases of the technology adoption lifecycle are looking for solutions best suited for their needs.
So the question is: full distribution or best-of-breed?
At Platform Computing, we believe there is a need for both. For someone who is new to Hadoop and would like to experiment with this new programming model, a full distribution solution seems to be an easy way to get up to speed and get acquainted with Hadoop as it contains all the elements in the stack needed for running MapReduce applications. But for someone who is already Hadoop savvy and would like to bring their MapReduce applications into production, a whole new set of requirements will need to be met. Customers who need a production ready solution are seeking enterprise-class capabilities, such as 1) superior predictability of the infrastructure and distributed runtime engine for MapReduce jobs, so it meets the organization’s SLA requirements; 2) high resource utilization to eliminate a siloed environment while allowing organizations “do more with less”; 3) a rich set of management capabilities for operational efficiency; 4) high availability to ensure hardware and service failures do not require jobs to be manually recovered or restarted from scratch, and 5) of course, faster performance.
The full distribution solutions currently on the market do not deliver those capabilities the mature market is looking for, that’s why Platform Computing is delivering
Platform MapReduce, a best-of- breed, distributed runtime engine for MapReduce workloads, to fill in the gap.
Launched on June 28, the eve of Hadoop Summit, Platform MapReduce received great traction at the event. As expected, users who have had a few years of experiences with either open source Hadoop or commercial solutions are well aware of the shortcomings in the existing options, and they were excited to hear about Platform MapReduce and the enterprise-class capabilities it provides.
Built on Platform Computing’s decade’s worth of experience in managing and scheduling workloads in distributed environments, Platform MapReduce is designed using the same core technology that has powered many Fortune 1000 customers for their mission critical, most demanding workloads--bringing that capability to MapReduce environment is a natural market expansion for the company. Platform MapReduce addresses the major issues that are holding back the current market, and it is designed to help organizations overcome those barriers of moving MapReduce applications into production. The positive responses we’ve already received from the market are a solid validation of our solution, we are looking forward to bringing a new set of capabilities to the Hadoop world.