The Hadoop Big Data Management Challenge
IT infrastructure for enterprise applications such as mission-critical Big Data clusters suffers from high costs and poor agility/scalability.
Virtualization has had limited penetration in this segment because of its performance overhead and inability to guarantee SLAs. Consequently, Big Data applications like Hadoop are mostly deployed on dedicated physical clusters, leading to cluster sprawl, low hardware utilizations, over-provisioning for peak demand, long lead time to scale/deploy clusters, and the need to copy petabytes of data for each cluster.
There are multiple challenges that need to be resolved.
Import once, deploy in minutes, guarantee performance and manage Hadoop lifecycle
Robin Application Virtualization Platform provides a complete out-of-the-box solution for hosting Hadoop in your big data pipeline on a shared platform, created out of your existing hardware – proprietary / commodity, or cloud components.
Hadoop deployment on Robin
Cloudera Cluster deployment on Robin
Robin’s application-aware fabric controller simplifies cluster deployment and lifecycle management using container-based “virtual clusters.” Each cluster node is deployed within a container and a collection of containers running across servers form the “virtual cluster.” This allows Robin to automate all tasks pertaining to the creation, scheduling, and operation of these virtual application clusters, to the extent that an entire Hadoop cluster can be provisioned or cloned with a single click and minimal upfront planning or configuration.
Cluster Consolidation and QoS
Controlling IOPS in a shared environment
Robin eliminates cluster sprawl by deploying virtual clusters on shared hardware. The key to successful multi-tenancy is the ability to provide performance isolation and dynamic performance controls. The Robin application-aware fabric controller equips each virtual cluster with dynamic QoS controls for every resource that it depends on – CPU, memory, network and storage. Robin enables you to allocate min and max IOPS to the applications as required. This creates a truly elastic infrastructure that delivers CPU, memory, network and storage resources – both capacity and performance – to an application exactly at the instant it is needed.
Robin provides out-of-the-box 2-way or 3-way replication that removes the need to specify application-specific replication. Robin also allows setting up two Hadoop clusters sharing data on the same HDFS. This enables applications to share data due to a unified view across various data silos.