The Hadoop Big Data Management Challenge

IT infrastructure for enterprise applications such as mission-critical Big Data clusters suffers from high costs and poor agility/scalability.

Virtualization has had limited penetration in this segment because of its performance overhead and inability to guarantee SLAs. Consequently, Big Data applications like Hadoop are mostly deployed on dedicated physical clusters, leading to cluster sprawl, low hardware utilizations, over-provisioning for peak demand, long lead time to scale/deploy clusters, and the need to copy petabytes of data for each cluster.

There are multiple challenges that need to be resolved.

  • Manage multiple technologies and hand-coded integration between the diverse collection of many open source projects
  • Improve hardware utilization, reduce CAPEX
  • Reduce OPEX
  • Handle  seasonal spikes and growth
  • Share data with development


Hadoop Challenges Resolved by Robin Application Virtualization Platform

Robin Solution

Robin Solution

Import once, deploy in minutes, guarantee performance and manage Hadoop lifecycle

  • Create a pre-baked image of all Hadoop components with the latest version of each component
  • Define application details in a single YAML file
  • Deploy Hadoop within minutes
  • Get Application-to-Spindle QoS
  • Manage Application lifecycle
  • Handle seasonal spikes & growth with elastic scaling
  • Share data between dev and prod environments

Robin Application Virtualization Platform provides a complete out-of-the-box solution for hosting Hadoop in your big data pipeline on a shared platform, created out of your existing hardware – proprietary / commodity, or cloud components.

Hadoop deployment on Robin

Hadoop ecosystem on Robin Application Virtualization Platform - Simple Hadoop Management

Robin Benefits for Hadoop

Agile Provisioning

Agile Provisioning Big Data - Hadoop - Robin Systems
  • Simplify cluster deployment using application-aware fabric controller—provision entire Hadoop cluster with 1 click and be operational within minutes
  • Deploy container-based “virtual clusters” – collection of containers running across commodity servers
  • Automate tasks – create, schedule and operate virtual application clusters
  • Scale-up or scale-out instantaneously to meet application performance demands

Cloudera Cluster deployment on Robin
Watch demo.

Robin’s application-aware fabric controller simplifies cluster deployment and lifecycle management using container-based “virtual clusters.” Each cluster node is deployed within a container and a collection of containers running across servers form the “virtual cluster.” This allows Robin to automate all tasks pertaining to the creation, scheduling, and operation of these virtual application clusters, to the extent that an entire Hadoop cluster can be provisioned or cloned with a single click and minimal upfront planning or configuration.

Cluster Consolidation and QoS

Cluster Consolidation Big Data Hadoop Robin Systems
  • Eliminate cluster sprawl with virtual application clusters on shared hardware
  • Enable multi-tenancy with performance isolation and dynamic performance controls
  • Leverage dynamic QoS controls for every resource – CPU, memory, network and storage

Controlling IOPS in a shared environment
Watch demo.

Robin eliminates cluster sprawl by deploying virtual clusters on shared hardware. The key to successful multi-tenancy is the ability to provide performance isolation and dynamic performance controls. The Robin application-aware fabric controller equips each virtual cluster with dynamic QoS controls for every resource that it depends on – CPU, memory, network and storage. Robin enables you to allocate min and max IOPS to the applications as required. This creates a truly elastic infrastructure that delivers CPU, memory, network and storage resources – both capacity and performance – to an application exactly at the instant it is needed.

Data Sharing

Storage efficiency for Big Data Hadoop with Robin Application Virtualization Platform
  • Enables data sharing pointing HDFS of one cluster to another
  • Out-of-the-box 2-way or 3-way replication

Share data across 2 Cloudera clusters
Watch demo.

Robin provides out-of-the-box 2-way or 3-way replication that removes the need to specify application-specific replication. Robin also allows setting up two Hadoop clusters sharing data on the same HDFS. This enables applications to share data due to a unified view across various data silos.