The Big Data Infrastructure Challenge

IT infrastructure for enterprise applications such as databases and mission-critical Big Data clusters suffers from high costs and poor agility/scalability.

Virtualization has had limited penetration in this segment because of its performance overhead and inability to guarantee SLAs. Consequently, these applications are mostly deployed on dedicated physical clusters, leading to cluster sprawl, low hardware utilizations, over-provisioning for peak demand, long lead time to scale/deploy clusters, and the need to copy petabytes of data for each cluster.

This problem applies to cloud deployments as well, where dedicated resources and high-performance storage translates to higher cost.


Robin Solution for Big Data

Robin can help create an elastic, agile, high-performance foundation for data-driven applications.

The platform provides a complete out-of-the-box solution for hosting all the data and data-driven distributed applications in an enterprise on a shared platform, created out of commodity or cloud components. The solution can be deployed on bare metal or on virtual machines, allowing organizations to rapidly deploy multiple instances of their data-driven applications, on premises or on-cloud, without creating additional copies of data. It decouples compute and storage layers and combines a highly distributed cost-optimized persistent storage layer with an integrated host-side distributed caching layer to provide high-performance storage access to applications.

The efficiencies unlocked by Robin result in potential CAPEX and OPEX savings up to 40%.

Benefits for Big Data 

Storage Efficiency and Sharing

  • Pool low-cost commodity storage (or cheap cloud storage such as S3) into a single logical namespace that is accessible to all applications running on compute nodes
  • Get maximum data security with built-in data volume level encryption
  • Enables data sharing with thin clones of the entire Big Data cluster
  • Out-of-the-box 2-way or 3-way replication

Robin scale-out storage layer creates a single logical entity out of a pool of commodity storage media, presenting it as a block storage pool to the containers running on the compute nodes. This allows an organization to pool together its low-cost commodity storage (or cheap cloud storage such as S3) into a single logical namespace accessible by all applications running on the compute nodes. The application data is sliced into segments and distributed across disks in multiple storage nodes. Robin provides out-of-the-box 2-way or 3-way replication that removes the need to specify application-specific replication.

Agile Provisioning

Agile Provisioning Big Data Robin Systems
  • Simplify cluster deployment using application-aware fabric controller—provision entire Hadoop cluster with 1 click and be operational within minutes
  • Deploy container-based “virtual clusters” – collection of containers running across commodity servers
  • Automate tasks – create, schedule and operate virtual application clusters
  • Scale-up or scale-out instantaneously to meet application performance demands

Robin’s application-aware fabric controller simplifies cluster deployment and lifecycle management using container-based “virtual clusters.” Each cluster node is deployed within a container and a collection of containers running across servers form the “virtual cluster.” This allows Robin to automate all tasks pertaining to the creation, scheduling, and operation of these virtual application clusters, to the extent that an entire Hadoop cluster can be provisioned or cloned with a single click and minimal upfront planning or configuration.

Cluster Consolidation and QoS

Cluster Consolidation Big Data Robin Systems
  • Eliminate cluster sprawl with virtual clusters on shared hardware
  • Enable multi-tenancy with performance isolation and dynamic performance controls
  • Leverage dynamic QoS controls for every resource – CPU, memory, network and storage
  • Build elastic infrastructure that provides all resources to each application as needed

Robin eliminates cluster sprawl by deploying virtual clusters on shared hardware. The key to successful multi-tenancy is the ability to provide performance isolation and dynamic performance controls. The Robin application-aware fabric controller equips each virtual cluster with dynamic QoS controls for every resource that it depends on – CPU, memory, network and storage. This creates a truly elastic infrastructure that delivers CPU, memory, network and storage resources – both capacity and performance – to an application exactly at the instant it is needed.

Time Travel

  • Take unlimited cluster snapshots
  • Restore or refresh a cluster to any point-in-time using snapshots.

Application Time Travel
Watch demo.

Robin provides out of the box support for application time travel. Cluster level distributed snapshots at pre-defined intervals can be really useful to restore the entire application if anything goes wrong. Robin recommends admins to take snapshots before making any major changes to the cluster. Whether you are upgrading the software version or making a configuration change make sure to have a snapshot. If anything goes wrong the entire cluster can be restored to the last known snapshot in matter of minutes.