Rubrik Addresses Age Old Backup Problems

The backup and recovery process is complex and brittle. Virtualization has only made the situation worse. To meet the ever increasing demands of their users and application owners IT professionals have been forced to try a never ending parade of point data protection solutions. These point solutions combined with existing legacy methods have created data protection sprawl that makes the environment even more complex. Rubrik has introduced a converged data management solution that combines hardware and software into a single platform so that the process can become easier and more flexible.

Among the components found in most mid-sized to very large backup environments are backup servers, media servers, search servers, disk based backup repositories for backups, tape libraries, data protection networks, and tape archives. There are also catalog databases to track all backups and media, as well as licensing, deduplication, and replication metadata indexes, all of which need to be tracked and managed. To compound matters further, businesses today are demanding instantaneous recovery, faster discovery of archived files, and rapid access to test and development resources.

While some vendors have tried to simplify the backup process by providing backup appliances with pre-integrated backup software, this only reduces the initial installation and deployment time but it does not address the long-term issues.

Rubrik Offers New Approach to Old Problem

Rubrik is a new company with a solution, which they call a Converged Data Management Platform, designed to meet these challenges. It borrows the hyper-converged approach from server virtualization and has the potential to eliminate many of the new point data protection products and eventually legacy backup systems. It accomplishes this consolidation by replacing all of these parts with a single 2U appliance that contains converged backup with globally deduplicated storage in a scale out, distributed computing architecture similar to that used by Amazon and Google. In addition to having the potential to reduce significantly the number of hardware and software products needed in a backup architecture it can also greatly simplify data protection, recovery and archive operations.

While Rubrik is a new company, they have an experienced management and technical team that include key engineers who were behind Google, VMware, Facebook, Data Domain and Oracle Exadata; and have been in-market with a GA product since June.

Rubrik’s new data protection software was designed from the ground up to complement the web-scale hardware architecture that scales backup server compute and hybrid storage capacity across multiple nodes. It is a vendor-agnostic platform with the ability to support any third-party ecosystem technology by building additional modules on the integration layer, which exposes the API set for building custom integration points into applications, hypervisors, containers and protocols.

A Closer Look at the Rubrik Appliance

Rubrik is initially focusing on data protection for VMware virtual environments, which makes sense considering the increasing amount of virtualization going on in most data centers today as well as the complexity in managing and protecting the data in these environments. They are also providing protection for physical Oracle servers. In addition to VMware and Oracle, Rubrik will also be adding the ability to backup physical MS-SQL servers and support for the KVM hypervisor, in the near future.

Rubrik’s new appliance is a 2U device built from industry standard commodity hardware. Each appliance contains four nodes (servers) in a cluster setup, which can be scaled out non-disruptively by simply adding more nodes to handle increasing volumes of data as well as increased data protection compute demands. Each cluster can be scaled out to thousands of nodes but regardless of the number of nodes deployed; they are still managed as a single system through a GUI or API driven user interface.

Rubrik is one of the first data protection vendors to utilize a flash tier to provide high-speed data ingestion, which solves a common cause of failed backups. Moreover, as primary storage becomes all-flash, secondary storage needs to provide similar performance so that applications can be recovered quickly. Additionally it applies content-aware global deduplication and compression to maximize data efficiency.

At the heart of the Rubric appliance is the Rubrik Converged Data Management system, which consists of a file system, metadata service, cluster management and a task framework, all of which are distributed across the cluster. This structure provides scalability while eliminating performance bottlenecks. RCDM also delivers an agentless, rack-and-go experience through integration into existing partner APIs such as VMware VADP.

Some of its more notable features are:

  • Agentless Discovery – Uses VMware APIs to scan vCenter and automatically builds a list of all VMs and applications.
  • Google like global search function with global index across all on-premises and cloud storage. File searches can be entered into the UI (User Interface) from any VM and Rubrik searches for all file versions across all VMs to produce immediate suggestion lists along with various options once the desired file is located.
  • Easy to set up “Rack and Go” system – can be up and running in 15 minutes.
  • Secure Cloud Out – can replicate deduplicated data for off-site protection via an integrated, secure connection with Amazon S3 services.

Examining the Core of the Converged Data Management System

The core of Rubrik’s Converged Data Management system consists of four main components.

  • The Rubrik Cloud-Scale File System is a distributed file system which is Fault Tolerant (resilient to multiple node and disk failures), Flash-Optimized (built for hybrid flash/disk architecture to maximize I/O throughput), Storage Efficient (utilizing zero-copy clones to make multiple copies of data from one “golden image”), and Scale-out NFS server (the system exposes itself as scale-out NFS server to any host when a snapshot is mounted). This file system stores and manages versioned data.
  • The Rubrik Distributed Metadata System that operates alongside the Cloud-Scale File System provides an index that can be accessed at high speeds. It also provides continuous availability, linear scalability with no single point of failure. It accomplishes this by distributing copies of the data throughout the cluster. Access to the metadata is maintained even in the case of a node failure.
  • Rubrik Cluster Management manages the Rubrik system setup and ongoing system health using a zero-configuration multicast DNS protocol to automate appliance discovery.
  • Rubrik Distributed Task Framework is the engine that globally assigns and executes tasks across the cluster in a fault tolerant, efficient manner. Tasks are automatically load balanced across the entire cluster, and tasks are distributed to the nodes that house the impacted data.

Rubrik’s Converged Data Management is the “brains” of the system, which enables data lifecycle management from initial data ingest to archive and retirement. It stores versions of data using a full snapshot combined with forward and reverse incremental copies.

It also ensures data integrity by building multiple checks within the file system and data management layers.

The UI (User Interface) is built on a REST API-driven framework with a HTML5 web user interface, which is centered on policy driven SLAs. It is designed for ease of use while reducing information overload by displaying only the items that actually require user attention or intervention.

This simplified interface avoids the usual problems encountered when moving to a new backup system such as a high learning curve and possible backup misconfigurations that put data at risk and misunderstood options, to name a few.

A Look at Basic System Operations

Setup of the Rubrik appliance is straightforward and simple. Once the appliance is racked, connected and configured with the necessary IP addresses for each node in the cluster and given the necessary login credentials for the primary virtualized environment, you connect it to your vCenter servers.

The Rubrik software then scans the vCenter via VMware APIs and automatically discovers and compiles a list of all VMs and applications. Rubrik manages data protection via SLA (Service Level Agreements) instead of “backup jobs”. An example SLA might be one that defines specific VMs to protect how frequently snapshots are to be taken, S3 storage as the target for the snapshots and a retention policy specifying that the snapshots will be retained for three months.

The administrator then selects the VMs to backup to the appliance along with an appropriate SLA policy. In addition to the default SLAs included in the system, the Dynamic Policy Engine allows the backup admin to create as many custom SLAs as needed to meet the organization’s backup and retention windows.

When backup begins, the VM is snapshotted and bits are sent to the Rubrik appliance where they are deduped inline and temporarily stored on flash. Rubrik uses VMware’s CBT (Change Block Tracking) to copy only the changed blocks from a previous operation. The SLA determines if the data will ultimately be stored on disk as well as being sent to S3 type endpoint storage in the cloud. If data is sent to the cloud the system leverages deduplication for optimal WAN transfer efficiency. Data in-flight and at-rest in the cloud uses military grade AES 256-bit encryption.

Recovery is also simple and straightforward. Simply pick the VM and a restore date or if restoring a single file, pick the VM, the file and the restore date. All backup metadata is stored on the flash drives for fast recalls.

Once the needed file(s) is/are located, you can perform a recovery or an instant mount. For very fast recoveries, the backup can be mounted directly on the Rubrik flash tier then you mount it to vCenter over an NFS mount point. Unlike some other appliances, the Rubrik appliance can handle multiple VM mounts without any noticeable performance degradation.

New Enhancements in 2.0 Release

The new 2.0 product release adds various new features and a new appliance model. A few of the more salient new features are:

  • Unlimited, nondisruptive replication that is asynchronous, deduplicated, WAN-efficient, and master-master native replication with zero impact on production
  • Integrated policy engine: Complete data protection, including off-site replication and cloud archiving by selecting desired RPOs and retention with a single integrated policy engine
  • Disaster Recovery: With near-zero RTOs and elastic RPOs
  • Failover and failback with complete data management among multiple sites: Admins can mount data directly on Rubrik for instant off-site recovery
  • Active Directory integration

Rubrik has also added a new model, the r348 to the previously released r344.

The existing r344 model provides four 8-Core 2.4GHz CPUs, 256GB of RAM, 48TB of HDD, and 1600GB of SSD storage.

The new r348 is also a 2U unit with double the capacity of the r344 and increased ingests rates.

Additionally, all units have 6/8 GigE, and 6/8 10 GigE interfaces.

StorageSwiss Take

Overall Rubrik appears to have a robust hyper-converged backup solution that addresses many of the problems and limitations inherent in current legacy backup environments. It greatly simplifies the enterprise backup infrastructure by collapsing many of the discrete backup devices along with their attendant software, into a single appliance, is simple to deploy and provides advanced data protection and recovery features that also seamlessly leverage cloud storage through an easy to use interface.

Organizations that want a way to streamline their backup infrastructure and potentially lower their costs for it, should take a close look at Rubrik’s new backup solution.

Sponsored by Rubrik

Unknown's avatar

Joseph is a Lead Analyst with DSMCS, Inc. and an IT veteran with over 35 years of experience in the high tech industries. He has held senior technical positions with several major OEMs, VARs, and System Integrators, providing them with technical pre and post- sales support for a wide variety of data protection solutions. He also provided numerous technical analyst articles for Storage Switzerland as well as acting as their chief editor for all technical content up to the time Storage Switzerland closed upon their acquisition by StorONE. In the past, he also designed, implemented and supported backup, recovery and encryption solutions in addition to providing Disaster Recovery planning, testing and data loss risk assessments in distributed computing environments on UNIX and Windows platforms for various OEM's, VARs and System Integrators.

Tagged with: , , , , , , , , , , ,
Posted in Product Analysis

Enter your email address to follow this blog and receive notifications of new posts by email.

Join 17.4K other subscribers
Blog Stats
  • 1,980,392 views