Backup and storage administrators have a two-pronged mission: Don't lose any data, and don't run out of room. Accomplishing that mission is simple, but it is not easy. As admins work out the best way to retain an ever-increasing volume of data, they face the secondary storage squeeze.
In the first part of the squeeze, described in the e-book The Secondary Storage Squeeze: How Can I See It Coming?, business users generate the data and want to keep it close at hand, but primary storage is expensive. The way out of that part of the squeeze is for business owners and IT admins to agree on which data and apps are most important, then apply policies and service level agreements (SLAs) on data retention and secondary storage.
In the second part of the squeeze, administrators must reliably back up a fast-growing balloon of primary data within available time limits. Adding hardware and capacity to the primary storage environment seems to relieve the pressure, but in fact it leaves even less time to complete incremental and full backups for increasing volumes of data.
The first e-book examined ways to resolve the business side of the squeeze: establishing retention policies, putting together a storage strategy and conducting a business impact analysis. This e-book covers the technology side of the squeeze: implementing the deduplication technology and protocol accelerators in the Quest DR Series Disk Backup and Deduplication Appliances. Storage and backup administrators will discover a valuable way to overcome outmoded backup software, limited network bandwidth and the secondary storage squeeze.
In this part of the squeeze, admins must reliably back up a fast-growing balloon of primary data within available time limits.
The term "secondary storage" applies in this context to the external devices not connected directly to production servers. Secondary storage is generally used for backup data sets for three reasons:
Figure 1 shows a representative layout of primary site storage, with tiered disk, backup disks and virtual standby. Corresponding to the tiered and backup disks at the primary site are less expensive, slower media at the secondary site.
Figure 1: Storage resources
The secondary site is not intended for production, applications or web facing assets like Microsoft Exchange, SQL databases, Salesforce.com or pricing information; however, its data access suffices for business functions like recovery.
Secondary storage is excellent business continuity insurance against data loss. Consider some alternatives:
Backing data up from primary to secondary storage is ideal. Getting the data there efficiently is the tricky part.
Secondary storage is excellent business continuity insurance against data loss.
Backup across any of the connections mentioned above takes precious time, no matter how fast the network is. Besides the amount of data being generated, there is the amount retained for compliance or business continuity, resulting in average data growth of up to 40 percent per year into the next decade.1
As the volume of data in an organization continues to grow, the risk increases that administrators will be unable to complete their backup jobs in the allotted time because of network congestion, system/process interruptions and an insufficiently long backup window. Admins have turned to different forms of secondary storage — local, remote or cloud — but connecting secondary storage to primary storage adds the complexity of media/backup servers, backup software and policies for implementing backups.
It's not a simple way to get their data to secondary storage, but many companies implement backup software, set up policies, construct backup storage repositories and live with the complexity until something breaks or the business requires a more responsive process that can meet service-level expectations. And still they do not remove all the risk and storage headaches of backup; in 2014, 73% of users were less than very confident they could restore critical data when needed. Incomplete backup jeopardizes business continuity and intensifies the secondary storage squeeze.2
They can greatly reduce risk and headaches by implementing purpose built backup appliances — disk backup devices designed as storage repositories. With backup appliances, companies can maintain backup data on disk longer, for faster, more accurate restores when needed. To deal with the ever-shrinking backup window, backup appliances include deduplication technology.
Deduplication reduces the amount of space required for backups by identifying and referring to repeated blocks of data instead of backing up the same data multiple times. For example, if a company performs a full backup of its customer resource management (CRM) data once a week and incremental backups each night, deduplication algorithms may find that only 10 percent of the data has changed from one day to the next.
The backup software sends only the changes, reducing the amount of data to be stored, the amount of time and space required to store it and the network bandwidth to send it.3
Backup appliances simplify secondary storage. Deduplication techniques can save on the order of 70 percent of storage space. Despite those advantages, however, the combination is not necessarily a panacea for the secondary storage squeeze.
Deduplication is subject to some technical limitations:
To address those limitations, backup software providers have developed source-based deduplication, in which a software plugin or agent installed at the data source runs the algorithm. Source based deduplication increases the speed at which data goes into the backup funnel through deduplication at the source. It removes the redundant data before sending it over the network, which reduces network congestion and improves backup speeds by a factor of three to four.
But source-based deduplication requires adequate hardware resources (CPU, memory, disk) to function, so admins face a trade-off between increasing resources on the source machine and increasing network bandwidth.
Backup appliances and deduplication are also subject to business limitations.
Most medium and large organizations connect their enterprise applications to secondary storage devices with different protocols like Network File System (NFS) for NAS storage, OpenStorage Technology (OST) for Veritas backup software products and Common Internet File System (CIFS) for other backup applications.
As their backup window shrinks and they seek relief from the secondary storage squeeze, these organizations turn to backup appliances featuring deduplication technology that supports their existing investments in backup software applications and processes. Some backup hardware manufacturers offer protocol accelerators to speed the ingestion of data during backup, but the accelerators work only with their proprietary storage appliances and they cost more.
The business problem arises when it becomes necessary to manage backup within the available budget and replication within the available network bandwidth. If the organization introduces additional backup applications, the appliance capacity must scale to accommodate the increased workload. Furthermore, the costs for protocol accelerators, replication, encryption and maintenance can add up.
The DR Series of Disk Backup and Deduplication Appliances addresses technology and business problems so storage administrators can extend their backup windows and protect their data reliably.
Appliances in the DR Series use DR Rapid to support the protocols of all leading backup applications — NFS, CIFS, Rapid NFS, Rapid CIFS, OST and Rapid Data Access (RDA) — to reduce the time and storage space needed for deduplication and backup. Additionally, Quest makes the protocol accelerators available at no additional charge, so administrators can perform source-based deduplication on data from a wide variety of backup applications.
Quest makes the protocol accelerators available at no additional charge, so administrators can perform source-based deduplication on data from a wide variety of backup applications.
In the following scenarios, assume a variety of servers hosting Microsoft Exchange and other business applications on a 1GbE network.
Figure 2 depicts backup to an ordinary secondary storage device. At the best-case throughput of 1GbE (.4093 TB per hour), the device takes 4.9 hours to back up 2TB of data from the Exchange server alone. If four other servers are backed up in parallel — a normal expectation in most companies — the network will limit the ingest rate of the backup target device to .8186 TB per hour. Throughput is divided among all five machines, so it can take up to 12.2 hours to back up 2 TB of data from each one.
Figure 2: Typical backup
Replacing the backup target device with a DR Series backup appliance does not change the ingest rate or network throughput, but it does change the amount of data moving across the network.
Figure 3 represents a backup path as a road and data as vehicles moving along the road. Data is backed up from the Exchange server to the DR Series appliance (or to any appliance) for the first time, and deduplication takes place downstream. Redundant data (dark blue car) is discarded after traversing the network. Needlessly sending that 20 percent of data is an inefficient use of network bandwidth, but on a first backup it may be acceptable.
Figure 3: First backup to DR Series appliance
Figure 4 shows a subsequent backup, with new and changed data represented by the grey car. The resulting backup data is complete, but 80 percent of the data sent from the Exchange server is redundant and was discarded after traversing the network. Sending so much useless data aggravates the secondary storage squeeze.
Figure 4: Subsequent backup to DR Series appliance
A protocol accelerator residing as a plugin on the Exchange server performs deduplication before the data traverses the network. DR Rapid identifies redundant data at the source, places references to the repeated blocks in the backup stream, eliminates the duplicate blocks and sends only the changed data (grey car), over the network to the DR Series appliance.
The combination of DR Series Disk Backup Appliances and DR Rapid deduplication has relieved the secondary storage squeeze for organizations worldwide:
Figure 5: Backup to DR Series appliance with source-based deduplication
With the growth in the volume of data that companies want to retain, most organizations eventually include secondary storage in their storage strategy as a means of reducing the risk of data loss. With a combination of purpose-built backup appliances, deduplication technology and protocol accelerators, they can address the secondary storage squeeze and shortened backup window to protect their data reliably.
Quest DR Series Disk Backup and Deduplication Appliances and DR Rapid deduplication help them overcome the technical and business problems of secondary storage. Applications can back up freely over all major connection protocols, send their data over the network, consume far less network bandwidth and occupy much less space.
Quest helps our customers reduce tedious administration tasks so they can focus on the innovation necessary for their businesses to grow. Quest® solutions are scalable, affordable and simple-to-use, and they deliver unmatched efficiency and productivity. Combined with Quest's invitation to the global community to be a part of its innovation, as well as our firm commitment to ensuring customer satisfaction, Quest will continue to accelerate the delivery of the most comprehensive solutions for Azure cloud management, SaaS, security, workforce mobility and data-driven insight.