| Home > News > Tech OnTap    

 
Printable Page
Five Tips for Improving Remote Office Backup and Recovery

Darrin Chapman
Data Protection Subject Matter Expert and Product Engineering Manager, Network Appliance

Darrin Champman Darrin Chapman is the person you turn to for just about any question involving disaster recovery or backup and recovery. He's been involved with every NetApp best practices guide involving data protection since 2002 and in his spare time designs training courses for customers and NetApp technical staff. Originally schooled as an electrical engineer, Darrin's background includes several years in network engineering for AT&T and Nortel.


The below article was originally published in the March 2006 edition of the Tech OnTap newsletter. To receive the newsletter monthly and enjoy other great benefits, sign up today.

Recent articles have suggested that up to 60% of remote backup procedures fail and less than 50% of data stored outside data centers can be recovered in a timely manner. Ouch!

At NetApp, we've helped thousands of customers efficiently manage and protect their remote office data. One of these customers is Norsk Hydro (abbreviated Hydro), which has operations in 60 countries and is one of the world's leading energy companies.

Over the past five years, Hydro has replaced a mesh of storage networks and direct-attached storage islands with a centrally managed storage infrastructure. The team has also implemented standard backup and recovery processes across over 200 remote sites, many of which are mirrored to a central disaster recovery facility. Over 3.5TB of remote data is currently protected at the company's central business continuance center. Worldwide, Hydro has nearly 400TB of NetApp storage.

In mid-2005, I flew to Oslo, Norway, and spent two days meeting with the Hydro IS Partner team. In addition to sharing corporate best practices, my goal was to document how Hydro had successfully designed and implemented its large-scale infrastructure transformation.

Hydro is an example of remote office backup and recovery done right. Below are five things any company interested in protecting remote office data can learn from Hydro's implementation:

#1: Consolidate on Networked Storage Where It Makes Sense

When Hydro initiated this project, all of its remote offices relied on various forms of local storage. Some of Hydro's remote offices are tiny, but others have 200 to 300 users and dozens of small file servers that need to be backed up. With data volumes increasing as much as 80% a year, the situation was becoming unmanageable. One of the first things Hydro did was identify the locations where it made economic sense to consolidate multiple file servers with a single networked storage system.

#2: Replace Remote Tape with Centrally Managed Backup

When Hydro started this project, nearly all of its remote offices were dependent on local tape backups. Roaming IT staff serviced the needs of many locations, meaning it might take days or weeks for something as simple as a file restore. One of Hydro's top priorities was to centralize backup information at major regional data centers.

Unfortunately, low-bandwidth connections between these sites and remote offices (which include distant oil rigs) made significant data transfer over wide-area networks (WANs) impractical. Working with NetApp, Hydro rolled out a solution that enabled it to back up remote office data over connections as slow as 2Mb per second to nearline secondary storage systems located in regional data centers.

To centralize backups for remote offices with very low-bandwidth connections, Hydro chose to:
  • Use block-level backups that copy only incremental changes. Hydro uses NetApp SnapVault® software to back up only changed data from NetApp storage and open systems SnapVault (OSSV) to back up Windows® and UNIX® servers in smaller offices. This is an area where NetApp has found a unique niche because we have a product that intelligently moves only data blocks that have changed (rather than whole files). This makes extremely efficient use of network bandwidth, allowing backups over low-bandwidth, high-latency WAN connections. (See sidebar for more details.)
  • Assign every OSSV client (primary system) its own storage volume. An OSSV file system or backup is written inside a quote management tree called a qtree (similar to a directory in a file system). Qtrees live within volumes on NetApp systems. Although NetApp advises against it, it is possible to create a single large volume supporting, for example, 50 or more OSSV primaries. An OSSV schedule applies to an entire volume, so in this scenario when backups are initiated the NearStore® secondary storage system will start pulling data from the 50 primary systems and sending it to the 50 qtrees as 50 concurrent backup streams. This can impact secondary system resource utilization and may approach concurrent session limits. In order to avoid this, Hydro assigned each OSSV primary its own volume where only the qtree belonging to that primary lives. A second issue is that one link may be slower than the others. For example, let's say one client has a small change rate and a fast link while another has a very large change rate and a slower link. Even if the speedy client finishes in five minutes, the backup cannot be completed until the slowest system is done. Again, by implementing a process where each OSSV primary is backed up to its own individual volume, this issue was avoided at Hydro.
  • Stagger schedules. By placing each primary system in its own volume, Hydro ensures it only has one backup stream at a time. The IT team staggers its backup streams by distributing the OSSV backup schedule throughout the day to make the most efficient use of available network bandwidth and resources.
  • Move all data that is backed up off to its own drive. Hydro implemented a policy in which all data that is backed up regularly on each Windows server is stored in one logical drive (E:), which the IT team refers to as the data drive. That way the team doesn't have to look for the data that needs to be backed up; it just points SnapVault to the E: drive on each server. This was something I hadn't seen before, and it is a very clever approach to simplifying backups
#3: Use Disk Volumes Instead of CDs For Bare Metal Recovery

A less frequent recovery scenario involves restoring an entire server to operational status after a hardware failure or disaster takes it down—commonly known as bare metal recovery (BMR). Given the amount of critical data being stored in some remote offices, Hydro had identified this as a priority for key offices.

Typical bare metal applications create a data volume that contains all the information necessary to restore a server. This data is usually written to a tape or bootable CD. If a server fails, you roll in replacement hardware and boot from CD. The server can be quickly rebuilt and returned to operation. One of the challenges with this approach involves updating and managing the resulting CDs and images.

Hydro uses Symantec® Live State Recovery (LSR) to create images, but instead of writing the resulting image to CD Hydro writes the image to a separate disk volume (on the same server or a separate server). The IT team then uses NetApp SnapVault software to move this image file up to a NearStore system at the data center so the images can be centrally managed.

If a server fails and the system disk must be restored, Hydro has four options:

  • Restore the image locally from the local data drive if available.
  • Restore the image from a NearStore at a regional data center directly to the failed server over a LAN connection.
  • Create a new system disk with the image files from the NearStore and send it to the site where the failed server is located.
  • Create a CD from image files on the NearStore and send it to the site where the failed server is located. Then simply restore from the CDs using LSR.
#4: Empower End Users

After finding that about 80% of restores were single files, Hydro identified end-user restores as one of its remote office data management goals. Using the .snapshot directory from Windows Explorer, Hydro end users are now able to browse through backups using fully GUI-based standard tools to locate and restore single files without IT intervention.

#5: Don't Lose Sight of the Big Picture

Implementing centralized backup was only one element of a larger, more ambitious master plan. After data backups and BMR images are copied from remote offices to NetApp storage at regional data centers, these systems are then mirrored to a recently established disaster recovery facility where Hydro has implemented a lights-out type operation. Data for critical regional and corporate applications is mirrored to the same facility.

The result is that all critical data is available at the disaster recovery facility. If any remote office goes down, its critical operations can be restarted at the DR facility using current data. The same goes for regional and corporate operations. Large tape libraries are available at the disaster recovery site for any data that needs to be archived to tape.

Hydro has successfully designed a global and highly scalable environment that enables it to protect data at many different levels.


The information in this article is useful to me.
1 2 3 4 5
Strongly Disagree Strongly Agree

What other types of information would you like to see included in this article?


Additional comments:

  Thanks! 



 
Related Information
Norsk Hydro Implements Company-Wide NetApp Solution (PDF)
SnapVault Best Practices Guide (PDF)
Enabling Rapid Recovery with SnapVault (PDF)
SnapMirror Best Practices Guide (PDF)
Implementing Efficient Backup over Low-Bandwidth Connections
NetApp SnapVault software streamlines backup and restore operations, making network backup and restore possible, even over low-bandwidth, high-latency connections.

Most backup applications work at the file level. For example, if even one byte changes in a 10MB file, the whole file is backed up. By comparison, SnapVault only sends the block containing the change (4KB versus 10MB in this example).

SnapVault provides this level of efficiency even when backing up third-party systems, including Linux®, Windows, and UNIX. In addition to great bandwidth efficiency, SnapVault also achieves great storage efficiency. Since only changed blocks are stored, incremental SnapVault backups use very little disk space and can be maintained on disk longer than other disk-based backup solutions and at a fraction of the cost.

Get the details. Read the reports:
Not All Snapshots Are Created Equal
After interviewing storage managers and database storage administrators at 15 to 20 companies, Mercer Management Consulting found that NetApp environments require less disk space than EMC for the same size database.

The cause: different approaches to Snapshot™ equivalent functionality.

Get the details. Read the full report. (PDF)