Reducing Time-to-Deployment:
Improving Agility for Database Development, Test, and QA
By Bruce Moxon
Last month I introduced the notion of
whole application virtualization, contending that a number of core storage services are necessary to deliver on this vision and the promise of grid computing. This month, I explore two of those services, mirroring and cloning, in the context of a rapid application development model for database-centric applications that increases an organization's agility.
Every database administrator (DBA) has experienced it—a request from the application development organization for a copy of a production database to support application development and testing. For larger databases in excess of a few hundred gigabytes such a request sets off a series of potentially time-consuming planning, provisioning, and implementation activities. (Oh, well, there goes the weekend...)
The Plan...
With traditional tools and approaches, a strategy for replicating relevant portions of the production database while minimizing impact on production operations must first be developed.
Depending on an organization's operations, this plan might include one of two fundamental approaches: building a dev/test copy of the database from backup tapes or doing a (partial or full) extraction of data from the production database—using a variety of vendor-supplied and homegrown tools. Backup-based replication is more straightforward and nonintrusive on the production system, but can be a lengthy process—especially for tape-based backups. Extraction-based replication can be faster—especially if partial replication is sufficient for the scenarios under development or test—but may require further planning to ensure that the production system is minimally impacted during the period of data extraction.
Provisioning...
Once the plan is in place for the replication of the database, adequate resources—including both servers and storage—must be provisioned. The servers must then be loaded with the appropriate software "stack" (operating system, drivers, database, application servers,...) and the database must be restored or created and populated.
Replication...
For many organizations, the effort to replicate a large (multi-100GB) database can take hours or even days. Replications requiring restores from tape are particularly lengthy and manually intensive.
In many large-scale environments, multiple copies of production databases may be required to support a range of routine activities, including application and database development, test, QA, and perhaps user training. This clearly has a multiplicative factor on the physical and human resources required to support such an endeavor.
Reducing Time-to-Deployment for Database Applications
In the storage industry at-large, a few key technologies have matured over the past few years to help organizations address this growing need for physical database replication and rapid deployment.
Disk-to-disk backup technology has become the norm in many organizations, greatly enhancing the speed at which database replicas can be constructed from backups. Snapshots provide space efficient, point-in-time copies of data containers (file systems, volumes, LUNs) that can be used for rapid recovery and, increasingly, as the basis for creating full writable data copies in a technique known as cloning.
While these technologies greatly simplify the task of replicating the database data, organizations employing these approaches are still faced with two significant operational challenges. First, the act of cloning the production data can adversely impact production application performance. As a result, cloning may need to be restricted to "off" or "nonpeak" hours. Second, this approach still requires a complete copy of the underlying data to be made. This can pose nontrivial provisioning challenges, especially when copying large databases multiple times.
Due to its unique "DNA," which includes Data ONTAP® and WAFL®, Network Appliance has developed innovative implementations of these general industry approaches in ways that deliver unmatched speed, efficiency, and simplicity to the task of database replication and deployment.
With Data ONTAP 7G, Network Appliance introduced FlexClone™ volumes. These are writable flexible volumes that are created from NetApp Snapshot™ copies. As such, they are effectively near-instantaneous, space efficient, writable "clones," sharing the same physical blocks with the baseline file system or LUN. As the baseline and clone data containers diverge (for example, due to continuing updates in the production database or development changes to a cloned database), only the divergent blocks need to be written to disk.
The result is that working copies of very large databases can be created within a few minutes and with minimal incremental storage. A typical 300GB database can be cloned and brought up in just a couple of minutes. (NetApp customer Ben Rockwood provides specific examples of FlexClone volume creation in his article "The Versatile Storage Platform.")
FlexClone volumes can only be created on the same storage system on which the Snapshot copy of the original data resides. Hence cloned data shares the same storage resources as the source of the clone. In cases where the production storage system is not fully utilized or where NetApp control of service capabilities can be employed, this may be sufficient to shield the production database from additional load that could be imposed by development activities. More commonly, however, operational requirements will dictate that the development databases be created on a separate storage system. Enter NetApp SnapMirror® software.
SnapMirror can be used to continually replicate production data (synchronously, if desired) to a secondary storage system with minimal impact on the production environment. The cloning and subsequent development activity can then take place on the secondary system, completely "out-of-band" of the production system. This secondary system might also serve as the organization's disaster recovery (DR) target or might be constructed as a low-cost development system utilizing lower cost controllers, iSCSI, and SATA drives.
A blade server–based deployment depicting this scenario is shown below. In this scenario, new database servers would be dynamically provisioned from the blade server pool, using the database containers cloned on the secondary (Dev/Test) system.
In environments where third-party storage is deployed in the production environment, the same approach can be employed with native database mirroring capabilities, with a variety of emerging heterogeneous replication technologies, including NetApp SnapDrive® and V-Series technologies, and potentially with third-party replication tools such as those in Topio's Data Protection Suite™ or IBM's SAN Volume Controller (SVC).
Finally, this same approach can be applied in additional operational scenarios to improve utilization and agility without impacting production system performance. Examples include:
- Offloading end-of-month reporting to cloned databases
- Offloading intensive ad hoc queries and data warehousing extract-transform-load (ETL) operations in support of business analytics
- Distributed development, using SnapMirror to replicate data across geographically distributed sites and employing FlexClone at the remote site to create copies for development, QA, or training
Through the NetApp SnapManager® suite of products, these capabilities are being integrated into standard database management frameworks, such as Oracle® Grid Control and Microsoft® SQL Server Enterprise Manager. This approach provides DBAs with more rapid and space-efficient approaches to database backup, recovery, and cloning, all from the same management suite they're using today.
The implications for many database development organizations are far reaching. NetApp's unique FlexClone implementation, coupled with appropriate replication technology, can significantly reduce the time required to provision database replicas for a variety of development and operational scenarios. Because of the space savings aspect of this approach, many organizations are finding that not only can they do what they were doing before faster, they can also do a lot more of it (many simultaneous dev/test cycles, for example).
Finally, increased agility and reduction in physical storage requirements are accompanied by a simplification of operational procedures—in short, helping to deliver on the promise of grid computing.
Previous article by Bruce Moxon: