NetApp A-SIS Deduplication:
Top 10 Customer Questions Answered
In May, NetApp announced a new deduplication technology that can significantly increase the amount of data stored in a set amount of disk space: Advanced Single Instance Storage (A-SIS) deduplication. This technology is available (at no charge!) for NetApp NearStore® R200 and NearStore on FAS systems.
Figure 1. Before and after A-SIS Deduplication.
A recent NetApp TechTalk Webcast and chat session generated a substantial number of questions about the new technology, and we saw a very high response to last month's Tech OnTap engineering perspective.
The full 23-page chat transcript (pdf) is now available, while the top 10 customer questions about A-SIS deduplication are answered below:
- What exactly is A-SIS deduplication?
- How do I add the A-SIS deduplication capability to a system?
- Why is an R200 or NearStore license required?
- Are there any plans to remove the NearStore license requirement?
- What do you mean by "light-duty" primary storage?
- Can the system perform other operations while A-SIS deduplication is running?
- Can I estimate my space savings before installing A-SIS deduplication?
- Can previously written volume data be deduplicated?
- Can I schedule the time when deduplication is run?
- What kind of space savings can I expect?
1. What exactly is A-SIS deduplication?
A-SIS deduplication is a general-purpose space reduction feature available on NearStore R200 systems and NearStore on FAS systems. When A-SIS deduplication is enabled, all data in the specified flexible volume can be scanned at intervals and duplicate blocks removed, resulting in reclaimed disk space.
Note: A-SIS deduplication is not supported on R100, R150, FAS250, FAS270, or any products in the 800 or 900 families.
2. How do I add the A-SIS deduplication capability to a system?
The A-SIS license enables the deduplication capability. This license is available for R200 systems and for any FAS system that has a NearStore license installed. There is no charge for the A-SIS deduplication license. A-SIS deduplication requires a minimum Data ONTAP® version of 7.2.2 for FAS3000/6000 systems.
3. Why is an R200 or NearStore license required to enable A-SIS deduplication?
Although A-SIS deduplication is application transparent, it has not been tested with mission-critical primary applications. By requiring a NearStore R200 system (or NearStore on FAS license) to enable A-SIS deduplication, we can help ensure that customers do not attempt to use A-SIS deduplication in performance-intensive application environments that have not been fully tested.
4. Are there any plans to remove the NearStore license requirement for A-SIS deduplication?
We are continuously monitoring customer adoption of A-SIS deduplication in backup, archival, and light-duty primary storage environments. As we gain more experience with A-SIS deduplication in these environments, we fully expect its use to broaden. Until we gain that experience however, A-SIS deduplication will require the NearStore license.
5. What do you mean by "light-duty" primary storage? Is this appropriate for A-SIS deduplication?
What we mean by light-duty primary storage is volumes that contain primary (first copy) data, but that are not performance driven. Some examples of this would be user home directories, document directories, and application volumes that experience heavy I/O loads during the day but are quiescent at night and on weekends. These volumes might very well benefit from A-SIS deduplication if the system has the performance headroom to support the additional overhead imposed by A-SIS deduplication.
6. Can the system perform other operations while A-SIS deduplication is running?
Yes, A-SIS deduplication runs as a background process and the system can perform any other operation during this process.
7. Can I estimate my space savings before installing A-SIS deduplication?
Yes. A space estimation tool is available to NetApp SEs. This tool is a standalone application that operates on a Linux® client and will "crawl" through any NFS volume (either NetApp or non-NetApp NFS volumes up to 2TB) and will estimate the amount of space savings you will get with A-SIS deduplication.
8. Can previously written volume data be deduplicated?
Yes. A CLI command signals A-SIS deduplication to scan and deduplicate existing data on a volume. This command can be run at any time on a volume that contains previously written data, and we recommend it be run whenever A-SIS deduplication is first enabled on a volume.
9. Since A-SIS deduplication is run after the data is written to the volume, can I schedule the time when this deduplication is run?
Yes. A CLI command allows you to set individual A-SIS deduplication schedules for each volume.
10. What kind of space savings can I expect with A-SIS deduplication?
The space savings of any deduplication product is dependent on the number of duplicate objects that can be found and removed. A-SIS deduplication is no different. Based on internal testing and customer feedback, the chart below illustrates some sample space savings achieved by A-SIS deduplication in typical environments:
Dataset
Full System Backups*
VMware Images
Tech Pubs
Software Archives
Database
Home Dirs
Web & MS Office
Oil & Gas Seismic Data
E-mail Archive |
Sample Space Savings Observed
20:1 (95%) over time
85%
50%
50%
30 – 50%
30 – 50%
30 – 45%
30%
20% |
*In data backup environments, space savings grow over time as repetitive full backups are retained. For example, tests with CommVault Galaxy provided a 20:1 space reduction over time, assuming daily full backups with a 2% daily file modification rate.
More Information
Didn't find the answers you were looking for? Several hundred questions were submitted during a July 19 TechTalk chat session. For more details, check out the full 23-page transcript (pdf).