Oracle on NetApp Part III:
Why Protocol Should Be Irrelevant
By Steve Hight
The final installment of this three–part series on Oracle® storage architectures looks at a real–world deployment of Oracle on NetApp. Previous articles focused specifically on how NetApp engineering has optimized the use of Oracle on NFS and Oracle on FC SAN.
When Tech OnTap contacted me about writing an Oracle on NetApp article, they asked me to talk specifically about our experiences running Oracle on SAN and NAS. I told them, "Shame on you for still thinking in terms of SAN or NAS. You should be talking about unified storage and using whatever protocol best meets each client's needs or the needs of the application." The terms SAN and NAS are dead along with "totally tubular" and "awesome." (Those are dead, aren't they?)
At Catholic Healthcare West (CHW) we have more than 117 Oracle Database instances running approximately 150 applications supported by 55 NetApp storage systems with NFS, Fibre Channel, and iSCSI protocols. In total, we have over 750TB of storage. Since the reality is that many IT organizations, application vendors, and DBAs still think in terms of protocol, this article highlights:
- How we choose which storage protocol to use for a given application
- The advantages of unified storage for Oracle
- The relative merits of the various protocols
Frankly, it's the second section that we think is the most interesting. The combination of unified storage and unique NetApp data management tools gives us tremendous power to manage our Oracle environment.
Choosing a Storage Protocol
Our first experience with Oracle on NetApp was with NFS a number of years ago. When NetApp said we could run Oracle over NFS, our first reaction was to laugh because at that time it was a very unconventional way to do things (some think it still is); back then our DBAs still wanted raw disk. When we converted from SAP to Lawson Financials we switched from traditional storage to NFS. Lawson sent a team of engineers out to certify the configuration. They planned to stay for five days. After one day they were completely satisfied; they certified the solution and left.
Since that time, we've added many additional Oracle applications, some on NFS, some on Fibre Channel, and some on iSCSI.
At this point you're probably wondering why we use more than one storage protocol with Oracle and how we choose which protocol to use.
Today, our standard for Oracle is Linux® with NetApp storage, and we don't differentiate between Fibre Channel, iSCSI, or NFS. We're running it all with confidence. When we deploy a new application, the decision really comes down to which protocol the vendor supplying the application can support the best. We always try to find out what protocol the vendor is most comfortable with or has the most deployments on. Some vendors are Fibre Channel only, while others are willing, or even prefer, to support other options. For example, our Emageon Picture Archiving and Communication System (PACS) specifies the use of NFS.
In the past, our preferred option was either iSCSI or NFS due to their protocol resiliency and the high per–port cost and management costs of Fibre Channel, but today the per–port cost on a director–class switch is really no longer an issue (it's no longer significantly different between FC and IP), so vendor support has become our overriding criterion.
Other Factors to Consider
Along with vendor support, there are a few other factors you may want to consider if you're trying to decide which protocol to choose for your Oracle environment:
- Relationship with networking team: I've talked with people who don't want to do anything but Fibre Channel because they don't have control over IP networking or a good relationship with their networking team. If that applies to you, you might find it beneficial in the long run to use FC SAN. On the flip side, there's obviously an issue of management complexity with FC SAN. If your organization doesn't have the skill set to deal with that, the protocols that leverage standard Ethernet gear may be preferable.
- What does your DBA want? Let's face it, good Oracle DBAs are hard to find, and many of them come with strong opinions about what's good and what isn't. Some will be more comfortable with NFS and iSCSI than others. I don't force a good DBA to use a protocol he or she isn't comfortable with if I don't have to.
Many people continue to believe that performance is a factor, but frankly I haven't found performance to be an issue with any of the protocols. Redundancy is excellent whichever you choose. We found that iSCSI has 2% to 4% more overhead versus FC. This probably resulted mostly from the fact that we were using a software initiator rather than an iSCSI HBA and in any case wouldn't be enough of a difference to sway our decision on protocol versus the other things I've already mentioned.
For us, the advantage of being able to use any storage protocol we need outweighs the relative advantages or disadvantages of any particular protocol. Read on to learn more about how we use unified storage in our Oracle environment.
Unified Storage in Our Oracle Environment
Over the years, CHW has grown by acquisition. As a result, we've got an extremely diverse environment with every flavor of UNIX® and countless applications. As a nonprofit, we have to be very standards based to mitigate cost and complexity; we focus on adopting technologies and deploying solutions that are open source, based on open standards, or at a minimum highly modular. It's just not a good business practice for us to adopt technologies that will leave us locked into a proprietary, single–vendor solution. As an organization, we are always looking for innovative and cutting–edge technologies to help improve patient care while reducing expenditures.
That's why unified storage fits so well in our environment. We can deploy a storage system and be certain that we won't need something different to meet our future protocol requirements. NetApp unified storage solutions meet our Oracle needs and give us a feature set that works across all protocols to simplify our Oracle data management.
Oracle Requirements
Our Oracle environment consists of literally hundreds of applications running on everything from Oracle 8.0.5 up to Oracle11g™ and Oracle RAC. A lot of people talk about their Oracle applications as "mission critical," but when you're in healthcare some applications truly are a matter of life and death. Our number one requirement from our Oracle infrastructure is reliability, followed closely by supportability. For us, it's very important to be able to support our storage in house without having to depend on outside parties. We don't like the typical glass house, big SAN thing here because we don't want to be dependent on others to help us recover from an incident.
A Few of the applications in use at CHW.
| Application |
Protocol |
Purpose |
| Lawson Financials |
FC SAN |
Hospital's financials |
| Emageon |
NFS |
PACS application (radiology images) |
Multiple Applications per Storage System
Right now, CHW has more than 50 Oracle servers (including three RAC clusters) connected to NetApp storage. Our 55 NetApp storage systems have a total of 750TB of data covering Oracle and other critical storage needs for our facilities. These storage systems are simple to configure and manage; we use NetApp Operations Manager to keep tabs on everything since we have a significant number of systems.
The majority of our storage systems are serving multiple applications. With the exception of our Emageon PACS application (radiology imaging) and Exchange, we don't have any NetApp systems dedicated to a single application. With NetApp flexible volumes (FlexVol® volumes), we don't have to worry about hot disks or disk contention. Workload is distributed very efficiently.
We'll be utilizing FlexShare™ in the future to prioritize workloads on the storage systems that carry multiple loads. We are not using it yet simply because DBAs and others are not generally aware that this capability exists. Once word gets out, we should have a lot of good applications for the technology.
Storage System Reliability and Rapid Recovery
With Oracle on NetApp, we know the storage is reliable, and it's also very easy for us to recover if database corruption occurs without any outside assistance. We can always use SnapRestore® to revert back to a point in time before the corruption and get back up and running in a matter of minutes. In addition, we like the ease with which we can clone a database using FlexClone® for test/dev or other purposes without consuming twice the storage resources.
Thin Provisioning
We're also using NetApp FlexVol technology to enable thin provisioning. Almost every time a new Oracle application (or other project) comes online, we see extremely bloated requirements for how much storage a database is going to need. Using flexible volumes enables us to mix volumes with differing performance and capacity needs and grow and shrink them as requirements change. We can tell a customer, "Sure, I'll be happy to give you 1TB for your new project," and know that space isn't going to get consumed unless it's actually needed. Thin provisioning allows us to overprovision our storage and allow volumes to grow as necessary rather than allocating large volumes up front and having them sit mostly empty for the first year.
Data Protection
In my experience, most data protection software is hard to configure, carries a steep learning curve, and often just doesn't work as advertised; as a result it ends up as shelfware. That's why we rely so much on Snapshot™ and SnapRestore. It's very easy, people use it, and people trust it.
Our data protection environment utilizes a combination of NetApp Snapshot copies for immediate recovery needs and regular tape backups. Now that SnapManager® for Oracle is available we are considering using it for new deployments. But as we all know, it takes time to move from a scripted solution to a product solution, but we see the benefits of having a product to simplify this process and the support from a vendor. We will get there eventually!
Observed Protocol Differences
Although we are agnostic on protocol, obviously there are some differences. As I said previously, these differences aren't enough to make us pick one protocol over another, but they may be worth considering if you are choosing a storage protocol for your environment.
We like NFS for its ease of management and statelessness. We can recover a 200GB database in a minute on NFS, which is impressive. On the other hand, fewer vendors support it, and NFS is less consistent across platforms. Different client platforms require different patch sets and different tuning for optimal performance. Vendor patches have been known to break NFS. It appears that direct NFS will address many of these issues. It's a technology we're actively investigating for future deployments.
Fibre Channel, on the other hand, is the mainstream protocol. It's more consistent across platforms, predictable, and robust. It does, however, add management complexity, and provisioning is somewhat slower, although with NetApp it doesn't make a huge difference.
The iSCSI protocol is very easy to manage and as resilient as NFS. We manually fail over our NetApp clusters on iSCSI to perform maintenance on occasion, and, just like NFS, we don't incur any downtime. The failover happens fast enough that the applications don't go down. I am continually surprised to meet people who are not familiar with iSCSI, so it would seem that iSCSI is not yet as mainstream. We have investigated using hardware–based iSCSI initiators, but decided against their use simply because the minimal CPU overhead does not justify the cost of the cards and the additional management overhead.
Some observed advantages and disadvantages by protocol.
| |
Advantages |
Disadvantages |
| FC SAN |
- Robust and predictable
- Broad industry support
|
- Complex management
- Provisioning is slower
|
| NFS |
- Easy to manage
- Stateless
- Exceptional recovery
|
- Patching and tuning*
- Different on each platform
- Fewer application vendors support it
|
| iSCSI |
- Free
- Stateless
- Low per–port cost
|
- Support not as broad as FC
- 2% to 4% CPU overhead
|
*Patching and tuning are no longer necessary with the direct NFS client in Oracle11g.
Getting the Most Out of Storage
At CHW, we prefer to buy fewer, larger storage systems and use each system for multiple workloads. Unless you have extreme performance requirements, there is no longer any reason to dedicate a storage system to one application. Today's storage systems are more than fast enough to serve multiple applications in most cases. By choosing unified storage that supports NFS, CIFS, Fibre Channel, and iSCSI we can use whatever protocol makes the most sense for the application and change protocols whenever necessary. Unified storage gives us the flexibility to meet all our storage needs for Oracle and other critical applications.