Saturday, August 2, 2014

NetApp Adds Migration Feature to Automation Software

NetApp this week quietly added a new LUN migration tool to OnCommand Workflow Automation (WFA); its software to provision, migrate, or decommission storage.

Intended to aid in the transition of LUNs from 7-Mode to clustered Data ONTAP (cDOT), this tool consists of a set of WFA workflows and associated applications that convert the 7-Mode LUNs to files, then mirror those files to a cDOT cluster, and finally convert the files back into LUNs on the cDOT destination.

Interestingly, the conversion is actually performed via Windows PowerShell cmdlets. The tool allows per-volume or per-LUN migration, leaving the source data intact. Although Snapshot history is not copied, it preserves any storage efficiencies during offline migrations.

NetApp is expected to leverage this tool as a replacement to the DTA2800 appliance or host-based migrations, as it enables both online and offline LUN migrations:

Online migrations perform a LUN-to-file conversion via NFS hardlinks, which requires a temporary “staging volume” to hold a copy of the source LUN. The application and host are shut down before cutover. Cutover downtime is minimal, usually the time it takes to complete the host remediation activities -- regardless of LUN size.

Offline migrations leverage a FlexClone volume of the source to convert the LUNs into files and SnapMirror between this cloned volume and the clustered Data ONTAP target. This technique does not require a temporary staging volume, but the host (where the LUN is mounted) must be shut down.

The LUN Migration Tool requires several components, including:

  • OnCommand Workflow Automation (WFA) 2.2 RC1 or later
  • OnCommand Unified Manager Core Package 5.2 for 7-Mode
  • OnCommand Unified Manager 6.1 RC1 for clustered Data ONTAP

The source storage system must be running Data ONTAP 7.3.x or 8.x 7-Mode. The staging storage system must be qualified to be the source of the clustered Data ONTAP Transition Data Protection (TDP) SnapMirror, and the destination storage system must be running clustered Data ONTAP 8.2.x.

For more details, see NetApp Technical Report 4314 entitled Workflow Automation for SAN LUN Migration: 7-Mode LUN Transition to Clustered Data ONTAP 8.2.

Tuesday, July 15, 2014

Introducing NetApp Private Storage for Microsoft Azure

NetApp, Microsoft, and Equinix today introduced “NetApp Private Storage for Microsoft Azure”; a hybrid cloud infrastructure that links NetApp Storage with Azure Compute via Azure ExpressRoute.

The solution consists of several components.

First, FAS Storage Systems, running either 7-­Mode or clustered Data ONTAP, must reside in a co­location facility that is an Azure ExpressRoute Exchange provider (such as Equinix). Even though both operating modes work, NetApp highly recommends clustered Data ONTAP.

It is important to also note that NetApp is testing E-Series Storage Arrays with iSCSI to be a part of this solution.

Next, the solution requires Azure ExpressRoute: a private connection that bypasses the public Internet. ExpressRoute connections offer faster speeds, more reliability, and higher security than typical connections. In fact, in tests by NetApp, it has been shown that ExpressRoute provides 36% better performance compared to VPN over public Internet connections.

According to the three vendors, the solution is currently available in two Azure regions:

Azure US West (San Jose, California):
  • 200Mbps, 500Mbps, 1Gbps, 10Gbps virtual circuits
  • 1ms - 2ms latency observed

Azure US East (Ashburn, Virginia):
  • 200Mbps, 500Mbps, 1Gbps, 10Gbps virtual circuits
  • < 1ms - 1ms latency observed

As ExpressRoute is rolled out globally, NetApp will be testing latency in additional locations.

There are also several required features for the customer's network equipment within the Equinix colocation facility. NetApp does not certify specific network equipment to be used in the solution; however, the network equipment must support the following features:

Border Gateway Protocol (BGP)
BGP is used to route network traffic between the local network in the Equinix colocation facility and the Azure virtual network. 
Minimum of two 9/125 Single Mode Fiber (SMF) Ethernet ports
Azure ExpressRoute requires two physical connections (9/125 SMF) from the customer network equipment to the Equinix Cloud Exchange. Redundant physical connections protect against potential loss of ExpressRoute service caused by a failure in the physical link. The bandwidth of these physical connections can be 1Gbps or 10Gbps. 
1000Base-T Ethernet ports
1000BASE-T network ports on the switch provide network connectivity from the NetApp storage cluster. Although these ports can be used for data, NetApp recommends using 1GbE ports for node management and out-of-band management. 
Support for 802.1Q VLAN tags
802.1Q VLAN tags are used by the Equinix Cloud Exchange and Azure ExpressRoute to segregate network traffic on the same physical network connection.

Other optional features for the solution include:

Open Shortest Path First (OSPF) protocol
OSPF protocol is used when there are additional network connections back to on-premises data centers or other NetApp Private Storage for Microsoft Azure solution locations. OSPF is used to help prevent routing loops. 
QinQ (stacked) VLAN tags
QinQ VLAN tags (IEEE 802.1ad) can be used by the Equinix Cloud Exchange to support the routing of the network traffic from the network to Azure. The outer service tag (S-tag) is used to route traffic to Azure from the Cloud Exchange. The inner customer tag (C-tag) is passed on to Azure for routing to the Azure virtual network through ExpressRoute. 
Virtual Routing and Forwarding (VRF)
Virtual Routing and Forwarding is used to isolate routing of different Azure Virtual Networks and the customer VLANs in the Equinix co-location facility. Each VRF will have its own BGP configuration. 
Redundant network switches
Redundant network switches protect from a loss of ExpressRoute service caused by switch failure. It is not a requirement, but it is highly recommended that redundant switches are used. 
10Gbps Ethernet ports
Connecting 10Gbps Ethernet ports on the NetApp storage to the switch provides the highest amount of bandwidth capability between the switch and the storage to support data access.

NetApp also indicates that connectivity of FAS Storage to Azure Compute only supports IP storage protocols (SMB, NFS, and iSCSI) at this time.

There are several scenarios envisioned for this solution:

  • Cloudburst for peak workloads
  • Disaster Recovery
  • Dev/Test and Production Workloads
  • Multi-Cloud Application Continuity
  • Data Center Migration/Consolidation

One of the more interesting scenarios is multi-cloud application continuity. For example, take two geographically-dispersed Microsoft SQL Server 2012 Availability Group (AG) nodes in an Active/Passive configuration.

The primary SQL AG node is a Hyper-V virtual machine located in a Microsoft Private Cloud on the East Coast of the United States. The SQL AG node located in the Microsoft private cloud is connected to NetApp storage via iSCSI.

The secondary SQL AG node is an Azure virtual machine located in a Virtual Network in the West US Region. The secondary SQL AG node is connected to NetApp Private Storage in the co-location facility via iSCSI over a secure, low latency, high bandwidth Azure ExpressRoute network connection. Additionally, a third SQL AG node could be located in an Amazon Web Services (AWS) compute node; providing further multi-cloud failover capability.

SQL AG Replication occurs via a network connection between the on-premise private cloud and the Azure virtual network.

In the case where there is a loss of a SQL node, SQL storage in the primary location, or a loss of an entire primary datacenter, the surviving SQL Availability Group database node database replicas are activated automatically.

This application continuity model can be extended by using multiple Azure regions with NPS for Azure deployments -- each in different Azure regions.

NetApp Private Storage for Microsoft Azure is immediately available through reseller partners and directly from NetApp, Microsoft, and Equinix in North America. The solution will be available in Europe and Asia in the near future.

Tuesday, June 17, 2014

NetApp Rebrands SSD-Only FAS Systems as All-Flash FAS

NetApp today rebranded its Fabric-Attached Storage (FAS) systems with only solid-state drives (SSD) as All-Flash FAS (or AFF) systems.

AFF systems can run any version of Data ONTAP that supports SSDs. However, NetApp plans to also offer five pre-configured bundles, starting June 23, that leverage the FAS8080 EX and FAS8060 with 200GB, 400GB, and 800GB SSDs.

But why is Data ONTAP good for flash?

I've asked Nick Triantos, one of our consulting systems engineers, to comment on why AFF is different. This is what he said:

"The biggest challenge for us is not how WAFL writes; in fact, that’s a real advantage. The biggest challenges for us have been:

Multi-core Optimizations – For a long time, Data ONTAP didn’t leverage multiple cores effectively. In fact, the project for multi-core optimizations started back with version 7.3 and has continued through the Data ONTAP 8 releases. I’m sure you’ve seen where one CPU was at 90% and the other at 20%! If the workload was hitting an ONTAP domain that would run on a single core, then your performance bottleneck was that particular core (90%). It didn’t matter if the other cores were underutilized. This has been addressed.

Metadata Management – When you leverage small block size like 4K, inherently, you create a ton of metadata you need to manage. In order to get to the data fast, you need even faster access to metadata. How do you access metadata faster? In memory. That’s why there’s a ton of memory in the FAS2500 and FAS8000 Series; so we can manage as much metadata as possible in DRAM.

Data Protection – This is actually related to the above. The AFF has more data protection features than any flash (or non-flash) array in the market. While this is a good thing, there’s a trade-off. The trade-off is longer I/O paths because metadata has to located and validated against the actual data blocks.

How do you protect against lost writes for example? What happens if I’m a trading firm and the SSD acknowledges that an SSD page has been written – when it was either not written at all or it was written to the wrong location? You just lost millions of dollars. Data ONTAP not only detects, but also protects and recovers from lost writes (which are a very insidious type of failure).”

I said, “Let’s talk more lost writes”. Here’s his response:

“Lost writes are a rare, but a very stealthy failure and the worst thing is you won’t know it happened until days or even months later. But once it happens, you just corrupted your data! Good luck trying to find out which backup or snapshot or replication point is not corrupted. Of course, all this additional data protection stuff comes with a trade-off.

On the other hand, claiming blazing speeds and just protecting against two drives losses is not sufficient to claim superior protection of data – especially when flash arrays are typically deployed for business critical, revenue generating applications. You have to have worked through all the failure modes and make sure you can protect against those failures.  We’ve hardened Data ONTAP over nearly 20 years of existence to provide a very high level of resiliency against all modes of failure in various combinations.”

To recap, NetApp AFF system bundles have:

  1. Larger memory
    • Larger read/write cache in FAS8000 more in-memory metadata
  2. Faster NVRAM
    • Faster ACKs = lower response times
  3. Significant Multi-core optimizations
    • Since Data ONTAP 7.3 to version 8.2+
  4. Continuous Segment Size Cleaning (CSS)
    • Data ONTAP variable segment size  (4K-256K)
  5. Intelligent Algorithms
    • Pattern Detection based Read-Ahead
    • Sequential reads same blk size (i.e 32k) & different blk sizes (4k,64k,4k,64k)
    • Strided reads: Start at Block N read Blk 10 & 12 but skip in between blk 11
    • Backwards reads: Start at Block N read -10 blocks
    • Multiple threads simultaneously reading from multiple locations

AFF system bundles are available to quote and order starting June 23, 2014.