Enterprise Executive - 2017: Issue 2

Enterprise Networks

Steve Guendert 2017-04-12 02:34:05

As a mainframe end user, the chances are quite likely you have a disaster recovery/business continuity strategy that requires you to replicate data to a remote site. That data may be stored on a variety of media: spinning disk, flash/SSD, virtual tape or physical tape cartridges. While some of you may still be using the PTAM (Pickup Truck Access Method) and have tape cartridges physically transported to an offsite location, the overwhelming majority of you are doing some form of electronic data replication. For longer distances the replication methodology is usually asynchronous, while for shorter distances (less than 50 miles) the replication methodology is often synchronous. The platform/protocols used for the data replication over distance can be a wide variety, including DWDM, FCIP or IP. These types of business continuity architectures are expensive, but compared with the financial and/or business reputational costs of a major outage, they are well worth it. I would like to pose a question to you. What do you believe to be the most expensive cost component of these business continuity architectures: The mainframe? The DASD arrays? The network hardware? If you guessed any of those, I am sorry but you are likely wrong. I cannot comment on your specific costs, but based on my experience in meeting with customers and clients worldwide, I can safely say that as a rule, the most expensive cost component of business continuity architectures is the cost of the network bandwidth between the data centers. In the remainder of this column, I will give you some ideas on how you can improve the efficiency with how you use that bandwidth. The more efficiently the bandwidth can be utilized, the better, so these ideas should help you reduce your bandwidth costs. A subsequent feature length article will expand on these ideas in more detail. First, you should always do a bandwidth needs analysis/assessment. This can be done in-house using commercially available software, or you can engage one of your vendors. For example, if the predominant bandwidth requirement is driven by z/OS Global Mirror, you may want to engage your DASD vendor. A good bandwidth needs assessment will include not only current requirements, but also a forecast for growth, because, let’s face it, your data is growing, and with it your data replication requirements and, hence, bandwidth requirements. Next, take advantage of protocol efficiencies with newer technologies. For example, if your architecture includes cascaded FICON directors with Interswitch Links (ISLs) between sites, the latest FICON switching is 32 Gbps. Today, with no 32 Gbps host or storage device connectivity yet on the market, the primary use case for 32 Gbps is with ISLs; 32 Gbps and 16 Gbps FICON are more efficient for ISLs. The reason: They use 64b/66b encoding, while 8 Gbps and 4 Gbps use 8b/10b encoding. The overhead of the 64b/66b encoding is 3.125 percent, which is considerably less than the 8b/10b encoding scheme, which has a 25 percent overhead. Furthermore, 16 Gbps and 32 Gbps FICON ISLs can utilize Forward Error Correction (FEC) to further enhance reliability and efficiency with ISL usage. Why use that expensive bandwidth for overhead when a much more efficient option is available? Third, keep data streaming over the distance. Try to avoid having bursty data traffic patterns. Going back to our previous cascaded FICON example, make certain you have plenty of buffer credits configured on the director ISL ports. It also is a good idea to use your SAN vendor’s buffer credit recovery mechanism. Buffer credit recovery is a standards-based technique that detects and remedies buffer credit loss issues caused by things such as faulty SFPs. Fourth, if you have z13 as well as the FICON directors and storage devices that will support it, you should consider using FICON Dynamic Routing (FIDR) for your ISLs. FIDR enables ISL routes to be dynamically changed based on the Fibre Channel exchange ID, which is unique for each I/O operation. With FIDR, an ISL is assigned at I/O request time, so different I/Os from the same source port going to the same destination port may be assigned different ISLs. z13 servers using FIDR have advantages for performance and management in configurations with ISL and cascaded FICON directors: • Support sharing of ISLs between FICON and FCP (Metro Mirror/PPRC or distributed) • I/O traffic is better balanced between all available ISLs. • Easier to manage with a predictable and repeatable I/O performance. Finally, give serious consideration to using some form of trunking, as well as Quality of Service (QoS). Trunking is used to aggregate individual cross site links into one or more “big pipes.” QoS is a feature/management technique that allows the end user to prioritize specific workloads. This prioritization is extremely useful in the event one or more cross site network links go down, and there may not be enough bandwidth left on the remaining links to maintain performance. QoS features allow you to make certain that in this event, the most important workloads have use of the bandwidth. Those are some basic ideas for gaining more efficiency in your long distance cascaded FICON network. More efficiency means lower bandwidth requirements, which can significantly lower costs in your business continuity architecture. Steve Guendert, Ph.d., is z Systems technology director of Product Management for Brocade Communications, where he leads the mainframe-related business efforts. He is a senior member of the Institute of Electrical and Electronics Engineers (IEEE), the Association for Computing Machinery (ACM) and a member of the Computer Measurement Group (CMG). He is a former member of both the SHARE and CMG boards of directors. Email: stephen.guendert@brocade.com

Published by Enterprise Systems Media. View All Articles.

This page can be found at http://ourdigitalmags.com/article/Enterprise+Networks/2761249/399971/article.html.

Using a screen reader? Click Here