august-september 2010 - z/vm - ibm

84
IBM zEnterprise System a new dimension in computing WWW.MAINFRAMEZONE.COM AUGUST/SEPTEMBER 2010 THE RESOURCE FOR USERS OF IBM MAINFRAME SYSTEMS A PUBLICATION

Upload: others

Post on 09-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

IBMzEnterprise System

a new dimension in computing

W W W . m a i n f r a m e z o n e . C o m a U G U S T / S e P T e m B e r 2 0 1 0

T h e r e S o U r C e f o r U S e r S o f i B m m a i n f r a m e S y S T e m S

A

P U B L I C A T I O N

can we change the way the mainframe is managed forever?

we can

Why CA Mainframe 2.0?Simplification of mainframe management by automating, streamlining and standardizing the acquisition, installation and updates of your CA Technologies mainframe software.

innovation with expertise gained over 33 years of engineering mainframe software to create new and enhanced solutions that help meet key business challenges.

focuS on continually delivering technology that helps simplify and enhance the user experience to address workload and staffing needs.

The culmination of these elements, CA Mainframe 2.0 is an innovative new approach to managing mainframe applications.

CA Mainframe 2.0 continuously extends the value of CA Technologies’ mainframe solutions through an enhanced customer experience. The first CA Mainframe 2.0 delivery is CA Mainframe Software Manager (MSM)—a key no-cost deliverable which standardizes, automates and simplifies CA Technologies product installations and updates.

CA MSM includes a new, streamlined Electronic Software Delivery (ESD) method that allows you to install CA Technologies mainframe products using standard utilities—without requiring you to reconstitute a tape cartridge. Test results showed an amazing 74%–98% improvement in install and update times for both the expert and novice user. The simplified management in MSM empowers the next generation of less-experienced IT staff to perform core mainframe ownership tasks, and gives more experienced staff valuable time to add even greater value to the business.

Visit ca.com/mainframe/benchmark-webcast to view the webcast about the CA MSM testing and hear firsthand from the participating novice and expert testers. Visit ca.com/mainframe2 to learn more about all things CA Mainframe 2.0.

Copyright © 2010 CA. All rights reserved.—2153

Learn more about CA Mainframe 2.0 at ca.com/mainframe2

The 7 virtues of mainframeOrganizations around the world are recommitting to the mainframe as they discover the unique energy, performance and management advantages it offers in today’s challenging business environment. The “7 Virtues of Mainframe” make it the clear choice for high-security, always-on transactional workloads and make it an attractive, viable option for almost all net-new workloads. IT Managers and their finance counterparts now see that a single mainframe can often replace the exploding server > racks of servers > server farm > server plantation scenario, along with its enormous and ever-growing total cost of ownership, and the resulting complexity fatigue.

More innovationCA Technologies’ recent introductions of CA Compliance Manager for z/OS, CA Encryption Key Manager and CA Cross Enterprise Application Performance Management are clear indicators of the investments CA Technologies is making to advance mainframe management. 2010 will bring even more innovations as part of CA Technologies’ commitment to constantly providing you further opportunity to leverage your mainframe environment and release the latent value of the investments you’ve already made.

If you license CA Technologies Software for use on your mainframe, you can get CA Mainframe Software Manager at NO ADDITIONAL CHARGE.

Contact your CA Technologies account representative today to order CA Mainframe Software Manager and request your free copies of Releasing Latent Value: The Book and Mainframe 2.0: The Book.

cost effective

innovative

resilient

securegovern

virtualized

reliable

efficiency through standardized

& centralized business functions

2153_Mainfr Exec ad-2pg 4c_zjournal_060210_2c.indd 1-2 6/3/2010 5:08:56 PM

can we change the way the mainframe is managed forever?

we can

Why CA Mainframe 2.0?Simplification of mainframe management by automating, streamlining and standardizing the acquisition, installation and updates of your CA Technologies mainframe software.

innovation with expertise gained over 33 years of engineering mainframe software to create new and enhanced solutions that help meet key business challenges.

focuS on continually delivering technology that helps simplify and enhance the user experience to address workload and staffing needs.

The culmination of these elements, CA Mainframe 2.0 is an innovative new approach to managing mainframe applications.

CA Mainframe 2.0 continuously extends the value of CA Technologies’ mainframe solutions through an enhanced customer experience. The first CA Mainframe 2.0 delivery is CA Mainframe Software Manager (MSM)—a key no-cost deliverable which standardizes, automates and simplifies CA Technologies product installations and updates.

CA MSM includes a new, streamlined Electronic Software Delivery (ESD) method that allows you to install CA Technologies mainframe products using standard utilities—without requiring you to reconstitute a tape cartridge. Test results showed an amazing 74%–98% improvement in install and update times for both the expert and novice user. The simplified management in MSM empowers the next generation of less-experienced IT staff to perform core mainframe ownership tasks, and gives more experienced staff valuable time to add even greater value to the business.

Visit ca.com/mainframe/benchmark-webcast to view the webcast about the CA MSM testing and hear firsthand from the participating novice and expert testers. Visit ca.com/mainframe2 to learn more about all things CA Mainframe 2.0.

Copyright © 2010 CA. All rights reserved.—2153

Learn more about CA Mainframe 2.0 at ca.com/mainframe2

The 7 virtues of mainframeOrganizations around the world are recommitting to the mainframe as they discover the unique energy, performance and management advantages it offers in today’s challenging business environment. The “7 Virtues of Mainframe” make it the clear choice for high-security, always-on transactional workloads and make it an attractive, viable option for almost all net-new workloads. IT Managers and their finance counterparts now see that a single mainframe can often replace the exploding server > racks of servers > server farm > server plantation scenario, along with its enormous and ever-growing total cost of ownership, and the resulting complexity fatigue.

More innovationCA Technologies’ recent introductions of CA Compliance Manager for z/OS, CA Encryption Key Manager and CA Cross Enterprise Application Performance Management are clear indicators of the investments CA Technologies is making to advance mainframe management. 2010 will bring even more innovations as part of CA Technologies’ commitment to constantly providing you further opportunity to leverage your mainframe environment and release the latent value of the investments you’ve already made.

If you license CA Technologies Software for use on your mainframe, you can get CA Mainframe Software Manager at NO ADDITIONAL CHARGE.

Contact your CA Technologies account representative today to order CA Mainframe Software Manager and request your free copies of Releasing Latent Value: The Book and Mainframe 2.0: The Book.

cost effective

innovative

resilient

securegovern

virtualized

reliable

efficiency through standardized

& centralized business functions

2153_Mainfr Exec ad-2pg 4c_zjournal_060210_2c.indd 1-2 6/3/2010 5:08:56 PM

�   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

c o n t e n t sA U G U S T / S E P T E M B E R 2 0 1 0 • v o l U M E 8 / n U M B E R 4 • w w w. M A i n f R A M E z o n E . c o M

A R t I c L e s—————————————————————————————————————–8 IBM’snewzenterpriseMainframeandItsnewHybridz/Bladeenvironment B y J o E c l A B B y—————————————————————————————————————–12 cIcssysplexoptimizedWorkloadRouting B y D A v E w i l l i A M S —————————————————————————————————————–22 storagePerformanceManagement:MoreBalancetoImprovethroughput B y G i l B E R T H o U T E k A M E R , P H . D . , A n D E l S D A S—————————————————————————————————————–28 thenewsystemz:theBusinessPerspective B y A l A n R A D D i n G —————————————————————————————————————–30 ProactiveItsystemsManagement:thetimeIsnow B y G . J A y l i P o v i c H—————————————————————————————————————–34 HowDatabaseDesignAffectsDB2forz/ossystemPerformance B y l o c k w o o D l y o n—————————————————————————————————————–42 LinuxonsystemzKernelDumps B y M i c H A E l H o l z H E U —————————————————————————————————————–48 IsYourz/ossystemsecure? B y R A y o v E R B y—————————————————————————————————————–52 BuildingBetterPerformanceforYourDB2/cIcsProgramsWiththreadsafe B y R U S S E v A n S A n D n A T E M U R P H y—————————————————————————————————————–57 PerformanceManagementessentialstoItsuccess B y M A n o M A T H A i—————————————————————————————————————–62 PerformancestrategiesforBatchApplicationsonz/os B y S R i n i v A S P o T H A R A J U A n D A R k A n A n D i—————————————————————————————————————–68 DB2forz/osParallelism B y w i l l i E f A v E R o—————————————————————————————————————–72 ManagingtheWebsphereMQDeadLetterQueue B y R o n A l D w E i n G E R—————————————————————————————————————–75 UsingHardwarecryptographicsupportWithopenssHinLinuxonsystemz B y M A n f R E D G n i R S S , P H . D .—————————————————————————————————————–

c o L U M n s—————————————————————————————————————–4 Publisher’sPage B y B o B T H o M A S —————————————————————————————————————–6 Linuxonsystemz:ofshoes,andships,andsealingWax… B y D A v i D B o y E S , P H . D .—————————————————————————————————————–21 z/VendorWatch:IBMPreparestheMarketforItsnewMainframeProcessor B y M A R k l i l l y c R o P—————————————————————————————————————–40 Peteclarkonz/Vse:Hintsandtipsforz/Vse4.2 B y P E T E c l A R k —————————————————————————————————————–46 BigIron:theMainframestory(soFar) B y S T E v E n A . M E n G E S—————————————————————————————————————–56 z/DataPerspectives:MoreDBAProverbs B y c R A i G S . M U l l i n S—————————————————————————————————————–61 cross-PlatformManagement:theDoorIsopentocross-PlatformsystemsManagement B y G . J A y l i P o v i c H—————————————————————————————————————–67 complianceoptions:constraints,controls,andcapriciousness B y G w E n T H o M A S—————————————————————————————————————–71 Mainframesecurity:organizationalQuestionsAffectingMainframesecurity B y S T U H E n D E R S o n —————————————————————————————————————–74 storage&DataManagement:three-WayDisasterRecoverysolutionsforsystemz B y R A y l U c c H E S i —————————————————————————————————————–80 Itsense:teachableMomentsFromtheGulfoilspill B y J o n w i l l i A M T o i G o —————————————————————————————————————–

2 8 1 H w y 79 M o r g a n v i l l e , N J 077 5 1Te l : 73 2 972 . 1 2 6 1 Fa x : 73 2 972 . 9 4 16We b : w w w. re s p o n s i ve s y s t e m s . c o m

RESPONSIVES Y S T E M S

Optimize Both Performance & DB2 Memory Utilization With

Buffer Pool Tool® For DB2Reduce Your Processing Costs, Save $$ Now!

zP U B L I s H e R ’ s P A G e

B O B T H O M A S

�   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

PublisherBoB [email protected]

AssociatePublisherDEnny [email protected]

editorialDirectorAMy B. [email protected]

columnistsDAviD BoyES, Ph.D.PETE clARkSTU HEnDERSonMARk lillycRoPG. JAy liPovicHRAy lUccHESiSTEvEn A. MEnGEScRAiG S. MUllinSGwEn THoMASJon williAM ToiGo

onlineservicesManagerBlAiR THoMAS [email protected]

copyeditorsDEAn lAMPMAnPAT wARnER

ArtDirectorMARTin w. [email protected]

ProductionManagerkylE [email protected]

AdvertisingsalesManagerDEniSE T. [email protected]

AdvertisingsalesRepresentativeskARin [email protected]

lESliE [email protected]

—————————————————————————theeditorialmaterial in this magazine is accurate to the best of our knowledge. no formal testing has been performed by z/Journal or Mainframezone, inc. The opinions of the authors and sponsors do not necessarily represent those of z/Journal, its publisher, editors, or staff.—————————————————————————subscriptionRates: free subscriptions are available to qualified applicants worldwide.—————————————————————————Inquiries: All inquiries concerning subscriptions, remittances, requests, and changes of address should be sent to: z/Journal, 9330 lBJ freeway, Suite 800, Dallas, Texas 75243; voice: 214.340.2147; Email: [email protected].—————————————————————————Forarticlereprints, contact [email protected]—————————————————————————Publications Agreement no. 40048088Station A, Po Box 54windsor on n9A 6J5canada—————————————————————————All products and visual representations are the trademarks/registered trademarks of their respective owners.—————————————————————————MainframeZone,Inc. © 2010. All rights reserved. Reproductions in whole or in part are prohibited except with permission in writing.(z/Journal iSSn 1551-8191)—————————————————————————

z/JournaleditorialReviewBoard:David Boyes, Pete clark, Tom conley, Phyllis Donofrio, willie favero, Steve Guendert, Mark S. Hahn, norman Hollander, Eugene S. Hudders, Gene linefsky, chris Miksanek, Jim Moore, craig S. Mullins, Mark nelson, Mark Post, Greg Schulz, Al Sherkow, Phil Smith iii, Rich Smrcina, Adam Thorntonz/JournalArticlesubmission: z/Journal accepts submission of articles on subjects related to iBM mainframe systems. z/Journal writer’s Guidelines are available by visiting www.mainframezone.com. Articles and article abstracts may be sent via email to Amy novotny at [email protected].

iBM’s Black Swan Mainframe Event

A Black Swan Event is generally characterized as an outlier

event of enormous magnitude and consequence that

ultimately has an extreme impact. According to

Wikipedia, in his New York Times best-selling book, The Black

Swan, author Nassim Taleb says that almost all major scientific

discoveries, historical events, and artistic accomplishments were

unexpected and unpredicted and qualify as Black Swan Events.

High-tech examples of Black Swan Events are the Internet, the

PC, and the laser.

To me, IBM’s announcement on July 22 of its new

zEnterprise System is definitely a Black Swan Event. This isn’t just a new, big-

ger, faster mainframe—far from it. According to acclaimed IT analyst Joe

Clabby’s article in this issue of z/Journal, this new System z announcement is

markedly different than in previous years because:

• IBM’s mainframe has several new personalities. It now performs

competitively in floating point and single-thread processing and offers

outstanding business analytics processing capability.

• The new zEnterprise can now be tightly coupled with IBM’s zBX (IBM

zEnterprise BladeCenter Extension) blade server environment to create a

common management environment across mainframes and blades.

• New names and extensions have been introduced.

When including the fact this new mainframe can offer up to 40 percent

better performance over the z10, it’s easy to say this IBM announcement of the

zEnterprise is indeed a Black Swan, game-changing event. Z

Somehow, I think Lewis Carroll would have loved the Linux world—it’s often seen but usually absurd. No matter what I plan to talk about, there are always other

interesting things that come to mind when this column is due. Here we’ll discuss an upcoming show and some interesting technology coming to the Linux environment by way of the High-Performance Computing (HPC) community. We’ll also examine an exciting way programming skills in the Linux on System z environment are being taught using the Alice project at Carnegie Mellon University (CMU). First, the Ohio LinuxFest 2010 dates have been announced; it will be held Sept. 10 - 12 in Columbus, OH. It’s a large, volunteer-run show focused on open source, Linux on all platforms, and other open source projects. Registration rates haven’t yet been set, but they’ve historical-ly been very low. The interesting connection with Linux on System z is that the Columbus area is littered with substan-tial mainframe and Linux on System z sites (for example, Nationwide Insurance), and there seems to be a grassroots movement to cover open source efforts outside the Intel world, including some sessions on z/VM open source tools and Linux on System z management. If you’re thinking of attending a Linux-oriented conference this year, it’s an inex-pensive way to make your conference dollars count. To learn more, visit www.ohiolinux.org. Second, Linus Torvalds approved a series of kernel commits that introduced a fascinating new high-per-formance distributed file system called Ceph to the mainstream Linux kernel sources. Ceph (more details at http://ceph.newdream.net) is a file system that provides petabyte-scale parallel, network-based data storage across multiple servers and storage technologies. It’s designed to work in environments with different kinds of physical storage; it has automatic data location balancing (i.e., it will automatically reshuffle data to optimize performance) and fault management (with petabyte-scale data sets, something is bound to be broken almost all the time). Ceph adds a policy component that tells the file system that “any file stored in this directory needs to be repli-cated in at least three physical locations that can’t share a

cabinet, power supply, or disk shelf.” Originally developed at the University of California Santa Cruz, a document available at http://ceph.new-dream.net/weil-thesis.pdf describes the function of Ceph in detail and provides sample performance numbers—near-wire-speed performance across all kinds of different loads and failure scenarios. Here at Sine Nomine, we have it operational with both Intel and Linux on System z and are continuing to investigate how to get HSM function working. Last, the Alice project at CMU. Brainchild of Caitlin Kelleher of Washington University and championed by the late Randy Pausch, Alice provides a exploratory environ-ment for application programming targeted at 3D objects and simulation software such as the game “The Sims” (in fact, Electronic Arts donated several motion and object libraries). Recognizing the growing availability of the System z platform, and the use of virtual machines to deploy collaborative and innovative environments, Alice software packages (in Red Hat Package Manager [RPM] format) are now available for Linux on System z. Alice has been used to introduce the ideas of what is programmable and the logic of programming to non-programmers—may be just the thing to introduce to your manager to show him or her what is doable in a short time frame. Information about Alice is available at www.cmu.edu/homepage/computing/2009/winter/alice-3-software.shtml. In the next issue, we’ll take another look at RHEL 6.1 for System z and some improvements in the kernel API for scheduling processes as groups and a few other goodies tossed in from our friends in Boeblingen, Germany. Also, as promised, we’ll provide more details on XCAT in an upcom-ing column. Z

Dr. DaviD Boyes is CTO and president of Sine Nomine Associates. He has participated in operating systems and networking research for more than 20 years, working on design and deployment of systems and voice/data networks worldwide. He has designed scenarios and economic models for systems deployment on many platforms, and is currently involved in design and worldwide deployment of scalable system infra-structure for an extensive set of global customers.Email: [email protected]

Of Shoes, and Ships, and Sealing Wax …

Linux on System z DaviDBoyeS,Ph.D.

�   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

“Sure, I’ve done some dumb things...

www.velocitysoftware.com    USA: 877.964.8867    International: 1.650.964.8867 or +49 (0) 621.373.844

z / V M P e r f o r M a n c e T o o l s f r o M z / V M P e r f o r M a n c e P e o P l e

...but in spite of the hype, buying the wrong performance monitor for our IBM System z wasn’t one of them!”I knew when it was time to get performance and capacity planning tools for z/VM and Linux that we wouldneed a solution which installed in minutes; not days or weeks.

I also knew we needed something capable of digging deep enough to provide complete and accurate datafor my capacity planning and charge back accounting folks.

Finally, I needed performance tools that would help identify and resolve performance problems instead ofcontributing to them.  

That meant there was only one viable solution to meet our performance monitoring needs.

zVPS from Velocity Software. It’s the smart thing to do.

�   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

IBM’s New zEnterprise Mainframe and Its New Hybrid z/Blade Environment

By Joe Clabby

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   �

Typically, when IBM introduces a new mainframe (System z), the announcement is all about more capacity, faster processing, better

energy utilization, and other speeds and feeds. Reporters publish the new specifi-cations. And customers buy the new System z to add more MIPS computing capacity to meet their needs for more processing power as their businesses continue to grow. But not this year … This year’s System z announcement is markedly different than in previous years because:

• IBM’s mainframe has several new per-sonalities. It now performs competi-tively in floating point and single-thread processing—and offers outstanding business analytics processing capability.

• The new zEnterprise can now be tightly coupled with IBM’s zBX (IBM zEnter-prise BladeCenter Extension) blade server environment to create a common management environment across main-frames and blades (this new, cross-plat-form management and governance environment has the potential to lower operational costs by up to 62 percent!).

• New names and extensions have been introduced.

What’s In a Name? In short, IBM’s new z announcement

consists of a new System z (the IBM zEnterprise), a hybrid hardware blade cabinet/extension (the zBX), and a new management environment known as the Unified Resource Manager (also known as zManager). Figure 1 explains the family, system, and model number conventions now in use to describe the new mainframe and associated exten-sions/management software.

The New, Improved Mainframe: The zEnterprise As could be expected, IBM announced that its new System z, the zEnterprise, can offer up to 40 percent better perfor-mance than its predecessor (the z10), using the same amount of energy as the z10. Further, it offers super-fast, single-thread processing, thanks to the new 5.2 GHz quad z core processor. Other new features include:

• Up to 96 cores (one to 80 configurable for client use and the others used for system activities)

• Up to 3TB RAIM memory (This new Redundant Array of Independent Memory acts like RAID [Redundant Array of Independent Disks], ensur-ing that if a memory error should occur, it can be rapidly corrected.)

• More than 100 new instructions (Instruction sets allow developers to write commands directly to the pro-cessor, enabling programs to exploit processors for greater performance. Contrast this with about eight new instructions for Intel’s new Itanium chip set.)

• 1.5MB L2 cache per core, 24MB L3 cache per processor chip (significantly more cache than previous generations, allowing more data to be processed in close proximity to the processor, thus improving processing speed)

• Cryptographic enhancements (add-ing to IBM’s already established lead in commercial system security

[IBM is the only systems vendor to have achieved EAL level 5 security certification.])

• Optional water cooling (Note that water is about 4,000 times more effi-cient at conducting heat away from servers than air.)

These numbers, in and of them-selves, are compelling enough to war-rant a capacity upgrade by the current mainframe installed base. But the new zEnterprise can also be used to run new applications because IBM:

• Significantly improved single-thread processing so a mainframe can now compete head-on with tuned distrib-uted servers from a performance per-spective

• Improved floating-point processing (often used for scientific and financial processing) so a mainframe can now competitively host these workloads

• Greatly expanded its memory on the chip—and added more main memory, making it possible to process more data in memory (The more data that can be placed in memory, the faster data can be processed. IBM’s memory expansion now enables a System z to be positioned as a huge business ana-lytics server.)

IBM’s increase in z memory will also assist customers who are looking to use a mainframe as a Linux consolidation serv-er. It’s no secret that mainframes offer the most advanced virtualization, provision-ing, and workload balancing in the indus-try. But now, with the increased capacity of the new zEnterprise, coupled with the availability of more memory, the new z has become the most highly scalable Linux consolidation server on the market (some estimates show that up to 100,000 virtual machines [Linux instances] could be man-aged by an IBM System z). (This 100,000 number is in reference to zEnterprise over-

figure 1: The new z family

• family name: iBM System z• System name: iBM zEnterprise System (zEnterprise System)• name on the Server: zEnterprise• cEc name: iBM zEnterprise 196 (z196)• Model numbers: M15, M32, M49, M66, M80• Hybrid Hardware name: iBM zEnterprise Bladecenter Extension (zBX)• Management firmware: iBM zEnterprise Unified Resource Manager

(Unified Resource Manager)Source: iBM corp., July 2010

all, including Linux on System z images and Blade Virtual images. More than 300 Linux virtual servers can be hosted on a single zEnterprise server.) And thanks to these and other improvements (especially in software tuning), IBM is now reporting major increases in processing performance for z/OS workloads; huge increases in han-dling CPU-intensive workloads (these may gain up to an additional 30 percent, thanks to compiler enhancements); and significantly increased database pro-cessing performance (dedicated work-load optimizers yield five to 10 times improvement in complex query perfor-mance, making the mainframe an impressive business analytics server). And Now for Something Completely Different: A Tightly Coupled z/Blade Environment and a Firmware Manager But there’s a lot more to the z story than improvements manifest in the new zEnterprise hardware. IBM has also announced a new hybrid zEnterprise/blade environment that tightly couples blades with mainframes. Using this hybrid environment, blades can be con-nected to a zEnterprise at high-speed and managed to the service levels associated with a mainframe (high-performance, advanced security, and the ability to con-trol and manage large numbers of virtual machines within blade environments). “In short, IBM’s new hybrid z/blade environment is really” as Jeffrey Frey, an IBM Fellow, describes it, “a new gover-nance arrangement between the z world and the distributed systems world.” To paraphrase Frey’s description of why IBM embarked on building this hybrid environment, the logic behind this arrangement goes like this:

• System z offers the most advanced management, virtualization, security, performance, scalability, reliability, memory management, and power management facilities in the industry.

• Other servers (particularly x86-based servers) are comparatively immature when it comes to these advanced man-agement capabilities.

• If mainframe management can be extended down to bladed servers, then the advanced mainframe management facilities could manage those servers, providing IT managers and adminis-trators with more advanced tools (and with higher service levels) than they could hope to achieve using less mature management environments that cur-rently run on their hardware environ-

ments. (Further, IBM can integrate and package this new governance envi-ronment for IT managers/administra-tors, making deployment simple.)

What IBM is attempting to do with this extended governance environment is to free up IT managers/administrators from having to manage separate infra-structure stacks across various hardware platforms. Instead, IBM wants these managers/administrators to take advan-tage of the advanced/automated manage-ment facilities available on zEnterprise and across its other platforms to reduce the number of manual management tasks they need to perform—and, instead, focus on workload management.

A Closer Look: Unified Resource Manager and zBX To better understand this new main-frame governance environment, an overview of z management may be nec-essary. Essentially, IBM offers three software management environments across its product portfolio:

• Tivoli—this line of management soft-ware houses the software needed for provisioning, workload balancing, orchestration, and business process management and control (as well as for numerous other high-level man-agement activities—especially the management of services [also known as service management]);

• Systems Director is a management environment that’s largely concerned with the management of physical and virtual (logical) resources. (Note: Systems Director VMControl is a management environment designed to provide a common interface for ultimately managing virtual machines on mainframes, Power Systems, and x86 servers); and the new

• Unified Resource Manager is firmware (code delivered on zEnterprise) largely concerned with the management of the resources in zEnterprise and asso-ciated hypervisors (the code that man-ages virtual machines that use underlying processor resources.

What’s important to understand about these environments is that they can work together across IBM’s zEnter-prise server, Power Systems, and x86 servers (System x) to create an environ-ment where all aspects of a systems environment (firmware, physical and virtual servers, and high-level activities) can all be managed in an integrated,

automated fashion using a common interface. Taking this approach, some IT executives may be able to reduce human labor costs related to systems management by more than 62 percent. The new zBX environment is a chas-sis designed to be connected to a main-frame—and optimized for high-speed communications as well as improved manageability under mainframe gover-nance (its current connectivity is based on 10GB Ethernet). It will be rolled out later this year with support for selected IBM Power-based blades, to be followed in 2011 by IBM System x (x86) blades. But, from a design perspective, this chassis supports a 4U form factor—so the inclusion of “specialty blades” over time may prove possible (in other words, it may be possible to include other pro-cessors on blade form factors that can be managed by a mainframe). For example, a Mainframe Executive article in September/October 2008 on Hoplon Infotainment described how Hoplon uses tightly coupled cell processor front-ends to provide advanced graphical interfaces to back-end mainframe ser-vices. Such a configuration running inside a tightly coupled blade environ-ment may be possible in the future.

Summary Observations IBM’s zEnterprise should have strong appeal to its existing mainframe cus-tomer base, given that it allows for a 40 percent increase in performance and a 60 percent increase in capacity (allow-ing mainframe customers more pro-cessing headroom). But IBM’s new positioning of the mainframe as a man-agement/governance engine is the real heart of this story. If IBM can succeed in convincing centralized and distribut-ed systems managers to stop their infighting over which architecture is better—and, instead, focus those man-agers on managing higher-value work-flow management and business process flow that’s more aligned with a business’ strategy—then the enterprise as a whole will be better served. IBM’s new hybrid z environment makes this possible. Z

Noted for his research/analysis and public speaking abili-ties, Joe ClaBBy has written dozens of specialized ana-lytical reports on computer technology vendors and has spoken around the world on evolving computing trends. He has been in the computing industry for almost 30 years in positions in sales, product marketing, and research and analysis. He is now president of Clabby Analytics, and was formerly vice president of Systems and Storage at Summit Strategies, as well as group vice president of Platforms and Services at Boston-based Aberdeen Group. Email: [email protected]

1 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

B y D Av e W I L L I A m s

T he Workload Manager (WLM) feature of CICSPlex System Manager is a useful tool for optimizing system capacity in highly complex

environments. This tool analyzes the load capacity and health state of CICS regions intended to be targets of dynam-ic transaction routing requests and selects the region it considers the most appropriate target. CICS Transaction Server for z/OS Version 4.1 introduces a new feature of CICSPlex SM named Sysplex Optimized Workload Routing. This subfunction of the existing WLM

1 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

CICs sysplex

Optimized Workload Routing

feature was implemented in response to concerns voiced by many large enter-prise customers regarding the observed behavior of WLM in CICSplexes that span multiple Logical Partitions (LPARs).

Existing WLM Decision Behavior Let’s consider the current WLM decision behavior. WLM employs data spaces owned by a CICS Managing Address Space (CMAS) to share cross-region load and status data. Every CMAS owns a single WLM data space it shares with all user CICS regions it directly manages. A user region man-aged by a CMAS is known to CICSPlex SM as Local Managed Address Space, or LMAS. During CMAS initialization, that area is verified and formatted with the structures necessary to describe all workload activity related to the CMAS. When the user CICS regions begin rout-ing dynamic traffic, the state of those CICS regions is recorded in this data space. In a CICSplex where the same CMAS manages all dynamic routing CICS regions, all those regions use the same WLM data space to determine work-load information required for WLM operation. That means dynamic routing decisions are made based on the most current load data for a potential routing target region. A routing decision is based on an amalgamation of factors:

• How busy is the region? • How healthy is the region? • How fast is the link between the router

and target? • Are there outstanding CICSPlex SM

Realtime Analysis (RTA) events asso-ciated with the workload?

• Are there transaction affinities out-standing to override the dynamic routing decision?

This processing rationale provides equitable dynamic routing decisions when working in a single CMAS envi-ronment. However, with workloads being spread across multiple z/OS images, users must configure additional CMASs to manage the user CICS regions on the disparate LPARs. Each WLM data space must maintain a com-plete set of structures to describe every CICS region in the workload—not just the CICS regions that each CMAS is responsible for, but also those regions in other LPARs managed by other CMASs. This means the WLM data space

each CMAS owns must be synchronized periodically with the WLM data spaces owned by other CMASs participating in the same workload. This synchroniza-tion occurs every 15 seconds (the heart-beat) from the LMASs to their CMASs, then out to all other CMASs in the workload. CICS provides two dynamic routing exits—named in the System Initialization Table (SIT)—with different behavior characteristics:

• Dynamic Transaction Routing requests may be redirected using the DTRPGM System Initialization parameter. For DTRPGM requests, the routing region calls (from CICS) to decide the target region is synchronized with execution of the request at the selected target, which is then followed by a call from CICS upon completion of the dynamic request. This allows the router to increment the task load count before informing CICS of the target region system id, and also to decrement the count on completion of the request.

• Distributed Routing requests may be redirected using the DSRTPGM System Initialization parameter. For DSRTPGM requests, the routing region calls from CICS to decide whether a target is synchronized with the selected target. Typically, these dynamic requests are asynchronous CICS STARTs, so the router has no notification of when the routed trans-action begins or ends. CICSPlex SM has accommodated this anomaly by stipulating that DSRTPGM target regions must have workload specifica-tions associated with them; this trans-forms the targets into logical routing regions and lets the CPSM routing processes determine they’re being called at the DSRTPGM target level. This allows the task load count to be adjusted at transaction commence-ment and completion.

Given that CICSPlex SM routing regions count dynamic transaction throughput in a CICSplex, transactions started locally on the target regions remain unaccountable by the routing regions until a heartbeat (synchroniza-tion) occurs. Actually, the router trans-action counts won’t be accurately synchronized until two heartbeats have occurred—the first to increment the count and the second to decrement it again. However, this discrepancy isn’t considered as severe as when different CMASs manage a router and target.

For a multiple CMAS situation, the routing regions will be evaluating status data for a target region as described in its local WLM data space. If that target region is managed by a different CMAS from that owned by the router, then status data describing that target region may be up to 15 seconds old. For DTRPGM requests, this latency doesn’t have a severe impact. However, for DSRTPGM requests, the effect can be quite dramatic, particularly for high levels of workload throughput. The effect is known as workload batching.

Workload Batching Workload batching is the term applied to the effect seen in heavy work-loads in multiple CMAS environments, where dynamic distributed (DSRTPGM) routing requests are being processed. A target region may be managed by a dif-ferent CMAS to the routing region typi-cally because they reside in different LPARs. In that circumstance, the router is using a copy of the descriptor struc-ture to evaluate the target status from the actual structure employed by the target itself. The copied target descriptor being reviewed is synchronized with the actu-al descriptor in 15-second intervals. Between these 15-second heartbeats, the router will have a less accurate sta-tus compared to other potential target regions in the workload and will con-tinue to base its routing decisions on the last known valid data. Eventually, a heartbeat will occur and the data is refreshed. Compared to other regions, the target could now be either extremely busy or completely unexploited. The router reacts to this by appearing to be more aggressive in routing work toward or away from the target. This can cycle the region from a high throughput to a low throughput on this heartbeat boundary. This workload batching state will continue until there’s a genuine lull in the workload throughput, which will settle the batching down until the throughput picks up again. A user watching the task loading across the CICSplex will see some regions running at their MAXTASK limits and being continually fed with dynamically routed traffic while others remain unused. A snapshot 15 seconds later will probably see a reversal of utili-zation—the busy regions will be idle, and the idle regions will be at their MAXTASK limit. The users most sus-ceptible to these events are those who use MQ triggers to feed transactional

1 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

data into their CICSplexes, where the trigger regions tend to be managed by different CMASs. Those users would see the greatest benefit of Sysplex opti-mized workload routing.

Sysplex Optimized Workloads When CICSPlex SM was originally conceived, a single data space was con-sidered to be a wide enough scope to provide a common data reference point for all regions in the CICSplex. Today, that’s no longer true. The mechanism chosen to broaden the scope of these common points of reference is the z/OS coupling facility. However, the content of the WLM data space hasn’t simply been migrated into the coupling facility; some internal re-engineering was also undertaken. Routing regions are currently responsible for adjusting the target region load counts WLM uses to deter-mine task loads. On every heartbeat, the CICSPlex SM agent in the user CICS region reports its task count to its own-ing CMAS. The CMAS will then update the load count in the target region descriptor of its WLM data space and broadcast that value to other CMASs participating in workloads associated with the user CICS region. For Sysplex optimized workloads, this is turned around. When a target region runs in optimized mode, the tar-get region is responsible for maintain-ing the reported task count. CICS does this counting in the transaction manag-er; the count includes instances of all tasks in the CICS region, not just those that are dynamically routed. This load value for the CICS region, along with its basic health status, is periodically broad-cast to the coupling facility where other CICS components can interrogate it. At the CICSPlex SM level, a router will know whether this region status data will be available or not, and will factor this data into its dynamic routing decision, in preference to its original data space references. This means rout-ing regions are reviewing the same sta-tus data for a potential target region, regardless of which CMAS manages it. Therefore, the routing region is always using current status data to evaluate a target region rather than status data that could be up to 15 seconds old. In an environment where all routing targets are in a similar health and connectivity state, this means the spread of work across the workload target scope is far more even than in non-optimized mode. However, all the original data space

processing remains intact. This is neces-sary to maintain a seamless fallback mechanism should the coupling facility become unavailable.

Switching Workload to Optimized State For a workload to operate in a fully optimized state, all regions in the work-load must be at the CICS TS V4.1 level or higher and a CICS region status server must be running in the same z/OS image as each region in the workload in the CICSplex. This is a batch address space running a specialized CICS Coupling Facility Data Table (CFDT) server that is properly configured. This server must be managing the same CFDT pool name as that identified in the CICSplex definition (CPLEXDEF) for the CICSplex that will encompass your workload. The default pool name is DFHRSTAT. You may choose a differ-ent pool name or even a pool name that already exists in your z/OS configura-tion. However, a discrete pool name for dedicated region status exploitation is highly recommended. Otherwise, access to user tables in the pool may be degrad-ed by WLM operation, and vice versa. The decision on region status pool name should be made before any CICS regions in the workload are started. You may change the pool name while a workload is in flight, but it’s not recom-mended because:

• The change won’t be effective until all regions in the workload are restarted.

• The pool name switch while the work-load runs will cause the optimization function to be de-activated for all CICS regions connected to the region status server.

If the pool name is changed in error while the workload runs, then reversing the name to its original value will allow optimization to be re-activated. CICS regions required to run in optimized mode must be enabled for optimization. Setting a number of regions to be opti-

mized is most easily achieved using the CICSplex SM Web User Interface (WUI) CICS System Definition (CSYSDEF) tabular view and summarizing the list to a single row. Then use the update button to change WLM optimization enablement to enabled. That will enable optimization for all regions. Because you won’t want optimization set for your WUI server regions (and possibly others), you should then run through the updated system definition list and re-disable optimization for your WUI server regions on an individual basis. If some dynamic routing regions are already running, you may activate opti-mization to in-flight CICS regions using the “MASs known to CICSplex” tabular view in a similar manner to the “CICS system definitions” view. Users don’t need additional configuration actions to optimize their workloads. If you don’t run a region status server, workloads are forced to remain in a non-optimized state.

Coupling Facility Impact The coupling facility is impacted in two ways. CICS region status data is broadcast to it by target regions, and that data is subsequently read back in the routing regions when a route deci-sion is made. If CICS were to rebroad-cast status data at every change instance, and read it back on every occasion a route decision is made, then the cou-pling facility impact could be unsus-tainable. So, caching mechanisms were built in to reduce the number of I/Os to the coupling facility. Two tuning parameters are provided at the CICSplex and CICS system defi-nition levels to adjust coupling facility exploitation. One controls how often the coupling facility is updated with task throughput data; the other controls how long region status data should be cached by a routing region before requesting a refresh:

• Region status server update fre-quency: UPDATERS

• Region status server read interval: READRS.

A detailed description of these attri-butes is available in the field help for the CICSplex definition and CICS System definition WUI views. In addition to tuning the general read and update impact to the coupling facili-ty, two other specialized parameters allow further fine-tuning of the workload for heavy and light workload throughput:

1 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

CICs regions

required to run in

optimized mode

must be enabled for

optimization.

• Region status server top tier: TOPRSUPD

• Region status server bottom tier: BOTRSUPD.

If you think you need to deviate from the default settings for these attri-butes, monitor the performance of your coupling facility and that of WLM throughput capabilities for at least sev-eral days after modification. A region status record is 40 bytes. There’s one record for each region in your CICSplex, which is stored in the physical data table named from that CICSplex. This data table will be gener-ated within the named CFDT pool from the CICSplex definition resource table. CICS writes region status data to a file named DFHRSTAT. The definition of DFHRSTAT is automatically generated, and will locate a physical data table named from the parent CICSplex. Therefore, if PLEX1 comprised 100 regions, then the required space in the coupling facility would be 4,000 bytes for a table named PLEX1.

Optimized Workload Benefits If the topology of a CICSplex is such that regions in a workload can be man-aged by the same CMAS, then the per-ceived benefit won’t be so great. If most of the dynamic routing traffic flows through the DTRPGM exit, the benefit won’t be particularly high. If target regions in a workload execute a high proportion of non-dynamic throughput, the benefit of implementing an opti-mized workload is stronger. Benefits of running workloads in optimized state should become clear fairly quickly for a workload com-prised of routers and targets managed by different CMASs where the bulk of the dynamic traffic flows through the DSRTPGM exit—especially for trans-actional input that’s generated by dynamic CICS STARTs. No workload batching should occur. An effect of this will be that the overall workload should run through faster because fewer (if any) routed transactions would be waiting in the queue of a CICS region already at its MAXTASK limit. When your CICSplex extends beyond the scope of your Sysplex, there’s little benefit to optimized workload routing. Typically, this would occur when routers and targets are physically remote from each other. In those situa-tions, the isolated coupling facilities can’t be linked or shared, which effec-

tively nullifies the optimized routing functions.

Determining Workload Optimization State The easiest way to check the state of workload optimization is to use the active workloads view in the CICSPlex SM WUI. The list view contains a row for each workload active in the CICSplex. A new column added to this view indicates the workload optimiza-tion status. Expected values are:

• ACTIVE: All targets and routers are executing in optimized workload state.

• PARTIAL: At least one target and one router are executing in optimized workload mode.

• INACTIVE: The workload isn’t run-ning in optimized state, because either no routing regions in the workload are running in optimized state, no target regions in the workload are running in optimized state, or the workload was designated as being non-opti-mized.

The easiest way to check the optimi-zation state for a CICS region is to use the routing region or target region views located in the active workloads menu. The optimization status for the region is shown in the list views for both region types. Expected values are:

• ACTIVE: The region is executing in

optimized workload state.• INACTIVE: The region can execute

in optimized state, but it’s currently non-optimized. Reasons for this are detailed in the help data for the rout-ing and target region views in the WUI.

• N_A: The region isn’t optimized work-load-capable—probably because the region is running a CICS TS version prior to V4.1.

If you have regions that require no optimization capabilities, then set the region status server update frequency value for those regions to 0 to prevent the CICS transaction manager from broadcasting irrelevant region status data to the coupling facility. This would typically include all WUI server regions and any regions assigned a purely rout-ing role. CICS will record the status of a CICS region to the DFHRSTAT CFDT file. The definition for this file is automati-cally generated when the CICS region status function is initialized. The CICS file definition will be related to a physi-cal CFDT that’s named after the CICSplex name the region belongs to. When defining this file, RS domain will also generate a poolname gathered from the CICSplex Definition (CPLEXDEF) the starting region belongs to. The default poolname is also DFHRSTAT. In any given z/OS image, there must be one region status server per poolname running in that image.

1 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

figure 1: The lifetime of an Unoptimized workload of 10,000 Started Tasks initiated from a Single Routing Region

0

20

40

60

80

100

120

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

WLA266AA

WLA266AB

WLA266AC

WLA266AD

WLA266AE

WLA266AF

WLA266AG

WLA266AH

WLA266AI

WLA276A0

WLA276A1

WLA276A2

WLA276AT

WLA276AU

WLA276AV

WLA276AW

WLA276AX

WLA276AY

WLA276AZ

WLA286AJ

WLA286AK

WLA286AL

WLA286AM

WLA286AN

WLA286AO

WLA286AP

WLA286AQ

WLA286AR

WLA286AS

Sum of Cur Tasks

Interval

Region

C

M

Y

CM

MY

CY

CMY

K

TCS_zJournal-vault-Aug_Sep2010_outlines.pdf 1 7/19/2010 1:31:43 PM

For example, If a z/OS image exe-cutes CICS regions associated with PLEX1, PLEX2 and PLEX3, which all specify the default poolname, then only a single region status server must be running in that image for the CFDT pool named DFHRSTAT. Any routers needing to examine the status of a remote target will also require a region status server to run in the local z/OS image for the same poolname as that servicing the target regions. If you use the default poolname in the CPLEXDEF of all your CICSplex defini-tions, you’ll require one region status server per z/OS image.

Case Studies Figure 1 shows the lifetime of an unoptimized workload of 10,000 started tasks initiated from a single routing region. The workload is dynamically routed across a target scope of 30 target regions. Ten of these regions are man-aged by the same CMAS as the router, but the other 20 regions are managed by two other CMASs, one on a different LPAR in the Sysplex. Each line in the chart represents the task load in a target region at 10-second intervals. The lines clustered along the bottom of the chart are all local to the router. None of them exceed 10 percent exploitation. All the other regions are remote from the rout-er and are continuously surging 100 percent utilization and then back to idle. This is workload batching.

Consider Figure 2. This is the same workload, but with Sysplex optimization activated. No workload batching is occurring. None of the target regions are idle or at the MAXTASK limit. The workload is being spread equitably. The locality of the target regions to the router is appropriately reflected; the upper band of target regions is local to the router, and the lower band is remote from it. WLM correctly favors the local target regions over the remote ones until the task load difference for the region local-ity exceeds approximately 30 percent. However, the most important difference between the optimized and unoptimized workloads is represented by the number of 10-second time intervals across the bottom of each graph. The duration of the unoptimized workload was 16 10-second periods. When the same work-load runs in optimized state, the workload completes in 12 periods. In this test case, that was a 25 percent sav-ings in workload throughput time. These figures were measured in ideal circum-stances; you’ll need to run your own tests to determine your precise benefits. During testing of other intensive dis-tributed workloads, time savings of more than 50 percent were recorded. The higher the task load throughput, the greater the savings in throughput time. Sysplex optimization appears to be most effective at times of high throughput demand for distributed workloads. These are workloads fed to

CICS through asynchronous START commands. Typically, these are from MQSeries trigger transactions or WebSphere Sysplex Distributor. Workloads that originate from syn-chronous dynamic routing requests—such as those from transaction routes, function ships, etc.—won’t show such an exceptional improvement unless those target regions share transaction traffic with locally initiated tasks. In those circumstances, Sysplex optimiza-tion means the router will become aware of the non-dynamic throughput to a target region long before a heartbeat occurs; again, this lets routers make more intelligent routing decisions. If you’re running at least CICSplex to CICS TS V4.1 and your dynamic workload throughput comprises a high percentage of asynchronous routing requests, you should consider imple-menting Sysplex optimization.

Summary The key points to remember are:

• To enable workload optimization, first define and execute a region status server in each MVS image that will execute CICS regions intending to exploit it. When all regions are migrat-ed to CICS TS V4.1, those requiring optimization must be enabled in their CICS system definitions (CSYSDEFs).

• Users may mix and execute CICS TS V4.1 and pre-V4.1 regions in work-load, but full optimization benefits won’t occur until all systems are run-ning CICS TS V4.1.

• One region status server is required per pool name per z/OS image. Don’t start any servers if you don’t want to exploit optimized workloads.

• Don’t adjust the WLM RS domain tuning parameters until you’re certain an adjustment is required. When changes are deemed necessary, make them in gradual increments.

• Look at the new active workload views to monitor status and progress of workloads in target regions. Z

Dave Williams is part of the CICSPlex SM development team working in IBM’s development laboratory in Hursley, U.K., and has been part of the team since 1997. His career began in 1974, when he was one of the com-puter operators responsible for starting the CICS region at a major international bank. In 1978, his career shifted into applications development, writing CICS applications in Assembler. Since then, he’s been writing CICS code in Assembler and has covered most systems and develop-ment roles involving CICS. Email: [email protected]

� 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

figure 2: The Same workload as Shown in figure 1 but with Sysplex optimization Activated

0

10

20

30

40

50

60

70

80

1 2 3 4 5 6 7 8 9 10 11 12 (blank)

WLA266AA

WLA266AB

WLA266AC

WLA266AD

WLA266AE

WLA266AF

WLA266AG

WLA266AH

WLA266AI

WLA276A0

WLA276A1

WLA276A2

WLA276AT

WLA276AU

WLA276AV

WLA276AW

WLA276AX

WLA276AY

WLA276AZ

WLA286AJ

WLA286AK

WLA286AL

WLA286AM

WLA286AN

WLA286AO

WLA286AP

WLA286AQ

WLA286AR

WLA286AS

WLT266AA

Sum of Cur Tasks

Interval

Region

shortage of mainframe skills, but the role-based approach (for example, providing modules of information that are spe-cifically relevant to the storage manager or database adminis-trator) offers a somewhat different perspective of system management and one that reflects the growing business interest in skills frameworks and definitions. The first role for the workspace, CA Mainframe Chorus for DB2 Database Management, is now in beta but we’re likely to see further roles emerging fast to support CA Technologies’ new vision.

Syncsort Announces Data Integration Solution Focused primarily on high-speed data integration, Syncsort recently announced additional professional services to enhance its DMExpress 6 tool. This will help organiza-tions accelerate their mainframe application modernization initiatives by translating and transforming huge amounts of data between mainframe and open systems formats, a fre-quent obstacle for those tackling legacy modernization. In a further announcement, Syncsort and BMC Software announced an extension to their OEM agreement, through which BMC will deploy the Syncsort technology to assure that customer data needed for business-critical applications has integrity and is available to service the business.

NEON zPrime Gathers Momentum As the legal battles between IBM and NEON Enterprise Software continue unresolved, the Texas-based software company launched a new release of its tool for offloading traditional workloads to specialty processors. With zPrime 2.1, up to 90 percent of CICS workloads can reportedly be offloaded to specialty processors, and virtually all DB2 work-loads can run on System z Integrated Information Processors (zIIPs) and z Application Assist Processors (zAAPs). The new release includes features that allow system administra-tors to fine-tune the utilization levels of the specialty proces-sors. NEON says it has more than 50 zPrime customers, with eight set to take version 2.1 into production. With the poten-tial cost savings involved, this level of interest isn’t too sur-prising, though it seems likely that IBM will use its next generation of processors to close some of the loopholes that currently allow users to exploit zPrime. Z

mark lillyCrop is CEO of Arcati Research, a U.K.-based analyst company focusing on enterprise data center management and security issues. He was formerly director of research for mainframe information provider, Xephon, and editor of the Insight IS journal for 15 years. Email: [email protected]; Website: www.arcati.com/ml.html

By the time you read this, you will have heard the announcement of IBM’s next-generation processor. IBM is working hard to differentiate the functionality and

capacity of the new range from the current z10 technology in an attempt to increase demand in a recessionary market. Traditionally, new mainframe processors are launched when large users are getting desperate to unleash more MIPS at the top-end of the range. This time, the proverbial fruit might not be hanging quite so low for IBM, and instead users will be urged to take advantage of extra capacity to consolidate non-z/OS workloads onto the mainframe as a way of lowering overall processing costs or reducing power consumption per unit of work. In the meantime, users that are currently negotiating last-minute z10 contracts need to make sure the deal includes the option to upgrade cost-effectively to the new platform. Undoubtedly, the “zNext” will include some attractive features, and users need to ensure they don’t deny themselves access to the new models in the future because of over-restrictive contracts agreed to today.

What a Difference 20 Years Makes Leafing through some archive copies of Insight IBM recently, I came across the October 1990 issue, which cov-ered the launch of the System/390. Now that was a big event! You may recall that IBM introduced approximately 150 new products, including ESA versions of VM and VSE (to bring much needed MVS functionality to smaller users), peer-to-peer support for NetView, and an innovative con-nectivity architecture called ESCON. IBM also unveiled a major processor refresh in the shape of the ES/9000. It’s hard to imagine now the level of excitement that a major hardware announcement could generate 20 years ago, but I remember well the scores of IT journalists and analysts who were briefed on the new system’s feeds and speeds. The zNext is unlikely to draw such large crowds, I suspect, but it will be equally significant in its own way and will set the agenda for mainframe computing into the next decade.

Role-Based Management From CA Technologies Arguably the biggest announcement at this year’s CA World (apart from the name change to CA Technologies to reflect CA’s extensive heritage of managing technical solu-tions on different platforms—it’s good to see that techies are back in fashion!) was the company’s role-based workspace offering, Mainframe Chorus. The Chorus announcement focuses on the usual problem areas of cost reduction and the

marklillyCroP

z/Vendor WatchIBM Prepares the Market for Its New Mainframe Processor

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 1

I t’s important to pay attention to your storage configuration. Careful plan-ning in evenly distributing data and

workloads will yield better response times, more resilient operations, and higher throughput from your storage hardware. This article reveals the hidden influence of balance, which resources it impacts, and what balancing techniques you can use to improve performance and throughput without upgrading your hardware. We’ll discuss how storage tun-ing is uniquely different from processor tuning, and we’ll show that significant throughput and response time improve-ments may be possible with just a few well-chosen optimizations. Having an unbalanced storage system can affect performance and cost. When hardware resources aren’t evenly loaded during peak periods, delays will occur even though the resources are more than sufficient to handle the workloads. The consequence could be that hardware is being replaced or upgraded unnecessari-ly, which is obviously a tremendous waste of financial and other resources. Unfortunately, this often happens because of the low visibility of the most impor-tant metrics for the internal storage sys-tem components. If you only look at the z/OS side of I/O, these imbalances can be hard to find, resolve, and prevent. The mainframe performance per-spective has always been that Workload

Manager (WLM) optimizes the through-put in the z/OS environment by priori-tizing work and assigning resources. This load balancing works well for iden-tical processors in a complex. However, for storage, it’s a different story. The kind of optimization WLM performs simply isn’t possible for I/O since the location of the data is fixed. WLM can only manage the components that are shared, such as the channels and Parallel Access Volume (PAV) aliases. The inter-nal disk storage system resources are mostly out of WLM’s control, and utili-zation levels of the internal components of the storage system hardware are unknown to z/OS and WLM, so work can’t be directed to optimize balance. Let’s review how the level of balance on the major internal components of a disk storage controller influences the performance and throughput and how to create the necessary visibility to detect imbalances.

Front-End In a z/OS environment, front-end balance relates to the FICON channels and adapter cards. Most installations maintain a good balance between the FICON channels. z/OS will nicely bal-ance the load between the channels in one path group and, with multiple path groups, most installations have ways to ensure each path group does about the

same amount of work. The less visible components here are the host adapter boards. Multiple FICON ports are attached to one host adapter board, and the host adapter boards share logic, processor, and band-width resources between ports. So, it’s important to carefully design the layout of the port-to-host adapter board con-figuration. Link statistics provide a good way to track imbalance. The load on each of the FICON channels is the same, but the links aren’t evenly distrib-uted over the host adapter cards. The resulting differences in load on the host adapter cards negatively influence the response times for the links on the busi-est cards (see Figure 1).

RAID Parity Groups Redundant Array of Inexpensive Disks (RAID) parity groups contain the actual data the applications want to access. The throughput of a storage system largely depends on the throughput of the RAID parity groups. A common misconception is that a disk storage system with a large amount of cache hardly uses its disks because it does most of its I/O operations from cache or to cache. Although it’s true that under normal circumstances virtually all operations occur via cache, many of those operations do cause disk activity in the background. The only operations that don’t cause a disk access are the random

S t o r a g e P e r f o r m a n c e m a n a g e m e n t

B y g i l B e r t H o u t e k a m e r , P H . D . ,

a n D e l S D a S

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Sorry Nessie, we have actual data from satisfied end-users to prove this claim. Hundreds of Luminex Channel Gateways with DataStream Intelligence™, installed across 6 continents, are enabling mainframe virtual tape deduplication and replication, including batch and HSM data.

Although Channel Gateways won’t exactly draw crowds of tourists, they will enable you to dramatically reduce requirements for storage, floorspace, and energy. You’ll also increase the speed of backups, data retrieval and improve your disaster recovery capabilities.

Still not a believer? Visit luminex.com/proof.

Visit us at SHARE in BostonBooth 104

GOLD PREMIER SPONSOR

www.luminex.com

The proven, MODERN solution for Mainframe Virtual Tape.

We’ve all heard unbelievable stories. But do you know…

12X BATCH &

HSM DATADEDUPE

AD-Fantasy Multiple Choice (Nessie, full page).indd 1 6/29/10 8:08 AM

read hits; all others do access the disks at some point. For instance, sequential reads, even though they’re mostly hits, must always be read from disk. As for writes, all writes are done to cache, but they need to be written to disk sooner or later, too. Moreover, for many of the current RAID schemes, a single write on the front-end causes more than one disk I/O on the back-end. For RAID 1 or RAID 10, a write takes two disk operations since all data is mirrored. For RAID 5, a random write takes four operations; for RAID 6, it even takes six operations because of the more complicated way parity updates work for these RAID schemes. Sequential writes are much more efficient on RAID 5 and RAID 6 than random writes, but they will still generate more than one back-end I/O per front-end I/O. The key observation is that the back-end I/O rate is important and isn’t easily visible from the front-end I/O rate. Back-end peaks will likely be at a totally differ-ent time from the front-end peak, but possibly not even that much lower in terms of number of I/Os. Actual work-loads differ significantly between installa-tions; Figures 2 and 3 show some examples of back-end I/Os vs. front-end I/Os. How does this all relate to balance and performance potential? The back-end operations are done to a particular RAID parity group. If active volumes are placed together on a single RAID parity group, whereas other RAID pari-ty groups contain only inactive volumes, this most busy RAID group may run out of steam before any of the others do. As soon as this most busy RAID parity group reaches its maximum throughput limit, it will start responding slowly and all work to the other volumes on that RAID group will suffer, too. Likewise, an application that accesses one volume on an overloaded RAID group can encounter major performance issues even though most of the volumes it accesses are still fine. Therefore, having even only one highly busy RAID group may cause degraded application response times or longer batch periods. Ultimately, that may affect only a few batch jobs or, for example, it could cause a bank’s Automated Teller Machines (ATMs) to time out. Therefore, the overall throughput potential of a disk storage system greatly depends on the balance you can achieve between the parity groups (see Figure 4). Both charts represent the same workload on the same hardware, but the balanced layout on the right shows a peak of 540 back-end I/Os instead of the 900 I/Os for

the busiest RAID array on the left-hand side. This means the box could achieve a 66 percent higher throughput if every-thing was balanced evenly. The differ-ence between the left and right chart is that the left-hand chart is the current situation and the right-hand chart shows the situation that would be achieved if the volumes had been placed to achieve the best balance possible. A heat map is a useful tool for view-ing the workload at the parity group level (see Figure 5). You will need a software package to determine the activity for each parity group for a prolonged period, and, with this, you can plot the activity over time. In a heat map, a hotter color (orange to red) indicates an overloaded parity group for a particular time.

Cache Resources It may not be intuitively clear how imbalance can impact cache usage. Let’s consider how. Storage systems are provided with large amounts of cache memory to achieve a high number of read hits.

However, cache isn’t just used for reads. Writes are also done to cache, and they even take priority over reads. Writes will tend to fill up the cache if they can’t be de-staged, causing a lower ratio of read hits than would be expected with the configured cache memory. So, despite large sizes, cache memory available for reads can be significantly reduced when there are bottlenecks in the storage con-figuration that delay the de-staging of writes from cache to disk. Ultimately, a “FW bypass” condition may occur, where a write operation is forced to wait until de-staging occurs before it’s acknowl-edged as completed to the host. Since a FICON channel can send random write data much more quickly than a spinning disk can accept it, it’s quite possible to create a workload that will cause the writes to fill up the cache. In practice, those problems are most likely to occur in combination with flashcopy or shadow image technologies that require additional back-end opera-tions for each new write. While you may view decreasing read

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

figure 1: Response Time Per link for one Storage SystemNote: The response times per link aren’t equal because the links aren’t evenly distributed over the host cards in this configuration.

figure 2: front-End i/os vs. Back-End i/os Note: Peak occurs around 10 a.m. for the front-end rate and at midnight for the back-end rate.

zJOURNAL 8 X 10.75

10135_FATSCOPYad_zJournal.qxd AUG/SEPT 2010 ISSUE

Bring balance to tapeconversion & stacking with…

FATSCOPY

CORPORATE HEADQUARTERS: 275 Paterson Ave., Little Falls, NJ 07424 • (973) 890-7300 • Fax: (973) 890-7147E-mail: [email protected][email protected] • http:/ / www.innovationdp.fdr.com

EUROPEAN FRANCE GERMANY NETHERLANDS UNITED KINGDOM NORDIC COUNTRIESOFFICES: 01-49-69-94-02 089-489-0210 036-534-1660 0208-905-1266 +31-36-534-1660

Erasing Tapes > Mapping a Tape > Copying a Tape > Recovering Tapes with I/O Errors or Overwritten Tapes

“ We migrated 9000 3490E tapes and1000 9840 tapes with FATSCOPYto a Virtual Tape System (VTS).

This was about 15TB's of data and was done as background jobs that were run over a 6 weekperiod of time and did not causeany downtime.”

“ We had a media problem on a 9840 tape that had 6000 HSM data sets. We used FATAR to copy the tape and HSM was thenable to recall data sets successfullyfrom the copied tape.”

“ FATSCOPY allowed us to quicklyconsolidate 200 MAGSTAR tapes to 3 3592 tapes.”

“ To insure that no unencrypteddata goes offsite we use FATSERASE to erase all existingdata on our 3592 scratch tapes.”

VISIT US AT: SHARE TECHNOLOGY EXCHANGE • BOOTH #202 • AUGUST 2 - 4, 2010 • BOSTON, MA

For a No-Obligation FREE Trial or to request aFREE Concepts & Facilities Guide, call: 973-890-7300, e-mail: [email protected] visit: http://www.innovationdp.fdr.com/stackingSUN VSMEMC DLm3480/3490 Tape IBM TS7700 Bus-Tech MDL

hit ratios and increasing FW bypass rates as a sign that there’s no longer enough cache, the real reason is that one or more of the back-end arrays can’t handle the de-staging load. Usually, it’s only a small number of arrays that are in trouble, so the easiest, cheapest, and most effective solution is to simply make sure random write activity is well-spread across arrays. For replicated environments, you must take the back-end of the secondary storage system into account, too. Any write done on the primary system must also be done on the secondary. The sec-ondary system therefore needs to be able to de-stage the requests in time to pre-vent the secondary cache from filling up with writes. If the secondary can’t keep up, new writes from the primary will be delayed and they will start to fill the cache on the primary side. This is why

you must be particularly careful when deciding whether to select a more eco-nomical disk type on the secondary sys-tem compared to the primary.

Techniques to Optimize Throughput There are several techniques to achieve a better balanced system with more throughput. Let’s review the major ones:

• Configuration of the storage system hardware: Spread logical volumes across more physical disks in one or more RAID parity groups. The larger the group, the more likely it is the work is more evenly spread. That’s why a RAID 10 configuration with eight disks in a parity group will per-form better than a RAID 1 configura-tion, why 28D+4P provides a better balance than 7D+P, and why storage

pool striping works well.• Design of the SMS configuration:

Use a storage configuration with “hor-izontal storage pools” across both par-ity groups and Library Control Units (LCUs). This way, z/OS and Data Facility System Managed Storage (DFSMS) load balancing tends to spread work across all parity groups.

• DFSMS features: Use software strip-ing for highly active data sets so the work is spread over multiple, logical volumes in a storage group and most likely over multiple, physical disks. With just four stripes, you already have four times as many physical drives working on the I/Os, and the peaks are going to be much lower. Note that striping can be just as effec-tive for a random access data set as for sequential access.

• Tuning: Actively tune the configura-tion by moving volumes away from “hot” RAID parity groups. Most instal-lations do this with a manual review process, but this is a difficult task because of the many factors that must be considered. Existing software can recommend which volume moves are the best ones if you want to achieve and maintain a balanced configura-tion.

• Smart layout: When moving to new hardware, it’s important to make the layout as balanced as possible. For instance, distribute all FICON links and remote copy links as evenly as possible over all the host adapter cards, and spread the volumes over the RAID

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

figure 3: front-End i/os vs. Back-End i/os for a Different configurationNote: The back-end peak is a little over 50,000 I/Os, for an interval where there are only 35,000 front-end I/Os.

figure 4: front-End i/os vs. Back-End i/os Note: Peak occurs around 10 a.m. for the front-end.

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

parity groups in a way that optimizes the workload balance. Again, it’s a tedious, difficult task to do this manu-ally, but software can be used to find the optimal mapping for volumes over RAID parity groups.

Using a combination of these tech-niques, you will be able to create a well-balanced system and get more throughput and performance from your system without much effort. You may

even be able to use higher-density disks or move from RAID 10 to RAID 5 with-out a performance penalty.

Summary The way a storage configuration is balanced greatly influences its through-put and responsiveness. If there’s an imbalance between the components, delays can occur even though the hard-ware itself would be capable of handling the workloads. Using smart storage per-

formance management techniques to achieve a well-balanced system can yield impressive results in both throughput and response times. With the right bal-ancing efforts and software tools, storage hardware purchases may be postponed, saving a lot of money. If you manage stor-age performance wisely, it will directly translate into increased user satisfaction and lower hardware costs. Z

Dr. GilBert Houtekamer and els Das are co-owners and managing directors of IntelliMagic. Gilbert holds a Ph.D. from the Delft University of Technology. He has more than 20 years of experience in I/O performance analysis, and has written numerous publications on this topic, including the book, MVS I/O Subsystems, which he co-authored with Pat Artis. Email: [email protected]

els holds a master’s degree in Mathematics from the University of Amsterdam. She has worked in the IT business for more than 10 years, specializing in storage performance since 1999. Prior to founding IntelliMagic, she was the development manager of the performance software at Consul. Email: [email protected]

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 7

figure 5: Heat Map Showing various Singular Red SpotsNote: The red spots indicate that a better balance might be possible.

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 7

DINO-BlowUp_zJournal halfP-O.ind1 1 7/2/2010 7:56:56 AM

The new System z, called zEnterprise, will thrill the mainframe techies with substantially more power and speed. It also will delight mainframe data

center managers as it propels the main-frame to the center of the enterprise com-puting universe by enabling them to manage not only the mainframe environ-ment but also other IBM platforms: AIX Power blades, x86 Linux blades, and even, potentially, Windows running on x86 blades. But what’s in it for the business? Three advantages come to mind: 1) cost savings, 2) better service delivery performance/risk management, and 3) greater flexibility. The CFO, COO, and other C-level execs may not appreciate or even understand arguments about mul-tiple processor cores or super-scalar sys-tem design but they will understand saving money, better service delivery performance/risk management, and greater flexibility. In business terms, that translates into information systems that are cost-efficient and always available and able to deliver top performance, can accommodate the wide breadth of enter-prise workloads, and can scale up quickly and efficiently as the business grows. That’s what the new zEnterprise delivers. To underscore the point that the new machine isn’t merely an upgrade of the existing System z10, IBM deliberately

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

By Alan Radding

The New System z: The Business Perspective

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � �

started a fresh naming sequence with the zEnterprise. The first is the System z196. To that end, IBM calculates the zEnter-prise can deliver a 40 percent perfor-mance improvement over the z10 for z/OS and a 60 percent improvement for Linux on System z. As far as typical transactions go, IBM claims the new mainframe costs 67 percent less per ATM transaction and 44 percent less per credit card transaction compared to a distributed environment. Although the z10 will continue for some time, IBM clearly is indicating the zEnter-prise represents the mainframe future.

Cost Advantage Cost savings represent the first of the business advantages of the zEnterprise. IBM claims the new machine will lower the cost of acquisition, reduce storage costs, and reduce labor costs. IBM esti-mates Linux on zEnterprise will cost less that $1,000 per virtual server per year and reduce energy consumption by up to 80 percent. Of these, reductions in labor are the most important to the business by far. Top managers long ago should have figured out that the cost of acquiring IT systems—whether buying boatloads of distributed x86 servers or a centralized IBM mainframe—pales in comparison to the cost of the people required to deploy and support that technology over its productive life. The cheapest PC server—of the type a large enterprise can acquire in volume for a few hundred dollars apiece—still costs many thousands of dollars individually to support over its lifetime. And most of that cost reflects the cost of the myriad system and network administrators, help desk personnel, and others required to attend to the needs of those systems, their applications, and their users. By widely deploying automated management tools, the organization can reduce the number of people it needs. Still, most distributed, disparate systems invariably are more labor-intensive and less amenable to auto-mated management than centralized sys-tems such as the mainframe. Mainframe systems come with a comparatively high cost of acquisition despite programs by IBM such as the System z Solution Editions or IBM’s Open Infrastructure Offering (OIO) that do somewhat effectively lower the cost of acquisition. However, fewer peo-ple—usually far fewer—can more effi-ciently administer a mainframe environment supporting an equal or greater amount of work than a distrib-uted environment. And it’s the fully loaded costs of all those people, with

their salaries and benefits increasing year after year, that represent the biggest recurring hit to the balance sheet, not the initial technology acquisition. With the zEnterprise, this manage-ment efficiency can be extended to other platforms, specifically IBM’s Power and x86 blade systems. The zEnterprise becomes what IBM refers to as the sys-tem of systems, the result of a new Unified Resource Manager included with the new machine. An administrator working through the Unified Resource Manager, more aptly considered an inte-grated virtual hardware platform man-ager, can manage the server, storage, and network resources running on the System z, Power, and x86 components as System z virtual platform resources, effectively increasing each administra-tor’s span of control while reducing the number of administrators required. Add to these zEnterprise labor sav-ings resulting from increased developer productivity, if using the newest Rational toolset, and the cost picture looks that much better still. Finally, add in the value of the energy savings associated with the zEnterprise—as much as 80 percent for Linux on System z, according to IBM—and you can construct a compelling busi-ness case for the zEnterprise compared to distributed systems designed to sup-port the same workloads with equal lev-els of reliability, availability, scalability, manageability, and security.

Service Delivery/Risk Management Advantage Better service delivery performance is the second business advantage the zEnterprise offers. Better service deliv-ery has several dimensions. One is risk management; the zEnterprise provides better service delivery reliability and availability, which reduces risk. This is about avoiding the dreaded system-is-down complaint. IBM has long boasted of the mainframe’s 99.999 percent uptime. Even the best distribut-ed systems have to be specially engi-neered to deliver near this level of reliability and the results still are ques-tionable. With the zEnterprise, busi-nesses will demand the same level of mainframe reliability or more. Maximum uptime allows the organi-zation’s people to remain productive, instead of waiting for systems to come back up. More important, customers wanting to place orders or requiring services will find the organization up and running. In an Internet-connected world where competitors are only a

click away, top-notch service delivery performance can loom large indeed.

Flexibility Advantage Flexibility results from giving the organization choices. The zEnterprise enables more choices and surprising choices. To begin, IBM has been opti-mizing its core application suites for the new mainframe as well as for its recently upgraded POWER7 Systems. Combined with Unified Resource Manager, organi-zations can choose where they want to run key applications and still manage the platforms efficiently through the zEnterprise. And with x86 blades in the mix, it’s possible Windows applications can be included. IBM didn’t rule it out. Today, Linux and Cognos run on the z10 through an Integrated Facility for Linux (IFL). With the zEnterprise, that can continue. However, the organization can now choose to run Cognos on the zEnterprise right alongside its DB2 data, which will give it a significant perfor-mance boost. Or, it can decide to leave its Linux applications on an IFL or x86 blades managed through the Unified Resource Manager, giving the business more options. Whether it’s enterprise data serving, business intelligence and business ana-lytics, or Lotus-based collaboration, organizations can choose where in the now expanded and optimized main-frame ecosystem they want these appli-cations to run and still gain the benefits of the zEnterprise. The same goes for WebSphere and Java, which also have been optimized for the zEnterprise. In the end, zEnterprise enables choices: consolidate more or distribute some, co-locate applications and data, or dis-tribute but always remaining under zEnterprise management. The new zEnterprise isn’t perfect. IBM hasn’t added new workloads that can’t already run on the z10 today. Its cross-platform capabilities are limited to select IBM platforms and blades. In some ways, it even complicates the deci-sion of where to run particular applica-tions and workloads, which now will entail an assessment of the various plat-form trade-offs. But when it comes to business con-cerns such as cost, assured service deliv-ery, risk management and flexibility, the new machine still comes in ahead. Z

alan raDDinG is research director at Independent Assessment, an independent technology and business research and analysis group based in Newton, MA. Email: [email protected]: www.independentassessment.com

For the last two decades, we’ve considered and discussed proactive systems management as a desired state for IT. Despite the rhetoric, it appears not many shops have com-

pleted the journey. Today, critical business services rely on mainframe systems and applications being available and per-forming non-stop to Service Level Agreements (SLAs). The failure of one of these crucial business services can have cata-strophic effects on the enterprise—from decreased profit to outright cessation of business. Given these stakes, the require-ment for proactive systems management transcends IT operat-ing methodologies to become a business imperative. This article explores proactive systems management and its six stages of maturity; it also describes the processes and tooling required to implement a proactive operating rhythm. It con-cludes with some best practice “first steps” and the potential benefits of implementing proactive systems management.

Proactive IT Systems Management Proactive systems management can be described as managing the service delivery performance of business applications so per-formance problems are identified and remediated either before they have an impact, or before the impact has an adverse business effect. It involves having in place a specific set of IT processes, tooling, and skills. In most IT organizations, getting to proactive systems management will require a change in one or more of these three components. Getting to proactive systems manage-ment isn’t typically a binary operation. Rather, it’s a phased effort, with the amount of change required dependent upon in what stage, or level, of proactive systems management maturity, the organization currently operates. Proactive systems management might be said to have six levels:

• Level 1: Wait for user complaints of service problems and react• Level 2: React to rule of thumb alerts, diagnose via “war-room”

approaches, and instigate silo-based investigation• Level 3: Automatically discover violations and enable more

rapid remediation, eliminating most war-room convocations• Level 4: Automatically discover violations and advise stakehold-

ers of potential impact before remediation efforts• Level 5: Automatically discover violations and automatically

mitigate impact• Level 6: Automatically determine future (impending) violations

and automatically remediate before impact.

This synopsis of a more extensive topic enables an under-standing of some of the processes and tooling that would be required to move from the lower maturity levels to the higher ones.

Processes Required

• Cross-silo, cross-platform systems management discipline. In addition to a process requirement, this may extend into organi-zational considerations, too. Being proactive in one silo won’t result in proactive systems management of the business services if a problem occurring in another technology the business ser-vice uses is managed in a reactive manner. Ideally, organizations that want to reap the benefits of proactive management should be committed to proactive efforts across all appropriate tech-nologies. It’s possible to start raising the level of proactive matu-rity in one platform, such as the mainframe, and then extend the processes, motions, and lessons learned to other platforms.

• Business service constructs for use in implementing proactive

3 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Proactive IT Systems Management:

The Time Is NowBy G. Jay Lipovich

8 0 0 - 2 8 4 - 3 1 7 2 • 7 8 1 - 2 7 2 - 8 2 0 0 • w w w . b u s t e c h . c o mProduct and company names mentioned in this publication may be trademarks or registered trademarks of their respective companies and are hereby acknowledged.

Yes you can!

You didn’t think you could afford everything on your IT check list.

Tape VirtualizationData DeduplicationImproved RTO & RPORemote Vaulting

See us at SHARE - Booth 301

Plus simultaneous back-up for your open system and mainframe environments.

management. Proactively managing service commitments for business ser-vices isn’t possible if there’s no under-standing of the business service. Focusing proactive efforts on a CICS region, a DB2 database, or a specific Logical Partition (LPAR) won’t neces-sarily translate into proactive manage-ment of the business service. Without a business service construct for sys-tems management, you may waste time in discovery and remediation efforts in one silo and even make the performance situation worse.

• Encapsulate “tribal knowledge” housed in individual technicians. Although many organizations are building run books of prescribed actions for a variety of operational sit-uations, these may not focus on some of the deep technical knowledge tech-nical experts possess. There are several benefits of doing this: 1) the “really smart people” know how to identify and resolve many problems in their sphere of knowledge and they typical-ly take the shortest path to service restoration; 2) their expertise, reason-ing, and systems management pro-cesses may be applicable in other areas outside their sphere of responsibility; 3) when their knowledge is captured via automation, it reduces the time spent on repetitive triage tasks and frees them up for higher-value activi-ties; and 4) capturing their knowledge mitigates risk to the organization and enhances IT governance processes.

Tooling Required One of the biggest barriers to increas-ing proactive maturity lies in the tooling technicians have at their disposal for systems management. IT must recog-nize that yesterday’s status quo systems management may be insufficient for proactive systems management. The tools and processes implemented and used by the same technicians for 10 to 20 years may have sufficed so far, but that doesn’t mean the organization doesn’t need proactive management. IT should evaluate its current level of pro-active maturity, the desired level of maturity, and then assess what’s needed to reach the desired maturity level and the associated business benefits to be gained. Getting to proactive management requires a degree of innovation—in the processes, in the organization, and espe-cially in the tooling used to support systems management. The following is a list of some of the key requirements for

proactivity in systems management tooling:

• Solution breadth and depth. With business services executing across plat-forms and technology silos, systems management solutions must provide complementary breadth. Cross-plat-form solutions provide a common framework for management, common terminology and alert, drill-down analysis and resolution paradigms, such as single pane of glass displays.

• Constructs for business services that flow across mainframe silos (CICS, DB2, IMS, WebSphere Application Server, WebSphere MQ, etc.) and across platforms. If IT defines a pro-cess for managing cross-platform busi-ness services, there’s a need for tools that can synthesize the business con-structs automatically and display tri-age and root cause analysis data within silos from a starting point of the busi-ness service transaction.

• Thresholds appropriate for proactive operations. Most threshold alerting is based on either rules of thumb or experiential thresholds. Both have resulted in either so many alerts that they’re ignored (and therefore have no value) or so few alerts that perfor-mance problems with real business impact aren’t raised as alerts until the business owner calls. Thresholds should be automatically determined and actively maintained and updated as business cycles and processing pat-terns change. Without intelligence in the threshold process, it’s difficult to raise systems management to the higher levels of proactive maturity.

• Effective, intelligent alarm manage-ment. Complex business applications must be able to alarm on a wide variety of measures and conditions from mul-tiple technologies. Alarm management should be capable of dealing with the business service complexity without being complex. It also must differenti-ate single occurrences of exceptions with more serious multiple occurrenc-es within specific time cycles.

• Triage and support for root cause analysis. When tooling for the first four items in this list is in place, the next step is to enable drill-down on the alert to find solutions as rapidly as possible. To accomplish this, solution-led, drill-down analysis and problem solving across multiple technologies and platforms must be provided.

• Intelligence and advice. The com-plexity and scope of business services

and the underlying IT infrastructure interfere with the application of tech-nicians’ experiential knowledge. Intelligence and advice in systems management solutions can deal with the complexity, while also addressing some organizations’ concerns about bridging the generation gap to main-tain deep technical knowledge. Highly experienced technicians will find it difficult to achieve proactive maturity without the assistance of advisor tech-nology. Less experienced ones will find it impossible. Intelligent advisor technology can quickly analyze large numbers of variables, identify prob-lem sources, and recommend remedial or corrective actions.

• Predictive intelligence. To achieve the highest level of maturity, the sys-tems management tooling must antici-pate problems before they arise. This requires observing Key Performance Indicators (KPIs), charting their direc-tions, correlating changes with other KPIs, and predicting when one of the KPIs will reach a critical point that could impact services.

• Automation. While all of these capa-bilities can enable more effective sys-tems management, without automation there can be little truly proactive activ-ity. Automation provides alerts that lead to resolutions and reduce manual errors; it should be sophisticated enough to act on minute technical indicators while designed to surround the sophisticated and complex in sim-plistic, uncomplicated implementa-tions. Automation must be able to take actions across a broad range of objects and conditions and reflect a linkage to service impact models along with an automated feedback mecha-nism to those models and to other processes, such as service desks. Cross-platform business services will also demand cross-platform automation, as the operation, start-up, and shut-down of various platforms will be required as part of proactively manag-ing the business service.

Some First Steps Arriving at proactive systems man-agement will take organizational com-mitment for process change, organizational change, and possibly tooling changes, none of which happen quickly. As a starting point, consider implementing some of the following steps to increase the level of proactivity in your operations: • Identify key business applications and

3 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   3 3

the infrastructure pieces that support them.

• Know what conditions look like (resources used, work volume, etc.) when these key applications are meet-ing their SLAs.

• Set KPI thresholds based on compli-ant conditions that align with the applications and infrastructure.

• Use alarms to focus on thresholds that are being tripped.

• Set up pre-arranged automation for handling the conditions identified as problem situations.

• Integrate the alarming and event noti-fication into Business Service Management (BSM) processes to access service catalogs, Configuration Management Database (CMDB)/Configuration Management System (CMS), event management, and ser-vice desks so the mainframe fully par-ticipates in any BSM processes.

• Evaluate the increased effectiveness of systems management efforts.

The ease with which your organiza-tion can take these steps will depend greatly on the nature of your systems management processes and tooling.

Conclusion When IT can meld the processes and tooling to increase proactive manage-ment maturity, payoffs to the enterprise can be significant. These are benefits that have real value to the business and to the management of IT itself. They include:

• Reducing the business impact of infra-structure and application issues

• Lowering the cost of the mainframe• Reducing manual errors• Minimizing firefighting and war-room

time for IT staff, giving them more time to increase IT value to the busi-ness

• Creating a transition to a new genera-tion of technicians

• Mitigating risk to both the business and IT. Z

G. Jay lipoviCH has more than 35 years of experience in the design and development of strategies and solutions for more effective infrastructure and data management, including design strategy and performance testing for a mainframe hardware vendor; and design and develop-ment of strategies and solutions for infrastructure, data management, and capacity planning. He has published numerous articles in trade journals and is a frequent pre-senter at Computer Measurement Group (CMG) confer-ences and seminars around the world. He has been a guest lecturer at the U.S. Department of Defense Computer Institute and is ITIL Foundation certified. Email: [email protected]

www.luminex.com

The proven, MODERN solution for Mainframe Virtual Tape.

Sorry Bigfoot, we have actual data from satisfied end-users to prove this claim. Hundreds of Luminex Channel Gateways with DataStream Intelligence™ are installed across six continents, enabling mainframe virtual tape deduplication and improving disaster recovery by getting data offsite faster and more cost effectively.

Still not a believer? Visit luminex.com/proof.

20X+BACKUP DATA

DEDUPE

We’ve all heard unbelievable stories. But do you know…

A previous article (“DB2 UDB for z/OS Version 8: Performance Strategies,” z/Journal April/May

2006) described one common way DB2 DBAs and systems professionals tune DB2 for z/OS on a subsystem level. We called this resource constraint analysis because it’s based on an analysis of what resources are currently constrained and how that can be mitigated. This led to a DB2 systems tuning strategy to assist the DBA and systems programmer in developing the basic instincts necessary for supporting the IT enterprise. This article will examine the other side of the equation. How can we design databases so they may be accessed in an efficient manner? Are there designs that permit the DBA more flexibility when tuning is being done? We will pay specific attention to database design schemes and how they affect the way the DBA or systems pro-fessional tunes DB2 for z/OS subsys-tems. A follow-up article will take a look at another area: how application design affects DB2 system performance.

Resource Constraint Tuning Consider your DB2 system (or data-sharing group) along with your applica-tions as a single entity that consumes resources. Applications (including the DB2 subsystem code) use various com-binations of resources during execution. These resources may include:

• CPU cycles• Central storage (memory)• Disk storage (I/O)• Network (message travel time)• Object access (locking, thread wait,

enqueues)• Application throughput (elapsed

times, transactions per unit time).

A resource-based approach to DB2 system tuning concentrates on two aspects of the system:

• Resources as bottlenecks or con-straints

• The way resource usage affects key system characteristics, including recoverability, performance, avail-ability, security, and features and functions.

A resource-based approach uses the following procedure:

1. Identify resource constraints2. Identify coincident resource avail-

ability

3 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

How Database Design Affects DB2 for z/OS System PerformanceBy Lockwood Lyon

3. Note critical system characteristics and issues

4. Analyze 1, 2, and 3, looking for trade-offs.

By trade-offs we mean you’ll be looking to reduce or eliminate a resource constraint by re-directing the applica-tion to use a different resource. For example, you determine that you have a CPU constraint during your nightly batch window. Upon further analysis you note that many applications access a large partitioned table space using CPU parallelism. You can reduce the CPU constraint by inhibiting parallel-ism for those applications, although the applications may run longer. Resource constraint analysis can occur at the object, application, or sys-tem levels. For example:

• Analyze critical tables and indexes. Consider them to be constraints. Develop ways of using excess CPU, DASD, or other resources to reduce contention.

• Analyze critical applications. Consider them to be constraints. Develop meth-ods of using excess resources to increase application throughput.

• Analyze the DB2 subsystem. Determine the resource bottlenecks. Develop techniques for using excess resources to relieve the constraints.

The Impact of Database Design Database design will usually affect I/O, object availability, and application throughput. Figure 1 lists these resource constraints, with common root issues and the typical fixes or workarounds implemented. In Figure 1, notice the presence of partitioning in the typical fixes column. Partitioning schemes greatly influence the options a DBA has for tuning. Some schemes may prevent certain tuning tactics, while others are more accom-modating. The way a table and its indexes are partitioned affects more than just resource constraints. Partitioning usu-ally determines backup, reorg, runstats frequency, and data purge strategy; it can also be used as a performance tool to distribute activity across volatile page-sets to avoid hot spots. What specific measures can we take during database design or application tuning to minimize I/O, availability, and throughput constraints? The biggest issue will be data distribution and how activity is distributed across the page-

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   3 5

sets. There are many different models and examples of this; we’ll consider only a few, but the principles are valid across most models.

Insert Hot Spots Sometimes activity concentrated at a single point is a good thing. Consider a table with a sequentially ascending clus-tering key that experiences frequent row inserts. All new rows can be added at or near the physical end of the table space without interfering with each other since DB2 prefers to ensure that a row insert doesn’t wait (or timeout) rather than insist on placing the row on the “best” page. If most table activity is these inserts, then concentrating activi-ty at one point may be acceptable. When we have a high insert rate of rows that are evenly distributed across a table we may encounter resource prob-lems. If every row is inserted on a page different from the last, then each row requires a getpage and a lock (usually a page lock). As the number of getpages and locks per transaction increases, the number of physical I/Os tends to increase. In addition, more pages are locked. This can lead to excessive syn-chronous I/Os, excessive page locking, long transaction execution times, and increased chance of deadlock or time-out. This can be somewhat annoying in an online environment, as the victim of the timeout or deadlock is rolled back by DB2, requiring either failing the transaction or re-executing it. Several database designs address various aspects of this situation. The most common is a “rotating” parti-tioning scheme that focuses new row inserts in the last logical partition coupled with purge logic to remove or archive old data from earlier parti-tions. At some point, you can imple-ment a partition rotation process

using SQL similar to the following:

ALTER TABLE <table-name> ROTATE PARTITION

This assumes implementation of table-based partitioning that provides for reuse of purged partitions. Such a scheme must be organized and coordi-nated with backup and recovery pro-cesses. Rotation of partitions will result in the physical partition numbers (cor-responding to the physical data sets underlying the design) to no longer match the logical partition numbers. This is detailed in the IBM manuals. Another database design relates to insert hot spots in transactional tables that are frequently referenced. Here, you want to avoid insert hot spots by spread-ing new rows across the table space or partition rather than clustering them at the end of the pageset, while still main-taining a partitioning scheme that allows for data purging based on age. You can accomplish this using table-based partitioning, where the partition-

ing key is a surrogate key whose value is randomly determined at the time of insert. Another option is using an insert trigger. The effect is to spread row inserts evenly across partitions. Try to ensure that data access paths aren’t affected by this scheme. One possible danger involves Data Partitioned Secondary Indexes (DPSIs), which SQL uses to access the table. Without includ-ing a predicate specifying the partition key, SQL may cause DB2 to scan each partition of the DPSI to find qualifying rows. Because of this issue the DBA usually implements the aforementioned scheme without DPSIs.

Input Hot Spots Schemes that distribute rows to avoid hot spots tend to increase synchronous I/Os while decreasing the ability of DB2 to perform prefetch operations. Data availability may be a problem if transac-tions (commit-to-commit) are long since many pages will be locked. You should design transactions to have rela-tively short units of recovery. Such short transactions will have a higher propor-tion of commits over time than longer transactions, so the amount of work done at commit (such as externalizing log records) may affect system perfor-mance. One additional advantage of rela-tively short transactions is that they hold claims on pagesets for shorter peri-ods, allowing utilities to drain these claims and to process without failing. Partitioning where rows are inserted at end-of-partition will be friendlier to SQL statements accessing blocks of rows with similar key values. Data availability improves as such rows physically insert-

figure 1: Resource constraints, Possible Root causes, and Typical fixes

3 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

The way a table and its indexes are partitioned affects more than just resource constraints.

ed on the same page will limit total pages locked. Throughput is also rela-tively fast because inserts won’t compete with each other.

Data Purge Issues You can implement a rotating parti-tion scheme to purge or archive old data. Assuming you have no need to archive data, a partitioning scheme based on the purge criteria (usually date) works well. In our example where we partitioned a table by month, we can easily empty the oldest partition using load utility statements, such as the fol-lowing, having no input data:

LOAD REPLACE ... PART <partition number>

Some care must be taken since the load utility must know the physical partition number. Since after partition rotation the physical and logical partitions are no longer synchronized, the DBA must implement some process to generate the load statement with the correct parti-tion number. This can be done with programming or by using a language such as REXX. Note that these schemes also require coordination with other DBA processes such as reorg and image copy since the physical partitions requiring backup (and potential recovery) change over time. Again, you can use a program-matic process to generate correct copy and reorg statements or Real-Time Statistics (RTS) to automate these jobs. For schemes where the purge criteria can’t be conveniently matched with the partitioning scheme, purging must occur using an application. Here, to avoid resource constraints, you must design the purge application to peace-fully coexist with other concurrent table activity. For volatile tables that may be accessed at any time of day the purge application must be coded as restart-able, since it may encounter a deadlock or timeout. In addition, you should be able to tune the commit frequency easi-ly so the DBA can adjust it to minimize any timeouts and deadlocks other appli-cations experience. This is commonly done by having commit frequency information placed in a common area (such as a DB2 table) that can be modi-fied when needed, avoiding program logic changes. What happens if the purge process executes more slowly than rows are added to the table? The usual answer in this case is to partition the table based on other criteria and then execute mul-

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   3 7

tiple, simultaneous instances of the purge process. To avoid deadlocks, each instance is designed to purge from a dif-ferent partition. Again, if a partition rotation scheme is in place, the parti-tion choice must be programmed in some fashion. Partitioning schemes that favor quick purge (via emptying partitions with old data) tend to lessen the I/O load by lim-iting transaction activity to partitions with current data. However, by concen-trating data access to a small portion of the table you must beware of causing availability constraints since row access activity is now confined to a relatively small part of the table. Application throughput is excellent for inserting applications while queries may possibly deadlock or timeout due to page con-tention. This can be alleviated in several ways, each with advantages and disad-vantages. Options include querying with an isolation level of uncommitted read, implementing row-level locking, or shortening transaction length via more frequent commits.

Clustering There are three common reasons for table clustering:

• To support high-performance, multi-row access (such as I/O parallelism)

• To support distribution of volatile pages to avoid hot spots

• To allow some joins to work faster.

In each case, rows with “nearby” keys are expected to be accessed in the same transaction or in the same SQL statement. Increasing their physical proximity tends to decrease table get-pages and avoid random table I/Os. However, this may not always be the case. Another possible benefit of cluster-ing is when many (hundreds or more) rows with nearby keys are accessed. Clustering may favor either sequential prefetch or dynamic prefetch of pages, reducing I/O wait times and improving application throughput. Some partitioning schemes will clash with clustering needs. For example, par-titioning schemes that distribute row inserts to avoid hot spots may prevent clustering in the preferred manner. Sometimes the intent of the DBA or database designer is to partition tables similarly to encourage the use of I/O parallelism. This isn’t as common as you might think; most table joins tend to be parent-to-child and each table may have

different volumes, hot spots, and clus-tering needs. Similar cases are encountered in a data warehouse environment that sup-ports so-called Kimball designs—a sin-gle fact table with time-dependent data usually partitioned by date that’s joined to several dimension tables. In these cases, the dimension tables are usually partitioned by subject area or by some key in the subject area. This lets DB2 use a star-join access path to join the dimension tables before joining to the fact table. Implementing I/O-parallel-ism for the dimension tables may be possible by partitioning them all in a similar fashion. Clustering methods that take advan-tage of methods of high-performance data access (joins, I/O parallelism, prefetch) are usually chosen because they tend to avoid I/O and throughput constraints. However, as more data is accessed and locked, data availability issues may arise.

Recovery Issues Sometimes, partitioning schemes create additional issues for the DBA. A case in point involves the previous example where data is inserted into par-titions based on a random surrogate key. Here, the DBA has little choice but to schedule regular image copies of all partitions. Contrast this with a month-based partitioning scheme, where it may be possible to schedule frequent image copies of current data only (perhaps the most recent partition), while other par-titions are copied less frequently. Of course, this assumes that old data is changed far less frequently (or not at all) compared to recent data. This also applies to a disaster recov-ery environment. With a month-based partitioning scheme, the DBA can recov-er recent partitions first, making critical data available as fast as possible. This must be accompanied by the proper indexing scheme, perhaps using DPSIs and including index image copies.

Best Practices Some final notes on partitioning:

• Partitioning schemes such as partition rotation sometimes require coordinat-ing application and database adminis-tration efforts to ensure that image copy, reorg, and purge processes access the correct physical and logical parti-tions.

• Different partitioning methods sup-port data load, old data purging, and

infrastructure processes (backup, reorg, etc.) to different degrees.

• Application data access patterns, pro-posed data distribution, and data vola-tility may determine a particular partitioning scheme.

Making a change to a partitioning scheme may not be a viable option, or at least may be deemed extremely costly. Such a change usually means a complete data unload/load, rebuilding indexes, changing infrastructure support pro-cesses, and more. So, once chosen, the DBA would prefer that the partitioning scheme remain in place. A corollary of this is that if multiple partitioning methods are viable for a particular table you should choose the one deemed most flexible. This requires the DBA to be aware of current system performance constraints to determine what manner of partitioning will yield best performance. Finally, there are some disadvantages to partitioning. Partitioning increases the number of physical data sets that must be created, including any indexes that are also partitioned. This increase means adjusting corresponding DB2 configuration parameters to account for managing, opening, and closing a larger number of data sets. There are other factors to take into account and you should refer to the appropriate IBM DB2 manual for these. The following is a list of best prac-tices for database design when consid-ering overall system performance:

• Many of the benefits of partitioning aren’t available for segmented table spaces in DB2 V8. Universal table spaces, available in DB2 9 for z/OS, combine some of the features of both segmented and partitioned table spaces. For those on DB2 V8, consider partitioning for intermediate and large-size tables. Implementing parti-tioning isn’t required, but considering it is a good idea if only to ensure that data purge processes are discussed at design time. If you have migrated to DB2 9, consider universal table spaces.

• Partition based on major I/O activity (purge, mass insert, avoiding hot spots), but also consider current sys-tem resource constraints and how they may be affected. For example, in a DB2 subsystem where many batch jobs execute in a limited batch win-dow, batch application elapsed time (i.e., throughput) is a constraint. So, you might consider partitioning to

3 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   3 �

relieve potential throughput issues.• Partitioning to minimize I/Os requires

a detailed knowledge of data access patterns. In particular, purge processes are highly I/O-intensive; they usually involve lots of data access during a short period, deletion of rows and index entries, and logging of changes. Partitioning to take the purge process into account may be the most impor-tant factor in database design.

• Maximizing data availability is tight-ly linked to locking. While row-level locking is a potential fix for this it has several disadvantages, including increased CPU usage and greatly increasing the number of locked objects within the scope of a transac-tion. Reducing data availability con-straints is usually accomplished by improving transaction design by either shortening transaction lengths, decreasing the amount of data locked per transaction, avoiding locks (i.e., using uncommitted read), and so forth. It may be useful to consider partitioning that increases availabili-ty (avoiding hot spots, minimizing locking) by spreading data through-out multiple partitions. However, this has the disadvantage of increas-ing I/Os and possibly affecting throughput.

• Throughput constraints can be relieved with good table design once the data access patterns causing the issue are understood. Delays due to mass insert processing can be alleviat-ed with partitioning by ascending key. Delays due to purge processing may be handled with rotating partitions. Of course, each of these methods (and others) may lead to constraints on other resources.

There are partitioning methods we haven’t covered, as well as many addi-tional considerations that are beyond the scope of this article. Nevertheless, we hope this article has given you some ideas to consider, especially in the area of good database design. Considering potential resource constraints and designing data-bases to mitigate or avoid them is a best practice. Z

loCkWooD lyon is the principal database administrator for a major U.S. financial institution. During the past 20 years, he has specialized in mainframe DB2 systems and application performance tuning, DB2 subsystem installa-tion and configuration, and IT performance appraisal. His articles on IT best practices have appeared in several pub-lications, including z/Journal.Email: [email protected]

Earlier this year, IBM posted the latest and greatest version of Hints and Tips for z/VSE 4.2 on the z/VSE Web page. If you haven’t downloaded and read these

latest hints and tips, you will be pleasantly surprised. In the last few versions, significant detail and new sections have been added. To view the most current Hints and Tips, go to ftp://public.dhe.ibm.com/eserver/zseries/zos/vse/pdf3/zvse42/hintamm2.pdf. For a list of Hints and Tips for earlier versions, scroll to the bottom of this page: http://www-03.ibm.com/systems/z/os/zvse/documentation/#hints. Here you can download Hints and Tips for VSE/ESA Versions 2.6 and 2.7 and z/VSE Versions 3.1. 4.1 and, of course, 4.2. Note that if you aren’t up on the latest z/VSE release, some of the new release-specific information may not apply. However, many of the changes, additions, updates, and even new topics do apply and will be meaningful and helpful. One example in the Hints and Tips for z/VSE 4.2 is Chapter 19, which covers “VSE Health Checker.” This version is applicable to earlier releases that support the health checker. If you’re planning a z/VSE upgrade, Hints and Tips is required reference material. In the past, recommended read-ing and reference material for planning a z/VSE upgrade included the Release Guide, Upgrade Buckets, Planning Guide, Installation Guide, and other user information. Now you should add the current Hints and Tips to the list. For example, before you start an upgrade, be sure to read Chapter 17 on “Hints and Tips for Fast Service Upgrade (FSU).” While it may not be apparent from the title of the document, there are numerous bits of information on how to obtain information to help resolve or identify a problem. This includes information on how to use the documented (previously undocumented) commands. An earlier version of Hints and Tips examined how to use a Lock Trace to identify the owner of VSE/VSAM lock. Naturally, that type of information is carried forward in every version. In Hints and Tips for z/VSE 4.2, Chapter 1 examines the various TRACE options. And yes, this is where the “undocumented” system commands are docu-mented. In fact, many folks are first introduced to this doc-ument when they need to understand and use the undocumented commands. Chapter 5 provides an excellent discussion on how Job Control works and what the $JOBCTLx phases do. If you need to know which JCL command is processed by which

phase, a table provides that information. Chapter 3 provides a current discussion of the Turbo Dispatcher. Please pay close attention to your current release level and the special noting of release levels and release level subtopics related in this chapter. The Dispatcher and PRTY command have been enhanced in ways that affect how you will specify and use them, depending on your actual release level. By reviewing the “Summary of Changes” under the “About This Book” in any earlier version of Hints and Tips, it becomes obvious where a review is warranted, where new information has been added, and where information has been updated. So, if you’ve read an earlier version, you can quickly and readily become current without reading the whole volume; although reading the newer version in its entirety is often a helpful reminder of things forgotten. Hints and Tips started as a simple Web vehicle to quickly provide information to the z/VSE user from z/VSE develop-ment. The intent was that it would always be a Web-based distribution, updated as needed, when needed, with no committed timeframe or publication schedule. It’s a milestone document that’s indicative of the new relationship that has developed between the z/VSE user and z/VSE development team. This commitment and openness was an unstated promise that lived through the turmoil of the ’80s and was initially delivered with the first Hints and Tips in the ’90s. The latest version shows that IBM is paying attention, listening, and being proactive by providing infor-mation users need. The z/VSE user community conveys our thanks for a job well done.

News From z/VSE Development IPv6/VSE is now available from IBM (licensed product of Barnard Software, Inc.). IPv6 supports 128 host address-es, which will limit the bi-annual press reports on “we’re running out of addresses” that have occurred since the early ’90s. Current z/VSE news is available on Twitter at http://twitter.com/IBMzVSE or visit http://www.ibm.com/vse. Thanks for reading the column; see you all in the next issue. Zpete Clark works for CPR Systems and has spent more than 40 years working with VSE/ESA in education, operations, programming, and technical support positions. It is his privilege to be associated with the best operating system, best users, and best sup-port people in the computer industry. Email: [email protected]

Hints and Tips for z/VSE 4.2

Pete Clark on z/VSEPeTeClark

� 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

The Linux kernel code is stable, but even the best kernel hackers are

only human and make mistakes. So, while kernel crashes are rare, they can occur and are unpleasant events; all ser-vices the machine provides are inter-rupted and the system must be rebooted. To find the cause of such crashes, kernel dumps containing the crashed system state are often the only approach. When a user-space crash occurs, a core dump is written containing memo-ry and register contents at the time of the crash. Writing such core dumps is possible because the Linux kernel is still fully operational. This is clearly more difficult when the kernel itself crashes. Either the dying kernel must dump itself or some other program indepen-dent from the kernel must perform that task. This article reviews Linux kernel dump methods, describes the current Kdump process, compares System z dump tools, and offers an introduction to Linux dump analysis tools.

History The Linux Kernel Crash Dumps (LKCD) project implemented one of the first Linux kernel dump methods. However, Linus Torvalds never accepted those patches into the Linux kernel because the currently active kernel was responsible for creating the dump. This meant the code creating the dump relied on kernel infrastructure that could have been affected by the original kernel problem. For example, if the kernel crashed because of a disk driver failure, a successful LKCD dump was unlikely because that code was also needed to write the dump. LKCD is no longer active; the last LKCD kernel patch was released for Linux 2.6.10. Diskdump and Netdump were other dump Linux mechanisms; both had problems similar to LKCD and were never accepted into the mainline kernel. For Linux on System z, IBM devel-opers used another approach: stand-alone dump tools. When a kernel crash occurs, the standalone dump tool is started and loads into the first 64KB of memory, which Linux doesn’t use. Available since 2001, this functionality writes the memory and register infor-mation to a separate DASD partition or to a channel-attached tape device. z/VM also supports VMDUMP, a hypervisor dump method.

Kdump Operation Kdump, developed after the failure

of LKCD et al. uses a completely sepa-rate kernel to write the dump. With Kdump, the first (production) kernel reserves some memory for a second (Kdump) kernel. Depending on the architecture, currently, 128MB to 256MB are reserved. The second kernel is loaded into the reserved memory region; if the first kernel crashes, kexec boots the second kernel from memory and writes the dump file. Kdump was accepted upstream for Linux 2.6.13 in 2005; Red Hat Enterprise Linux 5 (RHEL5) and SUSE Linux Enterprise Server 10 (SLES 10) were the first distri-butions to include it. Kdump is supported on the i686, x86_64, ia64, and ppc64 architectures. Depending on the architecture, the first and second kernels may or may not be the same. When the second ker-nel gets control, it runs in the reserved memory and doesn’t alter the rest of memory. It then exports all memory from the first kernel to user space with two virtual files: /dev/oldmem and /proc/vmcore. The /proc/vmcore file is in Executable and Linkable Format (ELF) core dump format and contains memory and CPU register informa-tion. An init script (see Figure 1) tests whether /proc/vmcore exists and cop-ies the contents into a local file system or sends it to a remote host using scp. After the dump is saved, the first ker-nel is started again using the reboot command.

System z Dumps Unlike Kdump, the Linux on System z standalone dump tools don’t require reserved memory; they’re installed on a storage device and IPLing from that device triggers the dump process. Note that some of the features described in the following might be available only on the latest Red Hat Enterprise Linux and SUSE Linux Enterprise Server distributions.

DASD and tape standalone dump tools: Standalone dump tools for DASD and channel-attached tape devices are available. The tools are written in Assembler and are loaded into the first 64KB memory that isn’t

if [ -s /proc/vmcore ]; then cp /proc/vmcore /mydumps rebootfi

figure 1: A Simple kdump init Script with local copy

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Linux on System z Kernel DumpsBy Michael Holzheu

Other IBM Technical Universities - 2010 U.S. Calendar

IBM System x® Technical University July 26 - 30 Washington, D.C,IBM System Storage® Technical University July 26 - 30 Washington, D.C,IBM Power Systems® Technical University October 18 - 22 Las Vegas, NV

Brought to you by IBM Systems Lab Services and Training: ibm.com/systems/services/labservicesFollow us at: twitter.com/IBMTechConfs Stay connected at: #ibmtechu

IBM, the IBM logo, ibm.com. Power Systems, System Storage, System x, System z, z/OS, z/VM and z/VSE are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at: ibm.com/legal/copytrade.shtml. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.

A smarter planet starts with you . . .

Enhance your skills at this premier IBM System z® event featuring z/OS®, z/VM®, z/VSE™ and Linux® on System z.

2010 IBM System z Technical UniversityOctober 4-8, 2010Boston, Massachusetts(formerly IBM System z Expo)

Enroll today! Early bird discount available until 9/3.

ibm.com/training/us/conf/systemz

Get the latest in technical training, as well as a consolidated and detailed overview of the 2010 announcements for IBM System z, operating systems and related applications.

With more than 300 technical sessions and a solution center/expo, this is the IBM System z event to attend!

Sponsor and Exhibitor opportunities are available. Visit ibm.com/training/us/conf/systemz, then click on Solution Center Package

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

used by the Linux kernel. System z standalone dumps use two tools from the s390-tools package: zipl prepares dump devices, and zgetdump copies kernel dumps from DASD or tape into a file system (see Figure 2). These steps prepare partition /dev/dasdd1 on DASD 1000 for a standalone dump:

1. Format DASD: dasdfmt /dev/dasdd2. Create a partition: fdasd -a /dev/

dasdd3. Install the dump tool: zipl -d /dev/

dasdd1.

After a system crash, an IPL from the DASD device creates the dump. Before the IPL, all CPUs must be stopped and the register state of the boot CPU saved by issuing the com-mands in Figure 3 on the VM console of the crashed guest. After rebooting the guest, the dump can be copied into a file system using zgetdump:

# zgetdump /dev/dasdd1 > /mydumps/dump.s390

It’s also possible to copy the dump to a remote system using Secure Shell (ssh):

# zgetdump /dev/dasdd1 | ssh user@host “cat

> dump.s390”

The zipl and zgetdump tools for channel-attached devices currently sup-port single- and multi-volume ECKD DASD, single-volume Fixed Block Architecture (FBA) DASD, and 3480, 3490, 3590, and 3592 tape.

SCSI dump: Support was also added for Linux on System z guests and LPARs that have only Small Computer System Interface (SCSI) Fibre Channel Protocol (FCP) disk. Accessing these disks in a Storage Area Network (SAN) using zSeries FCP (ZFCP) is complex and support couldn’t be fitted into the first 64KB of memory, like for the DASD and tape dump tools. Instead, a second Linux kernel is used that’s con-ceptually similar to the Kdump approach. But unlike Kdump, this ker-nel isn’t loaded into guest memory in advance.

The ZFCP dump kernel is IPLed from SCSI disk using a new dump operand on IPL. With this operand, the first few megabytes of memory are saved in a hidden area the Processor Resource/Systems Manager (PR/SM) or z/VM hypervisor owns. The ZFCP

dump kernel is then loaded into that saved memory region. Using a z-spe-cific hardware interface, the ZFCP dump kernel can access the hidden memory. As with Kdump, the ZFCP dump kernel then exports all memory using a virtual file. A ZFCP dump user-space application running in a ramdisk then copies that file into a local file system on the SCSI disk where the dump tool was installed (see Figure 4). Preparing a SCSI disk for ZFCP dumps requires these steps:

1. Prepare partition on SCSI disk: fdisk /dev/sdb

2. Create ext3 file system: mke2fs -j /dev/sdb13. Mount file system: mount /dev/sdb1

/mnt4. Install SCSI dump tool: zipl -D /dev/sdb1 -t /mnt.

When running in a Logical Partition (LPAR), IPLing that SCSI disk using the SCSI dump load type on the HMC trig-gers a dump. When running under z/VM, the dump device must be defined using a cp command (the example shown uses W W P N = 5 0 0 5 0 7 6 3 0 3 0 0 C 5 6 2 , LUN=401040B400000000):

# set dumpdev portname 50050763 0300C562

lun 401040B4 00000000

To trigger the dump under z/VM, a ZFCP adapter (device number 1700 in this example) must be specified for IPL with the dump (see Figure 5). The ZFCP dump tool writes the dump as a file into the specified file system. This file can be used directly for dump analysis; no zgetdump tool is needed.

VMDUMP: Under the z/VM hypervi-sor, the VMDUMP command can be used to create dumps for VM guests. To use VMDUMP, no preparation of any dump device is required; the dump file is written into virtual SPOOL. This dump mechanism should be used only for small guests because it’s quite slow. A Linux tool called vmur copies dump SPOOL files into Linux, and the vmcon-vert tool converts VMDUMPs into Linux-readable dump format (the --convert option on vmur can also con-vert the dump on the fly while receiving it from SPOOL). VMDUMP is the only non-disruptive dump method for Linux on System z and is also the only method to dump Named Saved Systems (NSSs).

# ipl 1700 dumpLinux for zSeries System Dumper starting DUMP PARAMETERS: ================ devno : 1700 wwpn : 500507630300c562 lun : 401040b400000000 ... DUMP PROCESS STARTED: ===================== dump file: dump.6 0 MB of 1000 MB ( 0.0% ) 990 MB of 1000 MB ( 99.1% ) 1000 MB of 1000 MB (100.0% ) DUMP ‘dump.6’ COMPLETE

figure 5: iPl ScSi Dump Tool via zfcP Adapter

ON_PANIC=dump_reipl DUMP_TYPE=ccw DEVICE=0.0.4000

figure 6: The configuration of a DASD Dump Device

# cpu all stop# store status# ipl 1000

This produces messages similar to the following:

Dumping 64 bit OS00000032 / 00000256 MB00000064 / 00000256 MB...00000256 / 00000256 MBDump successful

figure 3: commands to Stop All cPUs, Save the Register State of Boot cPU, and iPl Dump Tool

figure 2: Standalone Dump Tool for channel-Attached Devices

figure 4: zfcP Dump Tool for ScSi Disks

Example of VMDUMP use:

1. Trigger VMDUMP on the VM con-sole: #cp vmdump

2. Boot Linux3. Receive dump in Linux format from

reader: vmur rec -c <dump spool id> dumpfile.

Automatic dump: When the Linux ker-nel crashes because of a non-recover-able error, normally, a kernel function named panic is called. By default, panic stops Linux. With Linux on System z, panic can be configured to automatical-ly take a dump and re-IPL. The dump device is specified in the file /etc/sys-config/dumpconf. Figure 6 shows the configuration for a DASD dump device. The service script dumpconf enables the configuration:

# service dumpconf start

and chkconfig can make the behavior persistent across reboot:

# chkconfig --add dumpconf

With this configuration, DASD device 0.0.4000 will be used for dump in case of a kernel crash. After the dump pro-cess finishes, the current system is rebooted. To instead stop the system after dumping, specify “ON_PANIC=dump” without the “reipl.”

How System z Dumps Compare to Kdump When Kdump was released, the IBM Linux on System z team considered adopting that dump method, but reject-ed it due to reliability concerns. The IPL mechanism on System z performs a hardware reset on all attached devices. So the dump tools can work with fully initialized devices. An IPL to start the dump process will always work, even if CPUs are looping with disabled

interrupts. Like Kdump, the System z dump tools are independent of the state of the first kernel; however, the System z dump tools don’t share memory with the first kernel, so there’s no way to over-write the code of the tools, as can hap-pen for Kdump. Another advantage of the System z dump tools is that they don’t require reserved memory. This is especially important under z/VM with many guests. The main disadvantage of the System z tools is that they’re different from Kdump, which is used on most other platforms; this makes them unfa-miliar to many. Installer dump support under Red Hat and SUSE Linux is limit-ed for System z. Kdump also has filtering mechanisms for dumping only kernel pages that are important for dump analy-sis, reducing dump size and dump time.

Dump Analysis Tools After a kernel dump has been creat-ed, it must be read by an analysis tool for problem determination. Two dump analysis tools are available for Linux, lcrash, and crash. The lcrash tool is part of the LKCD project and isn’t being actively developed; crash, developed by a company called Mission Critical Linux and now maintained by Red Hat, will probably be the Linux dump analysis tool of the future. The kernel dump analysis tools sup-port many commands:

• Show memory contents• Print kernel variables• Show kernel log• List Linux processes• Show kernel function backtrace for

processes• Show disassembly of kernel code.

A Simple Dump Analysis Scenario Let’s consider how crash is used. The sleep program is started (this is an

example only); then a dump is created, Linux is rebooted, and the dump is opened with crash. Apart from the dump file, crash normally needs two additional files: vmlinux and vmlinux.debug. These contain kernel symbol addresses and the datatype description, respectively. In some distributions, these two files are merged. For our example, the following steps have been per-formed:

1. Start sleep program: /bin/sleep 1000.2. Create DASD dump (/dev/dasdd1).3. Reboot Linux system.4. Copy dump: zgetdump /dev/dasdd1

> dump.s390.5. Start crash tool: crash /boot/vmlinux

/usr/lib/debug/boot/vmlinux.debug dump.s390.

Figure 7 shows all processes in the dump as well as the sleep process. The Process Identifier (PID) of the sleep process is 26735. The parent of the sleep process is the bash shell process with PID 26617 (see Figure 8). The sleep process has executed the system call “nanosleep”; the top-most function on the stack is “schedule” (the Linux function where all processes normally sleep until the scheduler wakes them up again).

Summary This article described the history of Linux dump methods. After Linus Torvalds rejected dump methods such as LKCD, the Kdump method was final-ly accepted in the mainline kernel. On System z, architecture-specific dump tools existed several years before Kdump, and remain in use. These include standalone dump tools for DASD and channel-attached tapes, a dump tool for ZFCP SCSI disks, and the hypervisor dump method, VMDUMP. The main advantage of these System z dump tools is reliability. Z

referenCes• Dump tool use: www.ibm.com/developerworks/

linux/linux390/development_documentation.html• Crash home page: http://people.redhat.com/

anderson• LKCD: http://lkcd.sourceforge.net.

miCHael HolzHeu is a Linux kernel developer at the IBM lab in Boeblingen, Germany. He studied computer science at the University of Erlangen and has worked for IBM since 1998. Starting in the z/OS and UNIX Systems Services environment, he joined the Linux on System z team in 2000. His main focus is kernel dump, dump analysis, and device driver development. Email: [email protected]

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 5

crash> ps ... 26617 26607 1 acad140 IN 0.2 3460 1900 bash 26735 26617 1 ac45640 IN 0.1 2232 608 sleep ...

figure 7: Processes in Dump

crash> bt 26735PID: 26735 TASK: ac45640 CPU: 1 COMMAND: “sleep” #1 [0ac53d70] schedule at 5461ce #2 [0ac53d88] do_nanosleep at 547c7a #3 [0ac53db0] hrtimer_nanosleep at 16ccd6 #4 [0ac53e70] sys_nanosleep at 16cdb4 #5 [0ac53eb8] sysc_noemu at 117c8e

figure 8: kernel function Backtrace (call chain) of the Sleep Process

Big Iron

If you were going to create a documentary about the mainframe, where would you start? You could reasonably start with the birth of Alan Turing in 1912, or with the

“birth” of ENIAC in Pennsylvania in 1946. How about when Turing’s Pilot ACE ran its first program in London in 1950? Other schools of thought might point to Konrad Zuse’s Z1 in 1936, a mechanical calculator considered the first binary computer; others could point to the earliest general-purpose, stored-program electronic digital computer known as the Manchester “Baby,” which performed its first cal-culation in 1948. IBM’s 701 in 1952 might even get a few votes as well. My colleague David insists you should start with Stonehenge (yes, he’s English), and I must admit he actually makes a good case that it’s a precursor to the first electronic calculator and deserves consideration. Non-IT folks who love the movies might first think of the 1957 movie, Desk Set, which introduced the con-cept to many, many Americans long before the HAL 9000 hit Hollywood. I think most people would proba-bly point to IBM in the ’60s as the real start of the “mainframe era.” They could point to the year 1961, when IBM’s Compatible Time-Sharing System (CTSS) was first demonstrated, or to 1963, when the IBM 7000 series replaced vacuum tubes with transis-tors. But really, most people I’ve asked agree that 1964 is the year because that is when the IBM System 360 family of mainframe computers was launched. Just five years later in 1969, a mainframe guided the Apollo 11 moon landing. So much history to consider, and we haven’t even mentioned Burroughs Corp., DEC, NCR, General Electric, Honeywell, RCA, UNIVAC, Control Data Corp., or independent software vendors such as Computer Associates International (now just CA Technologies). All this mainframe history—what came next and what’s in store for the future—got me to thinking there should be a documentary about this groundbreaking platform; a story

that captures the thoughts, ideas, and images of the men, women, and machines that revolutionized businesses and organizations around the world and created what is now just referred to as “IT.” There should be one and now there is one. The new documentary film, Big Iron: The Mainframe

Story, can be viewed at ca.com/bigiron and at ibmsystems-mag.com/bigiron. The documentary has a 33-minute run-ning time and is divided into five chapters. The project was developed and produced by MSP Communications, Inc., CA Technologies, and IBM Systems Magazine, but it involved the assistance of many people from a variety of organizations, including SHARE, MainframeZone, Inc., NASA, and IBM. Check it out and let me know what you think. Z

steven a. menGes is the vice president, Mainframe Business Unit, at CA Technologies.Email: [email protected]

The Mainframe Story (So Far)STevena.menGeS

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Important Big Iron DatesMy initial research took me to many different times and places; here are just a few:

~2500 BC: Stonehenge, the world’s first (and largest) analog computer, is created to store and calculate crop planting and other critical data.

1946: ENIAC, the first general purpose programmable computer is unveiled in Pennsylvania.1952: The IBM 701 Electronic Data Processing Machine is introduced.1964: IBM System 360 family of “maintenance” computers is launched.1972: VM virtualization is announced.1976: CA Technologies (then Computer Associates International, Inc.) is founded on Long Island in 1976

and begins selling CA-Sort.1979: The mainframe-powered UPC code and scanner are widely introduced and revolutionize retail. 1983: DB2 for MVS Version 1 was born.1998: The IBM System/390, Generation 5 debuts and breaks the 1000 MIPS barrier.1999: Linux on System z debuts.2001: The zSeries 900 securely processes a record 3,850 transactions per second.2005: IBM System z9 debuts and processes more than 1 billion transactions per day.2007: Six hundred new mainframe applications are introduced—with momentum in both traditional and

Linux, Java, and business intelligence apps.2008: More than 500 universities worldwide agree to teach mainframe and other large systems skills;

more than 50,000 students receive mainframe education.2009: More than 1,000 new or updated applications for the IBM mainframe have been introduced in

2008 - 2009.2010: An IBM System z10 EC has the equivalent capacity of nearly 1,500 x86 servers with an 85 percent

smaller footprint and up to 85 percent lower energy costs.

SHAREin Boston

August 1-5, 2010Hynes Convention CenterBoston, Massachusetts

• Solve your technical IT problems and achieve business results

• Meet the challenges of regulatory compliance and business continuity

• Get the scoop on the latest software and product releases

• Benefit from an active community of professionals sharing best practices and practical applications of technology

Get the answers you need at SHARE:

Learn more about SHARE in Boston at boston.share.org/2010.

FOLLOW US ON:

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

The percentage of internal breaches increased from 15 percent in 2003 to 44 percent in 2008; that dramatic change

was noted in a 2008 survey commis-sioned by CA Technologies. The same survey showed that most IT leaders con-sider internal security threats a bigger risk to business than external attacks. In the mainframe world, we’ve focused our efforts on securing our sys-tems from outside threats and control-ling the abuse of privileges by our massive user communities. Unfortunately, we’re missing exposures that, as noted in the survey, originate on the inside. Efforts to secure the mainframe date back to the early ’70s. The SHARE Security Project was formed in 1971, and it immediately began deliberations on how to secure IBM systems. The committee went around in circles, deal-ing with the issues of system integrity and system security, often confusing the two concepts. None of the members were satisfied with the non-existent level of security provided by the, then cur-rent, IBM operating system, OS/MVT. Then, in 1973, IBM announced a new operating system, OS/VS2 Release 2, which included this system integrity statement as part of the IBM VS2 Release 2 Planning Guide: System integrity is defined as the abil-ity of the system to protect itself against unauthorized user access to the extent that the security controls can’t be com-promised. That is, there’s no way for an unauthorized problem program using any system interface to:

• Bypass store or fetch protection; i.e.,

read or write from or to another user’s areas

• Bypass password checking; i.e., access password-protected data for which a password hasn’t been supplied

• Obtain control in an authorized state.

In VS2 Release 2, all known integrity exposures have been removed. IBM will accept as valid, any APAR that describes an unauthorized program’s use of any system interface (defined or undefined) to bypass store or fetch protection, to bypass password checking, or to obtain control in an authorized state. The system integrity statement pre-ceded the advent of any of the three mainframe security systems:

• Resource Access Control Facility (RACF), which IBM introduced in 1976

• ACF2, developed by SKK Inc. in 1978 and now owned by CA Technologies

• Top Secret, developed by CGA Allen in 1981 and now owned by CA Technologies.

The method used to protect data before the advent of these security sys-tems was rudimentary passwords, but it’s easy to replace the concept of simple password protection with the controls of a security system. Clearly, even as far back as 1973, mainframe security sys-tems were built on, and dependent on, the foundation of operating system integrity. IBM has done a commendable job of maintaining system integrity over the years and has been responsive to any reported system integrity expo-

sure. But, what about:

• Independent Software Vendor (ISV) products

• Locally developed exits and autho-rized programs

• Software obtained from other installa-tions via shareware such as the CBT Tape (see www.cbttape.org)?

System integrity exposures are par-ticularly exploitable by insiders who accounted for almost half of the docu-mented breaches in 2008. How do you secure your z/OS sys-tem? First, ensure that the proper secu-rity system controls and operating system parameters are set for maximum security. All the APF-authorized libraries should be properly protected and only a limited set of users should have author-ity to update them. Programs in APF-authorized libraries, if linked with the AC(1) parameter, can, at will, obtain an authorized state and be able to gain access to data sets and resources. However, assuring this is a complex process. Installations may have more than 100 APF-authorized libraries and, for each one, the list of users who can update them must be obtained from the security system. Each user must be vali-dated as appropriate. Similarly, the data sets that collect the Systems Management Facility (SMF) log files must be properly protected. These contain the security system viola-tion and logging records. The system parameter libraries must be properly protected and all parameters relating to the security they contain must be vali-

dated as appropriate. Any system exits, which can adjust the operating flow and also give certain programs or users additional privileges, must be reviewed. There are many parameters and settings and controls to review. Before the introduction of automat-ed methods, performing a security audit was a long, involved process that required extensive expertise, so it was difficult, expensive, and not totally com-plete. SKK developed and introduced the Examine/MVS product in 1984; it automated this analysis and allowed both auditors and internal security per-sonnel to perform the process. Examine/MVS, now owned by CA Technologies and known as CA-Auditor, was followed by Vanguard Integrity Professionals’ Analyzer and then Consul’s Audit, now part of the IBM Tivoli zSecure suite. The Defense Information Services Agency (DISA) has developed an exten-sive set of security guidelines. There are specific guides called the Security Technical Implementation Guides (STIGs) for ACF2, RACF, and Top Secret. These STIGS, available on the DISA Website at http://iase.disa.mil/stigs/stig/index.html, are cookbooks on how to properly configure the z/OS sys-tem parameters and ACF2, RACF, and Top Secret. They’ve been developed for all platforms the military uses, not just mainframes. There are separate DISA STIG guide-lines for z/OS for each of the three main-frame security products. While these guidelines may not apply completely to non-military z/OS installations, they’re a full list of items you should review. If the result of the review is that they don’t

apply to your installation, or must be modified to fit your site’s operating stan-dards, that’s acceptable, but each STIG should be used as a guide. DISA has also developed a REXX implementation to validate that the installation is adhering to the STIGs; it uses the CA-Examine product as a base

to collect the information. Vanguard Integrity Professionals recently intro-duced its Vanguard Configuration Manager (VCM) product, which reviews a z/OS system with RACF. This auto-mated methodology eliminates almost all the labor-intensive aspects of review-ing your installation for compliance

By Ray Overby

Is Your z/OS System Secure?

The percentage of internal breaches increased from 15 percent in 2003 to 44 percent in 2008; that dramatic change

was noted in a 2008 survey commis-sioned by CA Technologies. The same survey showed that most IT leaders con-sider internal security threats a bigger risk to business than external attacks. In the mainframe world, we’ve focused our efforts on securing our sys-tems from outside threats and control-ling the abuse of privileges by our massive user communities. Unfortunately, we’re missing exposures that, as noted in the survey, originate on the inside. Efforts to secure the mainframe date back to the early ’70s. The SHARE Security Project was formed in 1971, and it immediately began deliberations on how to secure IBM systems. The committee went around in circles, deal-ing with the issues of system integrity and system security, often confusing the two concepts. None of the members were satisfied with the non-existent level of security provided by the, then cur-rent, IBM operating system, OS/MVT. Then, in 1973, IBM announced a new operating system, OS/VS2 Release 2, which included this system integrity statement as part of the IBM VS2 Release 2 Planning Guide: System integrity is defined as the abil-ity of the system to protect itself against unauthorized user access to the extent that the security controls can’t be com-promised. That is, there’s no way for an unauthorized problem program using any system interface to:

• Bypass store or fetch protection; i.e.,

read or write from or to another user’s areas

• Bypass password checking; i.e., access password-protected data for which a password hasn’t been supplied

• Obtain control in an authorized state.

In VS2 Release 2, all known integrity exposures have been removed. IBM will accept as valid, any APAR that describes an unauthorized program’s use of any system interface (defined or undefined) to bypass store or fetch protection, to bypass password checking, or to obtain control in an authorized state. The system integrity statement pre-ceded the advent of any of the three mainframe security systems:

• Resource Access Control Facility (RACF), which IBM introduced in 1976

• ACF2, developed by SKK Inc. in 1978 and now owned by CA Technologies

• Top Secret, developed by CGA Allen in 1981 and now owned by CA Technologies.

The method used to protect data before the advent of these security sys-tems was rudimentary passwords, but it’s easy to replace the concept of simple password protection with the controls of a security system. Clearly, even as far back as 1973, mainframe security sys-tems were built on, and dependent on, the foundation of operating system integrity. IBM has done a commendable job of maintaining system integrity over the years and has been responsive to any reported system integrity expo-

sure. But, what about:

• Independent Software Vendor (ISV) products

• Locally developed exits and autho-rized programs

• Software obtained from other installa-tions via shareware such as the CBT Tape (see www.cbttape.org)?

System integrity exposures are par-ticularly exploitable by insiders who accounted for almost half of the docu-mented breaches in 2008. How do you secure your z/OS sys-tem? First, ensure that the proper secu-rity system controls and operating system parameters are set for maximum security. All the APF-authorized libraries should be properly protected and only a limited set of users should have author-ity to update them. Programs in APF-authorized libraries, if linked with the AC(1) parameter, can, at will, obtain an authorized state and be able to gain access to data sets and resources. However, assuring this is a complex process. Installations may have more than 100 APF-authorized libraries and, for each one, the list of users who can update them must be obtained from the security system. Each user must be vali-dated as appropriate. Similarly, the data sets that collect the Systems Management Facility (SMF) log files must be properly protected. These contain the security system viola-tion and logging records. The system parameter libraries must be properly protected and all parameters relating to the security they contain must be vali-

dated as appropriate. Any system exits, which can adjust the operating flow and also give certain programs or users additional privileges, must be reviewed. There are many parameters and settings and controls to review. Before the introduction of automat-ed methods, performing a security audit was a long, involved process that required extensive expertise, so it was difficult, expensive, and not totally com-plete. SKK developed and introduced the Examine/MVS product in 1984; it automated this analysis and allowed both auditors and internal security per-sonnel to perform the process. Examine/MVS, now owned by CA Technologies and known as CA-Auditor, was followed by Vanguard Integrity Professionals’ Analyzer and then Consul’s Audit, now part of the IBM Tivoli zSecure suite. The Defense Information Services Agency (DISA) has developed an exten-sive set of security guidelines. There are specific guides called the Security Technical Implementation Guides (STIGs) for ACF2, RACF, and Top Secret. These STIGS, available on the DISA Website at http://iase.disa.mil/stigs/stig/index.html, are cookbooks on how to properly configure the z/OS sys-tem parameters and ACF2, RACF, and Top Secret. They’ve been developed for all platforms the military uses, not just mainframes. There are separate DISA STIG guide-lines for z/OS for each of the three main-frame security products. While these guidelines may not apply completely to non-military z/OS installations, they’re a full list of items you should review. If the result of the review is that they don’t

apply to your installation, or must be modified to fit your site’s operating stan-dards, that’s acceptable, but each STIG should be used as a guide. DISA has also developed a REXX implementation to validate that the installation is adhering to the STIGs; it uses the CA-Examine product as a base

to collect the information. Vanguard Integrity Professionals recently intro-duced its Vanguard Configuration Manager (VCM) product, which reviews a z/OS system with RACF. This auto-mated methodology eliminates almost all the labor-intensive aspects of review-ing your installation for compliance

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � �

Is Your z/OS System Secure?

5 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

with the STIGs and does the analysis and reporting. The STIGs assume there’s no way to bypass the integrity of z/OS, but if a user program can successfully do this, it can reset certain flags and settings to make the security product believe the program is authorized to access any resource it wants and even bypass any production of the security system log records. This applies to all three security products, although the internal settings would be different for each one. Two different exposures can affect z/OS:

• Supervisor call routines or authorized programs whose sole purpose is to place the caller in an authorized state or set the caller up so he or she can be in an authorized state. These may be designed and intended to be a part of a legitimate or useful product or ser-vice but can be used illegitimately for other purposes.

• Errors in the implementation of prod-ucts, homegrown programs, operating system extensions, or even z/OS that would allow the caller to obtain con-trol in an authorized state. These may be in authorized programs, Supervisor Calls (SVCs), Program Call (PC) rou-tines, or system exits.

Often, systems programmers aren’t aware of the integrity issues the first exposure introduces. For example, on one of the mainframe Internet lists, a systems programmer posted a code fragment asking for help. What the pro-grammer was asking for isn’t relevant, but the code fragment (see Figure 1) is

startling. Utilization of this SVC gives anyone in that installation who has access to Time Sharing Option (TSO) or batch processing the “keys to the kingdom.” A similar SVC was found as part of an enhancement package, submitted by a reputable organization, on the CBT tape Website. The enhancement is something many installations would desire, so it’s possible many installations have unknowingly introduced this exposure into their environment. The second vulnerability, the inad-vertent one, has been found in products produced by ISVs, locally developed code, code obtained from other installa-tions and, yes, even in IBM-supplied code. Using analysis tools and techniques that discover these inadvertent vulnera-bilities, along with the deliberate ones previously described, we’ve defined the following vulnerability categories for them based upon exposures it has found:

• Store into an address provided by an unauthorized caller while in system key or state. This could be used to update the status of the user program into an authorized state by, for exam-ple, storing into the Job Step Control

Block (JSCB), which contains the authorized privilege flag.

• Load into a register, or several regis-ters, from a fetch protected storage location provided by an unauthorized caller. If a dump is then induced, the content of the fetch protected storage is available. If this is a password, it could lead to a security exposure.

• Pass execution control to an address provided directly or indirectly by an unauthorized caller. This could be used to just obtain control in an autho-rized state.

• Dynamically elevate authority by set-ting the JSCBAUTH flag or ACF2, RACF, or Top Secret privileges.

Discovering these vulnerabilities takes a high level of expertise, and a similar level of expertise would be required to develop a method to exploit them. However, executing the exploita-tion method is easy. In one case, we developed an 11-line REXX program that gave the user RACF privileged authority. A slightly different version would have done a similar thing for ACF2 or Top Secret. The 11-line REXX program could easily be entered by any TSO or batch user. Note that this applied to a vulnerability that the vendor has since closed.

figure 1: code fragment

SVC 255 GET INTO SUP. STATE WITH KEY 0 STORAGE OBTAIN,LENGTH=20480,SP=241,KEY=9 ST R1,MVSCSADR STORE AREA ADDR. IN CSAEXT IC R11,=X’80’ SPKA 0(R11) CHANGE TO KEY 8 MODESET MODE=PROB SWITCH TO PROBLEM STATE

How do you secure your z/OS system? First, ensure that the proper security system controls and operating system parameters are set for maximum security.

Given the rising percentage of insid-er attacks, it’s imperative installations take appropriate action before an attack succeeds, causing loss of critical or con-trolled information or wreaking finan-cial havoc on the organization. z/OS installations must:

• Apply all IBM integrity fixes as soon as they receive them.

• Require that all ISVs provide a system integrity commitment that covers the products installed at the installation and commits to responding immedi-ately to repair any system integrity exposure introduced by their prod-ucts. Some of these issues may require extensive changes and testing, so while the vendor may start on remediation immediately, it may be some time before the vulnerability is repaired in the field.

• Review all installation-developed authorized code for integrity expo-sures. If the installation doesn’t have the required expertise, it should engage the services of an outside expert consultant.

• Review all code obtained from outside the company for integrity exposures (again using an outside expert if nec-essary).

• Periodically review the state of the z/OS system to assure that older integrity exposures have been repaired and no new ones were introduced. Remember that the z/OS system is continuously being updated with Program Temporary Fixes (PTFs) and new releases and, similarly, ISV products and internally developed code are being introduced and updated. We recommend periodic

reviews for system integrity.

Keeping z/OS secure requires con-tinual review of what’s added to the operating system and a solid under-standing of how the operating system environment and security system con-trols are implemented. Z

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   5 1

ray overBy began his career in data security in 1981 as an ACF2 developer at SKK, Inc. He started his own consulting company, Key Resources, Inc., in 1988. Since then, he has been active in the security area of z/OS and has done numerous reviews of mainframe system integrity. The Vulnerability Analysis Tool is the result of those reviews. Email: [email protected]

When IBM released CICS TS 2.2 in December 2002, which introduced Task-Related User Exits (TRUE) in the Open Transaction Environment (OTE) architecture, a primary selling point was potentially significant CPU savings for CICS/DB2 applications defined as threadsafe. To be threadsafe, a program must be Language Environment- (LE-) conforming and knowledgeable CICS programmers must >

5 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

B Y R u S S E V A N S A N D N A T E M u R P H Y

Building Better Performance for Your DB2/CICS Programs With

T h R E a D S a f E

ensure the application logic adheres to threadsafe coding standards. (For more information, see “DB2 and CICS Are Moving On: Avoiding Potholes on the Yellow Brick Road to an LE Migration,” z/Journal, April/May 2007.) This may require knowledge of Assembler code to follow the many tentacles of application logic that need to verify the application and its related programs are threadsafe. If you define a program to be thread-safe, but the application logic isn’t threadsafe, then unpredictable results could occur that could compromise your data integrity. This article provides some background on what threadsafe means at the program level, how to identify and correct non-threadsafe coding, and how to ensure your pro-grams are maximizing their potential CPU savings.

Background CICS was initially designed to pro-cess using a single Task Control Block (TCB). Once the CICS dispatcher had given control to a user program, that program had complete control of the entire region until it requested a CICS service. If the program issued a com-mand that included an operating system wait, the entire region would wait with it. As a result, CICS programming guides included a list of operating sys-tem and COBOL commands that CICS programs couldn’t use. The flipside of these limitations was the advantage that CICS programs didn’t have to be re-entrant between CICS commands. As all activity in the CICS region was single-threaded, it was also restricted to the capacity of one CPU. The introduction of multi-processor mainframes raised new issues for the CICS systems staff, when the purchase of a faster (and more expensive) main-frame would slow down CICS if the individual processors on the new machine were slower than the single processor it replaced. IBM responded by attempting to offload some of the CICS workload to additional CICS-controlled MVS TCBs that could run concurrently on a multi-processing machine. For convenience, IBM labeled the main CICS TCB as the Quasi-Reentrant, or QR TCB. The most significant implementa-tion of this type of offloading came with the introduction of the DB2 Database Management System (DBMS). Rather than establishing one TCB for all DB2 activity, CICS would create a separate TCB for each concur-

rent DB2 request and switch the task to that TCB while DB2 system code ran. While all of the application pro-grams for each task in the region still ran single-threaded, each task’s DB2 workload could run simultaneously—limited only by the total capacity of a multi-processor. On a practical level, the DB2 workload seldom approached the CICS workload, meaning CICS users were still constrained by the pro-cessing speed of a single processor. Also, while the overhead of an indi-

vidual TCB swap (roughly 2,000 instructions) is slight, these two TCB swaps for each DB2 request can account for as much as 30 percent of total application CPU.

Open Transaction Environment In a classic “ah ha!” moment, some-one at IBM realized this TCB swap-ping overhead could be eliminated by simply not swapping the transaction back from the DB2 TCB and allowing application code to run there. To pro-

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   5 3

vide support for running CICS appli-cation code outside of the QR TCB, the concept of the OTE was developed. Put simply, OTE allows an individual CICS transaction to run under its own MVS TCB instead of sharing the QR TCB. Many transactions, each under their own TCB, can run simultane-ously in the same CICS region. If a transaction running in the OTE issues an operating system wait, none of the other transactions in the CICS region are affected. The drawback of OTE is that more than one occurrence of the same pro-gram can run simultaneously, requir-ing CICS programs to be re-entrant between CICS calls. A simple example of the type of problem created is the common practice of maintaining a record counter in the Common Work Area (CWA) that’s used to create a unique key. Under “classic” CICS, as long as the record counter was updat-ed before the next CICS command was issued, the integrity of the counter was assured. With OTE, it’s possible for two or more transactions to use the counter simultaneously, resulting in duplicate keys. Fully re-entrant programs—that don’t assume access to data in shared storage areas will automatically be seri-alized—are defined as “threadsafe.” It’s crucial to remember that threadsafe isn’t a determination CICS makes, but a promise the programmer makes. By marking a program as threadsafe, the programmer is stating that the program won’t cause any damage if it’s allowed to run in the OTE.

Preparing CICS Regions for Threadsafe Activity There are two ways to control the use of threadsafe in a CICS region. On the program definition, a new parame-ter has been added: concurrency. CONCURRENCY=QUASIRENT indi-cates the program must run on the Q R T C B ; C O N C U R R E N C Y =THREADSAFE marks a program as threadsafe, allowing it to run on an open TCB. Be aware that marking a program as threadsafe doesn’t make it threadsafe; the programmer uses this parameter to define programs that have proved to be threadsafe. The second control is at the region level. Specifying FORCEQR=YES in the SIT will override the CONCURRENCY parameter on the program definitions to force all pro-grams to run on the QR TCB.

Before marking any program as threadsafe, all Task Related and Global User Exits (TRUEs and GLUEs) that are active in the region must be reviewed to ensure they’re threadsafe-compliant and defined as threadsafe. Activating threadsafe programs in a region with non-threadsafe exits can result in a significant increase in CPU utilization.

Ensuring Threadsafe Compliance All threadsafe programs must be re-entrant. LE programs can be guar-anteed re-entrant by compiling with the RENT option; Assembler pro-grams can be easily tested for re-entrancy by linking with the RENT option and then running in a CICS region with RENTPGM=PROTECT. Non-re-entrant programs will abend with an S0C4 when they attempt to modify themselves. It’s strongly rec-ommended that all CICS regions run-ning threadsafe programs use RENTPGM=PROTECT. Unfortunately, there’s no automated way to identify non-threadsafe pro-gram code. IBM does supply a utility, DFHEISUP, that can be useful in iden-tifying potential non-threadsafe pro-grams. It works by scanning application load modules, looking for occurrences of commands found in member DFHEIDT. (Details appear in the CICS Operations and Utilities Guide.) DFHEISUP will report, for example, that a program issues an ADDRESS CWA command. Since the CWA is often used to maintain counters or address chains, a program addressing the CWA could be using it in a non-threadsafe manner. On the other hand, the program could also be using the CWA to check for operational flags, file Data Definition (DD) names, or other uses that don’t raise threadsafe issues. More worrisome, DFHEISUP could report no hits on an application program, leading you to believe the program was threadsafe, when the program was in fact maintaining coun-ters in a shared storage location whose address is passed in the incoming commarea. While DFHEISUP is helpful in the process of identifying threadsafe appli-cations, the only way to ensure an application is threadsafe is to have a competent programmer review it in its entirety.

Making Programs Threadsafe It’s possible for a program to access

shared storage areas such as the CWA while remaining threadsafe-compliant. Each shared storage access must be reviewed independently to determine its status. Accesses that require update integrity or repeatable read integrity (counters, in-core tables, pointers, etc.) aren’t threadsafe-compliant and must be serialized before running in a thread-safe region. Various serialization options are available to CICS programmers:

• R e t a i n C O N C U R R E N C Y = QUASIRENT on the program defini-

tion. Programs defined as QUASIRENT will always be dis-patched on the QR TCB. The QR runs only one task at a time, so any storage access the program makes is automati-cally serialized. The advantage of this method of serialization is that it doesn’t require any program or CICS System Definition (CSD) modifica-tion. The disadvantages are that pro-grams forced to run on the QR TCB don’t produce any CPU savings when using DB2; all programs that access the shared storage in question must remain as QUASIRENT; and the future risk that a programmer added access to these areas to a program defined as CONCURRENCY=THREADSAFE.

• Move the data to a facility serialized by CICS or DB2. A DB2 table, or a CICS resource such as temporary stor-age or transient data, provides serial-ization. The advantages include use of an IBM-supported resource; use of a facility that programmers are familiar with; and compliance with internal coding standards. The disadvantages include the additional overhead asso-ciated with the facility and the poten-tial use of non-threadsafe commands.

• Use serializing CICS commands. EXEC CICS ENQ and DEQ provide the capability to serialize access to applica-tion resources. The advantages include use of the standard CICS Application Program Interface (API) and the ability to serialize only the specific lines of code required. The disadvantages include the additional overhead (which is minor) and the potential for deadly embraces. Note that prior to CICS TS 4.1, the EXEC CICS ENQ operates in a separate pool from the XPI ENQ. The CICS ENQ facility can’t be used to seri-alize access between an XPI user and an API user prior to CICS TS 4.1. The like-lihood of requiring serialization between exit points and application programs is small, but attempting to use

5 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

CICS resources in such a case can cause unpredictable results.

• Use serializing Assembler com-mands. Assembler commands such as compare and swap can be used to pro-vide serialized storage access. The advantage is that serialization can be achieved with minimal overhead. The disadvantages are that these com-mands are limited in capability, are complex to code, and can cause seri-ous problems if coded incorrectly.

As an example, a program that used a CWA field as a record counter could:

• Leave the program accessing the CWA as CONCURRENCY=QUASIRENT

• Move the counter to a DB2 table• “Wrap” the counter access in an EXEC

CICS ENQ/DEQ• Call a special-purpose Assembler sub-

routine to handle the counter with a compare and swap.

Regardless of which method or methods are used to serialize access, it’s critical that all programs that access the storage be modified before any of them are marked as threadsafe. Adding an ENQ in PROGA to serialize access to the CWA won’t prevent PROGB from updating simultaneously.

Maximizing CPU Savings and Performance Because the CPU savings achieved in threadsafe CICS/DB2 programs is the result of not issuing TCB swaps, your CPU savings is maximized if your pro-gram remains on its L8 TCB from the time it issues its first DB2 command to the time it terminates. This isn’t always possible due to the issue of non-thread-safe commands. (Not all EXEC CICS commands are threadsafe. Consult the Application or Systems Programming Guide for a list of commands that are threadsafe in your release.) If your program issues a non-thread-safe command while running on the L8 TCB, CICS will automatically swap your task to the QR TCB, where it will remain until the next DB2 command, reducing the potential CPU savings. The reduc-tion in TCB swaps may also result in improved performance.

CPU Savings In the article, “CICS Open Transaction Environment And Other TCB Performance Considerations” (www.cmg.org/proceedings/2006/6130.pdf), Steven R. Hackenberg, an IBM

Certified IT Specialist, gave this exam-ple of the potential CPU savings: “To put this in perspective, consider a CICS region that processes 1,000 transactions per second with each doing one DB2 request. That amounts to 4 MIPS just for task TCB switches.” Another example can be found in the article, “Running OMEGAMON XE for CICS as Threadsafe to Reduce Overhead While Monitoring CICS Transaction Server V2 from CCR2 (http://www-01.ibm.com/software/tivo-li/features/ccr2/ccr2-2004-06/features-cics.html) written by Richard Burford, an IBM R&D developer. The CPU sav-ings can also have a direct impact on reducing your mainframe software cost.

Improved Performance If you’re experiencing poor response times, can threadsafe help resolve your problem? Here are the questions you must answer:

• What percentage of CPU is my QR TCB running?

• Does my QR TCB have to wait to be dispatched by the operating system?

• Are a high number of tasks running under the QR TCB being delayed?

• Are application programs waiting excessively for the QR TCB?

If you answered yes to most of the questions, then defining programs as threadsafe and processing as many tasks as possible on an open TCB will remove this constraint on the QR TCB and reduce the response times of both thread-safe and non-threadsafe transactions. Steven R. Hackenberg also gave this recommendation: “As the QR TCB exceeded 50 percent of a single CP, or the CPU consumed by QR and higher priority workloads exceeded 50 percent of the LPAR [Logical Partition], experi-ence has shown that ready transactions would begin to queue while trying to gain access to the QR TCB. This would be evidenced by the rapid increase in wait-on-first-dispatch and wait-on-redispatch times found in the CMF 110 performance records, and rapidly erod-ing total response times.”

Selecting the Pilot Threadsafe Application Because the CPU reduction in thread-safe programs is the result of eliminating the TCB switch overhead from every DB2 call, you’ll receive the greatest ben-efit by converting heavily used programs that issue large numbers of DB2 requests.

Review your CMF statistics to identify how many DB2 calls each application issues and then multiply that number by the number of application transactions per second in your environment. You can use the result to sort your applica-tions by DB2 calls (i.e., TCB switches) per second; the higher the number, the greater the potential CPU savings. For a pilot project, you must also consider the scope and complexity of the conversion. The ideal candidate will have a combination of:

• A relatively small number of programs to limit the scope of the review

• Few (or no) Assembler programs. Unless your shop has strong Assembler skills, reviewing Assembler programs for non-threadsafe activity can be difficult.

• All COBOL or PL/1 programs LE-compliant and compiled and linked as RENT.

Additionally, applications running in a QR-constrained CICS environ-ment—where transactions have a large wait for QR dispatch—will show addi-tional reduction in response time, as application code processing is diverted to the L8 TCBs.

Summary Running DB2 programs as thread-safe on the CICS Open Transaction Environment is a true win-win scenar-io. It provides us the opportunity to significantly reduce CPU requirements in our production regions while simul-taneously increasing throughput by exploiting the z/OS multi-processor environment. Z

russ evans is an independent technical consultant based in the Northeast. He has more than 25 years of experience with CICS, both as a systems programmer and as an Assembler language developer on programmer productivity tools, and is an acknowledged expert in the area of CICS threadsafe considerations. He’s a member of the CICS project for SHARE and has presented at regional user groups, SHARE, and Guide/Share Europe. His presentation “Threadsafe Conversion Techniques for CICS Applications” received a SHARE best session award at SHARE Orlando in 2008.Email: [email protected]

nate murpHy is president of Nate Murphy & Associates, and also heads the Tridex IMS, DB2 for LUW, and z/OS database users groups in New York. He has spent 45 years in mainframe information technology. His company specializes in database technology and mainframe operational efficiency. He was selected as a 2010 IBM Information Champion. Email: [email protected]: www.natemurphy.net

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   5 5

My short collection of DBA proverbs has been one of my more popular “z/Data Perspectives” columns. With that in mind, I decided it was time to share some additional

quotes, sayings, and proverbs that apply to database administration. One of my favorites comes from famous behavioral psy-chologist B.F. Skinner, who said, “It isn’t that they can’t see the solution; it is that they can’t see the problem.” How many times have you had to deal with people rushing off to find a solu-tion before they even understood the problem? Happens all the time in IT. If you can’t see the problem, then you’ll never formulate a work-able solution to that problem. This one applies to vendors, too. How many times has a salesperson tried to sell you a “solution” when all he really has to sell is his product? You can’t sell a solution if you don’t know what the problem is, folks! Disagree with that? Then I offer you another couple of quotes; the first comes from French author Pierre-Augustin Caron de Beaumarchais, who said, “It is not necessary to understand things in order to argue about them.” I see evidence of the truth of this quote every day. And then there’s Thomas Edison, who said, “There is no expedient to which a man will not go to avoid the labor of thinking.” Both quotes speak of our inherent laziness. Quite often, we start to argue before knowing what it is we’re arguing about. Or we get so caught up in our own position, we don’t stop to listen and hear what others are saying. I admire peo-ple who change their minds when they’re confronted with different facts or a changing ideology. If you believe the same things today that you did when you were in college, then it’s likely you aren’t very bright. The Lewis Carroll Alice in Wonderland books offers sage advice for our particular industry. For example, we can all learn from the Cheshire Cat. Recall the passage where Alice comes to a fork in the road and meets up with the Cheshire Cat for the first time. She asks him, “Would you tell me, please, which way I ought to go from here?” And the cat responds, “That depends a good deal on where you want to go.” Alice, in typical end-user fashion, replies, “It doesn’t

much matter where.” Causing the cat to utter words we should all take to heart, “Then it doesn’t matter which way you go!” Of course, you could follow Yogi Berra’s advice, instead. He said, “When you come to a fork in the road, take it.” But then where would that leave you? Unfortunately, that seems about as intelligent as some IT strategic planning sessions I’ve sat in on. The bottom line

is that planning and understand-ing are both required and go hand in hand. Those of us who practice the discipline of data management and administration understand the rigors of planning; but we also understand the benefits that can accrue. If you have no plan for where you want to go, then at best you will just be going around in circles; at worst, you’ll be going backward. Planning and keeping abreast of the latest technology is imperative in the rapidly changing world of information technology. As Alice might put it, “IT just keeps getting ‘curiouser and curiouser.’ ” Perhaps one of the most appli-cable quotes for software vendors comes from American psycholo-

gist Abraham Maslow, the man who invented the hierar-chy of needs that we all learned about in school. Maslow said, “If the only tool you have is a hammer, you tend to see every problem as a nail.” To that, I would add this one from historian Thomas Fuller, “A bad workman never gets a good tool.” Matching tools to problems can create solutions, but this can happen only if you have the right tools. And finally, one of the world’s brightest sages was W.C. Fields, who said, “The world is made up of only three things: oxygen, nitro-gen, and baloney!” Remember that one the next time you’re knee deep in a data modeling session and you might be able to reduce the number of data elements you’re dealing with. Z

CraiG s. mullins is president and principal consultant with Mullins Consulting, Inc. He’s an IBM Information Champion and has worked with DB2 since Version 1. He’s also the author of two best-selling books, DB2 Developer’s Guide and Database Administration: The Complete Guide to Practices & Procedures. Website: via www.craigsmullins.com

More DBA ProverbsCraiGS.mUllinS

z/Data Perspectives

5 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Matching tools to

problems can create

solutions, but this can

happen only if you have

the right tools.

The Airline Tariff Publishing Company (ATPCO)—the Washington, DC-based central clearinghouse and publisher of air-

line fares and fare-related data—has been focusing on controlling costs to ensure customer satisfaction. The technology company’s IT division has focused on improving system performance, which reduces costs and frees up resources for new products and services. ATPCO processes fare data from more than 500 airlines and sends that data to global distribution systems, computer reservations systems, and other travel-related companies around the world. ATPCO processes three times as many transactions as it did 10 years ago. In recent years, the company also has expanded the depth of its data, grew its customer base and broadened its dis-tribution. However, many of ATPCO’s product applications are more than 10 years old and no longer meet all cus-tomer requirements. ATPCO is leveraging its existing z/OS technology and infrastructure while rolling out new Web-based products to meet customer needs. Essential to these

efforts is avoiding unnecessary hard-ware acquisition costs and reducing software maintenance costs. The com-pany determined that implementing effective performance management was one of its best options for improving system performance. At ATPCO, performance manage-ment encompasses more than software; it involves multiple organizational units, including the:

• Production support team comprised of experts on the company’s products and procedures

• DBA and systems support teams responsible for maintaining databases and the system environment

• Business team, which interacts with the worldwide user community

• Performance group, which tracks and communicates performance issues, and identifies and responds to situa-tions that could jeopardize perfor-mance objectives.

The two-member performance team includes a system performance specialist and capacity planner, who

tracks performance and does forecast-ing, and an application specialist, who proactively develops solutions. The system performance specialist moni-tors and manages system resources to meet business needs and maintain Service-Level Agreements (SLAs). Workload Manager (WLM) policies, which distribute resources to high-pri-ority jobs so they can perform at the expected level, support this process. Since WLM doesn’t understand or resolve the reason why some jobs require more resources, the application specialist is responsible for finding ways to reduce resource consumption. Augmenting the performance of existing architecture is challenging, per-haps even more so than designing new applications, yet challenges create oppor-tunities that can lead to innovative ideas. Here’s how we handled those challenges from an application perspective.

Steps Taken We identified critical processes based on business value and visibility and tar-geted those that would secure the most cost savings. The nature of our business

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   5 7

Performance Management Essentials to IT SuccessB Y M A N O M A T H A I

makes the load on our system highly volatile, especially since we can’t control the daily activity. Our goal is good per-formance with any reasonable load fac-tor. To achieve this, we identified bottlenecks in the critical paths of stra-tegic and business processes and located specific areas for code optimization. We looked at scalability, focusing on applications that perform well at low load factors but crawl at high load fac-tors, and vice versa. We optimized code for different load factors and created unique paths and implemented self-tuning applications with intelligence built into control sections to choose the most optimal path. We profiled performance of pro-cesses to understand the flow of logic and behavior of functions. Tools help somewhat in gathering information, but getting actionable intelligence with manual processes (such as traversing through application logic to construct signal flow diagrams) is hard manual work. Understanding business function-ality is extremely important, and it’s best not to disturb the business function even if it’s not understood initially. We looked for duplication of business func-tionality and extraneous function calls. We found ways to simplify the logic flow. Throughout the process, we had to remember that the compiler or the envi-ronment may have inserted “invisible” functions or hidden code into the source; this meant we had to look beyond the source code. For example, while evaluating the high Task Control Block (TCB) switching rate of a CICS task, we learned that a third-party instrumentation facility that was pulled in at execution time was the culprit causing the high rate. We took advantage of Parallel Sysplex for performance and load balancing and to lower third-party software costs by routing workload to the machine with the required software resource. We made minor changes to our applications to accomplish this, but we achieved a significant financial return. Additionally, some of our databases don’t participate in database federation, forcing us to conduct multi-phase com-mits manually. This resulted in data integrity being questioned when we had to retrace many steps to do manual fixes. Even though data integrity wasn’t compromised, we considered this a per-formance issue because of the loss of productivity it caused. We carefully designed a homegrown agent that would

oversee the multiple phases of commit and would trigger an undo process in case of failure. Our back-end database is DB2 for z/OS, which manages about 10TB of data. Poorly performing SQL statements are the easiest to identify and correct. DB2 is well-equipped with accounting information at the correlation-ID level, package level, or even more granular lev-els. Only when an SQL solution is insuf-ficient do we resort to other solutions. Not all performance solutions are software solutions; some are procedur-al. Examples include stacking up non-critical or time-insensitive updates for non-peak hours or running various audit reports together by sweeping the database just once. It may sound like a clear-cut solution, but it’s difficult to get a consensus when working within the layers of communication and time zone differences typical of a global community. We also faced some interesting busi-ness functionality issues. We collect data from different sources (converging on one key) and distribute the same data to a different set of clients (diverging on another key). There’s an authentication process validating who has clearance to input the data and a filtering process at the distribution end. Both are resource-intensive procedures because of the granularity of authentication, but we proved we could reduce cost signifi-cantly by rewriting the algorithms. It was risky to change the decision-mak-ing modules that are the backbone of the business, but there was a greater risk in not trying. After careful evaluation and thorough testing, we rolled out the new algorithms and they were success-ful. We solved some design issues with historical databases that mirrored oper-ational database design. By changing the design, we could curtail exponential data growth. We keep dated informa-tion in our database, and old attributes become outdated when new ones are made effective. The old design involved explicitly applying a discontinue date to the old attribute; the new design assumes an implied discontinue date based on the presence of a new attribute. This yielded a more than 50 percent savings in the load operation.

Tools Used• IBM’s Tivoli Decision Support for

z/OS for general trend analysis• PLAN_TABLE as a repository of

explains of static SQL

• DSN Dynamic Statement Cache for dynamic SQL

• Visual Explain option of IBM’s Data Studio for graphic view of the SQL

• Compuware Corp.’s STROBE for in-depth analysis after we target a process for tuning.

We extracted accounting informa-tion from the Tivoli database using native SQL in a format that’s compara-ble to a DB2PM report. We used thread-level and package-level details to pinpoint likely candidates, then we used STROBE and Data Studio for in-depth analysis.

Techniques The keys to good performance are to issue as few I/Os as possible, reduce internal data movement, and minimize table processing. One methodology doesn’t suit all circumstances, so ATPCO used numerous methods:

• Since thread creation is expensive, we took steps to avoid job initiation. When events that warrant a job were infrequent, we made job initiation event-triggered rather than polling on a time interval. When these job-trig-gering events were too frequent and numerous, we stacked them up and released them at regular intervals.

• Usually, external I/O to and from data files and networks is the slowest com-ponent. Our first priority was to tune outside the application by optimizing buffers or cache controllers or acceler-ating I/O with DFSORT.

• We reduced slack and wait for resources by eliminating resource contention and optimized our batch window by increasing parallelism.

• Avoiding date format conversion reduced internal data movement. Date is a big part of our data attributes; it’s received in different formats from numerous sources. We try to ensure that it’s converted only once to the internal format, and, in most cases, we’ve standardized the date element for minimal conversion.

• We reduced table processing and searches. We learned that in-memory processing is faster than DB2 temp tables, and binary searches are faster than serial searches. We conducted binary search on the one-to-one rela-tionships and serial search only on no-match of the one-to-many rela-tionships. We split tables larger than the maximum allowable sizes and used pointers to map the tables rather than

5 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

moving data around. We developed a way of pre-calculating the size of the searches so we can use these tech-niques only when necessary.

• We avoided wide loops by placing unnecessary functions outside the loop and positioning the busiest loop inside while nesting.

• We learned the hard way that run units can be timed out, buffers can be paged out, and inactive files can be quiesced if we don’t optimize the life of threads. We streamlined and mini-mized external data access by storing data in memory for the duration of the process. We also reduced the size of DBRM and program to avoid EDM pool paging. We identified and con-verted edits that different modules performed repeatedly.

DB2 Solutions DB2 tuning efforts that gave us the most performance gains involved:

• Targeting packages that showed high getpage and taking steps to reduce it

• Tracking deadlocks and lock escala-tions and re-evaluating commit fre-quency

• Deferring updates until the commit point to minimize the duration of lock

• Evaluating actual usage of indexes and making adjustments accordingly

• Reviewing nested loop joins to con-firm they were looping on the smaller table and achieving up to a 90 percent savings by forcing the SQL to loop on the smaller table

• Identifying single object constraints and increasing parallelism by parti-tioning into smaller objects

• Using the multi-row insert capability introduced in DB2 Version 8 for a 20 percent savings. More savings could have been achieved with the atomic option, but the non-atomic option gave us a unique time stamp (40 to 50 microseconds apart) on a column defined as timestamp with default. Using the multi-row fetch operation yielded an almost 50 percent improve-ment, but the overall gain was only 25 percent because the multi-row result set had to be adapted to a framework designed to handle one row at a time.

More Improvements Here are some additional improve-

ments we made and lessons learned in the process:

• A DB2 table indexed on an ever-ascending timestamp column was defined with no free space and no free page. Everything was fine as long as there was only one thread inserting data. When multiple concurrent threads started, the first one put an exclusive lock on the last page for insert. The second thread tried unsuc-cessfully to get a lock on the target page. Since the target page is the last page, it started browsing from the beginning. We observed the high get-page and corrected the problem by adding some free pages.

• A singleton fetch-only thread bound with cursor stability was providing great performance, but five concur-rent threads going after keys that are far apart (no two SQLs targeting the same page) resulted in poor perfor-mance. Concurrent threads showed a tremendous increase in synchronous read. We tested on an isolated DB2 system to further research the scenar-io. At execution of concurrent threads, the buffer pool and other resources

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   5 �

DINO-VR_Lockhart_zJournal halfP-1 1 7/2/2010 12:55:06 AM

were in contention and the optimizer may have decided to do synchronous reads. There must be a maximum limit on the percentage of the buffer pool that can be monopolized by a single object. We corrected the situa-tion by partitioning the table and ren-dering it into multiple objects.

• We noticed an increased amount of Systems Management Facility (SMF) log offloading when a particular CICS task was running. The CICS-DB2 thread with RELEASE (COMMIT) attribute would cut a thread termina-tion record, SMF-101, at commit time. This particular task was looking for an unused five-digit number. Once the largest five-digit number is used, it looks for unused holes, with every single lookup resulting in a commit to release lock. The application was mod-ified to correct the situation.

• One application that keeps a large set of locks between commits was affected when a monitoring facility capturing IFCIDs, which uses Extended Common Service Area (ECSA) heavi-ly, was turned on. The coincidence was identified and the problem was corrected.

• We observed some SQLs doing table space scans instead of list prefetch. We

identified the root cause as a RID LIST exceeded condition. Doubling the RID pool size corrected most of the issues and the remaining problems were handled on a case-by-case basis by tuning SQL and avoiding a large RID LIST.

• A column with possible Y/N value defaulted to cardinality 25 because of insufficient statistics, resulting in poor performance. We corrected this with enhanced distribution statistics.

• An SQL used match columns that weren’t in the index, causing data pages to be searched in the join. Index columns alone would have sufficed.

• A table with a single identity column and no index was used as a next number generator. An application was created to insert a row, get the next number, delete the row, and pro-ceed. The unit of work couldn’t be committed until much later. We noticed that the delete action was taking a table lock rather than a page lock. Table lock for deleting a row with index is justified because it can’t release the index page until commit, but we didn’t understand the reason for the table lock when there’s no index. We modified the application so it didn’t delete the row and added

an offline process to delete rows behind the scenes.

Conclusion With an IT agenda that was domi-nated by cost cutting, ATPCO manage-ment had the foresight to invest in efficiencies. A CPU upgrade was deferred in 2007, major applications were installed with no adverse perfor-mance impact in 2008, and the momen-tum continued in 2009. In January 2010, CTO Steve Daniels explained that adjustments to ATPCO’s mainframe performance improved the entire IT services division and “literally saved the company millions of dollars.” Figure 1 shows how the 2009 average work units per workday increased by 20 percent, yet mainframe usage declined by nearly 5 percent. All solutions haven’t yet been installed because they’re being completed in stages based on resource availability and business needs. Z

mano matHai is an application performance specialist at ATPCO. She has been with the company for 28 years, in different capacities, including application developer, software architect, DB2 specialist, and performance specialist.Email: [email protected]

� 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

2508

6474

0

1000

2000

3000

4000

5000

6000

7000

2004 2005 2006 2007 2008 2009

MIP

S/W

ork

Uni

ts p

er D

ay

Year

Av. Plex CPU (MIPS)

2009 Average work units/workday increased by 20%2009 Mainframe usage declined by nearly 5%

Avg. work units/weekday Max. Work Units

figure 1: Mainframe Usage

With critical business services for almost any enterprise flowing across multiple platforms—including Windows, UNIX, Linux, IBM mainframe, and Linux

on System z—it would seem natural that best practices for systems management of these services would be a cross-platform approach. The business advantages of managing such services with cross-platform systems management include reduced costs, increased availability, reduced service outages, shorter time to diagnose and repair, more efficient systems management (fewer people handling a larger scope of work), and less risk to the business. Nonetheless, there has been significant resistance to implementing cross-platform efforts in most systems management disciplines. Although we might blame territoriality for the resistance, most IT professionals recognize that putting aside territori-ality and silo mentality is the right thing to do. There’s no need to prove their platform is the “best” one or the “right” one. What is important is to ensure that the business servic-es executed on their platform are executing correctly, and that proper intra-platform communication is happening as and when it should. One barrier to achieving this may sim-ply be language. Each platform has its own unique language and acronyms. The acronyms seldom translate easily when moving across silo boundaries. It’s particularly pronounced between distributed systems and IBM mainframe technolo-gists. Instead of learning how to communicate in a “foreign” language, IT continues for the most part to use platform-specific systems management processes and tools that per-petuate the inefficiencies and business impacts that result from such a siloed approach. When a problem occurs with a critical business ser-vice, the business owner asks, “When will my service be restored?” IT responds by inviting all the IT specialists for the participating platforms and technologies to a “war room.” Here, supporting evidence is provided with the goal of proving that the problem isn’t theirs. However, the cause of the problem often lies in the interaction between different technologies. Terminology, data, and tooling dif-ferences lead to confusion, wasted time and effort, and delays in restoring the business service to promised ser-vice levels. The business value of cross-platform management is

so compelling that IT should be searching for a solution. In fact, it may be available today in the form of middle-ware management. Most IT sites use IBM WebSphere MQ for their middleware. It includes common concepts and terminology across disparate platforms. Unfortunately, many technicians still manage MQ with individual plat-form tools. Since MQ provides a crucial connection between platform services, problems need to be addressed quickly and service restored with minimal dis-ruption. The MQ technology lends itself to true proactive management, in which problems can be identified before they impact business services, and are automatically resolved, eliminating outages and manual time and effort to diagnose and triage. A single pane of glass can provide visibility into MQ health and performance across AIX, Linux, Windows, and z/OS. It delivers a view of business services as well as the technology pieces that support them. The common picture and common metrics foster easy communication between MQ teams. It can also provide the basis for extending the scope of platforms that one technician can manage. An automated response to a developing problem enables the proactive management of this environment and the ability to rapidly restore service for the customer service application and without the need to convene a war room. Business processes today are cross-platform, and they need cross-platform management structures. IT will need to institute processes and solutions for cross-platform manage-ment. Middleware management is a logical and appropriate starting point, and instituting cross-platform middleware management will ensure IT meets its goals of delivering ser-vice levels the business requires while reducing the cost of delivering those services. Z

G. Jay lipoviCH has more than 35 years of experience in the design and development of strategies and solutions for more effective infrastructure and data management, including design strategy and performance testing for a mainframe hardware vendor; and design and development of strategies and solutions for infrastructure, data man-agement, and capacity planning. He has published numerous articles in trade journals and is a frequent presenter at Computer Measurement Group (CMG) conferences and seminars around the world. He has been a guest lecturer at the U.S. Department of Defense Computer Institute and is ITIL Foundation certified. Email: [email protected]

The Door Is Open to Cross-Platform Systems Management

Cross-Platform ManagementG.jayliPoviCh

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 1

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

By Srinivas Potharaju & Arka Nandi

atch applications are the most-often encountered workload in the IBM main-frame environment. Over the past few decades, cus-tom applications have been written, modified, and patched even as new ver-

sions of operating system and vendor software were released, sometimes with major enhancements to make the appli-cations more efficient. However, the older, custom applications haven’t always been able to efficiently leverage the latest enhancements. Large organi-zations have realized they could signifi-cantly reduce their Total Cost of Ownership (TCO) for their batch work-load by implementing various enhance-ments made possible with new releases of IBM mainframe software. This article shares insights about tuning batch applications as learned at a top financial organization and an approach for analyzing the batch work-load. We’ll look at the impact and bene-fits of tools and programming techniques the financial organization used, some of which can also be used to tune CICS applications.

CPU Optimization CPU optimization efforts should ideally start with identification and analysis of the most expensive work-loads. This can be done using the most suitable performance management tool available at the installation. Analysis of the data collected can facilitate deci-sions regarding the approach to follow for optimization. This article primarily focuses on optimization efforts for

application programs where source code is available.

Identify High CPU Consumers To identify high CPU consumers, start by looking at mainframe charge-back (or billing) data, which is usually collated from System Management Facility (SMF) records generated during execution of a workload on the main-frame. If chargeback data isn’t available, then you’ll need to run custom reports from SMF. The next step is to create a list of the most expensive workloads for prioritizing the optimization effort.

Use a Performance Management Tool Next, use an application perfor-mance management tool to identify the most CPU-intensive sections in the pro-gram. This step involves working with the mainframe capacity planning or performance group to set up the appli-cation programs targeted for optimiza-tion. The request would then be queued by the tool or executed immediately if the program is running. After the tool completes its measurement, a perfor-mance profile can be generated. You can then analyze it and examine the source code (and compiler listing) to decide on a code remediation strategy for opti-mizing performance.

Address Code Remediation Analysis of the performance profile of a program (in conjunction with the source code and compiler listing) is the most important step in CPU optimiza-tion efforts. Performance profiles detail where and how CPU time is spent dur-ing application program execution.

Analyze the information provided and the program logic to come up with opti-mization recommendations. Figure 1 lists steps to follow in a CPU optimiza-tion project.

CPU Optimization Guidelines Let’s examine some commonly used programming techniques and utility jobs that contribute to high CPU use by batch workloads and determine what can be done to remediate them.

Utility and Batch Jobs Some types of batch jobs can be optimized by using the most efficient utility for the function. Restarts for long-running batch jobs also have a significant impact and should be examined. DSNTIAUL to DB2 unload utility: DSNTIAUL can extract data from mul-tiple DB2 tables using SQL code con-taining joins. This is something that UNLOAD won’t be able to handle. However, when it comes to extracting data from tables that don’t have to be joined, the UNLOAD utility scores sig-nificantly higher from a resource con-sumption perspective. DSNTIAUL utility upgrade: Using old versions of applications or data-bases, including DB2, can hamper per-formance. Newer versions of DB2 for z/OS have enhanced utilities that are significantly superior to their older ava-tars. The DB2 Version 8 DSNTIAUL utility program is up to 50 percent faster (and consumes fewer resources) than its predecessor in Version 7 due to multi-row fetch support for the utility pro-gram. For sites running V7 or older

B

versions, much can be gained by mak-ing the version upgrade. Review batch job space allocations: Often, data sets are used in batch jobs that don’t have enough space allocated to them to hold the volume of data they require. While over-estimation of space should be avoided, under-estimation is

a bigger problem; it wastes precious CPU resources. Increasingly, we encoun-ter batch jobs that have failed with a space abend after doing more than 50 percent of the required work. If such a job completes after a restart, then it would mean it has used more than 150 percent of the CPU resources it should

have normally required. Giving greater attention to estimation and coding of space parameters could lead to signifi-cant CPU savings. Space allocations for data sets can be reviewed for jobs that consistently abend due to SB37, SE37, or any of the other abend codes related to space errors. You can identify such

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 3

jobs by looking at prior abend history for a batch job in the high CPU con-sumers list.

Best Practices for COBOL The program usage section in the performance profile identifies the CPU usage for COBOL statements during application program execution. Consider these best practices: COBOL sort to external sort utility: COBOL programs using sort with input/output procedures are less efficient than a COBOL program using sort without the input procedure and output proce-dure sections (i.e., SORT filename USING…GIVING… format). The use of input procedure and output procedure for pre-processing and post-processing of records from the sort utility causes the COBOL compiler to use the NOFASTSRT option, which is less efficient than the FASTSRT compiler option. There’s a 30 percent improvement in CPU time for programs compiled with the FASTSRT compiler option, which uses the COBOL sort statement in the USING…GIVING… format, or uses an external

sort utility in Job Control Language (JCL) such as IBM’s DFSORT or SyncSort from Syncsort. Search all instead of search: Use COBOL binary search instead of sequen-tial search in programs to improve per-formance by 10 to 30 percent. You may need to sort the file being used to load the array prior to program execution or build in an order by clause in the SQL statement if the records are being loaded to the array from a DB2 table. Use correct data types for arithme-tic: Don’t use DISPLAY data types for arithmetic. The recommendation is to use binary, packed decimal, or floating point for arithmetic operations. DISPLAY variables are commonly used as subscripts, counters, etc.; it’s advis-able to use binary data types for such arithmetic. Using correct data types for arithmetic should improve performance by 10 percent. Use indexed tables/arrays instead of arrays with subscripts: Use indexes (INDEXED BY phrase of OCCURS or USAGE IS INDEX) to reference COBOL arrays for programs that have a lot of

table processing to improve perfor-mance. Indexing for arrays performs better than subscripts defined as either external decimal (or usage display) or binary data items. Initialize very large arrays: Using the INITIALIZE identifier statement, where identifier refers to a large array or a large group variable in a COBOL working-storage section, can be a major CPU consumer. This is especially true when the INITIALIZE statement is coded inside a loop that’s performed repeatedly for every record read from an input file or row from a DB2 table. The INITIALIZE statements can be replaced with MOVE statements using a group-level COBOL variable. This can yield an improvement in CPU time of 10 to 15 percent.

COBOL Compiler Options You can use the following compiler options to further improve program performance. Use these options after you’ve taken care of design and other performance issues in the code. Use TRUNC settings to further improve performance. The three possi-ble settings are OPT, BIN, and STD. TRUNC=OPT is the fastest; this setting assumes the data conforms to PICTURE and USAGE specifications for binary data items. TRUNC=BIN is the slowest; the compiler generates extra code to handle halfword, fullword, and double-word binary data fields. This setting is usually used when COBOL programs interface with DB2 or CICS. Performance of TRUNC=STD is slower than TRUNC=OPT and is much faster than TRUNC=BIN, as it uses base 10 truncation for binary receiving fields. For a detailed discussion on the impact of compiler (AWO or NOAWO, SSRANGE or NOSSRANGE, etc.) and run-time (ALL31, CHECK, etc.) options that impact run-time performance of COBOL programs, please refer to IBM Enterprise COBOL Version 3 Release 1 Performance Tuning guide.

SQL Efficiency The efficiency of SQL statements is vital when we talk about applications that access relational databases. Some significant gains can be achieved by examining Explain reports for poorly performing programs (that access DB2) and then modifying the SQL statements in them. Most benefits come from removing table scans and promoting index scans. Poorly performing SQL statements can be identified from the

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

figure 1: Methodology for code Remediation

CPU usage by SQL statement and wait time by SQL statement in a performance profile. High numbers in the statement execution cost and % CPU time fields (when compared to the number of times a statement is executed) indicate the SQL may be a candidate for tuning. Look at Explain reports from the PLAN_TABLE to find out what can be done to tune the query. Use EXPLAIN(YES) when programs are migrated to production regions, pro-vided the PLAN_TABLE is created in the production database. If the PLAN_TABLE isn’t in production, then imme-diately lobby the DBA group to create one so you can take advantage of valu-able information from Explain.

Reduce Calls to DB2 Programs that use DB2 can be fur-ther optimized by eliminating redun-dant calls to DB2. You’ll need to look at the performance profile and analyze source code thoroughly to determine the usefulness of these techniques: Use of fetch first row only: Application programs that use a sequence of cursor OPEN/FETCH/CLOSE operations to fetch only one row from a table or result set can use the FETCH FIRST ROW ONLY clause

in the SELECT INTO statement to suc-cessfully eliminate calls made to DB2 for open and close of the cursor. This technique can improve CPU time by as much as 50 percent. DB2 V8 allows the ORDER BY clause to be coded for SQL using the SELECT INTO... FETCH FIRST ROW ONLY statement. This previously wasn’t supported. The new feature can be used to fetch top row based on a user-specified ordering, thereby eliminating the need for cursor operations in certain cases. Save previous value in the COBOL working storage section for sequential processing of records: COBOL pro-grams that process records sequentially from a large file or table can be further optimized by saving the previous value fetched from DB2 in the COBOL work-ing storage section (also referred to as lookup method), provided the value in the host variable used for executing the SQL remains the same. This coding technique can eliminate excessive and redundant calls to DB2 and improve the CPU time by as much as 70 to 80 per-cent. To determine whether your pro-gram can be tuned using this method, refer to the CPU usage by SQL state-ment section in the performance profile and look at statement execution counts

and % CPU time fields. High numbers in the statement count field for an SQL statement and % CPU time would iden-tify the SQL queries for further analysis. You’ll then have to analyze the source code to determine if the queries identi-fied execute repeatedly using the same input parameters. In certain cases, you may have to sort the input file or build in an ORDER BY clause in the table before you can use this programming technique. If the query identified using the aforementioned method is used in a cursor, then a COBOL array can be defined to store the result set from the FETCH operation. You’ll have to code your program to save the input parame-ters used in the OPEN cursor statement and compare the saved parameters with the next set of parameters before your next OPEN statement. If the saved value and current value match, you can use the value stored in the array. If the val-ues don’t match, refresh the array using the new values for input parameters. Leverage features of newer ver-sions of DB2: DB2 multi-row FETCH, new in V8, can help you optimize pro-grams where the CPU cost for the FETCH cursor operation is high. You can improve CPU time by as much as 40 to 50 percent by implementing DB2

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 5

If properly done, CPU optimization exercises can pay for themselves in terms of CPU savings and reduced chargeback fees.

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

multi-row fetch. Use of this feature reduces the number of calls made to DB2 to fetch rows from the result set. The percentage of savings is lower if the SQL used in the cursor is compli-cated with longer access paths. You can also consider using multi-row insert if your application program is inserting a large number of records into a table. Performance profiles and a thorough analysis of the application program are required to determine if multi-row FETCH, INSERT, cursor UPDATE, or DELETE can be used to reduce CPU time. Use COBOL arrays/lookup tables for reference data: Data present in small to medium tables that are refer-enced multiple times in an application program can be loaded in a COBOL array in the working storage section to cut down the number of calls to DB2. Queries that use the DISTINCT clause on small tables to fetch unique rows for

reference data repeatedly in a loop can be optimized by using COBOL arrays to store the result set. Use the capabilities of SQL: Poorly designed application programs (and databases) can cause severe perfor-mance issues and are a leading cause of high CPU usage in batch and online workloads. Often, the capabilities of SQL aren’t used; this can also cause per-formance issues. It’s always advisable to use the capabilities of SQL in applica-tion programs rather than try to work with data fetched from DB2 for joins and other operations. Programmatically coded joins perform poorly; they increase the number of calls to DB2. Row-by-row processing of records from a table can also lead to performance issues. Consider an example of calculat-ing asset values for a set of accounts from the account table. Say the asset calculation routine involves fetching the price of all the securities for an account

from a price table for the previous 30 days to compute asset values. If the cal-culation routine processes records account-wise, then this would involve multiple calls to the price table to repeat-edly fetch security prices. This would increase the number of calls to the price table and can lead to performance issues. Such design issues can be identi-fied by looking at the statement count and % CPU time fields from a perfor-mance profile. However, fixing this issue would involve a complete redesign of the application program, which can sometimes be expensive and risky.

Conclusion Figure 2 shows various issues and risks involved in a mainframe CPU optimization project and the steps you can take to address them. Poorly written code or SQL can increase CPU costs, which can lead to high cost of owner-ship. This cost can be reduced by imple-menting the strategies outlined here. However, such an exercise isn’t without its risks and challenges. Although main-frame CPU optimization exercises are effort-intensive and time-consuming, if they’re properly done, they can pay for themselves in terms of CPU savings and reduced chargeback fees. Z

srinivas potHaraJu is a technology architect with Infosys Technologies Ltd. He has extensive experience in the design, development, maintenance, performance tuning, testing, and implementation of applications in batch and OLTP environments. His areas of interest are database design, data modeling, and performance engineering. Email: [email protected]

arka nanDi is a technology architect with Infosys Technologies Ltd. His areas of interest are performance engineering and database design and administration. He specializes in DB2 on z/OS and SQL Server.Email: [email protected]

referenCes • IBM DB2 Utilities Suite for OS/390 and z/OS,

Version 7: http://www-01.ibm.com/software/data/db2imstools/pdf/sws1054f.pdf

• IBM DB2 UDB for z/OS Version 8 Performance Topics (SG24-6465-00): www.redbooks.ibm.com/abstracts/sg246465.html?Open

• DFSORT Tuning Guide: http://publib.boulder.ibm.com/infocenter/zos/v1r9/index.jsp?topic=/com.ibm.zos.r9.icet100/ice1ct00135.htm

• LE Performance Tips and Techniques - COBOL and PL/I Issues: http://www-01.ibm.com/support/docview.wss?uid=swg27001515&aid=1

• IBM Enterprise COBOL Version 3, Release 1 Performance Tuning guide: http://www-01.ibm.com/support/docview.wss?uid=swg27001475&aid=1

• Multi-Row Fetch Proven Benefits and Usage Considerations: www.db2expert.com/downloads/db2os390/YLAIDUGMultiRowFetch.pdf.figure 2: issues or Risks identified

Compliance OptionsGWenThomaS

Constraints, Controls, and Capriciousness

Pity the child who asked my fourth grade teacher, Mrs. Hunt, “Can I go to the bathroom?” “Yes, you can,” she would say, “but no, you may not.” The

first time I complained to my mom that we’d have to repeat the request in its proper form before Mrs. Hunt would say, “Yes, you may!” I expected Mom to get as angry as I was. Instead, she laughed and said that was a good way to teach grammar and also to highlight the difference between con-straints, controls, policy, and capriciousness. Of course, I wanted to know what she meant. My older brother Reese, who already knew this speech, grinned at me knowingly as Mom explained that Mrs. Hunt was assuming we were all capable of going to the bathroom; that there were no physical constraints keeping us from going. The school had a policy, however; kids couldn’t leave the classroom without a teacher’s permission. Mrs. Hunt was old, but she was probably fast enough to grab us if we tried to circumvent policy and make a break for it. And if she couldn’t prevent a child from leaving, she’d certainly react to it. It was an adequate control system, Mom said. Then Mom got serious. My teacher wasn’t being capri-cious, she explained. She wasn’t making up a rule on the spot, based on how she felt at the time. Being capricious about important things, such as not allowing someone to empty their bladder with dignity, is an abuse of power or trust, she said. No one likes or admires that. And she wouldn’t put up with it. If this were the case, then she’d march down to the school and have a chat with the princi-pal. But Mrs. Hunt was teaching policy, discipline, and grammar. Mom was all for it. Of course, Reese knew how to flip this into a game. “Hey, Gwen, may I hold my breath for 20 minutes, please?” he’d ask. “Yes, you may, but no, you can not!” I’d reply. “Hey, Gwen, may I fly to the moon?” “Yes, you may, but no, you can not!” “Hey, Gwen, may I reach right through the oven door and grab the hot baking dish?” “Come on! You know you can’t!” Fast forward several decades. I started working with soft-ware teams charged with integrating large sets of informa-tion and moving data between mainframes and client/server applications. I kept hearing that certain functionality wasn’t possible, when I knew for a fact it was. “It can’t happen the way you want!” developers would tell me, and I’d have to

play 20 questions with them to determine whether it wasn’t:

• Physically possible under any circumstances• Possible because of the way the system was architected or

configured• Possible due to preventive controls• Feasible because of related conditions or constraints, such

as resources or timelines • Advisable• Permitted because of a law regulation, policy, standard,

agreement, contractual requirement, or compliance requirement.

Do you work with people who take similar verbal short-cuts? Do they insist things can’t happen—things you’re being told to make happen as part of your compliance strat-egy? If so, the next time you’re told, “No, you can’t!”, consid-er going after a bit more detail. Determine whether the thing you need truly can’t happen, or whether the truth is that it may not. If you can’t do something, it’s because of a constraint or a control. Some constraints aren’t changing anytime soon; people still can’t thrust hands through metal oven doors, and mainframes still can’t fit in a shoebox. Other constraints and controls are situational, so you need to evaluate them in context and decide if they can be negotiated. If the truth is that, “Yes, you can, but no, you may not!”, then you probably need to know the reason so you can pass it on to your stakeholders when they ask why you can’t deliver what they want. It’s important for you to get good answers to your questions. After all, if you’re simply trying to comply with a compliance requirement, the last thing you need is to be accused of being capricious. Z

GWen tHomas is president of The Data Governance Institute and publisher of its Website at www.DataGovernance.com and its sister site, SOX-online (www.sox-online.com), the vendor-neutral Sarbanes-Oxley site. She has designed and implemented many data gov-ernance and compliance programs for publicly traded and private companies across the U.S. and is a frequent presenter at industry events. Author of the book Alpha Males and Data Disasters: The Case for Data Governance, she hosts the Data Governance & Stewardship Community of Practice at www.DataStewardship.com.Email: [email protected]: www.datagovernance.com

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � 7

If you want to get something done fast, get a friend to help. Usually, two can get a lot more done than one, especially when you’re trying to reduce how long something will take to complete. The same applies to DB2’s queries.

If you want a query to complete faster, have a bunch of DB2’s friends (processors) help out. The more processor friends you get involved, the faster that query will complete. Of course, other factors can influence the impact of letting a bunch of processors attack the same query. DB2’s query parallelism targets I/O-intensive que-ries (i.e., table space scans and large index scans) and Central Processor- (CP-) intensive queries (i.e., joins, sorts, and complex expressions). Its objective is to reduce the overall elapsed time of a query by taking advantage of available I/O bandwidth and processor power. This seemingly easy fix for long query elapsed times is one reason parallelism is considered such a perfect companion to data warehousing. Splitting a query across multiple processors doing multiple con-current I/Os is among the most straightforward ways of reducing the elapsed time of a long-running query. You can have a query run across two or more general purpose engines and redirect some portion of the >

� �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

By Willie FaveroDB2 for z/OSParallelism

work to an IBM System z Integrated Information Processor (zIIP) specialty engine, if available, with no application code changes. Without parallelism, a query pro-cesses data sequentially. If the data is in a partitioned table space, that means pro-cessing occurs one partition at a time even though the data is split across par-titions, or multiple independent physical data sets. The beauty of partitioning is that it gives the application the ability to process each partition independently. With parallelism turned on, DB2 can process each partition simultaneously, in parallel, using multiple CPs to reduce the overall elapsed time of the query while minimizing the additional CPU overhead for parallelism. Parallelism comes in three different types: • I/O query parallelism was the first

variety of parallelism. It was delivered in DB2 Version 3 and allowed a single CP to process multiple I/Os—fetching multiple pages into the buffer pool in parallel. Today, I/O parallelism is infrequently observed. I/O parallelism is also not zIIP-eligible.

• CPU query parallelism became avail-able in DB2 V4 and is by far the most common form of parallelism DB2 uses. This method breaks down a query into multiple parts; each part runs on a different general purpose processor and zIIP specialty engine if one is enabled. Each processor run-ning its portion of a query can also perform I/O processing in parallel.

• Sysplex query parallelism, introduced with DB2 V5, spreads a query across multiple processors and can take advan-tage of processors available to other DB2 members of a data sharing group. There are additional DSNZPARMs and buffer pool thresholds that must be set before you can take advantage of Sysplex query parallelism.

DB2 parallelism isn’t a given; it isn’t “just available” in DB2. You must per-form several actions before the optimiz-er decides to consider parallelism as an access path. First, all three forms of par-allelism require that DB2 knows paral-lelism should be considered for a package or SQL statement. In a dynamic SQL environment (the SQL type most likely used in a data warehousing environ-ment), the special register “CURRENT DEGREE” is used to enable/disable par-allelism. If the CURRENT DEGREE register is set to “1” or “1 ”, parallelism is

disabled; it’s not available as an access path choice. If CURRENT DEGREE is set to “ANY,” then parallelism is enabled. The default value for CURRENT DEGREE is set on the installation panel DSNTIP8 or by the DSNZPARM key-word CDSSRDEF on the DSN6SPRM macro. Whatever value is set as the DB2 subsystem default, the value of the CURRENT DEGREE special register can be modified (overridden) via the SET CURRENT DEGREE SQL state-ment. For static SQL, a BIND or REBIND of a package specifying the DEGREE keyword can be used to set the CURRENT DEGREE special register for that package instance. The default for CURRENT DEGREE normally should be set to “1,” disabling parallelism. Parallelism should be enabled on a per-task basis to ensure valuable CPU resources aren’t wasted. The next parallelism control to con-sider is the number of CPs DB2 will be allowed to use. MAX DEGREE on the installation panel DSNTIP8, or the DSNZPARM keyword PARAMDEG on the DSN6SPRM macro, can be used to set the maximum number of CPs DB2 can use for CPU query parallelism. The default for this value is zero, which allows DB2 to choose the degree of par-allelism. Although using 0 can simplify things, you should take time to deter-mine the best value for MAX DEGREE. A guideline for choosing a starting point for MAX DEGREE is to choose a value somewhere midpoint between the max-imum number of CPs available and the maximum number of partitions that will be processed. If you believe the queries will tend to be more on the CPU-intensive side, make this value closer to the number of CPs you have available. If the queries will be more I/O-intensive, make this number con-sistent with the number of partitions. Monitor and adjust accordingly. If the default 0 is accepted, be aware that it’s possible to get a degree of parallelism up to the 254 maximum. If multiple concurrent queries all get degrees that high, it could open up a whole different set of problems, not least of which could be available storage. Once parallelism is enabled, VPPSEQT (the parallel sequential threshold for buffer pools) must also be adjusted to some value greater than zero to actually get the optimizer to consider a query for a parallel access path. The number used for VPPSEQT threshold is a percentage and is 50 percent by default. When specified, it allows for a percent-

What Are Ambiguous Cursors?

What exactly are ambiguous cur-

sors? The simplest definition is a row

returned via a cursor that DB2 doesn’t

know whether it can update. The bind

parameters, ISOLATION CS, along with

CURRENTDATA, have a lot to do with

whether a cursor is considered ambigu-

ous. If CURRENTDATA is set to NO, then

DB2 doesn’t know or care if the data

being referenced remains current or

unchanged. Even though it could be

read by a cursor, it doesn’t mean a

DELETE may not sneak in.

There are steps you can take if you

don’t want a cursor to be ambiguous.

The easiest thing to do is to use the

“FOR READ ONLY” keyword on the cur-

sor. This keyword declares this cursor is

for read use only, so updates can’t occur.

CURRENTDATA set to YES is another way.

DB2 will use the necessary locks for

data retrieved to maintain currency.

Cursors with ORDER BY that access cer-

tain types of views and cursors with

joins are also read-only and therefore

aren’t ambiguous.

—WF

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   � �

age of the sequential steal threshold (VPSEQT) to be used for parallel pro-cessing. VPSEQT is a percentage of the VPSIZE (virtual pool size) value. The default for VPSEQT is 80 percent. If VPSIZE is set to the hypothetical value of 100 pages and VPSEQT is set to 80 percent, then 80 pages will be available from this particular pool for sequential processing. If you set VPPSEQT to 50 (%), then the default 40 pages of the VPSEQT pages are available for parallel processing. Prior to DB2 9, the optimizer picked the lowest-cost sequential plan and then determined if anything in that plan could be run in parallel. As of DB2 9, that lowest cost figure is determined after parallelism has been considered; this is a significant change from previ-ous versions of DB2. The biggest contributor to the degree of parallelism that DB2 will pick is the number of partitions. Nothing influ-ences DB2’s parallelism like partitions. Some actions are required to turn on parallelism. Even if everything is set correctly to make parallelism available in a DB2 subsystem, it’s not a forgone certainty that parallelism will be used. Several factors could still prevent DB2 from selecting parallelism even after it’s enabled:

• CPU parallelism isn’t considered if only one engine exists. If there’s any chance that parallelism could be part of your query execution, configure with at least two engines, including zIIPs. One zIIP and one CP should satisfy the multiple engine require-ment of CPU parallelism.

• CPU parallelism can be disabled using the Resource Limit Facility (RLF). Setting RLFFUNC equal to “4” in the RLF table for a plan, package, or authid prevents parallelism for that object type. Sysplex query parallelism can be disabled with RLFFUNC set to “5” and I/O parallelism with “3.” If all types of parallelism are to be disabled, a row must be entered for each of these three values.

• Setting the buffer pool threshold, VPPSEQT, to “0” at run-time will selectively disable parallelism. For now, VPPSEQT must be set to a num-ber greater than 0 for DB2 to take advantage of CPU parallelism.

• Using “cursor with hold” will prevent a query from taking advantage of par-allelism.

• An ambiguous cursor could also pre-vent parallelism.

• Sysplex query parallelism will degrade to CPU parallelism if you use a star join, sparse index, RID access, or IN-list parallelism.

There are a few other DSNZPARMs that might be of interest. These ZPARMs are referred to as opaque or hidden, which means they can’t be set up or modified via the DB2 installation pan-els. The first is PTASKROL on the DSN6SYSP macro. This ZPARM rolls up the accounting trace records for par-allelism into a single record. Its possible values are YES and NO with YES being the default. If set to YES, all the parallel task records are rolled up into a single record. With YES, less System Management Facility (SMF) data is col-lected and processing costs are reduced. However, some detail is lost. Generally, you should use the default for perfor-mance reasons. If you’re attempting to diagnose a balancing issue, NO should be considered to obtain the more detailed individual records. If NO is specified, each parallel child task pro-duces its own accounting trace record. Next is the hidden DSNZPARM SPRMPTH on the DSN6SPRC macro. This ZPARM can be extremely helpful and is well-documented in numerous DB2 presentations. There’s a threshold, by default 120 milliseconds (ms), a query must reach before actually using paral-lelism. Because of the initial cost set up for parallelism, it shouldn’t be used for short (and quick) running queries. This threshold prevents that from happening. Nothing that runs in less than 120 ms will use parallelism. In some instances 120 ms is still too low to eliminate what still might be considered “fast running” queries from considering parallelism. For those situations, consider increasing SPRMPTH. While there are situations where you might make this threshold higher than the default, there is no rea-son to make it any smaller. PARAPAR1 and OPTOPSE are ZPARMs that no longer exist in DB2 9. OPTOPSE was removed from DB2 V8. The enhancements they delivered are no longer selectable options; the fix OPTOPSE delivered is now always ON. The more aggressive parallel IN-List processing delivered by PARAPAR1 is now part of DB2 9. There are many sources, including information available in the accounting records and available through almost all monitors, to help you determine what parallelism is doing. Additional details are in IFCIDs:

• 221 covering the degree of parallel processing for a parallel group

• 222 about the elapsed time for a paral-lel group

• 223 about when a parallel group com-pletes.

There are also columns in EXPLAIN’s PLAN_TABLE and DSN_STATEMENT_CACHE tables with par-allel details, plus two EXPLAIN tables parallelism specifically uses, DSN_GROUP_TABLE and DSN_RTASK_TABLE. Another reason parallelism can be especially significant is its potential to reduce the cost of doing business on the System z platform. Parallelism breaks a query into multiple parts, each part running under its own Service Request Block (SRB), and each part performing its own I/O. Although there’s additional CPU cost when DB2 first decides to take advantage of query parallelism for the setup, there’s still a close correlation between the degree of parallelism achieved and the query’s elapsed time reduction. The use of SRB is significant. When taking advantage of parallelism in DB2, parallel child tasks can be redirected to a zIIP. Software charges are unaffected by the additional CPU capacity made available by the addition of zIIP processors when added to System z. One of the easiest ways to improve the amount of zIIP redirect is to enable parallelism. There’s also batch work. Taking advantage of DB2’s parallelism in your batch jobs could increase the amount of redirect to a zIIP while also using a resource during a time when the usual Distributed Relational Database Architecture (DRDA) redirect is low. Parallelism can be quite valuable, as it can significantly reduce elapsed times for some batch jobs. It can be a game-saver for warehousing, an area that often sees long-running queries that could benefit from the elapsed time improve-ments sometimes available with paral-lelism. Parallelism gets a little boost in performance, reporting, and stability with every new version of DB2; DB2 10 will continue that tradition. Z

Willie favero is an IBM senior certified IT software specialist and the DB2 SME with IBM’s Silicon Valley Lab Data Warehouse on System z Swat Team. He has more than 30 years of experience working with databases and more than 25 years working with DB2. He speaks at major conferences and user groups, publishes articles, and has one of the top technical blogs on the Internet. Email: [email protected]

7 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Mainframe Security

Here, we address two organizational questions affecting mainframe security: “Whose job is it?” and “Where are decisions made?”

Auditors occasionally identify some issue that’s important for effective security but hasn’t been addressed because no one has been assigned responsibility for it. This can include issues such as how mainframe TCP/IP is secured, which disk data sets are to be encrypted, how we protect sensitive resid-ual data, how system data sets are protected, which powerful programs are secured, whether opening VTAM Application Control Blocks (and the risk of applid spoofing) is to be con-trolled, how SNA network-to-network connections are secured, etc. (Auditors could help more by identifying the organizational issue behind the security issue.) If it isn’t someone’s job, it won’t get done. Auditors will occa-sionally criticize the security administrator for various security settings when the administrator doesn’t have the authority, much less the technical knowledge, to make the decision. Managers can address this by ensuring that standards and poli-cy identify who is responsible for these three functions:

• Decide (how each security option should be set) • Execute (implement these decisions) • Review (ensure each decision is carried out effectively).

In the February/March “Mainframe Security” column titled “Laying the Security Groundwork,” I examined six cat-egories of questions: access to the system, access to data sets and resources, access to the network (both TCP/IP and SNA), operating system protection, organizational issues, and dealing with auditors. Note where each function falls in the organization. The Execute function often falls to the security administrator. Consider his approach to protecting, for example, the payroll application: Everyone knows the Decide function should be performed by someone, such as the head of the payroll department, who understands the associated busi-ness risks. (The degree to which this is performed by some-one who best understands the related risks is an important component of the quality of information security.) The security administrator may consider whether to use encryption to protect the data (on tape, disk, or in the net-work), but may not have the means to make this happen. He may also be considering whether to use features (such as Erase-On-Scratch in RACF or AUTO-ERASE in ACF2 or Top Secret) to protect sensitive payroll data. But this is only possible if he knows which, if any, payroll data sets are considered sensitive, and that may depend on which laws and regulations apply. Who knows the laws and regulations? The legal and regulatory

compliance departments elsewhere in the organization. Managers can improve information security by ensuring each application has a formal risk assessment, specifying which data sets are considered sensitive or confidential, which laws and regulations apply, and what security mea-sures the administrator should be taking to appropriately protect the data sets. (This would be a good place to docu-ment records retention requirements, too, since legal and regulatory compliance have the knowledge to specify these.) So, to ensure your organization provides effective sup-port for information security, ask yourself how true each of the following is, and make improvements as you see fit:

• Is someone clearly responsible for the Decide, Execute, and Review functions for each of these areas: granting and revoking a userid, resetting a password, securing each path into the system, securing each TCP/IP port and IP address, granting access to data sets (on tape, disk, and the print queue, as well as on other platforms), protecting residual data on tape and disk, evaluating whether and how to use each resource class in RACF and Top Secret/type in ACF2, protecting USS files, and securing each TCP/IP daemon? Often, the security administrator shouldn’t be the one to decide.

• As a test of the previous bullet, is there clear documentation available of the decisions whether and how to use these resource classes: JESSPOOL, OPERCMDS, VTAMAPPL, SERVAUTH? Is there someone clearly responsible for these decisions?

• Does your security administrator believe after reading this column he has all the information and authority he needs?

• Is Kerberos installed on the Windows networks used to log onto the mainframe (to prevent sniffer programs from learning everyone’s mainframe userid and password)?

• Does each application have a formal risk assessment reviewed and updated at least every other year by the business owner and by legal and regulatory compliance?

• Does the Decide function result in a written approval that an auditor could use as a standard to evaluate the security software rules? (This is the Review function. If auditors don’t have a written approval or other standard to compare the rules to, they’re likely to use their checklists or personal opinion, resulting in subjective audit findings.) Z

stu HenDerson provides IS consulting and training. His Website provides articles, newsletters, and useful links for management, security staff and auditors, including the “RACF User News” and the “Mainframe Audit News.” He teaches security and audit seminars nationwide and in-house. Email: [email protected]; Website: www.stuhenderson.com

Organizational Questions Affecting Mainframe Security STUhenDerSon

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   7 1

Among the stated objectives of WebSphere MQ (WMQ) is assured delivery, once and only

once, but there are instances when mes-sages can’t be delivered to intended recipients. Causes for this include:

• Non-existing recipient (no queue defined to the queue manager)

• Incorrect recipient (queue name mis-spelled)

• Recipient’s mailbox full (max queue depth reached)

• Recipient unavailable—Open Transaction Manager Access (OTMA) is unable to deliver a message to an IMS queue.

WMQ has provided for these occur-rences through the Dead Letter Queue, (DLQ) the repository of almost last resort that’s similar to a post office department’s dead letter office. Messages landing in that queue require special, usually manual, attention.

Murky Policies There are tools that can be used to delete, copy, and move messages around, but the policies surrounding their use are often murky. Are the contents of the messages private, personal, secret, or otherwise restricted from general view-ing? Could the contents be subject to Sarbanes-Oxley (SOX) or other legal regulations? With only one DLQ per subsystem, how do you manage it with-out breaking any of the unique rules that may exist around the multitude of different messages? Just rerouting the message automatically to its intended destination is insufficient. If the mes-sage was a request-reply across plat-forms, the sender may no longer be waiting for the reply, which may then land in the DLQ of the other queue manager. Obviously, a more compre-hensive solution is required.

Message data is “owned” by a busi-ness project, just like file records are “owned” by a business project. During creation of that project, the ownership of the message data must be estab-lished at a level that can make deci-sions on disposition of the messages if they land in the DLQ. That level also must be established so it will be unaf-fected by personnel changes or depart-mental reorganizations. If the data is sent by one project, processed by a second, and delivered to a third, man-agement becomes more complex, espe-cially if one of those projects is in a different company. Often, decisions made during development to get the project started are carried forward to production. How many developers really have the authority to decide if it’s legal to allow some stranger access to DLQ messages containing a cus-tomer’s unencrypted financial infor-mation? Without responsible message ownership, it’s impossible to establish a DLQ management process that will satisfy the rules to which the message content may be subject. There’s an implied assumption that the message data owner could be linked to a specific message or types of messages. This can be by “any message destined for a particular queue,” “any message that will execute a specific transaction,” “any message with a spe-cific origin,” “any message with a par-ticular data string in a certain location,” or some other identification method. It may seem obvious, but, for example, with publish-subscribe (pub-sub) mes-sages, the same data may have several owners who have different require-ments. Even though WMQ is a time-independent process, the message sender may require timely notification of a break or delay in the delivery path. Stopping the sender to prevent additional messages could help recov-

ery and cleanup when the problem is resolved.

Root Cause Determination The options are to discard the mes-sage, retry delivery, or redirect the mes-sage. If the message can be discarded, the simplest method is to set an expiry period and let a scavenger program remove the messages. But even then, you should try to determine the reason the messages landed in the DLQ. If delivery is to be retried, further analysis is required. Retrying delivery is only a useful option after the receiving process is restored. Redirecting a message is useful as an interim step preceding redelivery as a means to clear the DLQ while recovery is under way, or as a means to give the owner access to the message to assist in problem determina-tion and ultimate restoration of service and data recovery. However, available resources may not permit alternate queues for each application; security and change procedures may not permit “on the fly” creation of production objects for message redirection. What’s acceptable at one shop may not fly at another. The most important piece of infor-mation is the reason the message was placed in the DLQ. Anyone who has administered a WMQ system for any significant time knows that Murphy can be at his most creative here. We once carefully calculated the requirements for a new batch application initial feed to a CICS processing application and added a 10 percent safety margin. When the job ran, the pageset behind the queue was quickly filled up, and half the messages landed in the DLQ. The inves-tigation uncovered that the developer used the wrong copy member for the message—one that was three times lon-ger than anyone was told—and, of course, used the COBOL “LENGTH OF” special register specification. It escaped notice in three levels of testing. There’s also no reason why all the messages in the DLQ would be from the same source or have the same target. Consider messages designated for stopped IMS transactions using the OTMA-IMS Bridge. Unable to deliver the messages to the IMS input queues, OTMA will send all of them to the same DLQ with the notoriously generic code 00000146 ‘OTMA X’1A’ IMS detected error.’ The immediate cause of the DLQ messages may not be the root cause of the problem. You should explore the following questions:

7 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

B y R o n a l d W e i n g e R

Managingthe WebSphere MQDead Letter Queue

• Was the “objectname unknown” because someone typed it incorrectly? Who is checking the administrators? Or was it unknown because someone deleted it? Where is the security and oversight?

• Was the “not authorized” error because the security request was still in the pipeline or did someone try to hack into the system?

• Did the target queue fill up because it was sized incorrectly or did the appli-cation removing the messages fail?

• Were the submitters aware there’s an offline period for IMS transactions and databases and they shouldn’t be sending to them during that period?

The amount of time available to make these determinations depends on the business impact. No one really wants to do an impact analysis while the messages are sitting in the DLQ; that should have been done during the application devel-opment phase. When the pageset full problem occurred, we knew the receiv-ing CICS application was able to handle the larger messages without any ill effects. This enabled us to move the messages back to the original target queue in stages to avoid filling the pageset again, permit-ting normal business. The program was later corrected with the proper copy member. If the larger messages couldn’t have been safely accommodated, resend-ing them to the original target wouldn’t have been possible. A side issue that this uncovered was the danger of backing the DLQ with a pageset used by other queues. If possible, the DLQ should also have its own bufferpool. Many companies don’t have dedicat-ed 24x7 production support teams for every application. Even though messages belong to a business group, an opera-tions team with only general knowledge is often the only off-hours, first-level

support available. For this reason, a fail-ure and impact analysis, which should be part of the development stage, must include managing dead letter messages and the results included with the pro-duction support documentation. This isn’t such a tremendous undertaking.

Document and Automate You need to document the follow-ing:

• The source and the destination of messages: If a problem exists or is imminent, can the sender of the mes-sages be stopped?

• The timeliness of the messages: Can the messages be expired without inter-vention?

• The resources needed to process the messages: When can a retry be initi-ated? Restoration of Service (ROS) activity of a failed application, plat-form, or service (a different topic), must be completed first.

• The escalation process: At what point do data specialists get involved, and how are they contacted?

With that information, handling DLQ messages could become an auto-mated process. IBM provides a dead letter handler utility, CSQUDLQH, described in the System Administration Guide. A home-grown equivalent could be created for platforms that don’t have one. This utility will perform some action, such as retrying or forwarding messages in the DLQ based on a rules-matching process against fields in the Dead Letter Header (DLH) and Message Descriptor (MQMD). Since putting messages in the DLQ could be consid-ered a processing failure, a supplemen-tal, home-grown utility could be written to provide reports for audit and addi-tional data gathering. We found such a

utility useful in identifying the source of messages that expired before someone could manually review them. It provides the information necessary to identify the messages while maintaining data confidentiality. Figure 1 shows one such sample report. When the messages are identified, the appropriate documentation can be referenced on what action to take. If the messages will expire, no action is need-ed. Other actions can be programmed into the DLQ handler as necessary. Such software would be useful, and may even be necessary, for identifying the “needle in the haystack” scenario where a single critical message is buried in the queue among thousands of less-critical mes-sages. The software, automatically initi-ated by a trigger process, could rapidly browse the messages, categorizing them based on previously defined rules, and identifying those it doesn’t recognize. First-level alerts should go to a continu-ously monitored console at the previ-ously agreed upon criticality. Shops that have many batch processes usually have skilled support teams to handle failures. A similar, or even the same, team can handle DLQ alerts and initiate escala-tion procedures for time-sensitive issues.

Conclusion Some administrators panic when mes-sages are directed to the DLQ. The DLQ is just another system object that needs to be managed, like any other object. By set-ting up a process and requiring adher-ence to the input parameters, managing messages in the DLQ could be a simple, even automated task. Z

ronalD WeinGer is a senior systems engineer for a major New York City metropolitan area company and has more than 10 years of application development experience.

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   7 3

figure 1: DlQ Message Sample Report

After the events of 9/11, governments everywhere began to reconsider their Disaster Recovery (DR) requirements for “critical” organizations. Prior to 9/11, most

companies employed dual-site DR planning, where IT operations could continue when a single data center went down by transferring activity to another site located nearby. After 9/11, critical organizations were asked to guard against “region-” or 350-mile-wide disasters.

Regionwide DR Solutions To sustain operations in the event of such vast disasters with minimal data loss, a company would need three data centers—two located close to one another, with the third outside the region defined by the other two. In one scenario, the primary and secondary data centers within region sites synchronously replicate data between themselves, while the primary asynchronously replicates data to the remote or ter-tiary out-of-region data center. If disaster struck the primary site, activity could continue at the secondary data center, which would also take over asynchronous replication to the tertiary data center. Another alternative is to use cascaded replication where the secondary site asynchronously replicates data to the ter-tiary site from the start. Since more often than not primary storage is down while processing elements remain opera-tional, with cascaded replication, a company may only need storage at the secondary location. In this case, storage access swaps to the secondary site with minimal downtime. If the primary site fails, work moves to the tertiary site. Also, with a fully redundant, three-way DR solution, costs are signifi-cant, but having only storage at the secondary site can be more economical. The IBM System z presents multiple asynchronous repli-cation alternatives, such as z/OS XRC, as well as proprietary storage subsystem-based solutions similar to those from EMC and HDS. Unlike proprietary vendor storage solutions, when using XRC, you can replicate to IBM, EMC or HDS storage but XRC consumes CPU processing resources, only supports CKD disk, and doesn’t support cascaded replication. Similarly, IBM, EMC and HDS also support their own proprietary synchronous replication solutions. In addition, EMC and HDS license IBM’s proprietary synchronous repli-cation facilities known as IBM Metro Mirror and, as such, can supply IBM-compatible services from EMC-to-EMC or HDS-to-HDS storage.

IBM’s GDPS Three-Way DR Solutions To facilitate the newly expanded government DR man-dates, IBM released its three-way DR solutions with GDPS version 3.3 in 2006. IBM supports Metro Mirror and either z/OS XRC or its subsystem-based Global Mirror asynchro-nous replication capabilities in two three-site configurations

called GDPS MzGM and GDPS MGM, respectively. With more than 500 two- and three-site installations (based on stats from IBM), GDPS also provides many features to ease business continuity and DR, specifically:

• HyperSwap Manager swaps primary site processing to use secondary site storage.

• Consistency groups join volumes and/or Logical Unit Numbers (LUNs) to conserve update sequences across sites and to initiate recovery for any volume or LUN failure.

• Run-book automation provides a script repository and automates script execution to restart operations.

EMC Three-Way DR solutions EMC created GDDR, its replacement for GDPS, to take advantage of the company’s proprietary SRDF replication services. In contrast, EMC can also operate in a fully GDPS MzGM-compatible mode with Metro Mirror synchronous and XRC asynchronous replication. Also, GDDR provides AutoSwap and ConGroup features similar to IBM’s GDPS HyperSwap and consistency groups. In addition, GDDR supports both cascaded and non-cascad-ed asynchronous replication to the tertiary data center. Aside from SRDF, the other major benefit to using EMC’s GDDR is its run-book expert system. GDDR’s expert system makes defining and maintaining run-book scripts considerably easier than using GDPS.

HDS Three-Way DR Solutions Like EMC, HDS supports both GDPS MzGM-compati-ble operations as well as its own proprietary replication solutions. However, HDS’ proprietary solution only operates with GDPS, using the HDS Universal Replicator (HUR) and Business Continuity Manager (BCM). Using these facilities, HDS supports both cascaded and non-cascaded asynchro-nous replication to the tertiary site.

Summary It sometimes takes a disaster to show the flaws in one’s recovery plans. The events of 9/11, although tragic, have revealed that some catastrophes can impact multiple locali-ties. As a result, IBM, EMC and HDS have all responded with three-way solutions that provide automated recovery for region-spanning disasters. Such capabilities can help organizations sustain operations whenever the next wide-ranging calamity occurs. Z

ray luCCHesi is president of Silverton consulting and has worked in data storage for more than 30 years. He helps both Fortune 500 and start-up storage vendors develop and market their products, and data centers better manage and optimize their data storage systems. Ray blogs on storage, strategy, and system topics at RayonStorage.com and can be followed on Twitter @Raylucchesi. Email: [email protected]; website: www.silvertonconsulting.com

Three-Way Disaster Recovery Solutions for System z

7 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Storage & Data ManagementraylUCCheSi

Setting up secure, protected

servers and network environments

has become vital as companies

seek to comply with regulations

from governments and other

organizations to implement

specific levels of security

compliance, including protection

of network data traffic. >

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   7 5

By Manfred Gnirss, Ph.D.

7 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

Common access methods for Linux servers such as Telnet, or other proto-cols for file transfer such as File Transfer Protocol (FTP), aren’t adequate in Internet environments and internal company networks because sensitive information such as passwords is trans-ferred over the network in the clear. Using Secure Shell (SSH), Secure Copy (SCP), and Secure File Transfer Protocol (SFTP) increases security because encryption protects both passwords and data. Although data encryption is expensive and can severely impact sys-tem performance, the IBM System z provides hardware encryption facilities that can reduce these adverse effects. OpenSSH is a free version of the SSH connectivity tools included in the Linux on System z distributions. Starting with OpenSSH version 4.4, OpenSSL dynamic engine loading is supported; it lets OpenSSH benefit from System z cryptographic hard-ware support if a specific option is used during the OpenSSH package build. This article describes experi-ences with such an OpenSSH package and demonstrates the value of hard-ware encryption support for SSH ses-sions and SCP file transfers. In our testing, we used a System z10 Enterprise Class (EC) with Novell SUSE Linux Enterprise Server (SLES) 11 for IBM System z Service Pack 1 (SP1), as this version of Linux con-tains the updated OpenSSH package.

Hardware Cryptographic Support of System z System z provides two different types of hardware support for cryptographic operations: Central Processor Assist for Cryptographic Function (CPACF) and Crypto Express features. The first type, CPACF, is incorpo-rated into the central processors shipped with the System z, and was introduced with System z990 and System z890. CPACF supports several symmetric encryption algorithms:

• Data Encryption Standard (DES) and Triple DES (TDES)

• Hashing algorithms SHA-1 and SHA-256

• Advanced Encryption Standard (AES) with a key length of 128; with the z10, AES-192 and AES-256

• Pseudo Random Number Generator (PRNG).

CPACF algorithms execute syn-chronously and are for clear key opera-

tion (i.e., the calling application provides cryptographic keys in unen-crypted format). The second type uses installable Crypto Express features, either Crypto Express2 (z9 and z10 up to GA2) or

Crypto Express3 (z10 GA3). A Crypto Express feature can be configured either as a cryptographic accelerator to per-form clear key RSA operations at high speed or as a cryptographic coprocessor to perform symmetric and asymmetric

figure 3: linux on System z Environment for Hardware cryptographic Support for openSSH

figure 2: Encryption Algorithms Supported in cPAcf of System z10

gnirss@tmcc123-180:/usr/lib64> icainfoThe following CP Assist for Cryptographic Function (CPACF) operations are supported by libica on this system:SHA-1: yesSHA-256: yesSHA-512: yesDES: yesTDES-128: yesTDES-192: yesAES-128: yesAES-192: yesAES-256: yesPRNG: yes

figure 1: feature code 3863 (cPAcf) installed

operations (RSA) in clear key and secure key mode. Crypto Express operations are performed asynchronously outside the central processor; work is partially off-loaded, and CPU cycles for crypto-graphic operations are reduced com-pared with the equivalent execution of a pure software implementation. To benefit from CPACF, you must install the free of charge LIC internal feature 3863 (Crypto Enablement); this isn’t installed on the machine by default at delivery, but can be later installed without disrupting system operation. There’s no downside to installing CPACF, so it’s recommended for all systems. To verify that CPACF is enabled, use the Support Element (SE) or Hardware Management Console (HMC) and look for “CP Assist for Crypto functions: Installed” in the CPC details panel (see Figure 1). Or, if you’re a Linux on System z user, you can easily check whether the feature is installed and which algorithms are supported. The icainfo command of library libica dis-plays which CPACF functions are sup-ported (see Figure 2).

Configuring the Crypto Express Feature for Linux on System z The Crypto Express2 and Crypto Express3 features add hardware support for SSH session “handshake” accelera-tion. To benefit from this, you must enable Linux guests under z/VM for access to Crypto Express. The Logical Partition (LPAR) activation profile must contain at least one processor of a Crypto Express feature in the cryptographic candidate list and at least one usage domain index. Starting with System z10, LPAR activation profiles can be modi-fied without deactivating the LPAR.

Linux Cryptographic Architecture Used by OpenSSH OpenSSH uses OpenSSL to perform cryptographic operations. If the OpenSSH package is built using the “—with-ssl-engine” option, the OpenSSL library will use the ibmca cryptographic engine, if installed, to perform encryp-tion operations. The ibmca engine uses the libica library to handle the requests; this library is aware of which algorithms the available hardware supports and

passes requests to the hardware as appropriate, instead of performing the operation in software. Running under z/VM has no impact on the crypto-graphic architecture inside the Linux server other than requiring access to installed Crypto Express hardware via the z/VM directory for the Linux guest. Figure 3 shows the architecture. Enabling OpenSSH hardware crypto-graphic support from Linux on System z requires installing the following software and driver packages. These are all part of the Linux on System z distribution and may be installed by default:

• openssh• openssl• openssl-ibmca• libica• z90crypt.

Preparing OpenSSL for ibmca Engine Use The OpenSSH package shipped with SLES 11 SP1 automatically uses hard-ware cryptographic support, if OpenSSL is configured for dynamic loading of the ibmca engine. SLES 11 SP 1 ships with dynamic engine support disabled. To enable it, you must modify the OpenSSL configu-ration file after installing all required software packages. A sample openssl.cnf.sample file is provided with the openssl-ibmca package and contains settings to enable ibmca. To customize OpenSSL to enable ibmca:

1. Make a backup copy of openssl.cnf.2. Append the content of sample file

/usr/share/doc/packages/openssl- ibmca/openssl.cnf.sample to the

existing openssl.cnf.3. Move the line “openssl_conf =

openssl_def ” from the appended part to the top of the configuration file. The new configuration file should resemble Figure 4.

4. Check the value of the “dynamic_path” variable and change it as nec-essary to the correct path for libibmca.so (this path varies, depending on the Linux distribution in use).

To verify whether dynamic engine support for ibmca is enabled, use the “openssl engine -c” command as shown in Figure 5, which will show ibmca status and a list of supported algo-rithms. To disable dynamic engine loading of ibmca, comment out the “openssl_conf = openssl_def ” line at the top of openssl.cnf. Once dynamic

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   7 7

figure 5: All Supported Algorithms of the imbca Engine

gnirss@tmcc-123-180:/etc/ssl> openssl engine -c(dynamic) Dynamic engine loading support(ibmca) Ibmca hardware engine support [RSA, DSA, DH, RAND, DES-ECB, DES-CBC, DES-EDE3, DES-EDE3-CBC, AES-128-ECB,AES-128-CBC, AES-192-ECB, AES-192-CBC, AES-256-ECB, AES-256-CBC, SHA1, SHA256]

figure 4: configuration file /etc/ssl/openssl.cnf with Dynamic Engine loading for ibmca

# OpenSSL example configuration file.# This is mostly being used for generation of certificate requests.# This definition stops the following lines choking if HOME isn’t# defined.HOME = .RANDFILE = $ENV::HOME/.rnd# --- next line: enable dynamic engine ibmca ---openssl_conf = openssl_def# Extra OBJECT IDENTIFIER info:#oid_file = $ENV::HOME/.oidoid_section = new_oids--- some lines not displayed ---# This really needs to be in place for it to be a proxy certificate.proxyCertInfo=critical,language:id-ppl-anyLanguage,pathlen:3,policy:foo# --- here starts the appended information to enable dynamic engine ibmca ---# OpenSSL example configuration file. This file will load the IBMCA engine# for all operations that the IBMCA engine implements for all apps that# have OpenSSL config support compiled into them.# Adding OpenSSL config support is as simple as adding the following line to# the app:# #define OPENSSL_LOAD_CONF 1# --- next line kept here only as comment, while moved to top of file ---#openssl_conf = openssl_def

[openssl_def]engines = engine_section[engine_section]foo = ibmca_section[ibmca_section]dynamic_path = /usr/lib64/engines/libibmca.soengine_id = ibmcadefault_algorithms = ALL#default_algorithms = RAND,RSAinit = 1

7 �   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

engine loading of ibmca engine is enabled, any OpenSSH activity will automatically use any available hard-ware cryptographic support.

Verify CPACF Usage To verify whether CPACF is used for a Linux server on System z, use the icastats tool, which shows the number of executed encryption requests the libi-ca library handled. It distinguishes between requests CPACF executes and those executed by software fallback. Figure 6 shows an example of icastats output. Here, CPACF executed AES and SHA requests; since no Crypto Express was available, RSA handshakes were executed in software.

Using OpenSSH With Hardware Crypto Support Transferring large files using SCP shows the influence of using dynamic engine loading support of OpenSSL on

OpenSSH, as SCP uses OpenSSH under the covers. Transferring data from a PC to the host using the network—which is com-plex, with many hops between the PC and the host—showed no throughput increase by enabling ibmca, but host CPU usage during the transfer dropped dramatically, from about 70 percent to 15 percent. This suggested a network bottleneck; repeating the test using SCP within the Linux host, using “localhost” as the target address and /dev/null as the output, increased throughput when ibmca was enabled. Since CPACF doesn’t support all ciphers and Message Authentication Codes (MACs), it’s important to select appropriate ciphers and MACs; SCP allows specification of both. We performed more tests using various ciphers: Triple DES, AES-128, AES-192 and AES-256, all of which benefit from CPACF. The MD5 MAC is always executed in software on System z, since CPACF doesn’t support it, whereas SHA-1 can also benefit from CPACF. Figure 7 shows the results of operations with and without dynamic engine loading sup-port active using TDES and AES ciphers and MD5 and SHA-1 MACs. The user time of the execution of the scp command is an indication of the CPU cycles consumed. It’s clear that CPACF dramatically decreases CPU consumption for TDES and AES. MD5 is the faster MAC in software, but with CPACF, SHA-1 contributes to another 30 percent gain. Using CPACF clearly frees up cycles that

can be used for other workloads.

Determine Ciphers/MAC Via Profile Most SSH or SCP users won’t explicitly specify the cipher or MAC, so it’s worth considering modifying the configuration files for the Linux SSH client (ssh_config) and SSHD server (sshd_config) to default to using algorithms that will benefit from hard-ware crypto support. Using the Ciphers and MACs keyword lets you specify the algorithms that will benefit from hardware at the top of the search order: AES or TDES as the top cipher, and SHA as the top MAC. If the hardware is System z9, choose TDES or AES-128 rather than AES-192 or AES-256 because the latter two aren’t supported by CPACF until z10.

Influence of Crypto Express Having a Crypto Express feature helps during session initialization for asymmetric RSA requests. The SCP test represents a relatively long-running ses-sion with only one RSA handshake, so the effect of an active Crypto Express feature was minimal (a difference of 0.01 seconds for user time). The benefit of a Crypto Express feature for OpenSSH is greater if a high number of short-run-ning sessions are established simultane-ously, as is common with some Web applications.

Summary Using hardware encryption support in combination with OpenSSH can save significant CPU cycles, leading to better performance. At a minimum, CPU load will decrease for encryption workloads. Tests indicated a user time reduction for TDES by a factor 15 and a factor of 11 for AES. Selecting SHA as MAC also was beneficial. Use of CPACF is free of charge and its enablement is strongly encouraged. If a Crypto Express feature is available, make it available to Linux systems that can benefit from it. Z

aCknoWleDGementSpecial thanks to Ulrich Buch, Thomas Hanicke, Ulrich Mayer, Winfried Münch, Peter Spera, Arwed Tschoeke, Klaus Werner, and Arthur Winterling for their input and discussions.

manfreD Gnirss is a senior IT specialist at the IBM Technical Marketing Competence Center (TMCC) Europe. He holds a Ph.D. in theoretical physics from the University of Tuebingen, Germany. Before joining the TMCC in 2000, he worked in z/VM and z/OS development for more than 12 years. At the TMCC, he’s involved in several Linux on System z proof-of-concept projects and customer projects. Email: [email protected] 7: Results of operation with and without Dynamic Engine loading Support Active

figure 6: verification of cPAcf Usage with icastats while ScP Doing Data Transfer

gnirss@tmcc-123-180:/usr/lib64> icastats function | # hardware | # software----------+------------+------------ SHA1 | 467 | 0 SHA224 | 0 | 0 SHA256 | 40 | 0 SHA384 | 0 | 0 SHA512 | 0 | 0 RANDOM | 3 | 0 MOD EXPO | 0 | 5 RSA CRT | 0 | 1 DES ENC | 0 | 0 DES DEC | 0 | 0 3DES ENC | 0 | 0 3DES DEC | 0 | 0 AES ENC | 14248 | 0 AES DEC | 28496 | 0

Reading a pass-along

copy of z/Journal ?

Why?

Sign up for your own free subscription

now at:

www.mainframezone

.com

(free subscriptions areavailable worldwide.)

A D I n D e XcoMPAnY WeBsIte PAGe

z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0   •   7 �

BMc Software www.bmc.com 5, 35, 37, 39

________________________________________________________________________________

Bus-Tech www.bustech.com 31

________________________________________________________________________________

cA Technologies www.ca.com ifc, 1

________________________________________________________________________________

chicago-Soft ltd. www.quickref.com 53, 63

________________________________________________________________________________

compuware www.compuware.com 15

________________________________________________________________________________

Dino-Software www.dino-software.com 27, 59, iBc

________________________________________________________________________________

Dovetailed Technologies www.dovetail.com 51

________________________________________________________________________________

iBM System z Technical University www.ibm.com/systems/services/labservices 43

________________________________________________________________________________

illustro Systems www.illustro.com 41

________________________________________________________________________________

innovation Data Processing www.innovationdp.fdr.com 25, Bc

________________________________________________________________________________

Jolly Giant Software www.jollygiant.com 49

________________________________________________________________________________

luminex Software www.luminex.com 23, 33

________________________________________________________________________________

Mackinney Systems www.mackinney.com 11, 17

________________________________________________________________________________

Responsive Systems www.responsivesystems.com 3

________________________________________________________________________________

SHARE www.share.org 47

________________________________________________________________________________

Software Diversified Services www.sdsusa.com 13

________________________________________________________________________________

Trusted computer Solutions www.trustedcs.com 19

________________________________________________________________________________

velocity Software www.velocitysoftware.com 7

IT SensejonWilliamToiGoTeachable Moments From the

Gulf Oil Spill

supply chain partnerships, damage its customers, and disrupt the lives of employees and their families. A second teaching point from the oil rig disaster: vendor hubris. Undoubtedly, BP was told its blowout preventer was “fail-safe.” This is similar to the claims made by just about every technology vendor regarding its system or storage array, its hypervisor, or its software process for de-duplicating, compressing, or otherwise manipulating a company’s most irreplaceable data asset. IBM used to claim that Big Iron was bulletproof. Even if there was a fire, power cutoff followed by sprinklers would resolve the immediate crisis. To my knowledge, such claims are no longer made by Big Blue, but I hear echoes of the same hubris as VMware talks about its site recovery manager, or Microsoft or Oracle talk about high-availability failover clusters. It gets my back up when I hear software folks saying that nothing could possibly go wrong in their wares, despite the fact that most software ships to market while it is only 80 percent complete. As one developer recently said, “If you don’t get a ton of complaints back over your version 1.0 release, you shipped it too late.” Finally, we have the issue of poor regulation helping to set the stage for the oil spill disaster. Summarized, toothless regulation reduced the emphasis on risk mitigation. Concerns about ongoing oversight and the consequences of non-compliance were insufficient to encourage good safety practices. In too many business organizations today, we see the same lackadaisical attitude toward disjointed and confusing regulatory requirements for data protection, preservation, and stewardship. This varies from firm to firm and by industry segment, but that makes the central point: Absent a coherent set of best practices that non-technical auditors can grasp, data today is at high risk. Tape is treated as a dead or dying technology. Companies are seriously considering cloud woo. Plans for migrating workload off mainframes and on to x86 virtual servers, while on hold in some firms, are still in the offing. All of these “strategies” ignore the core issues of how to platform data safely and securely, yet they raise no red flags in the corporate governance/risk/compliance office. This is scary, given what we’re seeing in the Gulf of Mexico today. Z

Jon William toiGo is a 25-year veteran of IT and the author of 13 books. He also is CEO and managing principal partner of Toigo Partners International, an analysis and consulting firm serving the technology consumer in seven countries. Email: [email protected]; Website: www.it-sense.org

While I’m hopeful some resolution will have been found by the time this column goes to press for the broken well head that’s currently spewing tens of millions of

gallons of crude oil into the Gulf of Mexico each day, I can’t help but see some compelling parallels between the evolution of this disaster and the state of today’s data center. Perhaps we can learn from this tragedy and improve our own thinking about disaster prevention in the digital world. First, there’s the decision-making that led to the catastrophe aboard the Deepwater Horizon rig. Consensus is growing that some poor choices were made regarding the safety of the capping processes that led to the disaster. These are characterized as shortcuts made by BP managers in the quest for profit, a view reinforced by evidence that the engineers’ concerns about the sealing of the well were overridden by business managers who were focused on schedules and production efficiencies. This interpretation aligns with our general understanding of market dynamics that compel businesses to do whatever they can to reduce costs and improve profits. “Cost containment” and “top-line growth” are two of the three components of Harvard Business Review’s triangular metaphor for business value—the third component being “risk reduction.” In this case, managers—perhaps without even thinking about broader consequences—appear to have preferred improved operational efficiency (top-line growth) to safety (risk reduction). They may well have viewed the likelihood of a calamity as so small they didn’t adequately weigh the consequences of a low probability blowout. Parallel to corporate IT today: In many organizations, those responsible for business continuity and disaster recovery planning have been shown the door in an effort to trim labor costs. Their perceived value to the organization is diminished by the statistic advanced by Gartner and others that less than 5 percent of data center outages come in the form of catastrophic disasters. In other words, companies that are doing without any sort of continuity planning capability are, like their counterparts at BP, adopting a risk posture that’s based on the preference for cost containment and improved profit by ignoring the big consequences of a low-probability event: a smoke-and-rubble disaster or a severe weather event. This is as understandable as it is shortsighted. If and when a big disaster happens, companies lacking current business continuity plans and logistics stand to lose everything. In addition to the impact on shareholder value, the lack of a continuity capability will harm a company’s

� 0   •   z / J o u r n a l   •   A u g u s t / S e p t e m b e r   � 0 1 0

C

M

Y

CM

MY

CY

CMY

K

DinoSoftware_Umbrella_ME-O.pdf 4/18/2010 10:29:07 PM

C

M

Y

CM

MY

CY

CMY

K

DinoSoftware_Umbrella_ME-O.pdf 4/18/2010 10:29:07 PM

zJOURNAL 8 X 10.75

10136_FDRSOSadv12_zJOURNAL.qxd AUG/SEPT 2010 ISSUE

BACKUP DATA TRAVELS ON FICON CHANNELS…NOT YOUR TCP/IP LINKS

VISIT US AT: SHARE TECHNOLOGY EXCHANGE • BOOTH #202 • AUGUST 2 - 4, 2010 • BOSTON, MA

CORPORATE HEADQUARTERS: 275 Paterson Ave., Little Falls, NJ 07424 • (973) 890-7300 • Fax: (973) 890-7147E-mail: [email protected][email protected] • http:/ / www.innovationdp.fdr.com

EUROPEAN FRANCE GERMANY NETHERLANDS UNITED KINGDOM NORDIC COUNTRIESOFFICES: 01-49-69-94-02 089-489-0210 036-534-1660 0208-905-1266 +31-36-534-1660

FDRSOS…is an easy-to-use, high speed, cross-platform, Open Systems disk backup and rapid recovery…Data Travelson FICON Channels not your TCP/IP Network.

No longer will backups…take too long…congest networks…or leave you worrying about the prospects of a reliable recovery!

❂ Take distributed data backup off communication networks.

❂ Eliminate the need for distributed backup servers.

❂ Ensure backup no longer constrains production.

❂ Empower Open Systems backup with System z RAS.

❂ Employ your existing mainframe tape management & security.

FDRSOS… and FlashCopy or TimeFinder, lets you keep critical and revenue generating applications online withoutcompromising information security.

FDRSOS…no other solution has the capability of using high performance Systems z FICON channels to directly readand write the same disk volumes used by the Open Systemsbusiness applications on DS8700 and V-Max storage.

FDRSOS…provides the ultimate in z/OS distributed data protection and disaster recovery for Linux on System z, AIX,Linux x86-64, NetWare, OES2, UNIX and Windows.

FDRSOS… can help you consolidate hundreds of distributedOpen Systems backup servers onto one z/OS backup serverrunning on System z.

For More Information and a No-Obligation FREE Trialcontact: 973-890-7300, [email protected] orvisit: www.innovationdp.fdr.com/sos

FICON FICONCLIENTS

FIBRE

Application Disk on IBM DS8700 and EMC V-Max Storage

IBM z10 FICONTape/VTS

Backup Data Flow Backup Data Flow

Distributed Data Disaster Recovery Protection