Date post: | 13-Jan-2016 |
Category: |
Documents |
Upload: | asher-francis |
View: | 243 times |
Download: | 4 times |
SWG Competitive Project Office
Introduction to
IBM’s z/OS The Operating System for System z
2zCPO zClass Introduction to z/OS
Defining characteristics of z/OS Uses address spaces to ensure isolation of private areas
Ensures data integrity, regardless of how large the user population might be.
Can process a large number of concurrent batch jobs, with automatic workload balancing
Allows security to be incorporated into applications, resources, and user profiles.
Allows multiple communications subsystems at the same time
Provides extensive recovery, making unplanned system restarts very rare.
Can manage mixed workloads
Can manage large I/O configurations of 1000s of disk drives, automated tape libraries, large printers, networks of terminals, etc.
Can be controlled from one or more operator terminals, or from application programming interfaces (APIs) that allow automation of routine operator functions.
64 BIT Virtual Address Space
3zCPO zClass Introduction to z/OS
What’s an Address Space? An execution environment in z/OS
Remember z/OS runs in an LPAR (it is like a distributed server (a box on the floor))
How many Address Spaces can there be? THOUSANDS
User Address Spaces are unique and run single applications Multiple Units of Work can be active within the address space (parallel execution) These Units of work are called TASKs User Address spaces do not communicate with each other If one address space fails the other user address spaces continue to run
System Address Spaces Execute System Components (elements), e.g.
− DB2, CICS, SMF, RMF, DFSMS … (More coming)− These Components are called Subsystems (like a system within a system)
System Components Communicate with each other
Cloned or Duplicate Address Spaces running as a Subsystem communicate with each other Multiple Address spaces of a Subsystem and as a Component act as one If one address space fails, the Component, e.g. Running DB2 continues to
execute This enables continuous platform availability
4zCPO zClass Introduction to z/OS
What’s in an address space?
z/OS provides each user with a unique address space and maintains the distinction between the programs and data belonging to each address space
Because it maps all of the available addresses, however, an address space includes system code and data as well as user code and data. Thus, not all of the mapped addresses are available for user code and dataThe ‘size’ of an Address Space Depends on the addressing range of the Hardware Architecture of the serverIn this example:
− 16 MB Address Space− 2 GB Address Space
z/OS can run in different addressing modes 24 bit Mode (16 MB) 31 bit Mode (2 GB) 64 bit Mode (16 Exabytes)
5zCPO zClass Introduction to z/OS
Examples z/OS address spaces
System address spaces are started after initialization of the master scheduler. These address spaces perform functions for all the other types of address spaces that start in z/OS.
Subsystem address spaces for major system functions and middleware products such as DB2, CICS, and IMS.
TSO/E address spaces are created for every user who logs on to z/OS
Address spaces for every batch job that runs on z/OS.
z/OS and its related subsystems require address spaces of their own to provide a functioning operating system:
6zCPO zClass Introduction to z/OS
TS
O
The z/OS system structure
Base Operating System
Sys
tem
Tas
k
Ba
tch
Jo
b
TC
P/IP
VT
AM
Ba
tch
Jo
b
Ba
tch
Jo
b
Use r
Use r
Use r
Use r
IMS
CR
MP P
MP P
BM P
BM P
CIC
SA
O RT
O RA
O RD
O R
DB
2
We
bSp
her
e
JES
Lot
us
No
tes
LIC (LPAR, etc)
A d d r e s s S p a c e s
Address space addressability – 64-bit in z/OS
– 24 bit in MVS/370, 31 bit in MVS/XA –> OS/390
zSeries hardware
7zCPO zClass Introduction to z/OS
Do you speak zSeries?
DASD
PR/SM (Hypervisor Firmware)
z/OSTest
z/OSDevelopment
z/OSProduction
Linux
LINUX
LINUX
LINUX
z/VM
Z/OS
SAP SAP
CP IFLICF zAAP
ChannelSubsystem
ProcessorUnits
Logical Partitions
CU CUControl Units
VirtualMachines
Channels
ServiceAssistProcessor
IntegratedFacility for Linux
CentralProcessors
8zCPO zClass Introduction to z/OS
What’s Virtual Memory? Virtual Memory
The hardware addressing capability of the architecture. Most likely the main storage (central storage) will be less than the virtual storage.
− E.g. 512 GB main storage vs 16 Exabytes virtual storage • 239 vs 264
Where’s it all go? Page = 4K virtual address range Frame = 4K real address range Slot = 4K disk storage space
A Page can exist in a Frame or in a Slot. It must be in a frame for data and instructions to be accessed.
The location of the page is kept in tables created and maintained by the operating system. Each JOB has its own distinct tables. The pointer to the tables is part of the state data.
The operating system has storage managers to manage the pages, frames and slots (VSM, RSM and ASM – Virtual, Real and Auxiliary Storage Managers).
z10 Has 1 MB Segments
10zCPO zClass Introduction to z/OS
The address space concept
16 EB
64-bit addresing(z/OS)
The “Bar”
2GB
31-bit addresing(MVS/XA)
16 MB
The “Line” 24-bit addresing
(MVS)
11zCPO zClass Introduction to z/OS
Mapping of z/OS addressability
12zCPO zClass Introduction to z/OS
How virtual storage works
Virtual storage is divided into 4-kilobyte pagesTransfer of pages between auxiliary storage and real storage is called paging When a requested address is not in real storage, an interruption is signaled and the system brings the required page into real storage z/OS uses tables to keep track of pages Dynamic address translation (DAT) Frames, pages, slots are all repositories
for a page of information
Identification Division *Data DivisionWorking Storage Section 77 abc-sw pic xx. *Procedure DivisionOpen File-A, File B * Move FIELD-A to FIELD-BClose File-A, FILE-B.STOP RUN
Main Memory
13zCPO zClass Introduction to z/OS
Elements z/OS consists of a collection of functions that are called
base elements and optional elements Some the base elements can be dynamically enabled and
disabled− Customer may choose to use a vendor product instead of IBM
products.
Optional Elements are called features Customers can select features they want shipped with the
operating system The optional elements (features) are either integrated or
nonintegrated. Features, both integrated and nonintegrated, are also tested
as part of the integration of the entire system.
14zCPO zClass Introduction to z/OS
Elements of z/OS - Base and Optional IBM HTTP Server IBM Tivoli Directory Server for z/OS ICKDSF Integrated Security Services ISPF JES2 Language Environment Library Server MICR/OCR Network File System (NFS) OSA/SF Run-Time Library Extensions SMP/E TIOC TSO/E z/OS UNIX 3270 PC File Transfer Program
Some Base Elements Base Control Program (BCP) Bulk Data Transfer base (BDT) BookManager Read Communications Server Cryptographic Services DFSMSdfp Distributed File Service EREP ESCON Director Support FFST HCD High Level Assembler (HLASM) IBM HTTP Server IBM Tivoli Directory Server for z/OS ICKDSF Integrated Security ServicesHigh High Level Assembler (HLASM)
15zCPO zClass Introduction to z/OS
Optional Elements
BDT File-to-File BDT SNA NJE BookManager BUILD C/C++ without Debug Tool Communications Server
Security Level 3 DFSMSdss DFSMShsm DFSMSrmm DFSMStvs DFSORT
GDDM-PGF GDDM-REXX HCM HLASM Toolkit Infoprint Server JES3 RMF SDSF Security Server z/OS Security Level 3
16zCPO zClass Introduction to z/OS
The BCP – Base Control Program Essential operating system services
Base control program and job entry subsystem (JES) BCP requires the following:
A security product (RACF is the IBM offering) DFSMSdfp Communications Server SMP/E TSO/E z/OS UNIX System Services (z/OS UNIX) kernel
Important BCP components System management facilities (SMF) Resource Management Facility (RMF) Workload manager (WLM) Interesting optional features Infoprint Server I/O configuration program (IOCP), Program management binder Support for the Unicode Standard. z/OS XML System Services (z/O
18zCPO zClass Introduction to z/OS
Running in the Address Spaces
User Applications Batch Jobs
MiddleWare DB2, CICS
ISV Applications Application Servers (WebSphere) WebServers TSO Users Unix Users
Unix System Services (USS) System Applications TCP/IP Stack… RACF (Resource Access Control Facility) z/OS Security Manager
19zCPO zClass Introduction to z/OS
Who Makes an Address Space
When z/OS is “Booted” (really IPLed (Initial Program Load)) a component call the Master Scheduler is built as the 1st address space. The Master Scheduler creates other address spaces as needed.
− When a TSO User Logs on
− When A USS User Logs on
− When A System Task Is started
− When JES is Started
− When JES Initiators are Started (they pull jobs off the JES Queues) More examples follow
SMF – System Management Facility RMF – Resource Management Facility DFSMS – Data Facility Storage Management Subsystem
20zCPO zClass Introduction to z/OS
What Type of System Applications?
GRS – Global Resource Serialization Controls Access to Resources
RACF – Resource Access Control Facility Provides Security Services
WLM – Workload Manager Dynamically sends work to resources & resources to white space
JES – Job Entry Subsystem Queues up work for entry into the z/OS Queues up output for sending work to printers
SMF – System Management Facilty Gathers messages from system applications and writes them to disk.
Performance data, events …. RMF – Resource Measurement Facility
Provides reports on system and application activity Graphical real time operating system data
These Components and subsystems communicate with each other …. across address spaces.
21zCPO zClass Introduction to z/OS
SMF – Part of BCP SMF – System Message Recording
Components write messages to SMF, SMF writes messages to a dataset Every message has a specific record id associated with it Record formats are different The data is post processed System programmers configure what messages are/are not written There are two SMF datasets – One is hot and when the dataset if full, automation or the operator switches to
the standby data set and them dumps the data of the full data set
The data is used for various purposes Performance analysis Workload management behavior Resource consumption
− I/O, Memory, CPU Error Analysis
z/OS architects / developers use a system service to write SMF records Developers determine if, where, and when in the code a record is written There is an SMF developer/designer that assigns the record id and reviews the record format/structure and
content
Data and record collection is event driven, e.g. Start and stop of a job
− Reason codes indicating why job was stopped Open and close of a dataset Count of I/O records read/written
22zCPO zClass Introduction to z/OS
RMF
Resource Monitoring Facility RMF is an optional priced feature of z/OS. It is a product that
supports Performance Analysis, Capacity Planning, and Problem Determination. For these disciplines, different kinds of data collectors are available:
− Monitor I is the long term data collector for all types of resources and workloads. The SMF data collected by Monitor I is mostly used for capacity planning but also for performance analysis.
− Monitor II is the snap shot data collector for address space states and resource usage. Some of the gathered data is also displayed in SDSF
− Monitor III is the short-term data collector for problem determination, workflow delay monitoring, and goal attainment supervision. The MIII data is also used by RMF PM Java Client, the RMF Web Browser interface, and Tivoli TBSM
The data collected by all three gatherers can be saved persistently for later reporting. Monitor II and Monitor III are online reporters. Monitor I and Monitor III can store the collected data to datasets
23zCPO zClass Introduction to z/OS
Open Standards (WBEM/CIM)
RMF Sysplex Data Server and APIs
Historical ReportingAnalysis and Planning
Real-time ReportingProblem Determination and Data
Reduction
SMF
RMFData Gatherer
RMF Postprocessor RMF Monitor III
RMFMonitor I
RMFMonitor II
background
RMFMonitor III
SMF
VSAMVSAM
Snapshot Reporting
RMF Architecture Overview
Long
-term
Ana
lysi
s Online M
onitoring
24zCPO zClass Introduction to z/OS
W O R K L O A D A C T I V I T Y PAGE 22
z/OS V1R2 SYSPLEX SYSPLEX START 10/14/2002-12.30.00 INTERVAL 000.30.00 MODE = GOAL RPT VERSION V1R2 RMF END 10/14/2002-13.00.00
REPORT BY: POLICY=STANDARD WORKLOAD=SYSTEM SERVICE CLASS=SYSTEM RESOURCE GROUP=*NONE PERIOD=1 IMPORTANCE=SYSTEM
TRANSACTIONS TRANS.-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE---- --SERVICE RATES-- PAGE-IN RATES ----STORAGE----AVG 107.81 ACTUAL 2.58.714 SSCHRT 108.4 IOC 1816K ABSRPTN 331989 SINGLE 0.0 AVG 8600.09MPL 107.81 EXECUTION 2.58.714 RESP 7.4 CPU 132456K TRX SERV 331989 BLOCK 0.0 TOTAL 927137ENDED 32 QUEUED 0 CONN 2.8 MSO 64298M TCB 1361.0 SHARED 0.0 CENTRAL 927137END/S 0.02 R/S AFFINITY 0 DISC 0.1 SRB 12872K SRB 132.3 HSP 0.0 EXPAND 0.00
#SWAPS 1086 INELIGIBLE 0 Q+PEND 4.2 TOT 64445M RCT 0.3 HSP MISS 0.0EXCTD 0 CONVERSION 0 IOSQ 0.3 /SEC 35797K IIT 4.9 EXP SNGL 0.0 SHARED 35.98
AVG ENC 0.00 STD DEV 3.37.619 HST 0.0 EXP BLK 0.0REM ENC 0.00 APPL % 83.2 EXP SHR 0.0
MS ENC 0.00
C P U A C T I V I T Y
z/OS V1R2 SYSTEM ID SYS1 START 10/14/2002-12.30.00 RPT VERSION V1R2 RMF END 10/14/2002-13.00.00
CPU 2064 MODEL 107CPU ONLINE TIME LPAR BUSY MVS BUSY CPU SERIAL I/O TOTAL % I/O INTERRUPTSNUMBER PERCENTAGE TIME PERC TIME PERC NUMBER INTERRUPT RATE HANDLED VIA TPI
0 100.00 11.81 20.61 031528 9.01 0.641 100.00 11.00 18.18 131528 10.80 0.832 100.00 7.49 12.16 231528 14.64 0.933 100.00 6.92 10.34 331528 18.22 0.824 100.00 6.60 10.30 431528 18.26 0.76TOTAL/AVERAGE 8.76 14.32 70.93 0.81
//RMFPP EXEC PGM=ERBRMFPP //SYSIN DD * DATE(10142002,10142002) RTOD(1100,1300) DINTV(0030) REPORTS(CPU) SYSRPTS(WLMGL(SCPER)) SYSOUT(H)
Postprocessor: Standard Reporting
25zCPO zClass Introduction to z/OS
Middleware
z/OS runs middle ware applications and packages MiddleWare is usually a product, i.e. it costs the customer $
− It may not be an IBM product
− Common Middleware is:• DB2, CICS, IMS, WebSphere Products
z/OS provides some interfaces for vendors to use Called the Subsystem Interface (SSI) Also used by z/OS components
CICS – Customer Information Control System DB2 – IBM’s Relational Database IMS – Information Management Subsystem
Transaction Monitor IBM’s J2EE Websphere Application Server
26zCPO zClass Introduction to z/OS
Application
Middleware
(CICS, IMS, WebSphere)
z/OS
z9 Processor
Database
(DB2, IMS)
z/OS Software Stack
SAP, Siebel, JDEdwards, and customer applications CICS 3.2, WebSphere 6.1, IMS 10
DB2 9, IMS 10
z/OS 1.8
z9 109
Sys
tem
s M
anag
emen
t
Sec
uri
ty (
RA
CF
1.8
)
27zCPO zClass Introduction to z/OS
Transactions and Data – the zSeries Application “Sweet Spot”
Transaction monitor – manages a transaction A program or subsystem that manages or oversees the sequence of events that
are part of a transaction Makes sure the ACID properties of a transaction are maintained Includes functions such as interfacing to databases and networks and
transaction commit/rollback coordination Provides an API so applications can exploit the services of the transaction
monitor
IBM’s z/OS-based transaction monitors: IMS - Information Management System CICS - Customer Information Control System WebSphere Application Server for z/OS
A key strength of the z/OS platform is support for high-volume, high-performance transaction management using transaction monitors
28zCPO zClass Introduction to z/OS
IMS – Information Management System
“IMS Runs the World” since 1968:Most Corporate Data is Managed by IMS
− Over 95% of Fortune 1000 Companies use IMS
− IMS Manages over 15 Billion GBs of Production Data
− $2 Trillion/day transferred thru IMS by one customer
Over 50 Billion Transactions a Day run through IMS
− IMS serves close to 200 Million users per day
− Over 79 million IMS trans/day handled by one customer on a single production Sysplex, 30 million trans/day on a single CEC
− 120M IMS trans/day, 7M per hour handled by one customer
− 4000 trans/sec (250 million/day) across TCP/IP to a single IMS
− Over 3000 days without an outage at one large customer
− 21,000 transactions per second on a single z990, with 4 IMS servers
29zCPO zClass Introduction to z/OS
30+ years of applications >30B transactions per day 5000 packages/2000 ISVs 30M CICS users 50K CICS/390 licenses, 16K customers 950,000 CICS application programmers
“it’s the programming model!” 490 of IBM’s top 500 customers
What is it? CICS provides an execution environment for concurrent program execution
for multiple end users, who have access to multiple data types. CICS will manage the operating environment to provide performance,
scalability, security, and integrity
CICS – Customer Information Control System
30zCPO zClass Introduction to z/OS
Architected on SOA infrastructure & principles Fully J2EE 1.4 platform certified Leading Web Services support WebSphere Rapid Development & Deployment
zAAP enabled (z9-109, z990, z890) Run Java applications next to mission critical data Lower the cost of computing for WebSphere
Application Server (and all z/OS based Java applications)
Common code infrastructure Administration skills shared between platforms Develop anywhere, run on WebSphere Application
Server for z/OS
Native OS support – leverages the z/OS platform
Optimization features designed to provide security and data interaction, including CICS, IMS, DB2
Client
Browser
Web Service Requestor
zAAP
WebSphere Application Server for z/OS, the Java Transaction Manager
DB2
EJB Container
EJBs
Web Container
JSP Servlets
WAS z/OS
31zCPO zClass Introduction to z/OS
A Mainframe Runs Mixed Workloads
Typical large customer daily activity
39zCPO zClass Introduction to z/OS
DFSMS – The Premier Storage Management Suite
Improve the use of the storage media; for example, by reducing out-of-space abends and providing a way to set a free-space requirement.
Reduce the labor involved in storage management by centralizing control, automating tasks, and providing interactive or batch controls for storage administrators.
Reduce the user's need to be concerned with the physical details of
performance, space, and device management. Users can focus on using information instead of managing data.
GOALS:
http://www.ibm.com/systems/storage/software/sms/whatis_sms/
40zCPO zClass Introduction to z/OS
Management ClassManagement Class
Data ClassData Class
Storage ClassStorage Class
System ManagedStorage
Storage GroupStorage Group
GRP_1
GRP_2
GRP_3
Storage GroupsStorage GroupsStorage GroupsStorage Groups
Allocation
request
Allocation
request
DFSMSdfp - System Managed Storage (SMS)
41zCPO zClass Introduction to z/OS
Mitigating Management Costs…DFSMSdfp constructs are key to data placement and
assigning goals, requirements, etc.The operating system and subsystems understand the three
user specifiable constructs.
DFSMShsm and ABARs are key to implementing the management policy.Coherent backupsData retirement
DFSMSdss key to movement, copying datasets.
DFSMSrmm key to tape management.
44zCPO zClass Introduction to z/OS
WLM DOES THE FOLLOWING
Monitors the use of resources by various address spaces Monitors the system-wide use of resources to determine whether they are fully utilized Determines which address space to “swap” out (and when) Inhibits the creation of new address spaces or steals pages when certain shortages of real storage exist Changes the dispatching priority of address spaces to adjust the consumption of system resources Selects the devices to be allocated, if a choice of devices exist to balance I/O devices
45zCPO zClass Introduction to z/OS
WLM Classification Rules
46zCPO zClass Introduction to z/OS
Transaction Flow
47zCPO zClass Introduction to z/OS
Mapping Unix to z/OSTerms and Concepts
48zCPO zClass Introduction to z/OS
z9-
109
z990 z890 z900 z800G5/G5
Multiprise® 3000
End of Servic
e
Coexists with
Ship Date
1.4 x x x x x x 3/07 1.7 9/02
1.5 x x x x x x 3/07* 1.8 3/04
1.6 x x x x x 9/07* 1.8 9/04
1.7 x x x x x 9/08* 1.9 9/05
1.8*
x x x x x 9/09* 1.10 9/06*
*Plannedz/OS.e – Available for z890 and z800 only
z/OS Support Summary
49zCPO zClass Introduction to z/OS
Summary of z/OS facilities
Address spaces and virtual storage for users and programs. Physical storage types available: real and auxiliary. Movement of programs and data between real storage and
auxiliary storage through paging. Dispatching work for execution, based on priority and ability to
execute. An extensive set of facilities for managing files stored on disk or
tape. Operators use consoles to start and stop z/OS, enter commands, and manage the operating system.
SWG Competitive Project Office
Introduction to
IBM’s System z Clustering TechnologiesParallel Sysplex
And LPAR Cluster
52zCPO zClass Introduction to z/OS
Objectives
In this session will learn about: Parallel Sysplex (z/OS and zSeries Clustering Technology
− Software and Hardware executing as one Server
− Multiple LPARs running as one Server
− z/OS running in each LPAR
− Up to 32 System Images (z/OS) running as a Parallel Sysplex Intelligent Resource Director (IRD)
− LPAR Clusters
− Exist within Parallel Sysplex Clustering
− Associated with Work Load management (WLM) managing Virtual Hardware Resources
Explain how Parallel Sysplex can achieve continuous availability Explain dynamic workload balancing Explain the single system image
53zCPO zClass Introduction to z/OS
Five Nines is the Gold Standard of Availability
99.999% availability is sometimes referred to as “continuous operation” 5 minutes downtime per year out of 24x365
Survey of 28 companies with mixed environments Average mainframe system availability = 99.993% or 36 minutes per year
downtime Average distributed server availability = 99.909% or 8 hours per year per
server downtime
Small improvements in the “nines” become more and more difficult to achieve Distributed system hardware and software design, test, and service strategy
are required
Downtime Mainframe Distributed Cost impact
hours per year .6
(99.993% availability)
7.98
(99.909%
availability)
13 times downtime costs
March 12, 2007 IDC Survey of 28 customers with mixed environments
54zCPO zClass Introduction to z/OS
Source: Gartner, Server Scorecard Evaluation Model version 2, May 2006
Availability Rankings- Selected Platforms
Gartner Ranks System z Tops in Availability (Parallel Sysplex)
Gartner Criteria: Single system
availability Planned downtime Disaster tolerance &
recovery Failover clustering High availability
services
"Platform"
Ava
ilab
ility R
ank
ing
Best
Worst
IBM System z
Unisys ES7000
IBM Power5
HP Integrity
Dell Poweredge
Sun Fire /Sparc IV
HP 9000
UNIX
WINTEL
MAINFRAME
55zCPO zClass Introduction to z/OS
What is a parallel sysplex = Continuous Availability Builds on the strength of zSeries servers by linking up to 32 images to create the
industry’s most powerful commercial processing clustered system Innovative multi-system data-sharing technology Direct concurrent read/write access to shared data from all processing nodes No loss of data integrity, No performance hit Transactions and queries can be distributed for parallel execution based on
available capacity and not restricted to a single node Every “cloned” application can run on every image Hardware and software can be maintained non-disruptively Within a parallel sysplex cluster, it is possible to construct an environment with no
single point of failure Peer instances of a failing subsystem can take over recovery of resources held by
the failing instance OR the failing subsystem can be automatically restarted on still healthy systems
In a parallel sysplex the loss of a server may be transparent to the application and the server workload redistributed automatically with little performance degradation
Software upgrades can be rolled through one system at a time on a sensible timescale for the business
56zCPO zClass Introduction to z/OS
Consider the Power
Parallel Sysplex – Up to 32 System Imagesz10 Server – Up to 64 Processors per imageMIPS up to 920 per processor
Up to 1,884,160 MIPS
In a Parallel Sysplex
57zCPO zClass Introduction to z/OS
Addresses Planned/Unplanned HW/SW Outages
Flexible, Nondisruptive Growth
ƒ Capacity beyond largest CECƒ Scales better than SMPs
Dynamic Workload/Resource Management
Built In Redundancy
Capacity Upgrade on Demand
Capacity Backup
Hot Pluggable I/O
1 to 32 Systems
Single System Parallel Sysplex
12 1
2
34
56
78
9
10
11
Site 1
GDPS
Site 2
121
2
34
56
78
9
1011 12
12
34
56
789
1011
Addresses Site Failure/Maintenance
Sync/Async Data Mirroring
ƒ Eliminates Tape/Disk SPOFƒ No/Some Data Loss
Application Independent
Z Series Continuous Availability
58zCPO zClass Introduction to z/OS
121
2
3
4
56
7
8
9
10
11
CouplingFacility
Shared data
Sysplex Timers
ESCON/FICON*
9672
zSeries
121
2
3
4
56
7
8
9
10
11
SystemZ9
Applications Applications
Parallel Sysplex Loosely coupled multiprocessing Hardware/software combination Requires:
− Data sharing− Locking− Cross-system workload dispatching− Synchronization of time for logging, etc.− High-speed system coupling
Hardware:− Coupling Facility
• Integrated Cluster Bus and ISC to provide high-speed links to CF
− Sysplex Timer – Time Of Day clock synchronization Implemented in z/OS* and subsystems
− Workload Manager in z/OS− Compatibility and exploitation in software subsystems, including
IMS*, VSAM*, RACF*, VTAM*, JES2*, etc.
Rolling Maintenance System and Application Code
Horizontal Scaling and High Availability
60zCPO zClass Introduction to z/OS
Coupling Facility – Glue for Communication
Within the Coupling Facility, storage is dynamically partitioned into structures. z/OS services manipulate data within the structures. Each of the following structures has a unique function:
Cache structure: Supplies a mechanism called buffer invalidation to ensure consistency of
cached data. The cache structure can also be used as a high-speed buffer for storing shared data with common read/write access.
List structure: Enables authorized applications to share data that is organized in a set of
lists, for implementing functions such as shared work queues and shared status information.
Lock structure: Supplies shared and exclusive locking capability for serialization of
shared resources down to a very small unit of data.
61zCPO zClass Introduction to z/OS
z/OS z/OS
PR/SM
z/OS
ICF ICF CP CP CP CP CP CP
Dedicated ICFs
ICF
z/OS
links
Internal Coupling Facility (ICF)
Spare CPs can be used as CF CPs ICFs can only run CFCC MSUs in ICFs "Don't Count“ Accessed via external Links
63zCPO zClass Introduction to z/OS
Unplanned Outagesƒ Configure for no single point of HW/SW failureƒ Fault tolerant HW, recoverable SWƒ Failure isolationƒ System detected (e.g. heartbeats, event triggers, soft fail thresholds)ƒ Policy managed (e.g. SFM, WLM, ARM, etc.)ƒ Dynamic workload Routing
Planned Outagesƒ n, n+1 supportƒ Non-disruptive rolling change managementƒ Redundancy to address risk tolerance (e.g. 2 vs. 3 elements)ƒ Dynamic workload balancing
Processes to support PS availability (e.g. change, problem, systems management)Thorough testing/training
Parallel Sysplex Availability Technologies
64zCPO zClass Introduction to z/OS
CF5 CPs
CF5 CPs
WorkloadDriver 5 CPs
CF5 CPs
IC ICICB4
Database• 380 million accounts• 52 TB Storage• 4 DS8300
54-way z9
BANCSCICS DB2
19 CPs
BANCS CICS DB2
BANCS CICS DB2
BANCS CICS DB2
19 CPs
19 CPs
19 CPs
54-way z9
ICB4
Requirement 4,100Transactions per second
Remember this Benchmark? Bank of China Benchmark
65zCPO zClass Introduction to z/OS
Goal: 4,100 TPS
4287
135 174253
61
114
170168
287
0
100
200
300
400
500
600
1,589 3,120 4,665 5,723 8,024Transactions per second
Dat
a Tr
ansf
er (
MB
/sec
)
Read Write
Near-Linear Scalability on a Parallel Sysplex running CICS and DB2 in a single system image with No Partitioning Required
Bank of China Parallel Sysplex Benchmark
Huge scale up, requires hugeI/O bandwidth capacity
66zCPO zClass Introduction to z/OS
Software Component FunctionXCF Sysplex Communication/Status
Monitoring/Group ServicesARM Subsystem restart (within CEC or cluster)CFRM CF Resource Management PolicySystem Logger High performance logging, Merged logsWLM Goal oriented unit of work managementWLM Enclaves Mult-system unit of workVTAM Generic Resource Network Single System ImageVTAM MNPS High Availability Network ConnectionTCP/IP VIPA Network Single System ImageTCP/IP VIPA take over/take back High Availability Network ConnectionCICSPlex/SM, IMS and MQ SMQ Transaction routing/balancingDB2 Sysplex Query Parallelism SQL Query de/re-compositionBatch PipePlex Cluster I/O PipingESCON Manager ESCON I/O Systems MangementDB2, VSAM TVS, IMS/DB Full read/write data sharingIRLM Sysplex database lockingBase Operating System Exploitation Resource SharingAdditional Subsystem Exploitation Resource/Data Sharing
Parallel Sysplex Software Cluster Technology
67zCPO zClass Introduction to z/OS
Failure Recovery enabled by Sysplex & ARM
z/OS Workload Manager Sysplex-wide workload management to one policy
Sysplex Failure ManagerSpecify failure detection and recovery actions
Automatic Restart ManagerFast recovery of critical subsystems
Cloning and symbolicsUsed to replicate applications across the nodes
68zCPO zClass Introduction to z/OS
zSeries Parallel Sysplex Resource Sharing
This is not to be confused with application data sharing
This is sharing of physical system resources such as tape drives, catalogs, consoles
This exploitation is built into z/OSBenefits
System ManagementPerformanceReduced hardware requirements $$$
69zCPO zClass Introduction to z/OS
Resource Sharing
RACF - Security ServerMultisystem shared security profilesImproved manageabilitySystems Management simplification
GRS StarMultisystem resource serializationHighly scalableRapid recoveryImproved performance
Tape SwitchingMultisystem tape sharingEliminate duplicationReduced cost
XCF StarMultisystem signalingSimplified systems definitionImproved performanceReduced costChannel constraint relief
JES2 CheckpointMultisystem checkpointSystems Management simplificationReduced cost
Operlog / Log RecMultisystem merged log Improved Systems ManagementSingle Systems Image
Shared CatalogShared Master and User catalogsSystems Management simplificationImproved performance
High value and easy transitionNo stand-alone CF requirementInstallation wizards available
CF
OperlogLogrec
Catalogs
TAPE
MasterConsole
OperlogLogrec
Catalogs
TAPE
MasterConsole
Catalogs
OperlogLogrec
Catalogs
TAPE
MasterConsoleCatalogs
Catalogs
OperlogLogrec
Catalogs
TAPE
MasterConsole
CatalogsTape
TapeTape
Tape
Tape
Catalogs
OperlogLogrec
Catalogs
TAPE
MasterConsole
IRDLPAR CPU Mgmt, Dyn CHPID Mgmt, IO PrtySystems Management across LPARsImproved performance, Availability
DFSMShsmWorkload Balancing
HFS / zFS"Shared" dataApplication flexability
Enables PSLC Licensing ChargesEnables PSLC Licensing Charges
70zCPO zClass Introduction to z/OS
Dynamic Workload Manager (WLM)
71zCPO zClass Introduction to z/OS
Manages resources within a server Processors and I/O Policy based
Integration ofz/OS Workload ManagerParallel SysplexPR/SM™
Directs physical resources to workload Handles unpredictable workloads Increases resource efficiency
z/OS
z/OS
z/OS
ICF
LPAR cluster
zSeries IRD scope
Intelligent Resource Director (IRD)
72zCPO zClass Introduction to z/OS
IRD, WLM and LPAR Clusters IRD is code executing within the hardware WLM Manages performance of:
Tasks within an address space Address Spaces within a z/OS image Subsystems within a z/OS image Subsystems across multiple images within a Sysplex LPAR clusters within a Sysplex on a single server … and more like TCP/IP Routing, creating address spaces to handle
workload peaks … LPAR Clusters managed as a ‘group’ provide
LPAR CPU Management − WLM requests reassignment of virtual CPs based on LPAR weights (goals)
defined by the IT shop Dynamic channel path management
− WLM Requests reassignment of virtual channel paths to improve I/O bandwidth to an LPAR based on weights (goals) defined by the IT shop
Channel subsystem priority queuing LPAR − WLM Requests reassignment of I/O priority for an LPAR to reduce I/O wait time
for an LPAR’s I/O based on weights (goals) defined by the IT shop
78zCPO zClass Introduction to z/OS
PerformanceRebalances batch initiators
“Move” initiators to images with capacityMore aggressively reducing them on
constrained systems Starting new ones on less
constrained systems Checking for potential rebalancing
every 10 sec.
SYS1 SYS2
select select
Batch Queue
free capacity free capacity
InitiatorInitiator
Batch Workload Balancing
Batch Workload Balancing
79zCPO zClass Introduction to z/OS
TCP/IP Workload Balancing Spraying
“Dumb” round robin
DNS/WLM Request routed to best host to balance workload
Network Distributor External box. Requires connectivity to each host Routes based upon WLM, user, application, QoS, etc. Similar to Cisco Multi-Node Load Balancer
Sysplex Distributor No external box required. Connects to a node within Sysplex, Routes to host based upon WLM, user, application, QoS, etc.
− Better WLM coordination Removes complexities of multiple LPARs in a CEC w/ OSA
Load Balancing Advisor The load balancer resides in the network (typically router-type node)
80zCPO zClass Introduction to z/OS
Dynamic VIPA / VIPA TakeoverSingle System Image to IP NetworkDynamic VIPA backup
If a host suffers an outage, the
stack may be moved to another
host manually No configuration changes to routers
VIPA TakeoverThis process is automated
Coordinated with application dependencies
VIPA TakebackPrior to planned outage
“Takeback” after host brought back online
ESCON
VIPA 192.168.253.1
VIPA 192.168.253.2
VIPA 192.168.253.3
VIPA 192.168.253.4 VIPA
192.168.253.5
VIPA 192.168.253.6
CF
192.168.253.4Cached IP address
Network
81zCPO zClass Introduction to z/OS
zSeries Sysplex Distributor zSeries Sysplex Distributor
Provides a single Sysplex wide IP address built on dynamic VIPA
Distributes network attachment based on application placement and recovery requirements
Dynamic workload balancing Reduces planned outages
− Rolling upgrades
− Hardware changes Reduces unplanned outages
− Software & hardware failures− Network failures
Simplifies client view of zSeries
TCP/IPDB2 TCP/IPDB2 DB2
DB2
TCP/IP
z/OS-1 z/OS-2 z/OS-3
192.168.253.4
82zCPO zClass Introduction to z/OS
zFS – z File System z/FS is “Sysplex aware" for file systems Write requests forwarded to USS owner
Reads can be managed in cache If owner fails, USS moves owner to another LPAR
Improved Byte Range Lock Manager (BRLM) availability Locks replicated on a backup system
Appl
USS
zFS
XCF
Sysplex
Appl
USS
zFS
Appl
USS
zFS
XCF
83zCPO zClass Introduction to z/OS
Aspects of Availability
High AvailabilityFault-tolerant, failure-resistant infrastructure supporting continuous application processing
Continuous OperationsNon-disruptive backups and
system maintenance coupled with continuous availability of
applications
Disaster RecoveryProtection against
unplanned outages such as disasters through reliable,
predictable recovery
Protection of critical business data
Recovery is predictable and reliableOperations continue after a disaster
Costs are predictable and manageable
84zCPO zClass Introduction to z/OS
Mainframe Disaster Recovery is Based on Parallel Sysplex
Primary Site Backup Site
Same systematic design for all applications and data
Recovery is automatic and fast Integrity preserved Additional cost is minimal
Mainframe
Takeoverand Restart
Disk Mirroring
Primary Site You must design site failover scheme for each
application and database Recovery is manual and slow Easy to lose synchronization and integrity You must pay for duplicate hardware and software
Backup Site
Distributed Production
Distributed Development & Test
Distributed Batch
Distributed Production
Distributed Development & Test
Distributed Batch
PTAM (Pick-up truck
access method)
85zCPO zClass Introduction to z/OS
Bank Austria Creditanstalt
GDPS/PPRC Experience
Recovery window reduced from 48 hours to less than two hours
Planned site switch completed in the two hour target
Significant reduction of on-site manpower and skill level required to manage planned and unplanned reconfigurations
Dynamic switchover of disk subsystems is between 32-95 seconds
No loss of committed data
86zCPO zClass Introduction to z/OS
PPRC and XRC Overview
PPRC (Metro Mirror) Synchronous remote data mirroring
Application receives “I/O complete” when both primary and secondary disks are updated
Typically supports metropolitan distance Performance impact must be considered
Latency of 10 us/km
S/390S/390
z/OSz/OS
UNIXUNIX
NTNT
1 4
3
2
PPRC
1 4 3 2
SDMSDM
XRC
XRC (z/OS Global Mirror) Asynchronous remote data mirroring
Application receives “I/O complete” as soon as primary disk is updated
Unlimited distance support Performance impact negligible System Data Mover (SDM) provides
Data consistency of secondary dataCentral point of control
87zCPO zClass Introduction to z/OS
GDPS – Geographically Distributed Parallel Sysplex
Near Continuous Availability & Disaster Recovery GDPS/PPRC (Peer to Peer Remote Copy (PPRC) - Synchronous) Multisite Sysplex (fiber distance between sites up to 40 km - max) No or limited data loss in unplanned failover - user policy Planned and Unplanned reconfiguration support
Disaster Recovery solution GDPS/XRC (eXtended Remote Copy (XRC) - Asynchronous) Supports unlimited distance Production systems in Site 1 Limited data loss to be expected in unplanned failover GDPS initiates restart of production systems in Site 2
Common functions (GDPS/PPRC and GDPS/XRC) GDPS solution manages tape resident data Point-in-time copy created (Flash Copy) intended to:
− Maintain D/R readiness during resynchronization− Perform D/R testing while maintaining D/R readiness
Management of zSeries Operating Systems
S/390S/390®®
z/OSz/OS
UNIXUNIX
NTNT
1 4
3
2
ESCON®
1 4 3 2
SDMSDM
Virtual Tape Controllers
Virtual Tape Controllers
Primary Site
Secondary Site
TCDBTMC
Catalog
TCDBTMC
Catalog
PPRC
XRC
PtPVTS
PPRC
88zCPO zClass Introduction to z/OS
P S
applicationapplication
UCB
PPRC
UCB
Brings different technologies together to provide a comprehensive application and data availability solution
HyperSwap – the Technology Substitutes PPRC secondary for
primary device Automatic – No operator interaction Fast – Can swap large number of devices Non-disruptive – applications keep
running Includes volumes with Sysres, page DS,
catalogs Hardware Triggers
I/O Errors Boxed Devices Control Unit Failures
IOS Timing Trigger Availability Autonomic detection of “soft” failures Customer defined timing thresholds to
trigger Hyperswap Dual Site and Single Site Environments
GDPS/PPRC GDPS/PPRC HyperSwap Manager
90zCPO zClass Introduction to z/OS
It will address any of the following types of work Large business problems that involve hundreds of end users, or deal with
volumes of work that can be counted in millions of transactions per day. Work that consists of small work units, such as online transactions, or large
work units that can be subdivided into smaller work units, such as queries. Concurrent applications on different systems that need to directly access and
update a single database without jeopardizing data integrity and security.
Provides reduced cost through Cost effective processor technology
IBM software licensing charges in Parallel Sysplex
Continued use of large-system data processing skills without re-education
Protection of z/OS application investments
The ability to manage a large number of systems more easily than other comparably performing multisystem environments
What a Sysplex can do for YOU…
91zCPO zClass Introduction to z/OS
Client Environment
System z
z/OS
DB2
IMS
WMQ
GDPS
Parallel Sysplex Deployment consists of five System z across two sites running 42 M business transactions a day
TD BankBest Practices
Background TD Bank has been running Parallel Sysplex
− Sysplex wide availability 99.998% over 10 years − Only 1.5 hours planned outage
System z is used for Customer Account Data for applications supporting Tellers, Internet Banking and ATMs
TD Bank Recommendations Keep sysplex up – do not bring it down Practice Rolling IPLs Exploit concurrent hardware upgrades Use automation Configure your sysplex for availability
− IMS/DB2 Data-sharing − Transaction routing − Sysplex Distributor for TCP/IP − Online database reorganizations − Clone each image− Ensure applications exploit parallel sysplex
92zCPO zClass Introduction to z/OS
Summary
Reduce cost compared to previous offerings of comparable function and performance
Continuous availability even during changeDynamic addition and changeParallel sysplex builds on the strengths of the z/OS platform
to bring even greater availability serviceability and reliabilityScales out at low overhead, near linear scaling
93zCPO zClass Introduction to z/OS
Additional Information
GDPSThe Ultimate e-business Availability Solution – GF22-5114www.ibm.com/systems/z/gdps
Parallel Sysplexwww.ibm.com/systems/z/pso