Post on 19-Jan-2016
transcript
1
Event Profiling in z/OS
Dave DayDeveloper
Cole Softwaredday@colesoft.com
Contact: Bob Shimizu (800) 932-5150 bshimizu@colesoft.com
Presents
2
Event profiling: An entirely different methodology
An event-based profiler builds the application profile by using records that the operating system creates while the application is executing.
These records indicate: When events are occurring Where events are occurring How often events are occurring And exactly what events are occurring
Event-based profiling yields far more accurate profiles with NO direct interference of the application.
3
Unique features of event profiling Counts and elapsed times for SVCs Contention statistics, including Lock Suspend Wait time data Memory management event counts Program Call identification with event count totals Tracking of events across address space boundary I/O response times No hooks to the operating system No internal interference with profiled application
4
Event Profiling (cont’d) z/OS automatically records all useful events into
internal trace buffers. This overhead is already built into z/OS.
Trace data is perfect for event profiling. All an Event Profiler needs to do is capture it and organize it into useful reports.
All work is done in the Profiler’s own address space: No hooks, No intercepts, No interrupts forced upon the
application being profiled. No TCBs, No SRBs, No Interference in the target address
spaces whatsoever. Vastly more data points (events) with which to build reports.
5
A CPU Utilization Problem This study illustrates the advantage that event profiling has over
existing profiling technology. It is taken from a recent beta test site.
The user had attempted unsuccessfully to use two of the current profilers on the market today to identity the reason for increased CPU consumption by an application.
In an attempt to reduce I/O, the user had moved some buffers above the bar into 64 bit storage. However, this change increased CPU utilization considerably.
We’ll start by examining the TOP CPU Consumers report for two runs. The 1st run shows the results before moving the buffers above the
bar. The 2nd run shows the results after moving the buffers.
6
Top CPU Consumers(before buffer change)
WORK LOAD CSECT BEGIN END TOTAL % UNIT MODULE OFFSET OFFSET EVENTS TOTAL THIS CPU GROUP EVENTS
BATCHTCB LOADMOD1 WAITLIST 000130 000137 400,679 40.31BATCHTCB LOADMOD1 TIMEUSED 000060 000067 276,920 27.86BATCHTCB LOADMOD1 DKEXCP 000060 000067 138,131 13.89BATCHTCB LOADMOD1 WAIT 000030 000037 4,492 00.45BATCHTCB LOADMOD1 GETMAIN 0000D8 0000DF 1,113 00.11
7
Top CPU Consumers(after buffer change)
WORK LOAD CSECT BEGIN END TOTAL % UNIT MODULE OFFSET OFFSET EVENTS TOTAL THIS CPU GROUP EVENTS
BATCHTCB LOADMOD1 DKEXCP 000060 000067 380,707 25.99BATCHTCB LOADMOD1 WAITLIST 000130 000137 375,069 25.60BATCHTCB LOADMOD1 TIMEUSED 000060 000067 259,604 17.72BATCHTCB LOADMOD1 WAIT 00030 000037 4,324 00.29BATCHTCB LOADMOD1 GCORE64 000250 000257 1,363 00.09
Why is there a difference between these two runs?
8
TOP CPU Consumers This is a “break-out” report before the buffer change.
WORK UNIT .................................................. BATCHTCBLOAD MODULE ................................................. LOADMOD1CSECT ....................................................... DKEXCPBEGIN OFFSET LOCATION ....................................... 00000060END OFFSET LOCATION ......................................... 00000067TOTAL EVENTS THIS OFFSET .................................. 138,131PERCENT OF TOTAL CPU EVENTS ................................. 13.89NBR OF MODULES REFERENCED FROM THIS OFFSET ................ 56LOAD MODULE PROGRAM OFFSET BREAKOUT VALUE ................. 5
LOAD MODULE ................................................. IOSVSSCQTOTAL EVENTS THIS LOAD MODULE ............................. 1,132PERCENT OF LOCATION EVENTS .................................. 00.81PERCENT OF TOTAL CPU EVENTS ................................. 00.11
LOAD MODULE ................................................. IAXPQTOTAL EVENTS THIS LOAD MODULE ............................. 836PERCENT OF LOCATION EVENTS .................................. 00.60PERCENT OF TOTAL CPU EVENTS ................................. 00.08
9
TOP CPU ConsumersThis is a continuation of the break-out report: LOAD MODULE ................................................. IGC0023ITOTAL EVENTS THIS LOAD MODULE ............................. 449PERCENT OF LOCATION EVENTS .................................. 00.32PERCENT OF TOTAL CPU EVENTS ................................. 00.04
LOAD MODULE ................................................. IRASRCHPTOTAL EVENTS THIS LOAD MODULE ............................. 123PERCENT OF LOCATION EVENTS .................................. 00.08PERCENT OF TOTAL CPU EVENTS ................................. 00.01
LOAD MODULE ................................................. BPXMIDIETOTAL EVENTS THIS LOAD MODULE ............................. 37PERCENT OF LOCATION EVENTS .................................. 00.02PERCENT OF TOTAL CPU EVENTS ................................. 00.00
10
TOP CPU ConsumersHere is the break-out report after the buffer change:
WORK UNIT ................................................... BATCHTCBLOAD MODULE ................................................. LOADMOD1CSECT ....................................................... DKEXCPBEGIN OFFSET LOCATION ....................................... 00000060END OFFSET LOCATION ......................................... 00000067TOTAL EVENTS THIS OFFSET .................................. 380,707PERCENT OF TOTAL CPU EVENTS ................................. 25.99NBR OF MODULES REFERENCED FROM THIS OFFSET ................ 72LOAD MODULE PROGRAM OFFSET BREAKOUT VALUE ................. 5
LOAD MODULE ................................................. IGC0023ITOTAL EVENTS THIS LOAD MODULE ............................. 249,805PERCENT OF LOCATION EVENTS .................................. 65.61PERCENT OF TOTAL CPU EVENTS ................................. 17.05
LOAD MODULE ................................................. IAXV6TOTAL EVENTS THIS LOAD MODULE ............................. 2,560PERCENT OF LOCATION EVENTS .................................. 00.67PERCENT OF TOTAL CPU EVENTS ................................. 00.17
11
Here’s what happened Csect DKEXCP had jumped to the top of the CPU
Consumers report. Within DKEXCP, it was obvious that IGC0023I had
gone from less than 500 events to ¼ million. The next slide contains data from the LOAD
MODULE SUMMARY Report. Lets see what activity is taking place inside
IGC0023I.
12
Load Module ActivityLOAD MODULE ............................................... IGC0023IEXECUTING UNDER WORK UNIT ................................. BATCHTCBDATA CAPTURE RECORD NUMBER, FIRST EVENT ................... 4,355DATA CAPTURE TIME, FIRST EVENT ..................... 14.17.06:34.7535DATA CAPTURE RECORD NUMBER, LAST EVENT .................... 2,260,369DATA CAPTURE TIME, LAST EVENT ...................... 14.21.45:83.1730
TOTAL OBSERVED EVENTS, THIS LOAD MODULE ................... 252,040PERCENT OF WORK UNIT TOTAL ................................ 19.37PERCENT OF SEGMENT TOTAL .................................. 15.83RATE PER SECOND ........................................... 900.14
TOTAL NUMBER OF MEMORY MANAGEMENT EVENTS .................. 125,397SVC 120--GETMAIN/FREEMAIN ................................. 3BR ENTER GETMAIN/FREEMAIN ................................. 6PROGRAM CALL IARV64 ....................................... 125,388
13
Why did event profiling succeed?Why didn’t the other profilers identify the problem?
Answer: The STATUS SVC to STOP the TCBs acquires the LocalLock. Storage functions, whether SVC or PC entered, acquire theLocal Lock as well.
Any profiler that needs the Local Lock to perform its logic during the active part of the interval will never observe any activity in any application when that application owns the Lock.
14
How Event Profiling works
z/OS is an interrupt-driven operating system.
Both hardware and software generate millions of interrupts per hour!
As part of interrupt processing, trace records are written to the trace table.
Trace records tell an Event Profiler a lot about work in the system.
15
Events: PR/SM Interrupts The PR/SM Dispatcher (in firmware) monitors
all LPARs to decide if/when to switch processing from one LPAR to another.
By default, this monitoring occurs 40 times per second.
The PR/SM Dispatcher receives control via an interrupt.
Each interrupt generates a trace record. That creates a lot of data points for an Event Profiler.
16
Events: System Processing Events
Significant system processing events also are recorded in the trace table: SVC interrupts and returns Program Call (PC) and returns (PR or PT) Program checks (0C4, page-in needed, linkage stack
creation-needed, etc.) Locks: Suspend time and re-dispatch time. System dispatcher
SVC counts and timings can be determined, per SVC individual service.
PC routine counts can be determined, per individual PC routine.
17
Profiler-generated Interrupts
When the profiler is active, it creates its own periodic timer interrupts.
So even in CPU intensive non-LPAR systems, (low interrupt rates), meaningful data points can still be collected.
18
Comparison of event profiling to another profiler
At another of the early beta sites, a user used another profiling product against a batch job: The job ran for 22 minutes with that profiler active, and had
22,000 samples taken. They then ran the same job using our event profiler. For the
same batch job, our profiler captured 1,185,000 events with the PSW pointing into the targeted job.
Although the profiles were identical for the 1st position, or 1st reported csect and offset, event profiling identified two new program locations at the #2 and #3 slot in the list.
19
Interrupts, and other events, are recorded in a trace table.
The trace table consists of a set of buffers, linked in a circular chain.
Each processor active on the image has a unique trace table.
Starting in z/OS 1.10, the trace table was enlarged and moved above the bar(64 bit storage). The default table size was changed from 64 4k buffers to 256 4k buffers.
Trace Table Structure
20
Trace Table Records There are two categories of trace table records:
Implicit records are created automatically by the micro code when certain events occur. These records are documented in POPs (Chapter 4).
Explicit records are created by software issued TRACE instructions. These records are described in SYS1.MACLIB(IHATTE).
Explicit trace entries all have a common record header. The header contains a TOD stamp, a record identification, and usually a TCB address (where applicable).
Most trace records contain a PASN ID value to identify the address space where the event occurred. When applicable, they contain PASN, SASN, and HASN values.
Record types found most often are: External Interrupts Start Sub-channel I/O Interrupts SVC
21
Implicit Trace Table Records
Implicit trace records of most importance to an Event Profiler are: Program Call (PC) Program Return (PR) and Program
Transfer (PTThe ability to track Program Call and
Program Return/Program Transfer is unique to event profiling
22
What makes Event profiling unique?
The data is just there, in the trace table.Current event profiling technology
captures 99.9% of events. No hooks are required to get these
numbers.
23
Contention
Event profiling is capable of reporting contention-induced waits, including: Lock Suspends; Enqueues generated by both SVC and ISGENQ Latch Contention (available at the TCB level of
reporting only); CPU The trace record that signals the interrupt contains
the PSW. If z/OS does not take the processor away from the executing program at the end of interrupt processing, no dispatch record is created when the program resumes
execution.
24
CPU Contention
When z/OS gives control of the processor to another program, there is a dispatch record for the “other” program, and when the original program again regains the processor, there is a dispatch record.
By keeping track of the interrupt locations and the dispatch location, the number of times the processor was taken away from a program can be computed.
By subtracting the interrupt time from the dispatch time the time spent waiting for a processor can be computed. CPU contention can be measured!
25
Contention Statistics: TCB or task level reporting.
TOTAL CONTENTION RELATED WAIT/PAUSE TIME. INCLUDES SVC ENQ,
ISGENQ, LOCK SUSPEND, LATCH, AND WAIT FOR CPU ...... 00.04.03:51.6056 TOTAL NUMBER OF CONTENTION EVENTS OBSERVED ................ 70,186 TOTAL ELAPSED WAIT TIME FOR LATCH OBTAIN ........... 00.00.50:13.6001 NUMBER OF TIMES OBSERVED WAIT FOR LATCH OBTAIN ............ 5 AVERAGE ELAPSED WAIT FOR LATCH ..................... 00.10:02.7200 TOTAL ELAPSED WAIT TIME FOR ENQ SVC ................ 00.01.41:21.5120 NUMBER OF TIMES OBSERVED WAIT DURING ENQ SVC .............. 504 AVERAGE ELAPSED WAIT DURING ENQ SVC ................ 00.00:20.0823 TOTAL ELAPSED WAIT TIME FOR ISGENQ ................. 00.01.20:22.2260 NUMBER OF TIMES OBSERVED WAIT DURING ISGENQ ............... 8 AVERAGE ELAPSED WAIT, ISGENQ PROCESSING ............ 00.10:02.7782
26
Contention Statistics: TCB or task level reporting.
TOTAL ELAPSED TIME SUSPENDED WAITING FOR A LOCK..... 0.00.00:00.0439
NUMBER OF TIMES OBSERVED LOCK SUSPEND ..................... 13
AVERAGE ELAPSED TIME SUSPENDED FOR LOCK ............ 00.00:00.0033
TOTAL ELAPSED TIME WAITING FOR A CPU ............... 00.00.11:94.2234
NUMBER OF TIMES OBSERVED CPU LOSS DUE TO CONTENTION .... 69,656
AVERAGE ELAPSED TIME WAITING FOR A CPU ............. 00.00:00.0171
27
Self-induced Waits
Includes: SVC entry to wait Branch entry to wait Entry to PAUSE (pause/release function)
28
Some statistics on self-induced Waits
TOTAL NON-CONTENTION WAIT/PAUSE TIME. SVC ENTERED WAIT, BR ENTERED WAIT, STIMER SVC, AND BR ENTRY TO PAUSE ............ 00.09.57:52.9055 NUMBER OF TIMES OBSERVED SVC ENTRY TO WAIT ................ 21,360 NUMBER OF TIMES OBSERVED ENTRY TO WAIT, EITHER SVC OR BR ENTRY, WHERE ENTRY TO WAIT RESULTED IN AN ACTUAL WAIT ............................................ 21,303 TOTAL ELAPSED TIME, SVC OR BR ENTRY TO WAIT ........ 00.04.06:46.2399 AVERAGE ELAPSED WAIT TIME .......................... 00.00:01.1569
29
More Wait statistics TOTAL ELAPSED WAIT/PAUSE TIME LOAD LOCATION MODULE MODULE TOTAL ELAPSED %WK UNIT MODULE BEGIN END WAIT WAIT TOTAL EVENTS TIME WAIT TIME APIVP000 PRIVATE 00007000 00010120 355 09.12:67 92.53 IGG019AR LPA 00D05AF8 00D05C88 20,478 00.34:35 05.02 APIVNRNT EXT-PRIV 19044000 190444B8 398 00.02:09 00.00 APIVNRNT EXT-PRIV 190444B8 19044970 200 00.02:08 00.00 IGG019AQ LPA 00D0C718 00D0CA10 213 00.00:77 00.00
30
I/O Response
Count and elapsed timings for DASD and Tape devices. SSCH-to-I/O interrupt time.
Shows the actual response time the device is presenting to the I/O sub-system.
31
SLIP Per events
Is a useful aid when debugging. Individual trace table entries can be formatted that
show not only the location of the PSW for the event, but the PSW state, key, mode,etc.
Combine this with SLIP/PER trace records, and it is possible to follow the execution logic, or program flow.
32
Other Statistics Available with event profiling
The next set of statistics is a portion of the SVC count and elapsed time part of a Summary Report. This is something that z/XPF refers to as the EVENT BREAKOUT Report. These statistics are available for the entire address space, at the TCB level, at the Load Module level, at the Csect level, and at a user defined group breakout level within a Csect.
The data is taken from a driver application specifically coded to verify z/XPF’s accuracy. In other words, it has no relationship to real world data processing. The app was coded to execute functions to generate trace records.
33
SVC Report SVC'S BY TYPE, IN ELAPSED TIME DESCENDING ORDER
NBR-DESCRIPTION EXECUTION ELAPSED TIME % TOTAL % TOTAL
COUNT THIS SVC SEGMENT SEGMENT
MM.SS:TH.MICS SVC TIME ELAPSED
47 STIMER-STIMERM 753 05.40:66.4662 98.04 84.61
26 CATALOG-LOCATE 34 00.01:56.1720 00.00 00.00
56 ENQ 532 00.00:87.9034 00.00 00.00
19 OPEN 10 00.00:77.3815 00.00 00.00
114 EXCPVR 20,381 00.00:76.3307 00.00 00.00
99 DYNALLOC 14 00.00:76.2501 00.00 00.00
20 CLOSE 16 00.00:68.7829 00.00 00.00
55 EOV 12 00.00:50.2786 00.00 00.00
48 DEQ 523 00.00:28.1959 00.00 00.00
132 RACLIST 6 00.00:18.8834 00.00 00.00
12 SYNCH 265 00.00:11.1521 00.00 00.00
08 LOAD 27 00.00:10.3999 00.00 00.00
130 RACHECK 12 00.00:08.0463 00.00 00.00
120 GETMAIN-FREEMAIN 1,502 00.00:00.5438 00.00 00.00
34
Detail Reporting
Here are some examples of Detail Reporting. All profilers on the market have summary reporting capability. They can produce a profile by summarizing individual records recorded during their interval.
With event profiling, the individual event that contributes to the profile can be examined as well. The 1st report contains SSCH and I/O interrupt events. The 2nd report contains Lock Suspend and Dispatch events.
35
Detail ReportingCPU=00, RECORDED AT 14.18.40:16.4544, RECNUM=83/0027,EVENT=START SUBCH, TIME=14.18.40:14.4966, DEVICE=0A91, BASE=0A91IOSASID=0017,TCB=008DBAB0,DRVR ID=03,VSAMTASK=APIVP000, TOKEN=0000005C0000000200000027008DBAB0
CPU=00, RECORDED AT 14.18.40:16.4554, RECNUM=84/0027,CMPT'D,EVENT=I/O INTRRUPT, TIME=14.18.40:15.0146, DEVICE=0A91, BASE DEVICE=0A91PASN=0001,HASN=0001, PSW=070E000000000000, LOC=KEY=0, STATE=SUP, ASC=PRIM, MODE=24,INSTR ADDR=LM=PNEH-PRV,OFFSET=FFFFA000TASK=UNKNOWN, ADDRESS= 00000000DEV STATUS =,,DEVICE END,CHANNEL END,,,,,LOCKS HELD(PSACLHS=00000080,(IOS UCB LOCK))
CPU=00, RECORDED AT 14.18.40:16.4604, RECNUM=86/0027,EVENT=START SUBCH, TIME=14.18.40:15.1294, DEVICE=0A91, BASE=0A91IOSASID=0017,TCB=008DBAB0,DRVR ID=02,EXCP PROCESSORTASK=APIVP000, TOKEN=0000005C0000000200000027008DBAB0
CPU=00, RECORDED AT 14.18.40:16.4636, RECNUM=88/0027,CMPT'D,EVENT=I/O INTRRUPT, TIME=14.18.40:16.1664, DEVICE=0A91, BASE DEVICE=0A91PASN=003E,HASN=003E, PSW=070C0000813E1192, LOC=NUCLEUSKEY=0, STATE=SUP, ASC=PRIM, MODE=31,INSTR ADDR=013E1192LM=IGC003,OFFSET=0000036ATASK=UNKNOWN, ADDRESS= 008E2B58DEV STATUS =,,DEVICE END,CHANNEL END,,,,,LOCKS HELD(PSACLHS=00000081,(IOS UCB LOCK,LOCAL))
36
Detail ReportingCPU=00, RECORDED AT 14.18.40:11.0026, RECNUM=77/0025,PSW ID'DEVENT=LOCK SUSPEND, TIME=14.18.40:09.3303, TYPE =LOCLHASN=0017, RTN ADDR=00C3532E, LOC=CSALM=HASCHAM,OFFSET=00005B46,CSECT=HASCHAM,OFFSET=00005B46TASK=APIVP000, TOKEN=0000005C0000000200000027008DBAB0SUSPENDED PRB =008FD660
CPU=00, RECORDED AT 14.18.40:11.0065, RECNUM=79/0025,CMPT'D, PSW ID'DEVENT=TASK DISPTCH, TIME=14.18.40:09.3860, TASKPASN=0017,HASN=0017, PSW=070C000080C3532E, LOC=CSAKEY=0, STATE=SUP, ASC=PRIM, MODE=31,INSTR ADDR=00C3532ELM=HASCHAM,OFFSET=00005B46,CSECT=HASCHAM,OFFSET=00005B46TASK=APIVP000, TOKEN=0000005C0000000200000027008DBAB0LOCKS HELD(PSACLHS=00000001,(LOCAL))
CPU=00, RECORDED AT 14.18.40:44.5261, RECNUM=118/002E,PSW ID'DEVENT=LOCK SUSPEND, TIME=14.18.40:42.4249, TYPE =LOCLHASN=0017, RTN ADDR=00C3532E, LOC=CSALM=HASCHAM,OFFSET=00005B46,CSECT=HASCHAM,OFFSET=00005B46TASK=APIVP000, TOKEN=0000005C0000000200000027008DBAB0SUSPENDED PRB =008FD660
CPU=00, RECORDED AT 14.18.40:44.5308, RECNUM=120/002E,CMPT'D, PSW ID'DEVENT=TASK DISPTCH, TIME=14.18.40:42.4800, TASKPASN=0017,HASN=0017, PSW=070C000080C3532E, LOC=CSAKEY=0, STATE=SUP, ASC=PRIM, MODE=31,INSTR ADDR=00C3532ELM=HASCHAM,OFFSET=00005B46,CSECT=HASCHAM,OFFSET=00005B46TASK=APIVP000, TOKEN=0000005C0000000200000027008DBAB0LOCKS HELD(PSACLHS=00000001,(LOCAL))
37
Desktop GUI Interface
The following slides are examples of what can be done by an event profiler with system trace events, after the data has been massaged and downloaded to a desktop program for analysis.
These screen-shots fulfill the adage, “A picture is worth a thousand words”
38
39
40
41
42
43
44
45
46
47
48
Future directions/development
Add logic to recognize and report on all major sub-systems. CICS, IMS, ADABAS, etc., etc..
JAVA reporting. CPMF data. CPMF is the new hardware facility IBM
has announced that creates processor records indicating the instruction address of the currently executing instruction on the processor. Combining this data with z/XPF’s current reporting will give z/XPF the ability to report accurately on CPU utilization with a greatly enhanced degree of accuracy within target profile application load modules.
49
Future directions, cont’d
Global profiling. Determine which pieces of code in the entire system were most active.
The contents of the trace table would be examined to identify the instruction address associated with the event. Would not exclude by ASID value.
What value is this? A piece of code that is called by many/all active units of work on a system may not be a significant resource consumer to an individual caller but may be so to the entire system.
50
Copies of Reports
The data beyond this page is a set of Summary Reports. The target used to generate the events for the profile bears no resemblance to any real world application. It was coded strictly to verify that z/XPF is picking up the events from the trace table correctly. The lowest level of detail in these reports is the Csect. It is possible to break out a Csect by offset groups, giving statistics within a Csect, in 64 byte groupings.
51
Thanks for your time
Event profiling offers an exciting new way to capture MORE measurement information about the internal operations of applications than ever before possible.
For more information, please contact us: www.colesoft.com Bob Shimizu (800) 932-5150
bshimizu@colesoft.com