®
IBM Software Group
®
IBM Software Group
®
IBM Software Group
®
IBM Software Group
®
IBM Software Group
®
IBM Software Group
IDz/ADFz Workbench –
Using Fault Analyzer to Analyze
and Solve z/OS ABENDs
Jon Sayles, IBM zDevOps Enablement - [email protected]
@Copyright IBM – April 2021
DevOps
2
IBM Trademarks and Copyrights © Copyright IBM Corporation 2008 through 2021.
All rights reserved – including the right to use these materials for IDz instruction.
The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates.
This information is based on current IBM product plans and strategy, which are subject to change by IBM without notice. Product release dates and/or capabilities referenced in these materials may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way.
IBM, the IBM logo, the on-demand business logo, Rational, the Rational logo, and other IBM Rational products and services are trademarks or registered trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be trademarks or service marks of others.
3
The IDz Workbench Curriculum
▪ Module 1 – IDz Terms, Concepts and Navigation
▪ Module 2 – Editing Your COBOL Programs
▪ Module 3 – Analyzing COBOL Programs
▪ Module 4 – Remote Systems – Connect, Navigate and Search
▪ Module 5 – Remote Systems – Dataset Access and Organization
▪ Module 6 – Remote Systems – ISPF 3.x, Batch Jobs and Batch Job Management
▪ Module 7 – MVS Subprojects – Organizing PDS Members and SCM Checkout
▪ Module 8 - The Data Tools – SQL Code/Test and DB2 Table Access
▪ Module 9 - Debugging z/OS COBOL Applications
Optional Modules▪ IDz/Endevor Integration Through CARMA
▪ zUnit – Unit Test
▪ Code Coverage – Test quality feature
▪ Code Review – Application quality feature
▪ Menu Manager – Integrate ISPF REXX Execs and CLISTs
▪ Web Services – SOA development
▪ Fault Analyzer
▪ File Manager
4
Course Assumptions
1. You know ISPF and have used it for at least two years, doing production work on z/OS with COBOL, PL/I or Assembler
Note that all of the workshops in this course are in COBOL – although files exist that are Assembler and other languages for you to experiment with – as time permits
2. You have:
No experience with Eclipse or IDz
Some experience with PC tools
▪ You have used MS-Windows applications for at least one year
IDz installed and running on your workstation at version 8.0 or later
▪ Note that all ISPF discussion/examples and screen captures assume IBM-installed ISPF product defaults – not any 3rd party or custom Dialog Manager applications you may have installed on your mainframe
5
Course Contributing Authors
▪ Thanks to the following individuals, for assisting with this course: Russ Courtney/IBM
James Rice/IBM
Walter (Zack) Zakorchemny
David Bean/IBM-Rational
Ed Steele/IBM-Rational
Olivier Gauneau/IBM
6
Course Overview
▪ AudienceThis course is designed for application developers who have learned or
programmed in COBOL, and who need to do z/OS Traditional Development and Maintenance as well as build leading-edge applications using COBOL and Rational Developer for System z.
▪ PrerequisitesThis course assumes that the student has a basic understanding and
knowledge of software computing technologies, and general data processing terms, concepts and vocabulary, as well as a working knowledge of COBOL and z/OS.
Knowledge of SQL (Structured Query Language) is assumed for database access is assumed as well.
Basic PC and mouse-driven development skills, terms and concepts are also assumed.
7
UNIT
Topics:
The IDz Workbench
▪ Analyzing Mainframe Abends
▪ ABEND Codes and Reasons
▪ Fault Analyzer
▪ Appendicies
This course is written for z/OS developers, not Systems Programmers.While Fault Analyzer has deep z/OS analytical tools that can be used by the Systems
Programming staff this material is aimed at COBOL and PL/I programmers responsible for discovering, analyzing and solving application program ABENDs.
Note an ABEND == a thrown exception in Java, C++ and in other/modern languages
8
Objectives
After completing the first two sections on Production
Support/Application Testing/Software Defects and
IBM Mainframe COBOL ABEND Research, you
should be able to:
Define the steps in a generalized methodology of ABEND
resolution
List the various sources of ABEND inputs, including:
▪ PD Tools documents
▪ Other SYSOUT
▪ Dynamic trace facilities
▪ Static code analytics
List the common types of COBOL program ABENDS
9
▪ When an application ABEND (ABnormal END-of-job) occurs, z/OS stops executing your program, closes files and buffers and generates a high-level message in the form of a System Completion Code (Sxxx) – or USER code (typically 4038)
▪ The System Completion Code is typically written to an output listing file through your //SYSOUT DD * JCL entry.
▪ The completion code indicates the z/OS system’s reason it stopped executing your program.
▪ The completion code is related to, but often only loosely related to what is invalid in your code
▪ Because of this the System Completion Code represents the starting point for your analysis of the problem.
Program ABENDs – Overview
She won't be laughing when she gets back to her desk
and finds out that last night's production batch stream never finished…
10
Analyzing and Solving a z/OS ABEND
There are as many ways to analyze and research ABENDs – just as there are many individual approaches to solving a business problem with procedural logic.
However, if you've never done software production support work, consider starting with the following structured problem-solving approach:
1. Preparation
2. Research
3. Hypothesis
4. Solution
5. Resolution
As a final note before we begin, understand that there are usually two distinct phases of /z/OS application Production Support:
1. Data Center “on-call” ABEND resolution – where a technician receives notification that a job or transaction has ABEND’d and often must be "fixed" within an extremely short timeframe (measured in hours). In this case, the technician's concern is to "patch" the problem - get the system back online, or get the batch job-stream back into production
2. Root Cause Analysis. This begins is when the programmers responsible for the application track down and solve the problem that caused the ABEND – typically “why” the ABEND happened.
The steps that follow represent a common approach to "Fix-It“ – they include ABEND resolution and proceed to Root Cause Analysis work-flow.
11
1. Preparation and Information/Input Gathering – 1 of 4
Collect background information on: What happened, When and WHERE the ABEND occurred.
\
1. Start with Fault Analyzer reports which contain a deep set of formatted analytic information
2. Collect additional supporting ABEND output ▪ SYSOUT from the job
▪ DISPLAY statements…
3. Obtain copies of the run-time:▪ JCL/PROCs
▪ Program source for
all modules
▪ Compile listings &
Link Maps
4. Grab “data dumps”▪ Data File records
▪ DB2 Table values
▪ Log files
Especially if the problem is data-specific
12
1. Preparation and Information/Input Gathering – 2 of 4
From the Run-stream JCL and/or from the JES spool file(s) (JESJCL) retrieve the DSNs of input and output files accessed by the program.
If the ABEND occurred in an online app you will need to gather the same kinds of background information from the:
- IMS SYSGEN Tables, and individual IMS GEN control blocks
- CICS – RDO Table entries
13
IBM’s A.D. (static analysis) tool can simplify the research and discovery phases of ABEND resolution – both for batch and online applications
1. Preparation and Information/Input Gathering – 3 of 4Application Discovery (A.D.)
Diagram ➔
showing datasets and DB2
table access points
Screen/Transaction/Program
Call Graph
14
1. Preparation and Information/Input Gathering – 4 of 4You can often benefit by spending some time with a Subject Matter Expert… as you often need to know “What” the application or programs do for the business.
The “What” of an application is the province of either: Business (End) Users, Architects or Systems Analysts
However it may not be possible to find someone – as most of the first (and even second) generation of Business Application developers have retired
15
Research – Analyze the ABEND data
▪ Using information & inputs from the preparation construct a mental map of the program's execution – HOW & WHY the ABEND occurred.
▪ The Why an ABEND occurred usually requires a combination of "Static" and "Dynamic" analysis – tools and techniques.
▪ These steps need not be followed in this order. In time you will develop an "intuition" as to which kind(s) analysis will be most likely to provide the information needed to solve the problem.
▪ To assist use application research and analysis tools such as
▪ IBM’s Application Discovery (AD) and IDz’s Static Analysis tooling:– Program Control Flow & PERFORM Hierarchy
– Data Flow
▪ IBM Debug Tool – for Dynamic (run-time) Analysis– You will most likely need an image copy of the data that caused the ABEND
▪ So – if the reason "why" the ABEND occurred is not apparent at this point, perform Static and/or Dynamic Analysis on the specific areas of the application relating to the ABEND.
16
Overview of Techniques: Static Analysis
1. Structural Visualization: is the generation of an accurate mental map, understanding or mental image of the program's control structure, or logic-architecture. Using the starting point represented by the ABEND condition (the statement which caused z/OS to halt execution) and using electronic-assisted tools (such as IBM’s Rational Asset Analyzer or Rational Developer for System z), build an accurate understanding of the code invocation: The module/file level (System View) - Paragraph/Section level (Hierarchy chart) -Statement level (Flow chart)
Structural Visualization can done be "top-down", by asking open-ended questions; such as learning how a particular routine "hangs-together logically", or it can be used "bottom-up", by asking specific close-ended questions about a program, such as "How does this particular paragraph get executed?" "How did this module get invoked?"
2. Data Flow Analysis: A combination of control structure analysis and data item analysis, which seeks to determine the usage of fields throughout a program. Data flow analysis is used to determine (from a given instance of a data item) where the next occurrences of that item exist in your program, and how the data item is used; (as a receiving field in a MOVE or mathematical operation, as the sending field in a MOVE statement, as part of a logic-branch (IF, PERFORM UNTIL/VARYING, etc.).
3. Data Impact Analysis: An expansion of Data Flow Analysis which traces the movement of data from field-to-field throughout a program, or throughout an entire application; including I/O (screens and files). Using Data Impact Analysis, you can identify all fields that might have had an impact on the contents of a field (before the ABEND occurred). And just as importantly - you can learn the affect changing this field will have on the behavior of the application.
4. Textual or Data Item Usage: Utilized more for application maintenance and enhancement requests, this type of Static Analysis involves searching for "categories" of program-items, such as "List all fields that contain *JUL*, *GREG*, *YR*, *YEAR* (suspect date candidates for Year2000 conversion), or list all such fields with two digits (numeric) or two-byte (alphanumeric) definitions.
5. Code Partitioning: Again, utilized more for application maintenance, enhancements and application reengineering, Code Partitioning involves mentally organizing and analyzing code by function or process, such that you understand and can distinguish the usage of code by business process. For example: Find all code that relates to the calculation of premium renewal payments … or… Isolate the code that edits a particular file, with an eye towards creating a shared subroutine from the code.
IBM’s A.D. is the fastest & most accurate
approach to Static Code Analysis
17
Overview of Techniques: Dynamic Analysis
1. Follow Program Logic: Source-level interactive debugging. Watch the program execute statement-by-statement, and line-by-line. This is very useful for detailed-debugging, particularly of dense or complex instructions. Some software (for example, the Rational Developer for System z) allows you to trace the program logic, attempting to re-create the sequence of events (COBOL statements) that transpired up to and including the ABEND condition. Given the size and scope of production applications, it is generally more practical to trace specific problem areas of a program.
2. Interactive Execution: Execute (run) a program, stopping at selective Breakpoints (Pause execution each time a certain field-value changes, or when a value exceeds some threshold), and examining the contents (value) of specific fields. Interactive Execution must be done by (or with) an application analyst who understands how the system is supposed tooperate. Interactive Execution is useful for observing control flow, and is often combined with line-by-line tracing by setting selective breakpoints, monitoring values, "running" the application to the breakpoints, and then tracing the code line-by-line.
3. Selective Data State Collection: Execute code and establish a functional summary of specific data value states. Use these states in subsequent test runs to compare results of current values to expected values. Debug Tool’s auto-log feature is beneficial
4. Code Coverage: Analyze the number of times each COBOL statement is executed for a given run. Note that PD Tools/Debug Tool can run a report that shows code coverage. This technique is extremely useful for analyzing test data coverage of a given application. And it can be used effectively for debugging if it makes apparent problems such as infinite loops (S222, S322 and B37 ABENDs), over-loading tables – i.e. loading tables beyond the maximum OCCURS clause and overlaying storage, which cause things like: S0C1, S0C4, and even S0C7 ABENDs.
IBM’s Test/Debug Tools are the best-in-
class approach to Dynamic Analysis
18
Hypothesis - Determine WHY the ABEND occurred – 1 of 4
▪ Your preparation and research is probably all you need to be able to describe WHAT, WHERE and HOW the ABEND occurred
In other words; at what point in the program the logic failed, and during what sequence of COBOL statements…
▪ However, before modifying any business logic you must determine WHYthese statements (sequence of steps) caused the failure:
"Why did this production input file contain spaces in a numeric field?"
▪ The data was supposed to have been edited at this point in the batch stream
"Why did the program's logic perform the Initialization routine twice?"
"Why did the Read routine execute past end-of-file?“
“Why did this alpha data end up in a packed LINKAGE field?”
▪ Only through a determination of WHY will you be able to make a change to production business logic safely, and with confidence that;
Your change will resolve the ABEND
Your change will not introduce new (additional) ABENDs
19
Hypothesis - Determine WHY the ABEND occurred – 2 of 4
▪ Sometimes it is relatively easy to come to an understanding of WHY certain ABEND conditions occurred. For example, perhaps a period was left off the appropriate termination point for an IF statement - which caused execution to perform an operation out of sequence. Or perhaps an IF .. NUMERIC test (which should have been coded for all numeric fields in a file) was forgotten. Or a paragraph was performed through the wrong paragraph-exit, or a production job was released before certain files were available (causing I/O errors). These types of ABEND situations can be understood (and usually resolved) fairly quickly. But, not always.
What if - in the case of the IF statement with the incorrect termination point - the logic that has been coded, correctly processed the first 100,000 records in the file?
▪ Making a change to a critical IF condition could very well affect other down-stream processing within the
program, wrecking havoc with subsequent routines.
Or what if - in the case of the file containing blanks in the numeric fields - the input file was supposed to be "clean" (validated) by this point in the job-stream - having gone through allegedly "exhaustive" edits in prior modules.
▪ By simply adding an IF test you may solve your program's specific ABEND, but you will not have resolved
the actual problem - which exists somewhere else in the system.
▪ In other words, localized/piecemeal approaches to resolving production ABENDs are not recommended - as
they usually change the problem, instead of solving it. And sometimes they just spawn new problems.
▪ It should be noted that, a clear understanding of the business functionality automated by this process is almost always required to resolve WHY something has gone wrong.
Call on business experts or "application/business" expertswho understand "the big picture" - and the context in whichthe job executes is the rule rather than the exception to this process.
20
Hypothesis - Determine WHY the ABEND occurred – 3 of 4
Developing an accurate determination of WHY a problem that lead to an ABEND condition exists may take a considerable amount of time depending on the:
Size, complexity and structure of the code
▪ Number of copybooks, Calls, Files/IO, etc.
Your familiarity with the program's business purpose - coupled with your ability to grasp the point of each statement
Type of ABEND and reason for the problem (some are more diabolical than others)
Size of the input/output files – and complexity of the data
▪ Multi-level OCCURS tables, Multiple 01-records on a file, etc.
In addition to an understanding of the reason for the ABEND, the results of your investigation should produce an understanding of the solution to the problem (the fix itself).
21
Hypothesis - Determine WHY the ABEND occurred – 4 of 4
There are typically two categories of ABEND “WHY” issues:
1. Data problems
Incorrect schema mapping
Invalid numeric data
Uninitialized data
Invalid values
2. Procedural logic problems
Missing modules
Modules executed out-of-sequence
Paragraphs/Sections executed out-of-sequence
▪ Fault Analyzer provides a research starting point with its “Event” listings - for both of the above categories
22
Determine WHY the ABEND Occurred – Data Problems
Typical reasons for data problems include:
▪ Mapping issues:
Incorrect Copybook version(s)
▪ Verify with SCM & Release Management tool
Mismatched LINKAGE/Entry Using
▪ Use IDz’s Scan for compatibility tool
▪ Invalid numeric data
Typically causes S0C7 or Data Integrity exposures – because of:
Data editing procedures bypassed
▪ Trace runtime module & paragraph flow
▪ Including prior jobs in the batch stream
Misunderstanding of COBOL syntax
Values entering application from un-edited sources
Basic logic/coding errors
▪ View COBOL logic by using IDz’s tools:
– Data Flow diagram
– Occurrences in Compilation Unit
23
Data Problems –analyzed by IDz tools:
Program Control Flow,
Data Flow Diagram,
Fault Analyzer,
Application Discovery
A.D.
FaultAnalyzer
Data Flow
24
Determine WHY the ABEND Occurred – Procedural Problems
Typical reasons for procedural problems include:Missing modules▪ Check
– Compile/Link JES output
– Link Maps – note that this info is provided by Fault Analyzer
Modules executed out-of-sequence▪ Fault Analyzer contains a CALL sequence table
▪ AD (Application Discovery)
▪ Application Architect or Systems Analyst input
Paragraphs/Sections executed out-of-sequence▪ Perform Hierarchy
▪ Program Control Flow
▪ AD (Application Discovery)
▪ Fall thru – caused by:– Poor program design/Obsession with GO TO statements
– Coding errors: PERFORM 1000-UPDATE-RTN THRU 10000-EXIT.
25
Procedural Problems – Program Control Flow, Perform Hierarchy & A.D.
26
Procedural Problems – Fault Analyzer Runtime Event Capture
27
Solution - Fix the Problem and Test Your Solution
Take the appropriate action to resolve any business -or system-wide issues.
▪ Depending on how extensive the damage caused by the problem, or for how long any problems have persisted undetected:
Files may have to be restored from backups from a previous point-in-time
Jobs may have to be re-run from a previous point-in-time (synchronized with file generations)
Files may have to be modified with "one-shot" programs, written to resolve issues that require "surgery" on the data
▪ Take the appropriate action to fix the technical (coding) problem: Edit program source - modifying the existing production
logic …and/or…
Modify the JCL (if the error included JCL issues)
You may have to edit files using File Manager
▪ Test your solution: Compile and Link the new version of the application
Create an "image copy" of the production file system, in order to test your fix
Re-Run the batch job and analyze results
Run "Regression Tests" against the new code – analyzefor unexpected results
28
Resolution
▪ Build (Compile/Link) the program(s) within your test environment
▪ Test/Validate your hypothesis/solution
▪ Migrate source modifications using your Version Control or SCM
▪ Promote your changes to production
▪ Schedule/Re-run the cycle
▪ Document the problem and its resolution – and optionally build in safe-guards into your development practices:
▪ Maintenance procedures/Best practices
▪ Testing tools and platforms
▪ Coding standards
29
Section Summary
Having completed this section on Production Support/Application
Testing/Software Defects and IBM Mainframe COBOL ABEND
Research, you should now be able to:
Define the steps in a generalized methodology of ABEND resolution
List the various sources of ABEND inputs, including:
▪ PD Tools documents
▪ Other SYSOUT
▪ Dynamic trace facilities
▪ Static code analytics
List the common types of COBOL program ABENDS
30
UNIT
Topics:
The IDz Workbench
▪ Analyzing Mainframe Abends
▪ ABEND Codes and Reasons
▪ Fault Analyzer
▪ Appendicies
Notes: • In order to understand why the z/OS runtime ABENDS in your code, you will need to
understand the z/OS software operations (Op Codes). • The traditional point of study is the Principles of Operation (POP) manual (not light reading)• Alternatively – your COBOL or PL/I instruction might have provided instructional guidelines
• Specific z/OS releases and COBOL Compiler options can modify the ABEND (MVS) Code information in this section You may need to discuss specific discrepancies found in this material with your Systems staff.
31
ABEND Completion Codes – And some typical causes
▪ There are as many reasons for ABEND conditions ("WHYs") as there are production systems. But is useful to categorize HOW certain ABEND completion codes are caused by specific programming patterns. This can expedite your approach to ABEND analysis
▪ The following information on a few common z/OS ABEND completion codes, and the conditions which generated them is included for you to make effective use of PD Tools/Fault Analyzer listings and the above debugging, research and analysis process.
Notes:This information is available to some degree within the ADFz product in the
Lookup View. There are other sources of MVS Completion Codes that you can find on the web:▪ http://ibmmainframes.com/references/a29.html
▪ http://ibmmainframes.com/topic-42-0-250.html
▪ http://www.jaymoseley.com/hercules/sabends.htm
32
S001: Record Length/Block Size DiscrepancyReason(s)
S001-0: Conflict between record length specifications (program vs. JCL vs. dataset label)
S001-2: Damaged storage media or hardware error
S001-3: Fatal QSAM error
S001-4: Conflict between Block specifications (program vs. JCL)
S001-5: Attempt to read past end-of-file‘
Instructions: OPEN, CLOSE, READ, WRITE
Frequent Coding Causes:
S001-0: Typos in FD or JCL
S001-2: Corrupt disk or tape dataset
S001-3: Internal z/OS problem
S001-4: Forgot to code BLOCK CONTAINS 0 RECORDS in FD (default Block is 1)
S001-5: Logic error (either forgot to close file, or end-of-file-switch not set, overwritten or ignored)
Tools to debug/IDz equivalent return codes:
S001-0: Cannot occur on IDz with Local ASCII/Windows (Line Sequential) files
S001-2: Norton Utilities – if on Workstation/COBOL application
S001-4: Cannot occur on Workstation/COBOL (no blocking for Line Sequential files)
S001-5: Logic error: Use IDz's Perform Hierarchy or AD's Program Flow Diagram to detect
Dynamic:
S001-0: During Debug – set a Watch Monitor on the 01 record
S001-2: Need to have PC/IT technician investigate (may need to reformat disk)
S001-4: Always code BLOCK CONTAINS 0
33
S013: Conflicting DCB Parameters
Reason(s)S013-10: Dummy data set needs buffer space; specify BLKSIZE in JCL
S013-14: DD statement must specify a PDS
S013-18: PDS member not found
S013-1C: I/O error search PDS directory
S013-20: Block size is not a multiple of the LRECL
S013-34: LRECL is incorrect
S013-50: Tried to open a printer for Input of I/O
S013-60: Block size not equal to LRECL for unblocked file
S013-64: Attempted to Dummy out indexed or relative file
S013-68: Block size > 32K
S013-A4: SYSIN or SYSOUT not QSAM file
S013-A8: Invalid RECFM for SYSIN/SYSOUT
S013-D0: Attempted to define PDS with RECFM FBS or FS
S013-E4: Attempted to concatenate > 16 PDSs
Instructions: OPEN, CLOSE, READ, WRITE
Frequent Coding Causes:Most of these ABENDs occur running und z/OS (some may not even occur under z/OS, although older modules
running OSVS or VS COBOL II code that have not been recompiled can produce them).
Most are due JCL/COBOL➔ FD inconsistencies.
Tools to debug – Static Analysis:
S013-18: Open multiple windows on AD Batch Job Diagram and program Environment Division -SELECT ASSIGN clause
34
SOC1: Invalid InstructionReason(s)
- SYSOUT DD statement missing
- The value in an AFTER ADVANCING clause is < 0 or > 99
- And Index or Subscript is out of range
- An I/O verb was issued against an unopened dataset
Instructions:
OPEN, CLOSE, READ, WRITE, Table handling routines
Note also that during Debug SYSOUT-DISPLAYs are written to the "console"
Frequent Coding Causes:
- Incorrect logic in setting AFTER ADVANCING variable (or failure to understand 0-99 limits)
- Incorrect logic in table handling code, or number of table entries has overflowed the PIC of variable
e.g. PIC 99 (two digits, max) - but there are 100 entries in the table
Tools to debug:Static
SYSOUT problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN.
In AD: Double-click on GO TO verb, or PERFORM chain, or paragraph name.
In IDz: Select Paragraph name/Perform chain and select: Open Declaration
Dynamic:
Set Watch Breakpoint and Monitor on table index or AFTER ADVANCING variable.
Set conditional advanced break point on subscript (i.e. SUB<100).
35
S0C4: Protection Exception
Reason(s)The program is attempting to access a memory address that is not within the applications z/OS Address Space
Frequent Coding Causes:
- JCL DD statement is missing or incorrectly coded
- Incorrect logic in table handling code (referencing a table subscript < 1 or > max-table-size),
- Number of table entries has outgrown PIC of variable (i.e. PIC 99, but 100 entries).
- In IMS/TM systems, an MFS LL (length) field value is smaller than the actual input MSG length.
Tools to debug:Static
- DD statement problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN
- IMS LL problem: Analyze through multiple Edit Windows (same solution as DD).
- Incorrect linkage problem:
- Open multiple windows on CALLing and CALLed programs - verify linkage declarations.
Dynamic
Incorrect linkage problem:
- Set Breakpoint and Monitor on linkage declarations.
- Set conditional advanced break point on subscript (i.e. IDX < 100).
Incorrect logic.
- In IDz/Debug - set a conditional break point on subscript (i.e. IDX < 100).
36
S0C7: Data ExceptionReason:
Machine instruction expecting numeric data found invalid data
Instructions:
Arithmetic, IF-THEN-ELSE, MOVE (if receiving field is numeric - )
Note: IDz will also S0C7 if sending field is numeric and contains non-numeric (MOVE pic9field TO picXfield)
Frequent Coding Causes:
- Incorrectly initialized, or uninitialized variable
- Missing or incorrect data edit
- 01 to 01 level MOVE if sending field is shorter than receiving field
- Move of Zeros to Group-level numeric fields
- MOVE CORRESPONDING incorrect
- or -
- MOVE field1 to field2 incorrect assignment
Tools to debug:
Static
AD report with options data selector on MOD or ALL
Dynamic
Set Watch points and Monitor on field.
Run through to S0C7.
Locate the field definition, or use CSI report.
Solutions:
Add edit checks for all numeric fields and MOVE statements.
37
S0CB: Divide by Zero
Reason:
CPU attempted to divide a number by 0.
Instructions:
DIVIDE, COMPUTE
Frequent Coding Causes:
- Incorrectly initialized, or un-initialized variable
- Missing or incorrect data edits (i.e. failed to check divisor for zero value)
Tools to debug:
Static
AD report on all DIVIDE and COMPUTE instructions – or using IDz double-click on these verbs and select Filter from the Context Menu
Dynamic
Run through to the S0CB
Locate to field definitions of the offending fields
Solution:
Add edit to check for zero divide:IF divisor > ZERO
THEN
COMPUTE ...
ELSE
PERFORM error-processing routine
Add ON SIZE ERROR to all arithmetic verbs.
38
S222/S322: Timeout … Endless Loop
Reason:
Timeout due to program logic caught in "loop" through instruction set with no exit.
Frequent Coding Causes:
- Invalid logic or fall-through logic
- Invalid end-of-file logic
- End-of-file switch overlaid
- Subscript not large enough
- Perform Thru wrong Exit
- PERFORM UNTIL "End-Of-File", but not performing "READ" routine to reach EOF condition
Tools to debug:Static
Perform Hierarchy on logic in PERFORM chain
Program Control Flow
Dynamic
PD Tools (mainframe) Debug to S222
Analyze counts (color)
Query and Monitor on subscript
Set an Advanced Break Point - Conditional on count
Solution:
From within Debug, use Program Control Flow to identify logic which could cause looping.
Select and click on PERFORM THRU, PERFORM UNTIL, GO TO.
Place break points on potential error lines.
39
S806: Module Not Found
Reason:CALL made to program which could not be located along normal search path
(STEPLIB top-to-bottom, JOBLIB top-to-bottom, LINKPACK)
Instructions:Program CALL keyword or JCL EXEC PGM=XXXX
Frequent Coding Causes:- Module deleted from library, or never compiled to library
- Module name spelled incorrectly
- STEPLIB does not contain load library with module
- I/O error occurred while z/OS searched the directory of the library
Tools to debug:Static
Build (Link) Map
Do Remote Systems search on module name – in the Load Libraries
Dynamic
Set Program Advanced Break Point (Entry) to set program break before entry to system.
Solution:
Spell name correctly
Check for 0 or 4 return code from Link Edit (Build step)
40
B37/D37/E37 – Dataset or PDS Index Space ExceededABENDS - B37/D37/E37 (RTS-028)
B37: Disk volume out of space.
D37: Primary space exceeded, no secondary extents defined.
E37: Primary and secondary extents full. In TSO, PDS directory needs compress.
E37-04: Disk volume table of contents (VTOC) is full.
Reason:
MVS could not find space for output WRITE to disk
Instructions:
WRITE
Frequent Coding Causes:
- Not enough space initially allocated to output file(s).
- (more likely) Logic error - program in (infinite) loop writing output file(s) - see S222/S322 reasons.
Tools to debug:
Static – Fault Analyzer will show the DSNs of the out-of-space dataset. As will the JES Output messages
On the host the JCL will show the DDNAME and z/OS filespec of the dataset in question
Dynamic
Set an advanced conditional break point to break on a certain number on iterations
See S222/S322 reasons and solutions
Also, set break point on file WRITE statements
41
Database “ABENDS” – Unrecoverable Events from I/O Operations
Typically database-access routines are coded to test for specific return code values from the DBMS after each I/O operation. And the program will shut itself down if the specific return code values do or do not occur.
DB2:SQLCODE
A unique integer which describes DB2's reaction to your request.
SQLCA
Variable group which contains fields pertinent to debugging, particularly the SQLWARNs.
▪ IMS (DL/I database), VSAM and QSAM file management systems also pass values back to the application program that describe the outcome of each I/O (insert/update/delete/read) call.Consult your shop standards for coding best practices to determine how to utilize these
Better – create reusable code structures using IDz Snippets & Templates to simplify file access coding and make it consistent
Debugging approach:
Set Line Breakpoint and/or Variable Monitor on SQLCODE and other key feedback areas
- or -
Set Line Breakpoint and Watch Monitor for /"On-Change Break"
Double-click on field, Ctrl/F3
42
UNIT
Topics:
The IDz Workbench
▪ Analyzing Mainframe Abends
▪ ABEND Codes and Reasons
▪ Fault Analyzer▪ Appendices
43
Unit objectives
After completing this unit, you should be able to:
Work with ABEND analysis reports created by Fault Analyzer
Browse Report and Mini-Dump pages
Retrieve various Fault Analyzer view information
Browse and search ABEND codes
Use the various productivity features in the Fault Analyzer perspective
Reminder…This course is written for z/OS developers, not Systems Programmers.While Fault Analyzer has deep z/OS analytical tools that can be used by your Systems Programming staff, this material is created for COBOL and PL/I programmers responsible for discovering and analyzing application program ABENDs and their root causes.
44
Shooting Dumps – Trad. ABEND Analysis
Face facts:
“Shooting a dump“ (traditional ABEND research) is not a quick or easy task
▪ You need Assembler experience as well as z/OS systems knowledge to understand Address Space, Control Blocks, Registers, Base/Offsets, Hex addressing, etc.
I’d rather use Fault Analyzer – which: Identifies the line where execution halted
Shows the salient points-of-interest surrounding the ABEND:
▪ Variables and variable values
▪ Statements
▪ Data and buffers
Gives you a head start on the What/Where and How of ABEND analysis Work-Flow
45
What is Fault Analyzer?
▪ Fault Analyzer is a tool that helps you determine the cause of
an application ABEND. It determines:
What happened, How it happened
In which program(s) – On which lines – Using which variables
Accessing which Files …or… which Databases
▪ Fault Analyzer provides the necessary information to perform root cause
analysis on an application ABEND.
You do not have to interpret low-level, system dumps and wade through HEX
data & addresses. Information is presented in report format
▪ Fault Analyzer gathers information about an application and the
surrounding environment at the time of an abnormal end (ABEND),
providing you with the valuable information you need to work through
▪ After analyzing information about your application and its environment,
Fault Analyzer generates an analysis report (IDIREPORT) that describes
the problem in terms of application/program statements and variables
46
What does Fault Analyzer Provide?
Fault Analyzer answers
the questions:
What happened
Where it happened
How it happened
▪ In which program(s)
▪ On which lines
▪ Using which variables
▪ Accessing which Files
Etc.
47
Fault Analyzer for z/OS – Language and Environment Support
▪ Fault Analyzer supports:
▪ IMS and CICS® online application and system failures - with
debugging facilities for all of the online file systems and databases
– IMS-DL/I, DB2, VSAM…
▪ WebSphere® Application Server for z/OS system failures
▪ WebSphere MQ application failures
▪ Batch application failures that access:
– IMS-DLI. QSAM/VSAM, DB2…
▪ Language support▪ COBOL
▪ PL/I
▪ Assembler
▪ C/C++
▪ Language Environment
▪ UNIX System Services
▪ Java
48
1. An application ABENDs. The system intercepts the ABEND and calls a Fault Analyzer exit. The exit invokes Fault Analyzer (FA)
z/OS
Fault
Analyzer
Application(batch or
online)
Options
FA Invocation Exit
Abend
Real-Time Analysis
2. FA reads Options
that control whether it
will analyze the
ABEND, how to
process the ABEND,
and which Fault History
file to use
▪ Installation options are
specified for the system
▪ Options can be
overridden for a job step
or online region
Fault Analyzer – Operational Flow – 1 of 3
WHEN and WHERE the
ABEND occurred
49
4. Files for source mapping are read
▪ It searches for matching SYSDEBUG files, side files, and compiler listings
▪ Multiple libraries can be searched
z/OS
FA Invocation Exit
Fault
Analyzer
Application
Sysdebug files, compiler listings, and side files
Options
Real-Time Analysis3. Fault Analyzer examines
programs and the
environment in the
application Address Space
Application
Abend
Fault Analyzer – Operational Flow – 2 of 3
Fault Analyzer does the
ABEND Preparation for you
Fault Analyzer does much of
the ABEND Research for you
50
SYSOUT
Analysis
Report
5. A new Fault Entry is written to a Fault History File. The entry contains:
▪ Information about the application
▪ The Analysis Report
▪ A “mini-dump” of the application (this enables reanalysis)
6. The Analysis Report is written to SYSOUT (batch jobs only)
z/OS
Fault
Analyzer
Application
Fault History File
Fault
Entry
SYSDEBUG files, Compiler Listings, or Side Files
Options
FA Invocation Exit
Abend
Real-Time Analysis
Fault Analyzer – Operational Flow – 3 of 3
HOW the ABEND
occurred – Analysis of
the Fault Event
51
ABEND Resolution: Preparation, Research & Hypothesis
WHAT, WHEN, WHERE and
HOW the ABEND occurred
52
Reviewing ABENDs in the Fault Analyzer Perspective
Besides FA’s main Analysis Report (IDIREPORT) you may also wish to use the Fault Analyzer perspective
To do that:
1. Switch to (Open) the Fault Analyzer perspective in IDz
2. Specify the history file to connect with, that populates a
Default ABEND view with failed online and batch job
IDIREPORTs and other outputs
– You may be able to utilize the default file
3. Learn how to navigate the Fault Analyzer perspective,
to make use of the information contained therein
The next slides contain step details...
53
Fault Analyzer Perspective – 1 of 2
Steps:Open the Fault Analyzer Perspective ➔
54
Fault Analyzer Perspective – 2 of 2
Enter: FAULTANL.<version>.HIST
Ex: FAULTANL.V14R1.HIST
55
Fault Analyzer Perspective – Overview
Fault History files
Report Outline
List of ABENDS in the current Fault History file
Additional Reports
IDIREPORT
56
Fault Analyzer Perspective – The Outline View Sections in the IDIREPORT synch with entries in the Outline view. Double-click an entry to open the associated section
5757
Fault Analyzer – Report Tabs
Click the tabs to navigate to the report sections
58
Default List of History FilesFrom the Default tab
Scroll up and down – to find a particular ABEND
Double-click an ABEND history file, to bring up its IDIREPORT and other stats
Sort the list by any of the column headings
▪ Can also work with options of the Context Menu – with each ABEND entry
59
Filtering the list of ABENDs – 1 of 2
Right-click anywhere in the list > Filters >
then select a column to filter
RT
click
Clear Filters using this entry ➔
60
Specify a column-filter value for the list – 2 of 2
Wildcard characters
can be used
61
Main Report Example – S0CBThe IDIREPORT presents
a formatted, high-level
summary of the points of
interest necessary to
debug ABEND conditions
in your application.
Specifically, to answer the
questions:
• What happened?What z/OS ABEND
condition
• Where did it happen?What line or statement
was executing when it
happened
• How did it happen?What additional
information is available
for debugging purposes
Program line where
the S0CB occurred
Click S0CB for an explanation of this ABEND
62
IDIREPORT Example – S0C4
Here's an example of
an IDIREPORT which
shows that RPT-REC is
“Not addressable"
…a euphemism for:
"There's something wrong
with the: FD, JCL DD,
Data Set connection"
63
Fault Analyzer – Main Report Example – S0C7The IDIREPORT and
supporting text varies from
ABEND to ABEND
depending on:
• Type of ABEND
• Information available
at the time of the
ABEND
• Run-time platform
Note: CUST-ACCT-BALANCE
value is shown in hex because, even though the
field is declared as numeric,
invalid numeric data exists at
runtime
64
Fault Analyzer – Main Report Example – S0C9
A S0C9 is like a S0CB
(divide by zero) except
that a S0C9 occurs
because of an
excessively large fixed-
point number obtained
as the result of a
decimal division
operation
65
Fault Analyzer – Main Report Example – S0C1The IDIREPORT on an
IMS (TM) S0C1 ABEND
66
Fault Analyzer – Main Report Example – S806
IDIDREPORT information on a module-not-found (S806) ABEND
Most likely SAM2 is
either a typo on the CALL
statement, or the
program did not
successfully
compile/link into the
Load Module
67
Open Source File to ABEND Instruction
▪ Fault Analyzer can open the program source and position your cursor on the exact COBOL statement that failed.
▪ Steps:
From the IDIREPORT – click the source line #
From the FA Invocation Options Page specify the PDS that contains the source module
68
Lookup View - For MVS, DB2, IMS, MQ and File Return Codes
The Lookup view
shows a great
deal of
background
information on:• ABEND codes
• DB2 SQLCODE
• IMS PCB Feedback
• VSAM File Status
etc.
You can use the view,
or double-click on the
ABEND code shown in
the IDIREPORT
An alternative to the Lookup View: MVS Return Codes for Application Programmers• http://ibmmainframes.com/references/a29.html
69
What does ASRA stand for?ASRA means ABEND SYSTEM RECOVERY MESSAGE/REASON A. Various parts of CICS raise the error
▪ The first letter 'A' stands for ABEND.
▪ The second and third letters are from the name of the routine which raised the ABEND. In the situation of ASRA the routine is DFHSRP. The 4th and 5th letters of the program raising the Abend, make up the 2nd and 3rd letters of the Abend code. In this case SR. Giving ASR.
▪ The 4th letter signifies which error has been raised as each program may have the capacity to raise more than 1 error. Hence the last letter being A,B,C,1,2 or 0 etc. In the case of ASRA message/error type A has been raised giving ASRA.
▪ ATSB - Abend Temporary Storage. This is raised by DFHTSP.
▪ Terminal Control Abends are of the format ATC_ and are raised by DFHTCP.
▪ Task Control Abends are of the format AKC_ and are raised by DFHKCP. KC in this instance as TC has already been used for Terminal Control.
▪ AICA - Abend Interval Control. This is message A from program DFHICP.
▪ The AEI_ Abend codes are from the Exec Interface and are produced by DFHEIP.
70
Event Details – Event Summary
The event summary shows the call chain
Each line is an event or a program in the call chain
SAM2 was the active program when the ABEND occurredHyperlinks to the source file
71
Event Details – Program Detail section
Paragraph trace
The detail report for the 1st program begins here
72
Event Details – Current Statement + Variables
Current statement
Variables referenced by the current statement and their values
73
Event Details – Load Module Details
Link-Edit Date/Timestamp
74
Event Details – Instructions and General Purpose Registers
Current machine instruction
Register values
75
Event Details – Associated Files
Information is displayed about each file that was open at the time of the ABEND
7676
Event Details – File Buffers (Blocks)
Associated storage areas displays program variables, when source information is available
77
Event Details – Working Storage
77
Working-Storage SectionData Values + Declarations
7878
Abend Information – General information about the job and modules
Job information
79
Abend Information – Module Summary
Module summary
80
Fault Analyzer – System Wide InformationThis section contains console messages that are not identified as belonging to any specific event, or CICS system-related information, such as trace data and 3270 screen buffer contents. It is preceded by the heading: S Y S T E M - W I D E I N F O R M A T I O N -Information about open files that could not be associated with any specific event might also be included here. If there is no information in this section, then it does not appear in the report.
Information on Data Set not associated with the ABEND
81
Fault Analyzer – Miscellaneous
This section contains information about the Fault Analyzer options and files.
82
Fault Analyzer – Mini-Dump Reading 1 of 2
Fault Analyzer also provides for the
reading/browsing of System Dump data –
in Hex/Character format.Select an ABEND
Scroll through the dump –Issue navigation commands: Show nnn, +nn, etc.
83
Fault Analyzer Integration – Mini-Dump Reading 2 of 2
You can assign analysis notes to the
dump.
1. Right-click over the storage address
2. Add your note (click OK) ➔
3. Your note becomes
highlighted text inside the
dump
84
Checkpoint
1. What IDz Perspective is used to view Fault Analyzer reports?
2. How does IDz obtain Fault Analyzer Information? Where does the information originate?
3. IDz Fault Analyzer interface has a Lookup View. What is it used for?
4. How can you jump to the program statement where the ABEND occurred with the IDz Fault Analyzer interface?
85
Opening Fault Analyzer from Remote Systems/JES – 1 of 2
You can open a Fault Analyzer report directly from Remote Systems/JES.• First ensure that the name of your LPAR/Connection is the same as
the Systems Information Host name
86
• Then – you’ll be able to Right-Click on an Abended Job and hyperlink directly to the Fault Analyzer report
Opening Fault Analyzer from Remote Systems/JES – 2 of 2
87
Summary
Having completed this unit, you should now be able to:
Work with ABEND analysis reports created by Fault Analyzer
Browse Report and Mini-Dump pages
Retrieve various Fault Analyzer view information
Browse and search ABEND codes
Use the various productivity features in the Fault Analyzer perspective
88
UNIT
Topics:
The IDz Workbench
▪ Analyzing Mainframe Abends
▪ ABEND Codes and Reasons
▪ Fault Analyzer
▪ Appendices
▪ Miscellaneous Fault Analyzer slides
▪ COBOL and ABENDs
89
Downloading and Installing the Fault Analyzer Client
Click the Fault Analyzer link
https://developer.ibm.com/mainframe/products/downloads/
90
Fault Analyzer – Operational Process (Terms & Concepts)
▪ Fault Analyzer has the ability to isolate the exact instruction that caused an ABEND:
The analysis engine provides automatic analysis when the application fails.
When an ABEND occurs, Fault Analyzer activates automatically, and then records details in
a fault history file (see screen capture below)
Fault History files contain information about the faults analyzed by Fault Analyzer for z/OS.
Using Fault History files, re-analysis is available when real-time ABEND analysis isn’t
enough (you can extract additional information in batch or interactive mode)
ABEND happens
Fault Analyzer exits are invoked ➔
Salient details (points of interest)
written and stored ➔
91
Workshop – Big Picture
Steps/Stages:1. Copy a several datasets from your instructor's zServerOS TSO ID to your ID
▪ Details on the next slide
2. Modify JCL dataset names (and high-level qualifiers) to match your Sandbox ID
3. Compile a program named: HOSPCALC – which contains different types of COBOL
ABENDs generated from invalid COBOL logic in different parts of the program
4. Run your program (if it ABENDs)m from the Fault Analyzer IDIREPORT:
▪ Find the error in the COBOL source, and use the IDIREPORT ABEND analysis data to fix the error
▪ After you've solved the problem, you will save your edits, and re-compile HOSPCALC. Then run the
program until you either get the next ABEND … or get a zero return code ☺
ProgramLoad Module
Analyze IDIREPORT
Fix HOSPCALC COBOL error
Compile
Link Edit
92
Fault Analyzer – IDIREPORT
The IDIREPORTprovides ABEND analysis information
WHAT, WHEN
and WHERE the
ABEND occurred
HOW the ABEND occurred
93
Inputs to the Debugging and ABEND-Resolution Process
▪ Along with the System Completion Code Fault Analyzer provides reports about your program run which describe What/Where/When/How the ABEND occurred.
▪ Valuable information contained in the Fault Analyzer report-files includes:
The System Completion Code (and often a short text description of what it designates)
A short explanation of the cause of the ABEND
The COBOL instruction (statement) or line number, which contained the invalid operation causing z/OS to halt execution
Variables of interest – and code surrounding the instruction that halted execution
A "core-dump" (a hexadecimal printout) of the internal machine storage and registers relevant to the areas of your program surrounding the COBOL instruction which caused z/OS to halt execution.
▪ This information is critical to begin understanding and researching the problem, but it is sometimes insufficient to solve the underlying application problem, which could be any combination of: Incomplete, incorrect or invalid COBOL procedural logic
A typo such as a misplaced period, or incorrectly specified field
Incorrect or invalid input data
Batch jobs run out of sequence
Input files missing or corrupted (hardware errors)
Errors which relate to JCL problems
etc.
94
Fault Analyzer for z/ OS – Overview and Use
▪ Fault Analyzer runs in both test and production with very little
overhead.
▪ Fault Analyzer:
Helps you analyze failures when they occur or reanalyze them after the fact
Expands error messages and codes that apply to your failure with interactive
reanalysis and includes a feature for using application-specific messages and
codes to supplement those supplied by IBM
Creates a fault history file with an interactive display that helps you track and
manage application failures
Starts automatically when an application fails, eliminating the need to recompile
programs or change the job control language (JCL)
Integrates with IBM Developer for z Systems – and enables developers to
diagnose application problems without changing user interface
Note that you do not have to make any changes to existing programs in order to allow Fault Analyzer to produce an analysis of an ABEND. Nor do you have to recompile programs in order to use Fault Analyzer
95
UNIT
ABEND Resolution
• Terms and Concepts
• Types of ABENDs
• Defensive Programming
• Specific ABENDs
• ABEND on purpose
ABENDs and COBOL
"It's almost a given that there is some
amount of invalid data floating around
in the files and data bases."
z/OS Architect, 2020
Appendix 1
96
z/OS ABEND (ABnormal END of Task)
▪ Production business application software errors are costly:
While they are nowhere near as expensive as mistakes on an operating table
They’re more expensive than mixing up the 1% vs. 2% milk in the dairy cabinets…or hitting Reply All when you actually meant to hit Reply
▪ There are ~dozen categories of common COBOL errors which produce ABENDS. These include but are not limited to:
Incorrect data typing of field definitions
Incorrect subprogram parameter passing order
Invalid data within files
▪ Values out of range
▪ Specific bad values … missing values
Incorrect record-layout offset definitions
Programmer/Analyst/Developer errors
▪ Misunderstanding of the specs – Typically the biggest & most expensive single issue
▪ Incomplete testing – Second biggest issue
▪ Misunderstanding of the COBOL language - Third biggest issue
An ABEND is a
mainframe business
application "Blue
Screen"
97
ABENDING in Production vs. Test and Development
▪ ABENDs during Development & Test are expected
Not welcome - but expected
▪ ABENDs in Production are expensive - and unacceptable
They negatively impact corporation financials, market reputation, etc.
ABENDs during
Development/Test
Inconvenient and
Expected
Production ABENDs
Unacceptable
Potential loss of business revenue
98
ABEND or Invalid Data - which is worse?
▪ It is widely held that invalid production data is far worse than MVS ABEND situations:
When something ABENDS it ABENDS
▪ Execution stops
▪ z/OS tells you precisely what failed - when & where it failed (the why & how are up to
you to discover)
▪ Backout routines can be called automatically
▪ CHECKPOINT routines can be used to provide point-in-time recovery
When applications "go EOJ":
▪ Results may (or may not) be correct– Often only business users can verify this
▪ If results are not correct:– What's wrong - was it the data or the code?
– If it's the code, where in heck do you start?
– Backtrack - or start from the beginning
▪ If this was production, invalid values will negatively impact the corporation - not just you
or your team
Sometimes programs contain their own "self-balancing" defensive-programming:
▪ Record in/Record out counters
▪ Amounts in/Amounts out as well "trial balances"
99
▪ Alphanumeric Data: Truncation
Incorrect PIC clause alignment in the record layout
▪ Numeric data: Reference to numeric field that contains non-
numeric data
Decimal place precision and rounding - esp. with internal variables
▪ File Problems: Read past end of file
Reference to file before OPEN or after CLOSE
Write loop fills up an output file
▪ IF Conditions Incorrect specification of True/False logic
References to numeric fields that contain non-numeric data
ABENDS and COBOL Coding Errors
• Programmatic "fall-thru"
• COBOL statements execute downwards sequentially - irrespective of paragraph boundaries
• Unchecked PERFORM UNTIL (Iteration):
• Infinite Loops
• Index issues:• Typically "index out of range"
• File Handling:• Invalid ASSIGN clause
• JCL: • Incorrect module name
• Invalid DD Name
• Invalid DSN
• DISP = not correct with READ/WRITE
• Application Version Control Issues
Typical COBOL ABEND causes for sequential batch applications:
100
▪ Data: Truncation: Understand the COBOL MOVE
instruction
Incorrect PIC clause alignment in the record layout: Align the actual data file to the record layout
▪ Numeric data: Reference to numeric field that contains non-
numeric data: Liberal use of IF … NOT NUMERIC tests
Decimal place precision and rounding - esp. with internal variables: Understand the underlying accounting - and use ROUND
▪ File Problems: Read past end of file: Debugging, Desk-Checking
and Peer Reviews
Reference to file before OPEN or after CLOSE: Ditto
Write loop fills up the output file: Understand the record capacity and file Space Allocation. Debug for Infinite Loop
▪ IF Conditions Incorrect specification of True/False logic: Debug
with "Jump To" function, Flow Charting, Clear understanding of the COBOL semantics and business spec.
Avoiding ABENDS
• Program "fall-thru": • Paragraph Fall Thru: Debug with "Conditional
Watch Monitors and/or code a DISPLAY statement at the top of the paragraph - which names the paragraph.
• IF/Conditional Fall /thru: Ditto
• Iteration:• Infinite Loops: Check for numeric truncation in
loop counters
• File Handling• Invalid ASSIGN clause: Vertical split screen, view
JCL & Program ENVIRONMENT DIVISION
• JCL • Incorrect module name: Typically easy (JCL Error)
• Invalid DD Name - View ENVIRONMENT DIVISION and batch JCL side-by-side
• Invalid DSN: JCL Error
• File not the correct DCB: Debug with "Conditional Watch Monitors.
• DISP= not correct with READ/WRITE: ABEND upon OPEN <file>. In general: OPEN INPUT assumes that the file contains data (DISP=SHR) and OPEN OUTPOUT assumes that the file is empty (DISP=NEW). OPEN OUTPUT will over-write the content of a file.
101
Common COBOL Business Application ABEND Types
There are more ABEND types and situations that you'll see as a COBOL coder. But understanding these nine common ABENDS in this list will get you started
Also - z/OS will mask or return different system ABENDS than those listed below depending on whether the ABENDS occur in a layer of System Software (CICS, Language Environment)
▪ S001 - Record Length/Block Size Discrepancy
▪ S013 - Empty File/Record Length/Block Size Discrepancy
▪ S0C1 - Invalid Instruction
▪ S0C4 - Storage Protection Exception
▪ S0C7 - Data Exception
▪ S0CB - Divide by Zero
▪ S222/S322 - Time out/Job Cancelled - Infinite Loop
▪ S806 - Module Not Found
▪ B37/E37 - Out of space (output file)
102
S001: Record Length/Block Size - Discrepancy
Reason(s)
S001-0: Conflict between record length (program vs. JCL vs. dataset label)
S001-2: Damaged storage media or hardware error
S001-3: Fatal QSAM error
S001-4: Conflict between Block specifications (program vs. JCL)
S001-5: Attempt to read past end-of-file
Instructions: OPEN, CLOSE, READ, WRITE
Frequent Coding Causes:
S001-0: Typos in FD or JCL
S001-2: Corrupt disk or tape dataset
S001-3: Internal z/OS problem
S001-4: Forgot to code BLOCK CONTAINS 0 RECORDS in FD (default Block is 1)
S001-5: Logic error (forgot to close file, or end-of-file-switch not set, overwritten, etc.)
Defensive Programming:
1. Split-Screen COBOL ➔ JCL
2. From JCL: Right-Click on DSN … Open Declaration
3. Select File and verify LRECL from the Properties View
103
S001: Record Length/Block Size - Discrepancy
Defensive Programming:
1.Split-Screen COBOL ➔ JCL ➔ File Properties
2.From JCL: Right-Click on DSN … Open Declaration
3.Select File and verify LRECL from the Properties View
4.3.
5.
2.
1.
104
S013: Conflicting DCB Parameters
Reason(s)S013-10: Dummy data set needs buffer space; specify BLKSIZE in JCL
S013-14: DD statement must specify a PDS
S013-18: PDS member not found
S013-1C: I/O error search PDS directory
S013-20: Block size is not a multiple of the LRECL
S013-34: LRECL is incorrect
S013-50: Tried to open a printer for Input of I/O
S013-60: Block size not equal to LRECL for unblocked file
S013-64: Attempted to Dummy out indexed or relative file
S013-68: Block size > 32K
S013-A4: SYSIN or SYSOUT not QSAM file
S013-A8: Invalid RECFM for SYSIN/SYSOUT
S013-D0: Attempted to define PDS with RECFM FBS or FS
S013-E4: Attempted to concatenate > 16 PDSs
COBOL Instructions: OPEN, CLOSE, READ, WRITE
Frequent Coding Causes:Most of these ABENDs occur running under z/OS (some may not even occur under z/OS, although older modules running on older operating systems
(OSVS or VS COBOL II code) that have not been recompiled can produce them). And most are due JCL/COBOL➔ FD inconsistencies.
Tools to debug – Static Analysis: S013-18: Same technique as S001
105
S013: Block Size - Discrepancy
Defensive Programming:
1.Delete the BLOCK CONTAINS 0 RECORDS line as shown below
2.Save and COBUCLG
3.Open the IDIREPORT
Note that in COBOL v6 – with certain directives BLOCK issues are solved by the Operating System.
105
106
SOC1: Invalid Instruction
Reason(s)
- SYSOUT DD statement missing
- The value in an AFTER ADVANCING clause is < 0 or > 99
- And Index or Subscript is out of range
- An I/O verb was issued against an unopened dataset
- Can also happen of CALL/ENTRY subroutine LINKAGE does not match the calling programs record definition
- File READ attempted before File OPEN
Instructions:OPEN, CLOSE, READ, WRITE, Table handling routines
Note also that during Debug SYSOUT-DISPLAYs are written to the "console"
Frequent Coding Causes:
- Incorrect logic in setting AFTER ADVANCING variable (or failure to understand 0-99 limits)
- Incorrect logic in table handling code, or number of table entries has overflowed the PIC of variable e.g. PIC 99 (two digits, max) - but there are 100 entries in the table
Tools to debug:
Static
SYSOUT problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN.
Logic problem: Select File. Use Occurrences in Compilation to isolate statements
Dynamic:
Set Watch Breakpoint and Monitor on table index or AFTER ADVANCING variable.
Set conditional advanced break point on subscript (i.e. SUB<100).
106
107
S0C4: Protection Exception
Reason(s):
The program is attempting to access a memory address that is not within the applications z/OS "Address Space"
Frequent Coding Causes:- JCL DD statement is missing or incorrectly coded:
File Status: 47 upon READ Instruction
- Incorrect logic in table handling code (referencing a table subscript < 1 or > max-table-size)
- INITIALIZE used against a Buffer (file FD) that hasn't been opened.
- Number of table entries has outgrown PIC of variable (i.e. PIC 99, but 100 entries).
.
Tools to debug:
Static
- DD statement problem: Open multiple windows on AD Batch Job Diagram and program Environment Division - SELECT ASSIGN
- Incorrect linkage problem:
- Open multiple windows on CALLing and CALLed programs - verify linkage declarations.
Dynamic
The problem with S0C4 ABENDS, is that once they happen - there's nothing left to capture and assist with Debugging.
An "Address Space" is a block of virtual memory your Load
Module is assigned and runs in, when executing on z/OS. If your
program attempts to reference memory beyond the Address Space
assigned, z/OS ABENDS your program with an S0C4
Important Note: Compile parameters influence what statements will and won't S0C4
108
S0C7: Data Exception
Reason:
Machine instruction expecting numeric data found invalid data
Instructions:
Arithmetic, IF MOVE (if receiving field is numeric) and PERFORM VARYING statements
Your application can S0C7 if the sending field is numeric and contains non-numeric data (MOVE pic9field TO picXfield).
Frequent Coding Causes:
- Incorrectly initialized, or uninitialized variable
- Missing or incorrect data edit
- 01 to 01 level MOVE if sending field is shorter than receiving field
- Move of Zeros to Group-level numeric fields
- MOVE CORRESPONDING incorrect
- MOVE field1 to field2 incorrect assignmentstatements.
Tools to debug:
Static
Occurrences in Compilation Unit on numeric fields
Isolate all PIC 9 Fields
Dynamic
Set Watch points and Monitor on field.
Record the Debug session - Run through to S0C7 and Playback from the ABEND
Locate the field definition - and use client data analysis tools
Solutions:
Add edit checks for valid data in all numeric fields
Define all numeric data that does do participate in arithmetic as PIC X
Important Note: Compile parameters influence what statements will and won't S0C7
109
S0CB: Divide by Zero
Reason:CPU attempted to divide a number by 0.
Instructions:DIVIDE, COMPUTE with / operation
Frequent Coding Causes:- Incorrectly initialized, or un-initialized variable
- Missing or incorrect data edits (i.e. failed to check divisor for zero value)
Tools to debug:Static
Search for all DIVIDE and COMPUTE
instructions – or using IDz double-click on these
verbs and select Filter from the Context Menu
Dynamic
Run through to the S0CB
Locate to field definitions of the offending fields
Solution:
Add edit to check for zero divide:
IF divisor > ZERO
THEN
COMPUTE ...
ELSE
PERFORM error-processing
routine
Add ON SIZE ERROR to all arithmetic verbs.
110
S222/S322: Timeout … Endless Loop
Reason:
Timeout due to program logic caught in "loop" through instruction set with no exit. S322 = Timeout
S222 = Job Cancelled
Frequent Coding Causes:- Invalid logic or fall-through logic
- Invalid end-of-file logic
- End-of-file switch overlaid
- Subscript not large enough
- Perform Thru wrong Exit
- PERFORM UNTIL "End-Of-File", but not performing "READ" routine to reach EOF condition
Tools to debug:
Static
Perform Hierarchy/Program Control Flow on logic in PERFORM chain
Desk-Checking for other loop possibilities
Dynamic tools.
Debug to Loop
Query and Monitor on subscript
Set an Advanced Break Point - Conditional on count
Solution:
For S322 - you may need to increase the TIME=(,n) value in
the JCL Job Card
For S222 - you will need to read the code carefully to find one
of the Frequent Coding Causes
Note: You will need to
Cancel the job to stop
the Endless Loop ➔
111
S806: Module Not Found
Reason:CALL made to program which could not be located
along normal search path - which is:
//STEPLIB
//JOBLIB
LINKPACK
Instructions:Fix the program CALL keyword or the JCL EXEC PGM=XXXX
Frequent Coding Causes:- Module deleted from library, or never compiled to library
- Module name spelled incorrectly
- STEPLIB does not contain load library with module
- I/O error occurred while z/OS searched the directory of the library
Tools to debug:Static
Build (Link) Map
Do Remote Systems search on module name –
in the Load Libraries
Dynamic
Set Program Advanced Break Point (Entry) to break
before entry to system.
Solution:
Spell name correctly
Check return code from Link Edit (Build step)
112
B37/D37/E37: Dataset or PDS Index Space Exceeded
ABENDS - B37/D37/E37 (RTS-028)
B37: Disk volume out of space.
D37: Primary space exceeded, no secondary extents defined.
E37: Primary and secondary extents full. In TSO, PDS directory needs compress.
E37-04: Disk volume table of contents (VTOC) is full.
Reason:MVS could not find space for output file WRITEs to disk
COBOL Instructions:WRITE
Frequent Coding Causes:
- Not enough space initially allocated to output file(s).
- (more likely) Logic error - program in (infinite) loop writing output file(s) - see S222/S322 reasons.
Tools to debug:
Static – Fault Analyzer will show the DSNs of
the out-of-space dataset. As will the JES
Output messages
On the host the JCL will show the DDNAME and
z/OS filespec of the dataset in question
Dynamic
Set an advanced conditional break point to break
on a certain number on iterations
See S222/S322 reasons and solutions
Also, set break point on file WRITE statements