Using IPCS to look at z/OS dumps

I’ve been trying to find a big deep problem in my on z/OS, and I’ve been getting plenty of dumps.

I thought I’d give a short list of command for people new to IPCS. It is not detailed – but at least should give you sign posts as to what you can do.

First get your dump.

On my system dumps are in data sets like SYS1.S0W1.Z31B.DMP00001.

You can issue the operator command D D,E to give information about the dumps on your system. It provides information on where the problem occurred, and registers etc. This may be enough to help you solve your problem.

You can print the EREP report, again this gives a summary of the problem, and may be enough to solve your problem. See Logrec and example output.

Using ISPF editor on the output

On many of the displays you can use the command

REPORT VIEW

to capture what is currently displayed, and display it in an ISPF view session, so you can use ISPF edit commands on it.

If you display the first 24 lines of output, and use REPORT VIEW, you will get ISPF edit with the 24 lines of output. I tend to display the report, scroll down to the bottom of the report (so it is all displayed), and then use REPORT VIEW on the complete report.

Into IPCS

From the main ISPF IPCS panel, select option 0 defaults.

Tab down to

Source  ==> DSNAME('SYS1.S0W1.Z31B.DMP00001')

If you want IPCS to remember this next time you come in, change

Scope   ==> LOCAL   (LOCAL, GLOBAL, or BOTH)

Change local to both, and press enter.

Type =6

to issue an IPCS command.

If this is the first time you have used this dump, type

DROPD

This says drop all information you know about the current dump. This is because last week you may have had a different dump called SYS1.S0W1.Z31B.DMP00001. Any information stored about that dump does not apply to the current dump.

PF3 back to the command window.

status

This gives you a summary of the contents of the dump.

Dump Title: COMPON=BPX,COMPID=SCPX1,ISSUER=BPXMIPCE,MODULE=BPXVOTHD+00000244, ABEND=S00C4,REASON=00000004       

Use

find  'DIAGNOSTIC DATA REPORT'

This gives you information on

  • The symptom string. If you search on the internet with this string you may find other people have hit the same problem
  • Time of Error Information
    • The PSW pointing to (past) the failing instruction
    • Failing instruction text: 12 bytes of data around the failure
    • Translation exception address: If there was a problem with accessing the data
    • The registers and access registers
    • The Home, Primary and Secondary address spaces ids. (For cases where your program jumps to a different address space)

Commands

Display storage in the current ASID: IP WHERE 20E1F344

For the given address IPCS will information about which module it is in.

ASID(X'0010') 20E1F344. BPXINPVT+618344 IN EXTENDED PRIVATE       

Display storage in a different ASID: ip where 20E1F344 asid(x’59’)

If your program is using multiple address spaces, you can specify the address space id.

Display 64 bit data: ip l 050_814F7F30

This displays the storage at the address in standard dump format

Display storage key of the page: ip l 050_814F7F30 display(machine)

Also displays

 ASID(X'0010') ADDRESS(50_814F7F30.) KEY(00) ABSOLUTE(02_03C9AF30.)     

so you can see the key of the page.

Display the offset of the storage within the display: ip l 050_814F7F30 str

The str says display it as a structure – it gives the hex offsets of each line at the start of the line

+002B0 _14F81E0. 00000000 00000000 ...
+002D0 _14F8200. 00000000 00000000 ...

Display the instructions from the page: ip l 20E1F344 I

LIST 20E1F344. ASID(X'0010') LENGTH(X'1000') INSTRUCTION       
ASID(X'0010') ADDRESS(20E1F344.) KEY(00) ABSOLUTE(F9ECA344.)

20E1F354 | D28F 700C 2008 | MVC X'C'(X'90',R7),X'8'(R2)
20E1F35A | 9140 208C | TM X'8C'(R2),X'40'
20E1F35E | A774 0065 | BRC X'7',*+X'CA'
20E1F362 | 9180 208C | TM X'8C'(R2),X'80'
20E1F366 | A7E4 0008 | BRC X'E',*+X'10'
20E1F36A | A798 0004 | LHI R9,X'4'
20E1F36E | 5090 D2D0 | ST R9,X'2D0'(,R13)
20E1F372 | A7F4 0006 | BRC X'F',*+X'C'
20E1F376 | A728 0003 | LHI R2,X'3'
20E1F37A | 5020 D2D0 | ST R2,X'2D0'(,R13)
20E1F37E | 9602 D2D4 | OI X'2D4'(R13),X'02'
20E1F382 | 5898 0000 | L R9,X'0'(R8)
20E1F386 | 41E0 D2D0 | LA R14,X'2D0'(,R13)

Display storage for a given length: ip l 20E1F344 length(20)

I use

IP SETDEF LENGTH(4096) 

to display 4KB of data each time.

Set defaults: IP SETDEF

The IP SETDEF command (or ISPF option 0) gives

....
/*---------------- Local Default Values for IPCS Subcommands ---------------*/
SETDEF LOCAL NOPRINT TERMINAL NOPDS /* Routing of displays */
SETDEF LOCAL FLAG(WARNING) /* Optional diagnostic messages */
SETDEF LOCAL NOCONFIRM /* Double-checking major acts */
SETDEF LOCAL NOTEST /* IPCS application testing */
SETDEF LOCAL DSNAME('SYS1.S0W1.Z31B.DMP00001')
SETDEF LOCAL LENGTH(4096) /* Default data length */
SETDEF LOCAL VERIFY /* Optional dumping of data */
SETDEF LOCAL DISPLAY( MACHINE) /* Include storage keys, .... */
SETDEF LOCAL DISPLAY( REMARK) /* Include remark text */
SETDEF LOCAL DISPLAY( REQUEST) /* Include model LIST subcommand */
SETDEF LOCAL DISPLAY(NOSTORAGE) /* Include contents of storage */
SETDEF LOCAL DISPLAY( SYMBOL) /* Include associated symbol */
SETDEF LOCAL DISPLAY( ALIGN) /* Align output to byte */
SETDEF LOCAL ASID(X'0010') /* Default address space */

If you are using multiple address spaces, you can set the default address space so you do not have to specify ASID(x’..’) every time.

Browsing interesting storage

You can use option 1 from the main IPCS panel, to browse data, and keep track of your interesting data.

You enter the address, (and address space if required) and can browse the data.

This pointer data is stored in IPCS. If you give it a remark, it will persist across sessions, and so help you later.

Once you have defined at least one pointer, you can get a list of all of the pointers you have defined. Use the S prefix command to select it.

System trace

This is where it gets harder

The command

SYSTRACE

displays information from the system trace. Some of the key fields are

0002-0010 008DC518  SVC      2 00000000_20BC9420 
07041001 80000000
0002-0010 008DC518 SVCR 2 00000000_20BC9420
07041001 80000000
  • 0002 is the CPU number (which I very rarely use)
  • 0010 is the asid
  • 008DC518 is the TCB address
  • SVC/SVCR 2 is the supervisor call/supervisor call return
  • 00000000_20BC9420 is the PSW where the SVC occured.

On the right of the display is a string like

E29F45B336755380 

This is a Store clock value. If you use the option SYSTRACE TIME(GMT) it is displayed as

06:39:12.803669218 

What is interesting?

I use REPORT VIEW get an ISPF edit session of the trace

  • X ALL
  • F ‘*’ 20 20 all

This gives information such as I/Os completing, but also

  • *RCVY recovery code was entered
  • *SVC D a dump request was made
  • *PGM… an exception occurred – such as invalid storage access.

I then label each line of interest, such as putting .aa in the field on the left hand side.

  • reset
  • loc .aa

To locate the line of interest. Note, lines of interest will normally be at the bottom of the output – just before the dump is taken.

The system carries on processing while dump is being taken, so you will continue to get records in the system trace after the point of failure/request for a dump.

You can use standard ISPF edit commands, such as sort, exclude, delete all X, delete all NX; and use ISPF edit macros (written in Rexx) for special processing.

Note: You cannot use the IPCS command from ISPF edit sessions to display information from the dump. You have to come out of ISPF edit.

Note: Some PGM exceptions are valid – such as page not in storage – and z/OS will read it into memory. These do not have the *; they are just PGM, and are usually not interesting.

Program calls (PC) and SVCs

A program call has output like

 PC     ...   0            00_01A5184A     0030A           LocAscb 
PR ... 0 00_01A5184A 0142923C
  • For every PC there is a matching return (PR) with the same address (00_01A5184A)
  • The PC number is 0030a – and because this is a system value – it knows it is for the LOCASBC request.

Other subsystems, CICS, MQ, DB2 also use PC routines, but their PC number can be different between IPLs, and if you can have more than one CICS, MQ, DB2 subsystems on an LPAR.

Other requests in the SYSTRACE

  • DSP this TCB was dispatched
  • SSRV 11E This TCB was paused ( suspended because there was no work for it to do)
  • EXT CLKC the system timer interrupted this thread

What is a PC doing?

In my trace I have

 PC     ...   0            00_20DDF46C     0C101 

What is 0c101 doing?

Issue the commands

  • SUMMARY FORMAT
    • This formats most of the significant control blocks.
  • Go to the bottom and use REPORT VIEW
  • FIND ‘0c101’ 1 10

This gives output like

                              **PC INFORMATION** 

AUTH
PC KEY EXEC ENTRY EXEC LATENT
NUMBER MASK ASID ADDRESS STATE PARMS
-------- ---- ---- ------- ----- ------------------ ...
0000C100 8000 0059 20811470 S 00000000 00000000 ...


0000C101 8000 0059 20811D68 S 00000000 00000000 ...

The PC number C101 goes to address space with ASID 59 – and address 20811D68

You can then use the commands

ip   where  20811D68  asid(x'59')
ip l 20811D68 asid(x'59') i

to display which load module the code is in, and the instructions which will be executed.