+ All Categories
Home > Documents > DataGuard Support Issues 10072007 (1)

DataGuard Support Issues 10072007 (1)

Date post: 04-Jun-2018
Category:
Upload: diaxlee
View: 219 times
Download: 0 times
Share this document with a friend

of 76

Transcript
  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    1/76

    Oracle DataGuard

    Support IssuesBrian HitchcockOCP 10g DBA

    Sun [email protected]

    [email protected]

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 1

    mailto:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    2/76

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    3/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 3

    DataGuard

    Must be SYS to make changes Sqlplus / as sysdba

    Changes to DataGuard standby database

    Some cant be made while apply process running Change Guard status

    Support Issues Create physical standby

    Convert to logical standby After logical standby is running

    Refresh process

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    4/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 4

    DataGuard Errors

    DataGuard reports lot of errors

    Standby database alert log

    Many are for normal operation

    Why reported as errors?

    Monitoring of db alert log

    Will report these errors

    Hard to filter out normal errors

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    5/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 5

    Create Physical Standby

    On Primary database

    Enable Forced Logging

    Create password file

    Setup init.ora/spfile parameters

    Cant connect to standby

    SYS password

    Verify archiving enabled

    Backup db (hot or cold)

    Create standby control file

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    6/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 6

    Create Physical Standby

    On Standby database

    Copy db backup files from primary

    Copy standby control file from primary

    Setup init.ora/spfile parameters

    Start physical standby db

    Trace file

    Verify physical standby working

    May not see redo logs, register them

    Redo logs not deleted, use RMAN

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    7/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 7

    Convert to Logical Standby

    On Primary database Build LogMiner dictionary

    On Standby database

    Stop redo apply Errors, no impact

    Convert database to logical standby

    Two trace files

    Restart db

    Open resetlogs

    Verify logical standby working

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    8/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 8

    Logical Standby is Running

    Business requirements Standby frozen most of the day

    Standby catches up once per day

    Alert log messages while catching up Disk space for archived redo logs

    Other issues Apply process is slow

    How to detect, resolve Primary versus Standby backups

    Impact, resolution

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    9/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 9

    Logical Standby is Running

    Other Issues Constraint violations

    Errors, resolution

    No data found

    Errors, resolution

    ORA-16211

    Errors, Oracle Support

    Primary db XDB schema issues

    Fixed on primary, errors on standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    10/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 10

    Logical Standby is Running

    Other issues

    ORA-07445

    Refresh cures all

    Refresh process

    After refresh ORA-16211: unsupported record found in the archived redo log

    Compile invalid objects Import into standby database

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    11/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 11

    Primary Cant Connect

    Standby not available

    Reported on primary production database

    ORACLE not available

    Looks like production primary is down

    Your monitoring may need to be adjusted

    Thu Oct 18 16:59:20 2007Error 1034 received logging on to the standbyThu Oct 18 16:59:20 2007Errors in file /shared/orahome01/admin/BRHPROD/bdump/brhprod_arc1_2635.trc:ORA-01034: ORACLE not availablePING[ARC1]: Heartbeat failed to connect to standby BRHPRSB'. Error is 1034.

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    12/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 12

    SYS Password Issue

    Mon Oct 8 15:31:36 2007

    Error 1017 received logging on to the standby

    ------------------------------------------------------------Check that the primary and standby are using a password file

    and remote_login_passwordfile is set to SHARED or EXCLUSIVE,

    and that the SYS password is same in the password files.

    returning error ORA-16191

    ------------------------------------------------------------

    Mon Oct 8 15:31:36 2007

    Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_arc0_2309.trc:

    ORA-16191: Primary log shipping client not logged on standbyPING[ARC0]: Heartbeat failed to connect to standby BRHBRSB'. Error is 16191.

    Primary tries to connect to standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    13/76

    www.brianhitchcock.netBrian Hitchcock October 23, 2007 Page 13

    SYS Password Issue

    Verify SYS password is the same On primary and standby

    Sqlplus sys/

    Verify password file has same password On primary and standby

    Cat $ORACLE_HOME/dbs/orapw

    Refresh password file Alter user SYS identified by

    Update password file

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    14/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 14

    DataGuard Trace File

    Physical Standby

    Start log apply process

    Trace file created

    Stops when log apply process stops

    See file contents later

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    15/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 15

    Cant See Redo Logs

    Physical Standby Creating or Refreshing standby

    Primary configured, sending redo logs

    Standby not yet created/running

    Standby may not register redo logs Our scripts maintain primary archived redo logs

    Compress to save disk space, delete after 2 days

    Manually register Alter database register logfile ;

    DataGuard applies redo log

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    16/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 16

    Cant See Redo LogsBRHBETA> select * from v$archive_gap;

    THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#---------- ------------- --------------1 1959 1976

    BRHBETA> select sequence#, applied from v$archived_log order by sequence#;

    SEQUENCE# APP

    ---------- ---1956 YES1957 YES1958 YES1977 NO1978 NO1979 NO

    1980 NO1981 NO1982 NO1983 NO1984 NO

    11 rows selected.

    1959 thru 1976 on standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    17/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 17

    Redo Logs Not Deleted

    Physical Standby

    After applied to standby

    Unlike logical standby

    SQL apply process does delete them

    Use RMAN

    Possible disk space issues on standby

    How long will you need to store redo logs? Not an issue if converting to logical soon

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    18/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 18

    Stop Physical Standby

    Log Apply Process start

    Starts trace file

    When physical standby first created

    Ends when log apply stops

    Normal processing

    Trace file looks like a problem

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    19/76

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    20/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 20

    Standby Alert Log

    Tue Oct 9 16:34:34 2007Physical Standby Database mounted.

    Completed: ALTER DATABASE MOUNT

    Tue Oct 9 16:34:36 2007

    alter database recover managed standby database disconnect from session

    Tue Oct 9 16:34:36 2007

    Attempt to start background Managed Standby Recovery process (BRHBETA)

    MRP0 started with pid=11, OS id=13474Tue Oct 9 16:34:36 2007

    MRP0: Background Managed Standby Recovery process started (BRHBETA)

    Managed Standby Recovery not using Real Time Apply

    parallel recovery started with 7 processes

    Log apply process started when physical standby created

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    21/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 21

    Standby Alert Log

    Wed Oct 10 10:15:15 2007alter database recover managed standby database cancel

    Wed Oct 10 10:15:19 2007

    MRP0: Background Media Recovery cancelled with status 16037

    Wed Oct 10 10:15:19 2007

    Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc:

    ORA-16037: user requested cancel of managed recovery operation

    Recovery interrupted!

    Wed Oct 10 10:15:20 2007

    Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc:

    ORA-16037: user requested cancel of managed recovery operation

    Wed Oct 10 10:15:20 2007

    MRP0: Background Media Recovery process shutdown (BRHBETA)

    Wed Oct 10 10:15:21 2007

    Managed Standby Recovery Canceled (BRHBETA)

    Wed Oct 10 10:15:21 2007Completed: alter database recover managed standby database cancel

    Log apply process stoppedpreparing to convert to logical standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    22/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 22

    Trace File

    $ more /orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc/orahome01/admin/BRHBETA/bdump/brhbeta_mrp0_13474.trc

    Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit Production

    With the Partitioning, OLAP and Data Mining options

    ORACLE_HOME = /orahome01/product/10.2.0

    System name: SunOS

    Node name: brh-beta1-zone04

    Release: 5.10

    Version: Generic_118833-36Machine: sun4u

    Instance name: BRHBETA

    Redo thread mounted by this instance: 1

    Oracle process number: 11

    Unix process pid: 13474, image: oracle@beta1-zone04 (MRP0)

    *** SERVICE NAME:() 2007-10-09 16:34:36.298

    *** SESSION ID:(394.1) 2007-10-09 16:34:36.298

    ARCH: Connecting to console port...

    *** 2007-10-09 16:34:36.299 60639 kcrr.c

    MRP0: Background Managed Standby Recovery process startedStart applying redo logs to physical standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    23/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 23

    Trace File*** 2007-10-09 16:34:41.302 1018 krsm.c

    Managed Recovery: Initialization posted.

    *** 2007-10-09 16:34:41.303 60639 kcrr.cManaged Standby Recovery not using Real Time Apply

    Recovery target incarnation = 2, activation ID = 0

    Influx buffer limit = 27762 (50% x 55524)

    Successfully allocated 7 recovery slaves

    Using 158 overflow buffers per recovery slave

    Start recovery at thread 1 ckpt scn 8257757517457 logseq 1956 block 5*** 2007-10-09 16:34:42.124

    Media Recovery add redo thread 1

    *** 2007-10-09 16:34:42.124 1018 krsm.c

    Managed Recovery: Active posted.

    ORA-00367: checksum error in log file header

    ORA-00305: log 1 of thread 1 inconsistent; belongs to another database

    ORA-00312: online log 1 thread 1: '/shared/oralogs01/BRHBETA/redo01a.log'

    *** 2007-10-09 16:34:42.147 60639 kcrr.c

    Clearing online redo logfile 1 /shared/oralogs01/BRHBETA/redo01a.log*** 2007-10-09 16:36:15.066

    *** 2007-10-09 16:36:15.066 60639 kcrr.c

    Clearing online redo logfile 1 complete

    ORA-00367: checksum error in log file header

    ORA-00305: log 2 of thread 1 inconsistent; belongs to another database

    ORA-00312: online log 2 thread 1: '/shared/oralogs01/BRHBETA/redo02a.log'

    Recreating redo logs

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    24/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 24

    Trace File*** 2007-10-09 16:36:15.100 60639 kcrr.c

    Clearing online redo logfile 2 /shared/oralogs01/BRHBETA/redo02a.log

    *** 2007-10-09 16:37:51.473

    *** 2007-10-09 16:37:51.473 60639 kcrr.c

    Clearing online redo logfile 2 complete

    ORA-00367: checksum error in log file header

    ORA-00305: log 3 of thread 1 inconsistent; belongs to another database

    ORA-00312: online log 3 thread 1: '/shared/oradata02/BRHBETA/redo03b.log'

    *** 2007-10-09 16:37:51.479 60639 kcrr.c

    Clearing online redo logfile 3 /shared/oradata02/BRHBETA/redo03b.log*** 2007-10-09 16:39:26.048

    *** 2007-10-09 16:39:26.048 60639 kcrr.c

    Clearing online redo logfile 3 complete

    ORA-00367: checksum error in log file header

    ORA-00305: log 4 of thread 1 inconsistent; belongs to another database

    ORA-00312: online log 4 thread 1: '/shared/oradata02/BRHBETA/redo04b.log'

    *** 2007-10-09 16:39:26.488 60639 kcrr.c

    Clearing online redo logfile 4 /shared/oradata02/BRHBETA/redo04b.log*** 2007-10-09 16:41:00.447

    *** 2007-10-09 16:41:00.447 60639 kcrr.c

    Clearing online redo logfile 4 complete

    *** 2007-10-09 16:41:00.469 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1956

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    25/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 25

    Trace File*** 2007-10-09 16:41:00.469 60639 kcrr.c

    Fetching gap sequence in thread 1, gap sequence 1956-1976

    *** 2007-10-09 16:41:30.782-----------------------------------------------------------

    Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

    parameter is defined to a value that is sufficiently large

    enough to maintain adequate log switch information to resolve

    archivelog gaps.

    -----------------------------------------------------------

    *** 2007-10-09 16:54:31.045

    *** 2007-10-09 16:54:31.045 60639 kcrr.cFetching gap sequence in thread 1, gap sequence 1956-1956

    *** 2007-10-09 16:55:01.154

    -----------------------------------------------------------

    Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

    parameter is defined to a value that is sufficiently large

    enough to maintain adequate log switch information to resolve

    archivelog gaps.-----------------------------------------------------------

    *** 2007-10-09 16:56:31.179

    Media Recovery Log /oraarch01/BRHBETA/LOG_1956_1_629245032.arc

    *** 2007-10-09 16:56:33.431

    Media Recovery Log /oraarch01/BRHBETA/LOG_1957_1_629245032.arc

    *** 2007-10-09 16:56:44.495

    Applying redo logs to physical standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    26/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 26

    Trace File*** 2007-10-09 16:56:44.495 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1958

    *** 2007-10-09 16:56:44.495 60639 kcrr.cFetching gap sequence in thread 1, gap sequence 1958-1976

    *** 2007-10-09 16:57:14.647

    -----------------------------------------------------------

    Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

    parameter is defined to a value that is sufficiently large

    enough to maintain adequate log switch information to resolve

    archivelog gaps.-----------------------------------------------------------

    *** 2007-10-09 17:05:14.785

    Media Recovery Log /oraarch01/BRHBETA/LOG_1958_1_629245032.arc

    *** 2007-10-09 17:05:18.043 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1959

    *** 2007-10-09 17:05:18.043 60639 kcrr.c

    Fetching gap sequence in thread 1, gap sequence 1959-1976

    *** 2007-10-09 17:05:48.284-----------------------------------------------------------

    Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

    parameter is defined to a value that is sufficiently large

    enough to maintain adequate log switch information to resolve

    archivelog gaps.

    -----------------------------------------------------------

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    27/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 27

    Trace File*** 2007-10-09 17:07:18.309

    Media Recovery Log /oraarch01/BRHBETA/LOG_1959_1_629245032.arc

    *** 2007-10-09 17:07:21.114Media Recovery Log /oraarch01/BRHBETA/LOG_1960_1_629245032.arc

    *** 2007-10-09 17:07:22.945

    Media Recovery Log /oraarch01/BRHBETA/LOG_1961_1_629245032.arc

    *** 2007-10-09 17:07:27.300

    Media Recovery Log /oraarch01/BRHBETA/LOG_1962_1_629245032.arc

    *** 2007-10-09 17:07:29.637

    Media Recovery Log /oraarch01/BRHBETA/LOG_1963_1_629245032.arc

    *** 2007-10-09 17:07:29.709 60639 kcrr.cMedia Recovery Waiting for thread 1 sequence 1964

    *** 2007-10-09 17:07:29.709 60639 kcrr.c

    Fetching gap sequence in thread 1, gap sequence 1964-1976

    *** 2007-10-09 17:07:59.858

    -----------------------------------------------------------

    Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

    parameter is defined to a value that is sufficiently largeenough to maintain adequate log switch information to resolve

    archivelog gaps.

    -----------------------------------------------------------

    *** 2007-10-09 17:08:29.866

    Media Recovery Log /oraarch01/BRHBETA/LOG_1964_1_629245032.arc

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    28/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 28

    Trace File*** 2007-10-09 17:08:31.924

    Media Recovery Log /oraarch01/BRHBETA/LOG_1965_1_629245032.arc

    *** 2007-10-09 17:09:12.510Media Recovery Log /oraarch01/BRHBETA/LOG_1966_1_629245032.arc

    *** 2007-10-09 17:09:21.050

    Media Recovery Log /oraarch01/BRHBETA/LOG_1967_1_629245032.arc

    *** 2007-10-09 17:09:40.234

    Media Recovery Log /oraarch01/BRHBETA/LOG_1968_1_629245032.arc

    *** 2007-10-09 17:09:45.055

    Media Recovery Log /oraarch01/BRHBETA/LOG_1969_1_629245032.arc

    *** 2007-10-09 17:09:50.572Media Recovery Log /oraarch01/BRHBETA/LOG_1970_1_629245032.arc

    *** 2007-10-09 17:09:58.968

    Media Recovery Log /oraarch01/BRHBETA/LOG_1971_1_629245032.arc

    *** 2007-10-09 17:10:03.922

    Media Recovery Log /oraarch01/BRHBETA/LOG_1972_1_629245032.arc

    *** 2007-10-09 17:10:13.196

    Media Recovery Log /oraarch01/BRHBETA/LOG_1973_1_629245032.arc*** 2007-10-09 17:10:21.927

    Media Recovery Log /oraarch01/BRHBETA/LOG_1974_1_629245032.arc

    *** 2007-10-09 17:10:34.064

    Media Recovery Log /oraarch01/BRHBETA/LOG_1975_1_629245032.arc

    *** 2007-10-09 17:10:42.420 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1976

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    29/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 29

    Trace File*** 2007-10-09 17:10:42.421 60639 kcrr.c

    Fetching gap sequence in thread 1, gap sequence 1976-1976

    *** 2007-10-09 17:11:12.538-----------------------------------------------------------

    Check that the CONTROL_FILE_RECORD_KEEP_TIME initialization

    parameter is defined to a value that is sufficiently large

    enough to maintain adequate log switch information to resolve

    archivelog gaps.

    -----------------------------------------------------------

    *** 2007-10-09 17:12:42.563

    Media Recovery Log /oraarch01/BRHBETA/LOG_1976_1_629245032.arc*** 2007-10-09 17:12:45.563

    Media Recovery Log /oraarch01/BRHBETA/LOG_1977_1_629245032.arc

    *** 2007-10-09 17:12:48.534

    Media Recovery Log /oraarch01/BRHBETA/LOG_1978_1_629245032.arc

    *** 2007-10-09 17:13:00.505

    Media Recovery Log /oraarch01/BRHBETA/LOG_1979_1_629245032.arc

    *** 2007-10-09 17:13:02.054Media Recovery Log /oraarch01/BRHBETA/LOG_1980_1_629245032.arc

    *** 2007-10-09 17:13:03.231

    Media Recovery Log /oraarch01/BRHBETA/LOG_1981_1_629245032.arc

    *** 2007-10-09 17:13:03.902

    Media Recovery Log /oraarch01/BRHBETA/LOG_1982_1_629245032.arc

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    30/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 30

    Trace File*** 2007-10-09 17:13:04.492

    Media Recovery Log /oraarch01/BRHBETA/LOG_1983_1_629245032.arc

    *** 2007-10-09 17:13:08.171Media Recovery Log /oraarch01/BRHBETA/LOG_1984_1_629245032.arc

    *** 2007-10-09 17:13:26.860

    *** 2007-10-09 17:13:26.860 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1985

    *** 2007-10-09 17:16:07.172

    Media Recovery Log /oraarch01/BRHBETA/LOG_1985_1_629245032.arc

    *** 2007-10-09 17:16:08.067

    Media Recovery Log /oraarch01/BRHBETA/LOG_1986_1_629245032.arc*** 2007-10-09 17:16:08.131

    Media Recovery Log /oraarch01/BRHBETA/LOG_1987_1_629245032.arc

    *** 2007-10-09 17:16:08.195 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1988

    *** 2007-10-09 17:16:13.202

    Media Recovery Log /oraarch01/BRHBETA/LOG_1988_1_629245032.arc

    *** 2007-10-09 17:16:13.268 60639 kcrr.cMedia Recovery Waiting for thread 1 sequence 1989

    *** 2007-10-09 21:14:01.119

    Media Recovery Log /oraarch01/BRHBETA/LOG_1989_1_629245032.arc

    *** 2007-10-09 21:14:16.922

    *** 2007-10-09 21:14:16.922 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 1990

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    31/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 31

    Trace File*** 2007-10-10 09:32:33.399

    *** 2007-10-10 09:32:33.399 60639 kcrr.c

    Fetching gap sequence in thread 1, gap sequence 1990-1990*** 2007-10-10 09:33:05.187

    Media Recovery Log /oraarch01/BRHBETA/LOG_1990_1_629245032.arc

    *** 2007-10-10 09:33:22.505

    Media Recovery Log /oraarch01/BRHBETA/LOG_1991_1_629245032.arc

    *** 2007-10-10 09:33:22.570

    Media Recovery Log /oraarch01/BRHBETA/LOG_1992_1_629245032.arc

    *** 2007-10-10 09:33:22.631

    Media Recovery Log /oraarch01/BRHBETA/LOG_1993_1_629245032.arc*** 2007-10-10 09:33:22.693

    Media Recovery Log /oraarch01/BRHBETA/LOG_1994_1_629245032.arc

    *** 2007-10-10 09:33:22.761

    Media Recovery Log /oraarch01/BRHBETA/LOG_1995_1_629245032.arc

    *** 2007-10-10 09:33:22.807

    Media Recovery Log /oraarch01/BRHBETA/LOG_1996_1_629245032.arc

    *** 2007-10-10 09:33:22.864Media Recovery Log /oraarch01/BRHBETA/LOG_1997_1_629245032.arc

    *** 2007-10-10 09:33:22.918

    Media Recovery Log /oraarch01/BRHBETA/LOG_1998_1_629245032.arc

    *** 2007-10-10 09:33:23.199

    Media Recovery Log /oraarch01/BRHBETA/LOG_1999_1_629245032.arc

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    32/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 32

    Trace File*** 2007-10-10 09:33:23.255 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 2000

    *** 2007-10-10 10:11:07.685Media Recovery Log /oraarch01/BRHBETA/LOG_2000_1_629245032.arc

    *** 2007-10-10 10:11:08.422 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 2001

    *** 2007-10-10 10:14:48.843

    Media Recovery Log /oraarch01/BRHBETA/LOG_2001_1_629245032.arc

    *** 2007-10-10 10:14:49.013 60639 kcrr.c

    Media Recovery Waiting for thread 1 sequence 2002

    *** 2007-10-10 10:15:19.072*** 2007-10-10 10:15:19.072 60639 kcrr.c

    MRP0: Background Media Recovery cancelled with status 16037

    ORA-16037: user requested cancel of managed recovery operation

    ----- Redo read statistics for thread 1 -----

    Read rate (ASYNC): 619732Kb in 63640.12s => 0.01 Mb/sec

    Total physical reads: 619732Kb

    Longest record: 28Kb, moves: 0/2001133 (0%)Change moves: 779641/4101685 (19%), moved: 141Mb

    Longest LWN: 1023Kb, moves: 117/175493 (0%), moved: 23Mb

    Last redo scn: 0x0782.a8f27f37 (8257761607479)

    ----------------------------------------------

    *** 2007-10-10 10:15:19.088

    Media Recovery drop redo thread 1

    Stop Log Apply ProcessReady to convert to Logical Standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    33/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 33

    Trace File

    *** 2007-10-10 10:15:20.864 1018 krsm.c

    Managed Recovery: Not Active posted.

    ORA-16037: user requested cancel of managed recovery operation

    ARCH: Connecting to console port...

    *** 2007-10-10 10:15:20.871 60639 kcrr.c

    MRP0: Background Media Recovery process shutdown

    *** 2007-10-10 10:15:20.871 1018 krsm.coraarch01/BRHBETA $

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    34/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 34

    Convert to Logical Standby

    SQL Apply Process When applying redo logs

    Generates 2 trace files

    What are they?

    Trace files One shows start of kcrrwkx

    Second shows end of kcrrwkx

    What are these for? Neither show up in alert log

    Both continue as long as SQL apply process runs

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    35/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 35

    First Trace File/orahome01/admin/BRHBETA/bdump $ more brhbeta_arc0_13168.trc

    /orahome01/admin/BRHBETA/bdump/brhbeta_arc0_13168.trc

    Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit ProductionWith the Partitioning, OLAP and Data Mining options

    ORACLE_HOME = /orahome01/product/10.2.0

    System name: SunOS

    Node name: brh-beta1-zone04

    Release: 5.10

    Version: Generic_118833-36

    Machine: sun4u

    Instance name: BRHBETARedo thread mounted by this instance: 1

    Oracle process number: 24

    Unix process pid: 13168, image: oracle@beta1-zone04 (ARC0)

    *** SERVICE NAME:() 2007-10-10 10:40:26.358

    *** SESSION ID:(188.2) 2007-10-10 10:40:26.358

    kcrrwkx: nothing to do (start)*** 2007-10-10 10:45:26.240

    kcrrwkx: nothing to do (start)

    *** 2007-10-10 10:46:35.388

    kcrrwkx: nothing to do (end)

    ...

    ...

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    36/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 36

    Second Trace File/orahome01/admin/BRHBETA/bdump $ more brhbeta_arc1_13170.trc

    /orahome01/admin/BRHBETA/bdump/brhbeta_arc1_13170.trc

    Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit ProductionWith the Partitioning, OLAP and Data Mining options

    ORACLE_HOME = /orahome01/product/10.2.0

    System name: SunOS

    Node name: brh-beta1-zone04

    Release: 5.10

    Version: Generic_118833-36

    Machine: sun4u

    Instance name: BRHBETA

    Redo thread mounted by this instance: 1

    Oracle process number: 9

    Unix process pid: 13170, image: oracle@beta1-zone04 (ARC1)

    *** SERVICE NAME:() 2007-10-10 10:40:26.358

    *** SESSION ID:(396.1) 2007-10-10 10:40:26.358

    kcrrwkx: nothing to do (start)*** 2007-10-10 10:41:26.315

    kcrrwkx: nothing to do (end)

    *** 2007-10-10 10:42:26.322

    kcrrwkx: nothing to do (end)

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    37/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 37

    DataGuard Likes to Chat Physical Standby

    While applying archived redo logs

    Trace file documents everything standby does

    Logical Standby

    Once converted to logical standby Two trace files generated

    Contain messages for start/stop of each logapply

    Why are these generated? Why not have DataGuard alert logs? Trace files tell me that something is wrong

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    38/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 38

    Normal Operation

    Logical Standby catching up to Primary

    Apply process turned off during the day

    Catches up at night

    Apply process failed Catch up after fix (skip table in the example)

    Typical alert log messages

    Redo log from primary registered with DG

    Redo logs applied to standby

    Redo logs deleted from standby

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    39/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 39

    Standby Catching UpTue Oct 16 15:13:22 2007

    Completed: ALTER DATABASE STOP LOGICAL STANDBY APPLY

    Tue Oct 16 15:14:16 2007Incremental checkpoint up to RBA [0x7.a0aa2.0], current log tail at RBA [0x7.b8e2c.0]

    Tue Oct 16 15:14:45 2007

    ALTER DATABASE START LOGICAL STANDBY APPLY

    Tue Oct 16 15:14:45 2007

    ALTER DATABASE START LOGICAL STANDBY APPLY (BRHBETA)

    Tue Oct 16 15:14:45 2007

    No optional partAttempt to start background Logical Standby process

    LSP0 started with pid=21, OS id=5041

    LOGSTDBY status: ORA-16111: log mining and apply setting up

    Tue Oct 16 15:14:46 2007

    LOGMINER: Parameters summary for session# = 1

    LOGMINER: Number of processes = 3, Transaction Chunk Size = 201

    LOGMINER: Memory Size = 30M, Checkpoint interval = 150M

    Tue Oct 16 15:14:46 2007Completed: ALTER DATABASE START LOGICAL STANDBY APPLY

    LOGMINER: session# = 1, builder process P001 started with pid=7 OS id=10018

    LOGMINER: session# = 1, reader process P000 started with pid=34 OS id=10014

    LOGMINER: session# = 1, preparer process P002 started with pid=36 OS id=10020

    LSP2 started with pid=23, OS id=5043

    Stop SQL Apply process

    Start SQL Apply process after skipping table

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    40/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 40

    Standby Catching UpTue Oct 16 15:14:48 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2048_1_629245032.arcLOGSTDBY Analyzer process P003 started with pid=13 OS id=10051

    Tue Oct 16 15:14:48 2007

    LOGMINER: Turning ON Log Auto Delete

    LOGSTDBY Apply process P004 started with pid=40 OS id=10054

    LOGSTDBY Apply process P006 started with pid=42 OS id=10062

    LOGSTDBY Apply process P007 started with pid=17 OS id=10064

    LOGSTDBY Apply process P005 started with pid=15 OS id=10060

    Tue Oct 16 15:22:02 2007Beginning log switch checkpoint up to RBA [0x8.2.10], SCN: 8295181217591

    Thread 1 advanced to log sequence 8

    Current log# 4 seq# 8 mem# 0: /shared/oradata02/BRHBETA/redo04b.log

    Current log# 4 seq# 8 mem# 1: /shared/oralogs01/BRHBETA/redo04a.log

    Tue Oct 16 15:25:28 2007

    Completed checkpoint up to RBA [0x8.2.10], SCN: 8295181217591

    Tue Oct 16 15:34:32 2007Incremental checkpoint up to RBA [0x8.4cbae.0], current log tail at RBA [0x8.65553.0]

    Tue Oct 16 15:42:40 2007

    LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2048_1_629245032.arc

    Tue Oct 16 15:42:40 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2049_1_629245032.arc

    ...

    ...

    ...

    Processing redo logs

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    41/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 41

    Standby Catching Up

    Tue Oct 16 17:20:48 2007LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2049_1_629245032.arc

    Tue Oct 16 17:20:48 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2050_1_629245032.arc

    Tue Oct 16 17:20:54 2007

    LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2050_1_629245032.arc

    ...

    ...

    ...Tue Oct 16 18:39:13 2007

    LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2048_1_629245032.arc

    Deleted file /oraarch01/BRHBETA/LOG_2048_1_629245032.arc

    ...

    ...

    ...

    Tue Oct 16 18:43:40 2007LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2082_1_629245032.arc

    Tue Oct 16 18:43:59 2007

    LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2082_1_629245032.arc

    Tue Oct 16 18:43:59 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2083_1_629245032.arc

    Processing redo logs

    Deleting redo logs

    Processing redo logs

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    42/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 42

    Standby Catching Up

    Tue Oct 16 18:44:01 2007LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2056_1_629245032.arc

    Deleted file /oraarch01/BRHBETA/LOG_2056_1_629245032.arc

    Tue Oct 16 18:44:01 2007

    LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2057_1_629245032.arc

    Deleted file /oraarch01/BRHBETA/LOG_2057_1_629245032.arc

    Tue Oct 16 18:44:01 2007

    LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2058_1_629245032.arc

    Deleted file /oraarch01/BRHBETA/LOG_2058_1_629245032.arc

    ...

    Deleting redo logs

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    43/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 43

    Standby Catching Up

    Tue Oct 16 18:44:15 2007LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2087_1_629245032.arc

    Tue Oct 16 18:48:37 2007

    Completed checkpoint up to RBA [0xa.2.10], SCN: 8295181577382

    Tue Oct 16 18:55:18 2007

    Incremental checkpoint up to RBA [0xa.12dad.0], current log tail at RBA [0xa.1314b.0]

    Tue Oct 16 19:01:31 2007

    RFS[1]: No standby redo logfiles created

    RFS[1]: Archived Log: '/oraarch01/BRHBETA/LOG_2153_1_629245032.arc'

    Tue Oct 16 19:01:32 2007

    RFS LogMiner: Registered logfile [/oraarch01/BRHBETA/LOG_2153_1_629245032.arc] to LogMiner session id [1]

    Tue Oct 16 19:15:22 2007

    Incremental checkpoint up to RBA [0xa.142b2.0], current log tail at RBA [0xa.143fe.0]

    Tue Oct 16 19:29:01 2007

    LSP0: warning -- apply server 2, sid 384 waiting on user sid 196 for event (since 0 seconds):

    Tue Oct 16 19:29:01 2007LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2087_1_629245032.arc

    ...

    Primary is at 2153

    Standby is at 2087

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    44/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 44

    Standby Catching Up

    Tue Oct 16 19:30:58 2007

    LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA.PFILE.19144.1192413665' FROM SPFILE =LOGSTDBY status: ORA-16226: DDL skipped due to lack of support

    LOGSTDBY id: XID 0x0003.02d.00013e70, hSCN 0x0782.a9c2fdb8, lSCN 0x0782.a9c2fdb8, Thread 1, RBA

    LOGSTDBY stmt: create pfile='/orahome01/oradba/tmp/ora_adm_sqlbt_bkp.tmp1.17449.BRHBETA' from spfile

    LOGSTDBY status: ORA-16226: DDL skipped due to lack of support

    LOGSTDBY id: XID 0x000b.001.000126cf, hSCN 0x0782.a9c2fe15, lSCN 0x0782.a9c2fe15, Thread 1, RBA

    LOGSTDBY stmt: CREATE PFILE = '/tmp/datatools/BRHBETA.PFILE.19695.1192413687' FROM SPFILE =

    LOGSTDBY status: ORA-16226: DDL skipped due to lack of supportLOGSTDBY id: XID 0x0003.00c.00013e62, hSCN 0x0782.a9c2fe4a, lSCN 0x0782.a9c2fe4a, Thread 1, RBA

    LOGSTDBY stmt: ALTER DATABASE BACKUP CONTROLFILE TO '/tmp/datatools/dtodump_

    LOGSTDBY status: ORA-16226: DDL skipped due to lack of support

    LOGSTDBY id: XID 0x0009.007.00011453, hSCN 0x0782.a9c2feb4, lSCN 0x0782.a9c2feb4, Thread 1, RBA

    Tue Oct 16 19:30:58 2007

    ALTER TABLESPACE "SYSTEM" BEGIN BACKUP

    Completed: ALTER TABLESPACE "SYSTEM" BEGIN BACKUP

    Tue Oct 16 19:30:58 2007ALTER TABLESPACE "SYSTEM" END BACKUP

    Completed: ALTER TABLESPACE "SYSTEM" END BACKUP

    ...

    Unsupported DDLStandby doesnt execute

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    45/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 45

    Standby Catching Up

    Tue Oct 16 21:29:19 2007LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2157_1_629245032.arc

    Tue Oct 16 21:30:03 2007

    LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2157_1_629245032.arc

    Tue Oct 16 21:35:52 2007

    Incremental checkpoint up to RBA [0xa.f41b7.0], current log tail at RBA [0xa.f41cc.0]

    Tue Oct 16 21:55:56 2007

    Incremental checkpoint up to RBA [0xa.f43b5.0], current log tail at RBA [0xa.f43b5.0]

    Tue Oct 16 22:11:16 2007RFS[1]: No standby redo logfiles created

    RFS[1]: Archived Log: '/oraarch01/BRHBETA/LOG_2158_1_629245032.arc'

    Tue Oct 16 22:11:16 2007

    RFS LogMiner: Registered logfile [/oraarch01/BRHBETA/LOG_2158_1_629245032.arc] to LogMiner session id [1]

    Tue Oct 16 22:11:16 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2158_1_629245032.arc

    Tue Oct 16 22:11:20 2007LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2158_1_629245032.arc

    Standby catches up at 2158

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    46/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 46

    Archived Redo Logs

    Logical Standby

    After applied to standby

    SQL apply process does delete them

    Unlike physical standby

    Possible disk space issues on standby How long will you need to store redo logs?

    If standby frozen all day

    Weekends? Holidays? If standby fails

    How many days to fix failures?

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    47/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 47

    Archived Redo Logs

    How long are redo logs available on primary?

    If not on disk when needed for standby

    Recover from backup

    Dataguard may not see these redo logs

    Register redo logs

    Logical standby

    Also generates its own archived redo logs

    Needed to recover standby db Unique standby db objects?

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    48/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 48

    SQL Apply Process Slow

    Detect long-running transaction

    Compute estimate of time to complete

    Identify and skip problem table

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    49/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 49

    Long Running Transaction

    Standby Alert Log

    SQL apply process applying redo log 2049

    Doesnt move on within a few minutes

    Current time is Tue Oct 16 08:09:55 2007

    Shows start time for this redo log

    Has been processing for over 24 hours

    Mon Oct 15 05:52:29 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2049_1_629245032.arc

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    50/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 50

    Long Running Transaction

    What is apply process doing?

    Check redo logs waiting to be applied

    Where is processing in current redo log?

    How long to complete current redo log?

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    51/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 51

    Long Running Transactionalter session set nls_date_format = 'DD-Mon-YYYY hh24:mi:ss';column first_change# format 99999999999999999999

    column next_change# format 99999999999999999999column resetlogs_change# format 99999999999999999999select * from dba_logstdby_log;

    BRHBETA> SELECT TYPE, STATUS, HIGH_SCN FROM V$LOGSTDBY;

    TYPE------------------------------

    STATUS--------------------------------------------------------------------------------------------------------------HIGH_SCN---------------------COORDINATORORA-16116: no work available8257767540953

    READERORA-16127: stalled waiting for additional transactions to be applied8257767541085BUILDERORA-16127: stalled waiting for additional transactions to be applied8257767540965

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    52/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 52

    Long Running TransactionPREPARERORA-16127: stalled waiting for additional transactions to be applied

    8257767540965ANALYZERORA-16117: processing8257767540953

    APPLIERORA-16116: no work available8257767539467

    APPLIER

    ORA-16116: no work available8257767512259

    APPLIERORA-16113: applying change to table or sequence "BRH"."XXSUN_BRH_COMPS_INT"8257767539247

    APPLIERORA-16116: no work available82577675122629 rows selected.BRHBETA>

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    53/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 53

    Long Running TransactionBRHBETA> select * from dba_logstdby_log;

    THREAD# RESETLOGS_CHANGE# RESETLOGS_ID SEQUENCE# FIRST_CHANGE# NEXT_CHANGE#

    ---------- --------------------- ------------ ---------- --------------------- --------------------- -------------------------- ----------

    FILE_NAME

    TIMESTAMP DIC DIC APPLIED

    1 8257200902826 629245032 2048 8257767447753 8257767534297 12-Oct-2007 22:41:05 12-Oct-2007 23:18:23

    /oraarch01/BRHBETA/LOG_2048_1_629245032.arc

    12-Oct-2007 22:19:08 NO NO CURRENT

    1 8257200902826 629245032 2049 8257767534297 825776775404412-Oct-2007 23:18:23 13-Oct-2007 00:11:05

    /oraarch01/BRHBETA/LOG_2049_1_629245032.arc12-Oct-2007 23:12:18 NO NO CURRENT

    1 8257200902826 629245032 2050 8257767754044 8257767922751 13-Oct-2007 00:11:05 13-Oct-2007 01:11:05

    /oraarch01/BRHBETA/LOG_2050_1_629245032.arc

    13-Oct-2007 00:11:06 NO NO NO

    ...

    ...

    ...

    1 8257200902826 629245032 2140 8257781397314 8257781562968 16-Oct-2007 07:41:15 16-Oct-2007 08:41:15

    /oraarch01/BRHBETA/LOG_2140_1_629245032.arc

    16-Oct-2007 07:41:16 NO NO NO

    93 rows selected.

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    54/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 54

    Long Running Transaction

    redo log 2049 goes from SCN 8257767534297 to SCN 8257767754044

    Check again -- Tue Oct 16 15:04:29 MST 2007

    Compute Estimate Tue Oct 16 08:09:55 -- Tue Oct 16 11:17:39

    APPLIER has moved from 39247 to 39991

    3 hours --> roughly 750 SCNs, 250 per hour

    it still needs to go from 539991 to 754044

    over 200,000 SCNs -- at 250 per hour,

    this would take 800 hours --> 33 days

    APPLIER

    ORA-16113: applying change to table or sequence "BRH"."XXSUN_BRH_COMPS_INT"

    8257767540857

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    55/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 55

    Long Running Transaction

    Check again -- Tue Oct 16 15:04:29 MST 2007

    APPLIERhas moved

    39991 to 40857

    in the last 4 hours, 866 SCNs,

    roughly in line with 250/hr we computed earlier

    APPLIER

    ORA-16113: applying change to table or sequence "BRH"."XXSUN_BRH_COMPS_INT"

    8257767540857

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    56/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 56

    Long Running Transaction

    This is truly awful science!

    Assumes all SCNs take same amount of time

    If processing takes more than a few minutes

    Compute estimate

    Confirm that it will take a long time

    Compare with business requirements for standby

    Must be in synch once per day

    Decide to skip table

    If table required, must wait or full refresh

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    57/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 57

    Long Running Transaction

    Skip table

    SQL Apply Process restarts with redo 2048

    Standby catches up quickly

    ALTER DATABASE STOP LOGICAL STANDBY APPLY;

    EXECUTE DBMS_LOGSTDBY.SKIP -(stmt => 'DML' , -

    schema_name => 'BRH' , -object_name => 'XXSUN_BRH_COMPS_INT', -proc_name => null);

    ALTER DATABASE START LOGICAL STANDBY APPLY;

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    58/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 58

    Primary/Standby Interactions

    Logical standby backup starts Tablespaces put into backup mode

    Apply process applies redo logs from primary Contain transactions for primary backup

    Tries to put tablespaces into backup mode Apply process fails

    Wait for standby backup to finish

    Restart apply process

    Disable standby backups when catching up Apply process runs longer than normal

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    59/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 59

    Unique Constraint Violation

    Oracle calls this

    Oscillating updates

    Oracle docs explain this (I cant)

    Or primary update really did fail And was rolled back on primary db

    Fails and rolls back in standby db

    SQL apply process restarts

    Automatically

    No need to do anything

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    60/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 60

    Unique Constraint Violation

    Tue Oct 16 21:23:42 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arc

    ...

    ...

    ...

    Tue Oct 16 21:24:31 2007

    LOGSTDBY stmt: insert into "APPLSYS"."WF_LOCAL_ROLES"

    values

    COL1" = 'Value1',COL2" = 'Value2',

    COL3" IS NULL,

    LOGSTDBY status: ORA-00001: unique constraint (APPLSYS.WF_LOCAL_ROLES_U1) violated

    LOGSTDBY id: XID 0x0009.016.00011548, hSCN 0x0782.aa32b533, lSCN 0x0782.aa32b533, Thread 1, RBA

    Tue Oct 16 21:25:20 2007

    LOGMINER: End mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arc

    SQL Apply Process continues processing

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    61/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 61

    No Data Found

    What does it mean? When DataGuard updates standby

    Brings update from primary

    Brings pre-update data from primary

    On standby, DataGuard compares

    Pre-update data from primary

    Current data on standby

    If they dont agree

    DataGuard wont apply the update on standby

    SQL apply process fails

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    62/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 62

    No Data Found

    Wed Sep 19 12:09:23 2007LOGSTDBY stmt: update "PO"."PO_LINE_LOCATIONS_ALL"......SQL, values...LOGSTDBY status: ORA-01403: no data foundLOGSTDBY id: XID 0x0008.01e.0000c437, hSCN 0x0789.eacde6c1, lSCN 0x0789.eacde6c1

    LOGSTDBY Apply process P007 pid=29 OS id=3447 stoppedWed Sep 19 12:09:23 2007Errors in file /shared/orahome01/admin/BRHPRSB/bdump/brhprsb_lsp0_12386.trc:ORA-12801: error signaled in parallel query server P004ORA-01403: no data foundLOGSTDBY Analyzer process P003 pid=24 OS id=3439 stoppedLOGSTDBY Apply process P006 pid=27 OS id=3445 stoppedLOGSTDBY Apply process P005 pid=26 OS id=3443 stopped

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    63/76

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    64/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 64

    No Data Found

    Logical Standby

    No way to find out what happened

    No utility to verify primary, standby in synch

    Differences can exist for a long time Wont cause error until table updated on primary

    Logical Standby for reporting?

    Can you depend on this for your reports?

    How do you know what is in the standby?

    What has been skipped?

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    65/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 65

    Primary Schema Issues

    Primary db

    XDB schema reinstalled

    Create java class (loads java class from

    filesystem) Standby db

    Transactions came through to standby

    Standby doesnt have java class files

    Apply process fails

    Identify and skip transaction(s)

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    66/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 66

    ORA-07445 Errors

    SR opened Results

    Known bug fixed in 11g

    Apply patch on standby

    Impact None, no affect on standby

    Apply patch?

    Norefresh would wipe out patch

    Dont want to patch primary db

    - Primary doesnt have this error

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    67/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 67

    ORA-07445 Errors

    Tue Oct 16 21:27:50 2007

    Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_p004_6577.trc:

    ORA-07445: exception encountered: core dump [krvsmso()+1212] [SIGSEGV] [Address not mapped to object]

    Tue Oct 16 21:29:06 2007

    Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_lsp0_5041.trc:

    ORA-12805: parallel query server died unexpectedly

    Tue Oct 16 21:29:06 2007

    TLCR process death detected. Shutting down TLCR

    logminer process death detected, exiting logical standbyLOGSTDBY Analyzer process P003 pid=13 OS id=10051 stopped

    LOGSTDBY Apply process P005 pid=15 OS id=10060 stopped

    LOGSTDBY Apply process P006 pid=42 OS id=10062 stopped

    LOGSTDBY Apply process P007 pid=17 OS id=10064 stopped

    Tue Oct 16 21:29:06 2007

    LOGSTDBY status: ORA-16222: automatic Logical Standby retry of last action

    LOGSTDBY status: ORA-16111: log mining and apply setting up

    SQL Apply Process stops

    SQL Apply Process automatically restarts

    Logical Standby is not for the faint of hear t !

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    68/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 68

    ORA-07445 Errors

    Tue Oct 16 21:29:07 2007LOGMINER: Parameters summary for session# = 1

    LOGMINER: Number of processes = 3, Transaction Chunk Size = 201

    LOGMINER: Memory Size = 30M, Checkpoint interval = 150M

    LOGMINER: session# = 1, builder process P001 started with pid=7 OS id=10018

    LOGMINER: session# = 1, reader process P000 started with pid=34 OS id=10014

    LOGMINER: session# = 1, preparer process P002 started with pid=36 OS id=10020

    Tue Oct 16 21:29:10 2007

    LOGMINER: Begin mining logfile: /oraarch01/BRHBETA/LOG_2147_1_629245032.arcTue Oct 16 21:29:10 2007

    LOGMINER: Turning ON Log Auto Delete

    LOGSTDBY Analyzer process P003 started with pid=13 OS id=10051

    LOGSTDBY Apply process P006 started with pid=42 OS id=10062

    LOGSTDBY Apply process P004 started with pid=30 OS id=10219

    LOGSTDBY Apply process P005 started with pid=15 OS id=10060

    LOGSTDBY Apply process P007 started with pid=17 OS id=10064

    SQL Apply Process continues processing

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    69/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 69

    Refresh Process

    Export unique standby db objects Scripts to recreate

    Backup primary db Create standby control file

    Recover primary db backup on standby Use standby control file

    Create physical standby

    Convert to logical standby

    Import unique standby db objects Recreate with scripts

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    70/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 70

    Unsupported Record

    ORA-16211 SQL apply process fails

    Must skip table or refresh standby

    Oracle SR tells me to Add all column supplemental log group to table

    Rebuild standby Or reinstantiate the table

    Needed for each table Not an easy process

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    71/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 71

    Unsupported RecordThu Oct 11 10:11:58 2007

    LOGMINER: Log Auto Delete - deleting: /oraarch01/BRHBETA/LOG_2005_1_629245032.arc

    Deleted file /oraarch01/BRHBETA/LOG_2005_1_629245032.arc

    Thu Oct 11 10:15:55 2007

    ** LOGMINER WARNING - Invalidated 4 LCRs **

    Thu Oct 11 10:20:29 2007

    LOGSTDBY stmt: "BRH"."XXSUN_INV_ITEMS_INT": unsupported

    LOGSTDBY status: ORA-16211: unsupported record found in the archived redo log

    ORA-06512: at "SYS.DBMS_INTERNAL_LOGSTDBY", line 4717

    ORA-06512: at line 1LOGSTDBY id: XID 0x0009.02e.0001127d, hSCN 0x0782.a9016545, lSCN 0x0782.a9016545, Thread 1

    LOGSTDBY Apply process P007 pid=23 OS id=16578 stopped

    Thu Oct 11 10:20:29 2007

    Errors in file /orahome01/admin/BRHBETA/bdump/brhbeta_lsp0_13625.trc:

    ORA-12801: error signaled in parallel query server P007

    ORA-16211: unsupported record found in the archived redo log

    LOGSTDBY Analyzer process P003 pid=19 OS id=16570 stopped

    LOGSTDBY Apply process P005 pid=21 OS id=16574 stoppedLOGSTDBY Apply process P006 pid=36 OS id=16576 stopped

    LOGSTDBY Apply process P004 pid=34 OS id=16572 stopped

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    72/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 72

    Unsupported Record

    What causes this?

    Metalink 304061.1

    Possible causes

    Direct path insert on partitioned table Table has 500 columns

    Is this a standby?

    At any time this error may happen

    How to predict/prevent?

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    73/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 73

    Compile Invalid Objects In Logical Standby

    Execute utlrp.sql

    2 hours go by

    Not much changed

    Disable Guard for session

    Alter session disable guard alter database guard standby;

    Recompile runs in 2 minutes

    Alter session enable guard

    When normal things dont work

    Perhaps guard enabled is the problem

    Guard level is the problem (all vs standby)

    I I i l S db

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    74/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 74

    Import Into Logical Standby

    For recompile we used

    Alter session disable guard;

    Refresh Logical Standby

    Unique db objects exported before refresh

    Must be imported after refresh

    Import doesnt use SQL*Plus session

    Alter database guard standby;

    C l i

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    75/76

    www.brianhitchcock.net

    Brian Hitchcock October 23, 2007 Page 75

    Conclusion

    Logical standby

    Lots of errors

    Many require refreshing standby

    Lots of DBA support needed

    For all of this support

    What do you have?

    Do you know what is in the standby

    - Reporting?

    C l i

  • 8/13/2019 DataGuard Support Issues 10072007 (1)

    76/76

    Conclusion

    Physical standby Is solid, dependable

    No issues

    Logical standby

    Is it really a standby?

    Is it ready for failover?

    Is it providing complete data for reports?

    Lots of issues Is it worth the effort/risk?


Recommended