+ All Categories

Answer

Date post: 06-Sep-2015
Category:
Upload: gbenga
View: 59 times
Download: 4 times
Share this document with a friend
Description:
Sun Systems Fault Analysis Workshop: Online Assessment
Popular Tags:
36
1 . The service configuratio n repository provides a per-service snapshot at the time each service is successfully started so that fallback is possible. The SMF service always executes with the running snapshot. This snapshot is automaticall y created if it does not exist. You find that the console- login service configuratio n on a server is wrong, and now need to take steps to fix the problem by reverting to the last snapshot that started successfully . Once you have logged in as superuser or equivalent role you run Mark for Review (4) Points
Transcript

1.The service configuration repository provides a per-service snapshot at the time each service is successfully started so that fallback is possible. The SMF service always executes with the running snapshot. This snapshot is automatically created if it does not exist.

You find that the console-login service configuration on a server is wrong, and now need to take steps to fix the problem by reverting to the last snapshot that started successfully. Once you have logged in as superuser or equivalent role you run the following commands.

# svccfgsvc:> select system/console-login:defaultsvc: /system/console-login:default> listsnapinitialrunningstart svc: /system/console-login:default> revert startsvc: /system/console-login:default> quit

You have two more steps to complete in this process, which are necessary to update the information in the service configuration and to restart the service instance.

What two commands would you run to update the repository with the configuration information from the start snapshot and then restart this service instance? Mark for Review(4) Points

(Choose all correct answers)

svcadm restart system/console-login (*)

svcadm refresh system/console-login (*)

svccfg export system/console-login

svcadm update system/console-login

Correct!

2.You have used the prtdiag command on a server to get some information about the system configuration, diagnostics, and failed FRUs. When the prtdiag command was executed the following exit value of 1 was returned.

Which option describes the meaning of this exit value?Mark for Review(2) Points

Indicates that failures or errors were detected in the system. (*)

Indicates an out of memory internal error.

Indicates that an internal prtdiag error occurred on the system.

Indicates that no failures or errors were detected on the system.

Correct!

3.The response time within a newly configured zone is very poor, and many services are not running. The person that configured this new zone booted it and logged in successfully.

Within the new zone the following command is run, which explains the state of services.

# svcs xvNo output for more than 5 minutes.

From this generated message, you surmise what the probable cause of the slow zone is and run the next set of commands:

# zoneadm z newzone halt# zonecfg z newzonezonecfg:newzone> remove capped-memoryzonecfg:newzone> commitzonecfg:newzone> infozonecfg:newzone> exit# zoneadm z newzone boot# zlogin newzone# svcs xv

The zone should now run faster than before.

Which option would be the cause of this poor response time reported on this new zone? Mark for Review(2) Points

Loopback file system not enabled.

Zone mis-configured; resources caps to low. (*)

Zone initiation failed.

Physical memory capping changed.

Correct!

4.System performance, especially for compute-bound processing is not very good. You run the mpstat command for a short time and see that the CPU system time (sys) is fairly high, even on a system that is not doing much.

# mpstat 2CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl0 0 0 13 228 5 22 0 1 1 0 0 0 11 0 891 0 0 8 26 1 17 0 0 0 0 51 0 3 0 962 0 0 3 9 0 5 0 0 0 0 0 0 0 0 1003 0 0 10 34 2 23 0 0 0 0 2 0 5 0 954 0 0 5 70 28 64 0 0 0 0 34 0 4 0 965 0 0 32 27 0 18 0 0 0 0 0 0 4 0 966 0 0 4 39 13 33 0 1 1 0 12 0 0 0 1007 0 0 8 26 0 16 0 0 0 0 0 0 0 0 1008 0 0 12 36 0 26 0 0 0 0 0 0 0 0 1009 0 0 4 16 0 12 0 0 0 0 2 0 6 0 9410 0 0 14 42 1 26 0 0 0 0 1 0 10 0 90. . . . . .^C

The possible causes of this system seeming to be slow could be a kernel bug, improper configuration, or there is Interrupt processing.

Which command would you use to correct this system performance problem?Mark for Review(2) Points

Use the reboot command to see if the problem goes away.

Use the modinfo command to find any unwanted or suspicious module and unload it.

Use the intrstat 2 command to determine the source of the problem.

All of the above (*)

Correct!

5.The Automatic System Recovery (ASR) feature enables the server to automatically configure failed components out of operation until they can be replaced. In the server, the ASR feature manages nonfatal hardware failures associated with memory modules and PCI cards. To display system components and their current state you run the following command.

sc> showcomponent Keys:/SYS/MB/PCI_MEZZ/SYS/MB/PCI_MEZZ/PCIE4/SYS/MB/PCI_MEZZ/XAUI4/SYS/MB/PCI_MEZZ/PCIE5/SYS/MB/PCI_MEZZ/XAUI5/SYS/MB/PCI_MEZZ/PCEI6/SYS/MB/PCI_MEZZ/PCIE7/SYS/MB/PCI_MEZZ/PCIE8/SYS/MB/PCI_MEZZ/PCIE9 .. /SYS/TTYADisabled Devices /SYS/MB/CMPO/L2_BANK0

Once a faulty component has been disabled and after the cause of the fault has been repaired (for example FRU replacement, loose connector reseated), you must remove the component from the ASR blacklist database. What two options describe the command to remove a disabled component and the name of the database containing the list of all disabled components on the system? Mark for Review(3) Points

(Choose all correct answers)

enable component asrkey ; reset (*)

clearasrdb

asr-db (*)

asrdb

Sorry, that is not correct. Please review the course content and try again.

6.The following error message has been displayed on a client:

svc:/application/pkg/server:default (image packaging repository)State: maintenance since June 13, 2013 11:33:59 AM MDTReason: Start method failed repeatedly, last exited with status 1.See: http://support.oracle.com/msg/SMF-8000-KS See: /var/svc/log/application-pkg-server:default.logImpact: This service is not running.

This error indicates that the application package server service is in a maintenance state and users can't install a package. You look first for information in this log file:

# tail /var/svc/log/application-pkg-server:default.log

You then run the following commands to make the necessary changes to correct the problem, and clear and refresh the service:

# svccfg -s pkg/serversvc:/application/pkg/server> listprop pkg# svcadm clear pkg/server# svcadm refresh pkg/server

Which option describes the probable cause of this error?Mark for Review(2) Points

Invalid or incorrect property in service.

Problem with IPS server configuration.

Problem with IPS client configuration.

All of the above (*)

Sorry, that is not correct. Please review the course content and try again.

7.A service on the server is disabled and not starting. To debug it you first request information about the failed service by using the following command:

# svcs xvsvc:/ application/pkg/server:default (image packaging repository) State: maintenance since Mon 30 Jun 2014 08:16:40 AM PDTReason: Start method failed repeatedly, exit with status 1. See: http://support.oracle.com/msg/SMF-8000-KS See: /var/svc/log/application-pkg-server:default.logImpact: This service is not running.

In the output, you see that the IPS service has failed to start and has been placed in maintenance state due to repeated startup failures.

Which two options describe the remaining steps to be performed to debug this service that has failed to started?Mark for Review(2) Points

(Choose all correct answers)

Check the manifest files that completely define a service or an instance located in /lib/svc/manifest or /var/svc/manifest

Read the log associated with the failing service to identify the cause of the failure using cd /var/svc/log and the more command. (*)

Verify the failure by disabling and enabling the failed service using svcadm disable serviceinstance ; svcadm enable serviceinstance (*)

Use /usr/sbin/svcadm v restart serviceinstance to restart a service that is in degraded state.

Correct!

8.A server has been crashing intermittently for unknown reasons. You have asked the customer to start saving the information from the crash in the /var/crash directory so that you can analyze the problem. The crash dump configuration file has the following entries:Dump content: kernelDump device: /dev/dsk/c0t1d0s1 (dedicated)Savecore directory: /var/crashSavecore enabled: noAfter the most recent crash, the administrator went into the /var/crash directory to look for the dump file but the directory was empty.Which command would you use to enable the server to store crash dumps in /var/crash on reboot?Mark for Review(2) Points

# dumpadm y (*)

# coreadm -d

# dumpadm n

# dumpadm u

Correct!

9.You are notified that a system has panicked because it tried to execute an illtrap instruction at ksyms_open+0x14, as shown in the following output:> < pc::dis

ksyms_open+0x14 : illtrap 0x0

Knowing that the kernel will not overwrite its own code due to permissions on the pages of memory containing kernel code, you deduce two possible reasons for the cause of this panic.

Which two options could have caused this system panic?Mark for Review(2) Points

(Choose all correct answers)

BAD TRAP occurred due to a NULL pointer.

A serious hardware problem. (*)

Data cannot be used to reconstruct events that lead to the panic.

The kernel branched to a location that contained the instruction NULL. (*)

Correct!

10.While diagnosing peripheral devices using the probe-scsi and probe-scsi-all commands, the SCSI devices on two systems are not detected. These devices are in fact physically attached to the on-board SCSI controllers.

What step would you take to correct this reported problem with the SCSI devices?Mark for Review(2) Points

Run reset on both systems.

Test the hardware devices attached to the systems with the test-all command.

Use POST to perform diagnostic tests for the hardware components.

Power on all the SCSI devices. (*)

Sorry, that is not correct. Please review the course content and try again

11.After installing software, the ps command no longer functions. The error message generated includes:

ld.so .1: ps: fatal: libc.so.1: open failed: No such file or directory

Which two options could be the cause of the ps command to no longer function?Mark for Review(2) Points

(Choose all correct answers)

Corrupted procfs (*)

Wrong permissions set on /bin/passwd

Privileges are set to disallow PRIV_PROC_INFO

Corrupted /usr/bin/ps (*)

Sorry, that is not correct. Please review the course content and try again.

12.After a system reboot, users cant telnet to other systems or do other network-related tasks.

# telnet host68Trying 192.181.164.61...telnet: Unable to connect to remote host: Network is unreachable

To check for reasons why the users can't communicate over the network, you use the ipadm and ifconfig commands to make sure the network interface is configured correctly and is plumbed and up.

# ipadm# ifconfig net0 up

You also check the rc directories to see what scripts may be running that are undesired, since legacy rc scripts can still can run in addition to SMF.

Which option describes additional steps you could take to resolve the reported problem with the network?Mark for Review(2) Points

Troubleshoot using svcs -xv to make sure all the network services are enabled; try enabling them by hand.

Create a backup of the faulty system before fixing anything.

Check for any hardware NIC errors using the fmadm faulty command.

Both a. and c. above. (*)

Correct!

13.You want to save a crash dump of the live running Oracle Solaris system without actually rebooting or altering the system in anyway. A dedicated dump device was recently configured to the system using the dumpadm command.

Which command would you use to save a live system crash dump?Mark for Review(2) Points

# savecore vf

# dumpadm y d

# savecore L (*)

None of the above

Correct!

14.SMF has a notification feature that notifies you through email messages of service state transitions and fault management events. You want to set up a notification to occur if any service state changes from the online state to any other state.

As a first step you have installed the smtp_notify package:

# pkg install service/fault-management/smtp-notify

and now need to enable and then configure the service notifications.

Which option describes the command you would not use when enabling and configuring the service state transition notifications for all services?Mark for Review(2) Points

# svccfg s svc:/network/http:appache22 setnotify from-online mailto:root@localhost (*)

# svccfg s svc:/system/svc/global:default setnotify g service_transition_state mailto:root@localhost

# svcadm enable svc:/system/fault-management/smtp-notify

#svcs | grep smtp

Correct!

15.A user is logged in as root but still cannot install a package in a non-global zone.

# zlogin webroot@web# pkg install apptracepkg install: Could not complete the operation on /var/pkg/lock: read-only filesystem.

You have the user check the settings of the zone, using the following command to look for a specific setting that may cause a read-only file system.

# zonecfg z web infozonename: webzonepath: /zones/webbrand: solarisautoboot: truebootargs:file-mac-profile: strict

The user locates a file-mac-profile property in the output of the command, which has been set to a value of strict. By default, a zonecfg file-mac-profile property is not set in a non-global zone. The default policy for a nonglobal zone is to have a writable root file system. Knowing this information, you tell the user that this is the desired setting placed on the non-global zone and should not be changed.

Which statement is true when describing the profile strict? Mark for Review(4) Points

Logging and auditing configuration files can be local.

Permits updates to /var/* directories, and modification of files in /etc/* directories.

Read-only file system, no exceptions. (*)

Permits updates to /var/* directories, with the exception of directories that contain system configuration components.

Correct!

16.Before actually installing a software package on a Solaris 11 system, you want to check exactly what is going to be installed. In this example, you run the following command to view the installation action of an apptrace package without installing it.

# pkg install nv apptrace Packages to install 1 Estimated space available: 46.27 GBEstimated space to be consumed: 13.55 MB Create boot environment: NoCreate backup boot environment: No Rebuild boot archive: No

You determine that theres no issue with installing this package and run the pkg install command to complete the package installation. To verify or validate the installation of the package you run the following command:

# pkg verify v apptracePACKAGE STATUSpkg: //solaris/developer/apptrace OK#You decide to go ahead and install the dtrace package on this system too. When the installation completes you verify the installation of this package. # pkg verify v dtracePACKAGE STATUSpkg: //solaris/system/dtrace ERROR

Which command would you use to correct the dtrace package installation error reported?Mark for Review(3) Points

pkg fix dtrace (*)

pkg uninstall dtrace

pkg revert dtrace

pkg update reject dtrace

Correct!

17.While booting a server the following error message is generated on the console.

Boot device: /pci@9/pci@0/pci@0/pci@1,2/LSTLogic,sad@2/disk0,0:a File and args:ERROR: boot-read failEvaluating:Cant locate boot device{0} ok

You know that there are two probable causes for the boot sequence to return to the ok prompt without booting.

Which option describes how this problem could have occurred? Mark for Review(2) Points

Boot device does not exist on the machine.

Incorrect NVRAM or boot settings.

Boot device is corrupt.

All of the above (*)

Correct!

18.The following error message is displayed on one AI client.

Rebooting with command: boot net:dhcp installBoot device: /pci@7c0/pci@0/network@4:dhcp File and args: 1000 Mbps FDX Link up wanboot info: WAN boot messages->console wanboot info: Starting DHCP configuration wanboot info: DHCP configuration succeeded wanboot progress: wanbootfs: Read 366 of 366 kB (100%) wanboot info: wanbootfs: Download complete Tue Aug 5 20:46:43 wanboot alert: miniinfo: Request returned code 500 Tue Aug 5 20:46:44 wanboot alert: Internal Server Error \ (root filesystem image missing)

You know this error occurred because the client cannot find the boot_archive.

Which option describes how you would correct this problem?Mark for Review(3) Points

Check your DHCP configuration or the contents of the target directory you specified when you ran installadm create-service

Check the path name and permissions of the boot_archive at $IMAGE/boot/boot_archive (*)

Check you WAN boot configuration.

None of the above

Correct!

19.The Oracle Integrated Lights Out Manager (ILOM) firmware runs on the service processor and is the central software resource for identifying and managing server problems. To actively manage and monitor a server independently of the operating system state, you enter ILOM by logging in and then running an ILOM command to view components that may be faulty on this server.

$ ssh username@SP_ipaddressPassword: - >enter command here

In this example the ILOM command entered has identified a failed hardware component. In particular, you are shown a memory module fault that has been detected by POST.

Target Property Value- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - /SP/faultmgmt/0 | fru | /SYS/MB/CMP0/BR1/CH0/D0/SP/faultmgmt/0 | timestamp | Jun 2 23:01:32/SP/faultmgmt/0/ | timestamp | Jun 2 23:01:32faults/0 | | /SP/faultmgmt/0/ | sp_detected_fault | /SYS/MB/CMP0/BR1/CH0/D0faults/0 | | Forced fail (POST)

Which command would have been entered to view faulty components? Mark for Review(2) Points

>show faulty (*)

>show /Host/list

>show /SP/faults_mgmt

>show faults

Correct!

20.The svc.configd repository daemon for SMF is invoked automatically during system startup, and restarted if any failures occur. When svc.configd daemon is started, it does an integrity check of the SMF configuration repository. In this example the integrity check failed and svc.configd wrote the following message to the console.

svc.configd: smf(5) database integrity check of:

/etc/svc/repository.db

failed. The database might be damaged or a media error might have prevented it from being verified. Additional information useful to your service provider is in:

/etc/svc/volatile/db_errors

The system will not be able to boot until you have restored a working database. svc.started (1M) will provide a sulogin(1M) prompt for recovery purposes. The command:

/lib/svc/bin/restore_respository

can be run to restore a backup version of your repository. See http: //sun.com/msg/SMF-8000-MY for more information.

You enter maintenance mode and run the restore_repository command, which takes you through the necessary steps to restore a non-corrupt backup.

Which option describes how a SMF repository can become corrupted? Mark for Review(3) Points

Disk failure

Hardware or Software bug

Accidental overwrite of the file.

All of the above (*)

Correct!

21.Oracle Solaris 11 installations are configured to have a default publisher, solaris, which supplies software packages from the release repository: http://pkg.oracle.com/solaris/releaseAs the administrator, you can see what configuration a Solaris 11 system has by using the following command:# pkg publisherPUBLISHER TYPE STATUS URIsolaris origin online http://pkg.oracle.com/solaris/release/You can also quickly query some basic information about a repository to view the package publishers known by the repository; number of packages for each publisher; when the publishers package data was last updated; and the status of the publishers package data, as shown here:PUBLISHER PACKAGES STATUS UPDATEDsolaris 4044 online 2014-06-28T12:17:33.570603ZWhich two options describe the methods that you could use to quickly query some basic information about the release repository to view just the publishers name, number of packages, status, and last updated timestamp?Mark for Review(2) Points

(Choose all correct answers)

Use the command: pkgrepo info -s http://pkg.oracle.com/solaris/release/ (*)

Use the command: pkgrepo get s http: //pkg.oracle.com/solaris/release/ -p all

Load the repository URL into your Web browser. (*)

Use the command: pkgrepo list p http: //pkg.oracle.com/solaris/release

Correct!

22.During an Automated Install a SPARC client successfully downloads the boot_archive and boots the Oracle Solaris kernel, but fails to get one of the image archives. The following error message indicates that the solaris.zlib file is causing this problem.

wanboot info: Starting DHCP configuration wanboot info: DHCP configuration succeeded wanboot progress: wanbootfs: Read 368 of 368 kB (100%) wanboot info: wanbootfs: Download completeFri Aug 26 16:26:52 wanboot progress: miniroot: Read 221327 of 221327 kB (100%)Fri Aug 26 16:26:53 wanboot info: miniroot: Download complete

WARNING: i2c_0 failed to add interrupt.WARNING: i2c_0 operating in POLL MODE only

Hardware watchdog enabledRemounting root read/writeProbing for device nodes ...Preparing network image for useDownloading solaris.zlib--2011-08-26 23:19:57-- http://10.134.125.136:5555/export/auto_install/175s//solaris.zlibConnecting to 10.134.125.136:5555... connected.HTTP request sent, awaiting response... 404 Not Found2011-08-26 23:19:57 ERROR 404: Not Found.

Could not obtain http://10.134.125.136:5555/export/auto_install/175s//solaris.zlib from install serverPlease verify that the install server is correctly configured and reachable from the clientRequesting System Maintenance Mode

Which option describes the conditions responsible for the cause of this fault?Mark for Review(4) Points

The image path configured in WAN boot is not correct.

The image path does not exist or is incomplete

Access is denied due to permission issues.

All of the above (*)

Correct!

23.While trying to install a package on a system, the following error message appeared:

# pkg install nv group/feature/amppkg install: The following pattern(s) did not match any allowable packages. Try using a different matching pattern, or refreshing publisher information:group/feature/amp

You run the following command, which returns nothing:# pkg search entire

You decide to check and make sure the publisher is refreshed with the most current data, then try to install the package again.

# pkg refresh solaris# pkg search entireINDEX ACTION VALUE PACKAGEpkg.description set Provides for power management support

pkg.fmri set solaris/entire pkg:/[email protected] set entire incorporation including Support Repository Update (Oracle Solaris 11.1.7.2.0). pkg:/[email protected]# pkg install nv group/feature/ampCreating Plan (Evaluating mediators): /Packages to install: 19Mediators to change: 1Estimated space available: 30.54 GBEstimated space to be consumed: 401.84 MBCreate boot environment: NoCreate backup boot environment: NoServices to change: 2Rebuild boot archive: No

Which two additional steps could also have been taken to quickly troubleshoot the cause of this problem?Mark for Review(2) Points

(Choose all correct answers)

Check to make sure there is not a typo in the package name. (*)

Use the command pkg variant to display the values of variants that are set with the package.

Check to make sure the publisher is online with the command pkg publisher (*)

Check the package group info with the pkg info r *group* command.

Correct!

24.You know that the following configuration will cause two core files to be generated and saved when a process in the local zone terminates abnormally.# coreadmglobal core file pattern: /var/core/core.%f.%pglobal core file content: allinit core file pattern: core.%f.%pinit core file content: defaultglobal core dumps: enabledper-process core dumps: enabledglobal setid core dumps: disabledper-process setid core dumps: disabledglobal core dump logging: enabledWhich two options describe where these core files would be saved?Mark for Review(2) Points

(Choose all correct answers)

In the process current working directory. (*)

In the global zone in /var/core (*)

In the local zone in /var/core

In $HOME/corefiles

Sorry, that is not correct. Please review the course content and try again.

25.In general, after the kernel panics a system, the system reboots. When the kernel panics it drops into the debugger and prints some interesting information. You know that the mdb utility can examine this information to determine the cause of the panic.

After a system crash, you locate the appropriate saved image and then invoke mdb.

# cd /var/crash/`uname n`# lsbounds unix.1 unix.3 vmcore.1 vmcore.3unix.0 unix. 2 vmcore.0 vmcore.2

# mdb k unix.2 vmcore.3Loading modules: [ unix genunix specfs dtrace zfs scsi_vhci sd mpt mac px lcd iphook neti arp usba kssl fctl sockfs random mdesc idm cpc crypto fcip fcp ufslogindmux nsmb ptm sppp nfs lofs ipc ]

As a next step, you retrieve a stack backtrace which shows in reverse order all the functions that were active at the time of the panic.

Which option would you use in the mdb debugger to generate a stack backtrace?Mark for Review(4) Points


Recommended