+ All Categories
Home > Documents > COMP2221 Networks in Organisations

COMP2221 Networks in Organisations

Date post: 30-Dec-2015
Category:
Upload: corbett-richter
View: 23 times
Download: 1 times
Share this document with a friend
Description:
COMP2221 Networks in Organisations. Richard Henson May 2013. Week 11 – Troubleshooting & Optimisation. Learning Objectives: Explain the principles of troubleshooting as a means of mitigating against failure - PowerPoint PPT Presentation
Popular Tags:
30
COMP2221 COMP2221 Networks in Networks in Organisations Organisations Richard Henson Richard Henson May 2013 May 2013
Transcript

COMP2221 COMP2221 Networks in OrganisationsNetworks in Organisations

Richard HensonRichard Henson

May 2013May 2013

Week 11 – Troubleshooting Week 11 – Troubleshooting & Optimisation& Optimisation

Learning Objectives:Learning Objectives:– Explain the principles of troubleshooting as a Explain the principles of troubleshooting as a

means of mitigating against failuremeans of mitigating against failure– Use the various tools available on a named Use the various tools available on a named

operating system to identify potential faults operating system to identify potential faults and problemsand problems

– Take appropriate action to stop a fault Take appropriate action to stop a fault becoming a failurebecoming a failure

““A stitch in time saves nine”A stitch in time saves nine”

Business - Worst Possible Scenario (1)Business - Worst Possible Scenario (1)

There is an interruption in the power There is an interruption in the power supplysupply– UPS is invoked UPS is invoked – the interruption continues…the interruption continues…– servers all have to be shut downservers all have to be shut down

Power supply restored…Power supply restored…– but main domain controller doesn’t rebootbut main domain controller doesn’t reboot– no other domain controllers therefore no other domain controllers therefore

connect to itconnect to it– the domain tree failsthe domain tree fails

Business - Worst Possible Scenario (2)Business - Worst Possible Scenario (2)

Organisation cannot do business with the Organisation cannot do business with the network down…network down…– server can’t be persuaded to boot server can’t be persuaded to boot – new main domain controller has to be new main domain controller has to be

commissionedcommissioned– whole directory tree has to be rebuilt!!!whole directory tree has to be rebuilt!!!– word spreads very rapidly…word spreads very rapidly…

Business loses so much custom, trust, and Business loses so much custom, trust, and credibility that even when it starts doing credibility that even when it starts doing business again customers choose to go business again customers choose to go elsewhereelsewhere– without a flourishing customer base… without a flourishing customer base… the the

business foldsbusiness folds

Analysis: This scenario shouldn’t Analysis: This scenario shouldn’t have occurred…have occurred…

Unlikely that the server would fail to boot Unlikely that the server would fail to boot without prior warning…without prior warning…– warnings would have been presented…warnings would have been presented…– but were clearly not acted upon!but were clearly not acted upon!

Disaster recovery plan!?!Disaster recovery plan!?!– not formulated? not formulated? – not tested?not tested?– not effective (in the event of a domain tree controller not effective (in the event of a domain tree controller

failure…)failure…)

But it does…But it does… Actual example (15Actual example (15thth Feb 2010): Feb 2010):

– root domain controller [on the network] had not root domain controller [on the network] had not been backed up for 10 months, when it crashed been backed up for 10 months, when it crashed (well… at least it had been backed up at some (well… at least it had been backed up at some time…)time…)

– http://searchwindowsserver.techtarget.com/http://searchwindowsserver.techtarget.com/generic/0,295582,sid68_gci1381567,00.html generic/0,295582,sid68_gci1381567,00.html

The consultant called in to fix it reported that:The consultant called in to fix it reported that:– ““I had never seen a case where the forest I had never seen a case where the forest

root domain had to be recovered -- and I root domain had to be recovered -- and I couldn't find anyone who had.” couldn't find anyone who had.”

Analysis: Who is to blame? (1)Analysis: Who is to blame? (1) In this example, the organisation said In this example, the organisation said

they were following Microsoft guidelinesthey were following Microsoft guidelines– they set up an they set up an emptyempty root domain root domain– the root domain controller had a RAID-5 the root domain controller had a RAID-5

(best) disk configuration(best) disk configuration Was true, to some extent…Was true, to some extent…

– Microsoft did espouse this as best Microsoft did espouse this as best practice… (in the year 2000!)practice… (in the year 2000!)

– guidelines had changed since then…guidelines had changed since then…

Analysis: Who is to blame? (2)Analysis: Who is to blame? (2) The disaster that struck was:The disaster that struck was:

– two RAID drives failed on the same day!two RAID drives failed on the same day!– unlucky? possible to prepare for this?unlucky? possible to prepare for this?

The recovery process took about three weeksThe recovery process took about three weeks– most of the time was spent studying logs, doing most of the time was spent studying logs, doing

the restore, etc. the restore, etc.

In this case, the tree was still able to function In this case, the tree was still able to function without a root domainwithout a root domain– business was able to continuebusiness was able to continue– customer base wasn’t compromised…customer base wasn’t compromised…

Fault Tolerance and Risk Fault Tolerance and Risk AssessmentAssessment

General “common sense” principle:General “common sense” principle:– alwaysalways have a backup have a backup– ESPECIALLY for the most important computer ESPECIALLY for the most important computer

on the network…on the network…

Q: Q: – How can you tell what needs backing up?How can you tell what needs backing up?

A:A:– Risk Assessment and Risk ManagementRisk Assessment and Risk Management

Why not Risk Management?Why not Risk Management?

Time consuming!Time consuming! However, without proper risk However, without proper risk

management…management…– how does the organisation know what how does the organisation know what

processes are most important to its processes are most important to its functioning?functioning?

– how can an organisation provide resources how can an organisation provide resources to protect aspects of its network?to protect aspects of its network?

Risk Management and Risk Management and Risk AssessmentRisk Assessment

Risk Assessment is an essential first stepRisk Assessment is an essential first step– requires putting a “value” on assetsrequires putting a “value” on assets– more valuable… greater protectionmore valuable… greater protection

Do information assets have value?Do information assets have value?– organisations still failing to acknowledge that they organisations still failing to acknowledge that they

do…do…– categorisation of information assets therefore categorisation of information assets therefore

potentially problematicpotentially problematic– need to look at the consequence to the need to look at the consequence to the

organisation of losing that asset…organisation of losing that asset…

How do you back up a How do you back up a Domain Controller?Domain Controller?

The Windows “Backup” program works, and The Windows “Backup” program works, and can easily be scheduledcan easily be scheduled– but heavily criticised…but heavily criticised…– even the 2008 server version…even the 2008 server version…

Third Party products give more flexibility and Third Party products give more flexibility and protection e.g. :protection e.g. :– Recovery ManagerRecovery Manager

» http://www.quest.com/recovery-manager-for-active-directoryhttp://www.quest.com/recovery-manager-for-active-directory

– Backup ExecBackup Exec» http://www.symantec.com/business/products/family.jsp?familyid=backupexechttp://www.symantec.com/business/products/family.jsp?familyid=backupexec

Prevention is Better than CurePrevention is Better than Cure A server shouldn’t crash unexpectedly!A server shouldn’t crash unexpectedly!

– should be kept cool (environmental unit mustn’t should be kept cool (environmental unit mustn’t break down!)break down!)

– monitoring should show that unexpected things are monitoring should show that unexpected things are happeninghappening

– action can then (usually) be taken to take care of action can then (usually) be taken to take care of the unexpectedthe unexpected

Many tools available to:Many tools available to:– Check/monitor the system on a regular basisCheck/monitor the system on a regular basis– Provide stats/ to administrators Provide stats/ to administrators

» could also be used for security purposescould also be used for security purposes

– Generate alerts if something is starting to go Generate alerts if something is starting to go wrong…wrong…

Troubleshooting Tools for a Windows Troubleshooting Tools for a Windows Server: Task ManagerServer: Task Manager

Applications tab:Applications tab:– shows which applications are runningshows which applications are running– enables changing of process priorityenables changing of process priority

» use view/update speeduse view/update speed

– used toused to» open new applicationsopen new applications» shut rogue applications downshut rogue applications down

Task Manager (continued)Task Manager (continued)

Processes tab:Processes tab:– all system processesall system processes– Memory usage of eachMemory usage of each– % CPU time for each% CPU time for each– total CPU time since boot uptotal CPU time since boot up– also used to close a process downalso used to close a process down

» careful! (but you get a warning…)careful! (but you get a warning…)

Task Manager (continued)Task Manager (continued)

Performance tab:Performance tab:– total no. of threads, processes, handles runningtotal no. of threads, processes, handles running– Graph: % CPU usageGraph: % CPU usage

» User mode User mode » Kernel mode (optional: view menu)Kernel mode (optional: view menu)» graph per CPU (optional: view menu)graph per CPU (optional: view menu)

– physical (Page File) memory available/usagephysical (Page File) memory available/usage– virtual memory available/usagevirtual memory available/usage

Event ViewerEvent Viewer

Events recorded into “event log” files Events recorded into “event log” files – System logSystem log– Auditing log (customisable)Auditing log (customisable)– Application logApplication log– customisable - additional filescustomisable - additional files

New files recorded daily; old ones New files recorded daily; old ones archivedarchived– time before archiving also customisabletime before archiving also customisable

Event ViewerEvent Viewer

Three types of events recorded in log:Three types of events recorded in log:– InformationInformation– WarningWarning– ErrorError

More information on each event obtained by More information on each event obtained by double-clickingdouble-clicking– make note of event codemake note of event code– heed and take action if necessaryheed and take action if necessary

Using Event ViewerUsing Event Viewer

Wise to check all event logs regularlyWise to check all event logs regularly– take time/trouble to find out that those take time/trouble to find out that those

messages really mean…messages really mean… The action is needed that itThe action is needed that it

– sort out potential problems nowsort out potential problems now– Make sure they don’t become real ones Make sure they don’t become real ones

later… later…

Auditing Further EventsAuditing Further Events

Any “object” can be auditedAny “object” can be audited Objects to audit, and processes Objects to audit, and processes

audited can be set through audit audited can be set through audit (group) policy(group) policy– Using MMC & relevant snap-inUsing MMC & relevant snap-in

Types of process audited:Types of process audited:– accessaccess– attempt to accessattempt to access

Security auditingSecurity auditing

Same principles as general Same principles as general auditingauditing

Refers to “restricted” objectsRefers to “restricted” objects Events appear in separate Events appear in separate

security logsecurity log

Event Management software Event Management software (SIEM)(SIEM)

Who’s going to look at all these log files?Who’s going to look at all these log files?– in practice, often no-one..in practice, often no-one..

Solution – SIEM software to analyse and Solution – SIEM software to analyse and present information from:present information from:– network and security devicesnetwork and security devices– identity & access management applicationsidentity & access management applications– vulnerability management/policy compliance toolsvulnerability management/policy compliance tools– os, database & application logsos, database & application logs– external threat dataexternal threat data http://www.focus.com/briefs/

how-select-security-information-and-event-management-siem

Performance MonitorPerformance Monitor

Not available on diskNot available on disk To obtain and download Performance To obtain and download Performance

Monitor Wizard (PerfWiz), visit the Monitor Wizard (PerfWiz), visit the following Web site:following Web site:– http://www.microsoft.com/downloads/http://www.microsoft.com/downloads/

details.aspx?FamilyID=31fccd98-c3a1-4644-details.aspx?FamilyID=31fccd98-c3a1-4644-

9622-faa046d69214&displaylang=en9622-faa046d69214&displaylang=en

What if the machine What if the machine doesn’t boot…doesn’t boot…

Tools available:Tools available:– The boot error itselfThe boot error itself

» blue screen? driver softwareblue screen? driver software

» constant reboot? motherboardconstant reboot? motherboard

– Last Known Good…Last Known Good…» Gives machine a chance to go back to the Gives machine a chance to go back to the

previous (usually last but one) previous (usually last but one) configurationconfiguration

What if the machine What if the machine doesn’t boot… (continued)doesn’t boot… (continued) Safe ModeSafe Mode

– includes VGA Mode or boot includes VGA Mode or boot logginglogging

– Debugging mode also availableDebugging mode also available» output difficult to decipher for non-output difficult to decipher for non-

expertsexperts

Recovery ConsoleRecovery Console– ““DOS-type prompt” for performing DOS-type prompt” for performing

minor repairsminor repairs

What if the machine What if the machine doesn’t boot… (continued)doesn’t boot… (continued)

System Configuration Utility System Configuration Utility (Msconfig.exe)(Msconfig.exe)– automates the routine troubleshooting automates the routine troubleshooting

steps relating to Windows configuration steps relating to Windows configuration issuesissues

– can be used to modify the system can be used to modify the system configuration and troubleshoot the problem configuration and troubleshoot the problem using a process-of-elimination methodusing a process-of-elimination method

What if the machine What if the machine doesn’t boot… (continued)doesn’t boot… (continued)

Emergency Repair Disk (ERD)Emergency Repair Disk (ERD)– reboot machine using different mediareboot machine using different media

» e,g. floppy disk (yes… still possible)e,g. floppy disk (yes… still possible)

– media should be generated BEFORE it media should be generated BEFORE it needs to be used!needs to be used!

– option to create the ERD during the set option to create the ERD during the set up process…up process…

What if the machine What if the machine doesn’t boot… (continued)doesn’t boot… (continued)

Full restoreFull restore– assumes a full backup has already been assumes a full backup has already been

mademade– still have to:still have to:

» reformat hard disk from scratch…reformat hard disk from scratch…

» and then restore the backup files using and then restore the backup files using backup/restore option….backup/restore option….

– but better than losing all your data!but better than losing all your data!

Optimisation…Optimisation…

All about improving the performance All about improving the performance of system resources…of system resources…

A network manager should never A network manager should never have “nothing to do…”have “nothing to do…”


Recommended