Debugging XenApp & XenDesktop

Post on 14-Jan-2016

50 views 0 download

Tags:

description

Debugging XenApp & XenDesktop. Lalit Kaushal Escalation Engineer EMEA. Agenda. Overview of Common Components Troubleshooting Utilities Common Issues Troubleshooting Tips. Overview of Common Components. Putting It All Together.  Find “best” virtual desktop. - PowerPoint PPT Presentation

transcript

Debugging XenApp & XenDesktopLalit KaushalEscalation Engineer EMEA

• Overview of Common Components

• Troubleshooting Utilities

• Common Issues

• Troubleshooting Tips

Agenda

Overview of Common Components

Putting It All Together

SAN

XenServer

XenApp

PVS

Active Directory with roaming profiles

DesktopDelivery Controller

Virtual Machines

Authenticate

Find “best” virtual desktop

Start VM

PXE-boot VM and stream OS

Register

Connect using ICA

Acquire license and determine settings

Log in

Apply profile Deliver apps

Full range of authentication methods supported through web interface technology

Full support for SmartAccess and ICA session policies

Common Components

• ICA Client

• Web Interface

• Active Directory

• XML

• IMA

• DDC/ZDC (Although roles are a bit different)

Troubleshooting Utilities

• Understand the problem

• Where is the problem• Network• Server (all servers / one server)• Client (one client machine/ one client version/client type)• Data Store problem (corruption / inconsistency / configuration)

Before you begin

Where to start?

• Collect Information• Frequency? Can I reproduce?• Determine Possible Causes/Effects• Get dumps, logs

• Tools• Determine necessary tools• Create a Setup

• Debug • Tools and Information to solve problem

Solving the problem

• Determine accurate reproduction steps

• Find appropriate starting point to debug

• Crashes – Determine state (using global, stack, etc.)

• Debug against working model

• Use appropriate tools

What tools are available?

• WINDBG – Windows Debugger

• CDFControl – CDF Tracing

• FILEMON – File Monitoring

• REGMON – Registry Monitoring

• PROCEXP – Process Explorer

• SYSTEMDUMP

CDFControl

• Combines Filemon and Regmon

Process Monitor

• Process Explorer shows handles and DLLs processes

• Helpful to troubleshoot: • Memory Optimization issues• Application Streaming• Access issues

• Process Explorer is available from Microsoft

Process Explorer

Sync Packet (SYN) Start of TCP session. Three way handshake (Syn, Syn-Ack, Ack) ICA session initialisation packets are transmitted next

Reset Packet (RST)Something has gone wrong, TCP session failed, unhandled closure of session

Finish Packet (FIN) Session is been closed in a handled manner

Push Packet (PSH) Data is been sent to receiving process directly

Ack Packet (ACK) Packet was received successfully by the remote device

Network Trace - Packets

Session (Network trace)

Start of a session.

• • • End of session

Dumps

• User Mode versus Kernel Mode

• The Windows operating system can be conceptually divided into 2 parts:• User Space (User Mode)• Kernel Space (Kernel Mode)

• Applications run in User Mode

• System drivers run in Kernel Mode (Privileged Mode)

Debugging

USER MODE

USER SPACE

KERNEL SPACE

USER APPLICATION

USER APPLICATION USER

APPLICATIONUSER

APPLICATIONUSER

APPLICATIONUSER

APPLICATION USER APPLICATION

USER APPLICATION

USER APPLICATION

keyboard.syswin32k.systcpip.sys

rusb2w2k.sys

[…]

• Microsoft definition: BSOD is a Fatal Exception Error or System failure

• Fatal exception errors:• Access to an illegal instruction has been

encountered • Invalid data or code has been accessed • The privilege level of an operation is invalid

• In most cases the exception is non-recoverable

• Dumps system memory to a file for debugging • Memory.dmp is placed on the System Drive• Requires free space equivalent to physical RAM + approx 12MB

Dump and Logs - BSOD

• User dump – process memory • Live dump (snapshot) • Post-mortem dump (after crash)

• Kernel dump – OS kernel memory• Manual dump• Post-mortem dump (after BSOD)

• Complete dump – physical memory (kernel memory + processes)• Manual dump• Post-mortem dump (after BSOD)

Dumps & Logs - Types of Dumps

• Dr Watson• Debugger generates a log file (Drwtsn32.log) & User Dump (user.dmp) when an

application exception or program error occur • Log file is cumulative, user.dmp overwritten• Set as the default debugger: drwtsn32.exe –I

• User Dump• Generates memory dump of specific process• Microsoft Knowledge Base Article – 241215

User Dump

WINDOWS TASK MANAGER CAN CAPTURE USER DUMPS IN VISTA & 2008!!!

TestDefaultDebugger - CTX111901

• Can generate a dump from a session

• No keyboard required

• Command line option available

• 32 / 64 bit

SystemDump - CTX111072

Description saved in dump

• Citrix DumpCheck (Explorer Extension)

DumpCheck - CTX108825

Common Issues

Common Problems

• Server\Application Crash

• Server\Application Hang

• CPU Spikes

• Web Interface Debugging

Server\Application Crash

• Some method of capturing the fault is needed• Ntsd - http://support.citrix.com/article/ctx108173 • Windbg - http://support.citrix.com/article/ctx107528 • Userdump - http://support.microsoft.com/kb/241215 • Dr Watson – http://support.citrix.com/article/ctx103209 • WER - http://www.microsoft.com/whdc/maintain/StartWER.mspx

• Verify your chosen method works• TestDefaultDebugger – http://support.citrix.com/article/ctx111901

• Have one of these methods enabled

Capturing Application Crash Dumps

• Use tool analyze crash dumps• http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx• Latest version is part of WDK (620mb download)• Earlier version are available as standalone download

Debugging tools for Windows

WinDbg

Symbols – Huh?

• .PDB – Program Database • Generated during compilation of the application by the vendor• Necessary to translate memory into something human readable..

•11010101001010101 = helloworld()

• Microsoft symbols Server - Essential• http://msdl.microsoft.com/symbols

• Citrix symbols • ftp://ftp.citrix.com/• http://ctxsym.citrix.com/symbols  • SRV*c:\symcache*http://msdl.microsoft.com/download/symbols;SRV*c:\

symcache*http://ctxsym.citrix.com/symbols 

Symbols

Tell Windbg where to find the symbols

• Can use similar method for Kernel or User Dump analysis• !analyze –v• lmv m suspicousmodule• Update suspiciousmodule to latest version• Search if known stack trace

• Look at stack functions• Understand what the code was trying to do when it crashed

Analyzing crashes

Systemdump_400000 makes a call into USER32

• Component names•DLL•EXE•SYSTEM Driver

Systemdump_400000 makes a call into ntdll Read upwards

• The top of the stack is the last function executed• What caused the crash

• Look for non core OS components• Core OS module are usually not the fault• Closest to the top of the stack• Treat them as suspicious

• Find out via lmv command• Version

• Owner

• Timestamp

Review of the stack

Case Study: Using WinDbg to analyze IMA CrashCase Study: Using WinDbg to analyze IMA Crash

Case Study

• Issue ReportedIMA frequently stopped unexpectedly on several server in the farm

• Data Collected• Collected User Dump

Case Study

• Issue ReportedIMA frequently stopped unexpectedly on several server in the farm

• Data Collected• Collected User Dump

• Analysis Done• !analyze –v• lmv m <modulename>

• Resolution• Uninstall Oracle Client 9.2 and update to 10.2

Server\Application Hangs

Server Hangs

• Dumps are not created automatically• Full memory dumps are most useful

• Need to force a dump• Systemdump - http://support.citrix.com/article/CTX111072

• If server is not fully hung• Keyboard - http://support.microsoft.com/kb/244139/en-us• Hardware NMI Switch• Configure for full memory dump instead of kernel

Analyzing Server Hangs

• Automatic analysis• !analyze –v –hang

•Not 100% reliable for full memory dumps• Lmv m suspectmodulename

• Check for locks• The 3 step programme with two new commands• !locks• Look for exclusive waiters

•Notice contention count• Look at the owner thread code

• !thread <threadID>

• Force a crash of process• userdump.exe - http://support.citrix.com/article/ctx466627 • Vista/2008 – Available from Task Manager

• Same windbg commands again

• Automatic analysis usually good• !analyze –v –hang• Try and understand what code is doing from function names• Might have to chase the hang from one process to another

Analyzing application hangs

Case Study: Using WinDbg to analyze Server hangCase Study: Using WinDbg to analyze Server hang

Case Study

• Issue Reported• XenApp server is hanging during logon

• Data Collected• Collected Kernel Dump

Case Study

• Issue Reported• XenApp server is hanging during logon

• Data Collected• Collected Kernel Dump

• Analysis Done• !analyze –v -hang• !locks

• Resolution• Involved Microsoft and recommended relevant Microsoft Hotfix

CPU Spikes

CPU Spikes

• Try to define a pattern (leverage perfmon)

• Determine offending Thread ID causing the spike (Process Explorer, QSlice)

• Obtain UserDump of offending process immediately after (Userdump.exe, WinDbg.exe)

• !runaway•WinDbg command to view thread times•Topmost thread is one to investigate

• Use application spy to look at what the application is doing (TracePlus, Logger)

ProcDump – New Microsoft Tool!!!

• Microsoft command-line utility• To monitor an application for CPU Spikes• Generate a dump during spikes

• usage: procdump [-64] [[-c CPU usage] [-u] [-s seconds]] [-n exceeds] [-e] [-h] [-m commit usage] [-ma] [-o] [-r] [-t] < <process name or PID> [dump file]] | [-x <image file> <dump file> [arguments]>

• C:\>procdump -c 20 -n 3 -o pnamain c:\dump\pnamain

ProcDump

ProcDump Demo

Debugging WI

Problem categories - UI

UI

- Verify html code- Firebug- IE inspector- IE developer tools

LogicCommuni

cation

- “This looks wrong”- Accessibility- Section 508- Browsers’ incompatibilities

Problem categories - Logic

UI LogicCommunic

ation

Weird or counter-intuitive behaviour Spec says different thing Configuration issues WI trace file Event log Live http headers

Problem categories - Communication

UI LogicCommun

ication

When it’s not WI’s fault New features Performance issues Capturing traffic (Wireshark, Fiddler) Capturing ICA file

Troubleshooting Tips

• Isolate the problem!• Does the issue affect Farm / Server / User ?

•Farm – Try new farm / Data Store•Server – Try different server / Clean build•User – Try new user or Administrator

• Does the issue affect ALL users ?• Is it the same in Fixed Window as Seamless ?• Does the problem happen via RDP ?

• 5 Why’s?

Troubleshooting Tips

Authentication Issues

• What type of Authentication is configured?

• Is Application Enumeration works?

• Is Explicit\Prompt authentication works?

• Is Kerberos enabled?

• Capture Network Traffic

Licensing Issues

• What’s the SA Date and is it valid for current Product?

• License are not Product specific (2007.0131)

• What’s LMC and ‘LMSTAT –a’ command showing?

• Are you able to Telnet LS from XenApp box and vice versa?

• Is customer using Citrix Option (Citrix.opt) file?

• Is product going into Grace period? If not, what’s the error?

• License Acquisition Error 500?

Session Disconnection - Define the issue

Is the issue server-side or client-side ? Where is the network reset packet originating from (client/server)?Use Network and/or CDF tracing to identify the type of disconnection which is occurringIf issue is intermittent enable Connection auditing events to define the rate of the issue and follow up with usersIf client-side; check for network outages and process or device failures.

Narrow down steps required to reproduce the issue.

Rule out Keep-alives, other components, and timeouts. Map out patterns. (users always disconnected when shadowing, etc.) Identify if issue is related to users or group , subset of servers, network segment, build, or reported outages.

Session Disconnection

WsxBrokenConnection: Reason=2, Source=2

What side of the connection the disconnection:

1 = Client 2 = Server

Reason why the disconnection occurred. Reason why the disconnection occurred.

wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2242 CDF_INFO WsxBrokenConnection: From WD: RequestedBPP: 0, SessionBPP: 0, Reason: 0 (overwriting context SessionBPP: 0)wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2260 CDF_INFO WsxBrokenConnection: Reason=2, Source=2wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2292 CDF_INFO WSXICA: BrokenConnection not terminatewsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2299 CDF_INFO WsxBrokenConnection: open event Global\WFSHELL_DISCONNECT_1wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2313 CDF_INFO WSXICA: SetEvent disconnectwsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2323 CDF_INFO WSXICA: release disconnect semaphore: Global\CPSVC_DISCONNECT_1wsxica 17896 04/17/2007 08:02:18.636 license.c 335 CDF_INFO ReleaseLicense: saved LogonId=1, fLimitChecksDone=1

User’s session was disconnect from CMCUser’s session was disconnect from CMC

CDF Tracing (wsxBrokenConnection)

Disconnection matrix

• BTG to the rescue!

• Topics include all current Citrix Products

• Ensures basic information is collected

• Helps to narrow down technical issues

• Faster resolution times

• Your feedback counts!

Brief Troubleshooting Guide

Additional Information

• Citrix XenApp 5.0 for Windows Server 2008 Administrator's GuideKB Article CTX115519

• Getting Started with Citrix XenApp 5.0 KB Article CTX116418

• Brief Troubleshooting GuideKB Article CTX106727

• Troubleshooting Tools For Your Citrix EnvironmentKB Article CTX107572

• Citrix XenApp 5.0 Installation GuideKB Article CTX116573

• http://support.citrix.com/proddocs/

• Session surveys are available online at www.citrixsummit.com starting Thursday, 7 October• Provide your feedback and pick up a complimentary gift card at the registration

desk

• Download presentations starting Friday, 15 October, from your My Organiser Tool located in your My Synergy Microsite event account

Before you leave…