Debugging XenApp & XenDesktopLalit KaushalEscalation Engineer EMEA
• Overview of Common Components
• Troubleshooting Utilities
• Common Issues
• Troubleshooting Tips
Agenda
Overview of Common Components
Putting It All Together
SAN
XenServer
XenApp
PVS
Active Directory with roaming profiles
DesktopDelivery Controller
Virtual Machines
Authenticate
Find “best” virtual desktop
Start VM
PXE-boot VM and stream OS
Register
Connect using ICA
Acquire license and determine settings
Log in
Apply profile Deliver apps
Full range of authentication methods supported through web interface technology
Full support for SmartAccess and ICA session policies
Common Components
• ICA Client
• Web Interface
• Active Directory
• XML
• IMA
• DDC/ZDC (Although roles are a bit different)
Troubleshooting Utilities
• Understand the problem
• Where is the problem• Network• Server (all servers / one server)• Client (one client machine/ one client version/client type)• Data Store problem (corruption / inconsistency / configuration)
Before you begin
Where to start?
• Collect Information• Frequency? Can I reproduce?• Determine Possible Causes/Effects• Get dumps, logs
• Tools• Determine necessary tools• Create a Setup
• Debug • Tools and Information to solve problem
Solving the problem
• Determine accurate reproduction steps
• Find appropriate starting point to debug
• Crashes – Determine state (using global, stack, etc.)
• Debug against working model
• Use appropriate tools
What tools are available?
• WINDBG – Windows Debugger
• CDFControl – CDF Tracing
• FILEMON – File Monitoring
• REGMON – Registry Monitoring
• PROCEXP – Process Explorer
• SYSTEMDUMP
CDFControl
• Combines Filemon and Regmon
Process Monitor
• Process Explorer shows handles and DLLs processes
• Helpful to troubleshoot: • Memory Optimization issues• Application Streaming• Access issues
• Process Explorer is available from Microsoft
Process Explorer
Sync Packet (SYN) Start of TCP session. Three way handshake (Syn, Syn-Ack, Ack) ICA session initialisation packets are transmitted next
Reset Packet (RST)Something has gone wrong, TCP session failed, unhandled closure of session
Finish Packet (FIN) Session is been closed in a handled manner
Push Packet (PSH) Data is been sent to receiving process directly
Ack Packet (ACK) Packet was received successfully by the remote device
Network Trace - Packets
Session (Network trace)
Start of a session.
• • • End of session
Dumps
• User Mode versus Kernel Mode
• The Windows operating system can be conceptually divided into 2 parts:• User Space (User Mode)• Kernel Space (Kernel Mode)
• Applications run in User Mode
• System drivers run in Kernel Mode (Privileged Mode)
Debugging
USER MODE
USER SPACE
KERNEL SPACE
USER APPLICATION
USER APPLICATION USER
APPLICATIONUSER
APPLICATIONUSER
APPLICATIONUSER
APPLICATION USER APPLICATION
USER APPLICATION
USER APPLICATION
keyboard.syswin32k.systcpip.sys
rusb2w2k.sys
[…]
• Microsoft definition: BSOD is a Fatal Exception Error or System failure
• Fatal exception errors:• Access to an illegal instruction has been
encountered • Invalid data or code has been accessed • The privilege level of an operation is invalid
• In most cases the exception is non-recoverable
• Dumps system memory to a file for debugging • Memory.dmp is placed on the System Drive• Requires free space equivalent to physical RAM + approx 12MB
Dump and Logs - BSOD
• User dump – process memory • Live dump (snapshot) • Post-mortem dump (after crash)
• Kernel dump – OS kernel memory• Manual dump• Post-mortem dump (after BSOD)
• Complete dump – physical memory (kernel memory + processes)• Manual dump• Post-mortem dump (after BSOD)
Dumps & Logs - Types of Dumps
• Dr Watson• Debugger generates a log file (Drwtsn32.log) & User Dump (user.dmp) when an
application exception or program error occur • Log file is cumulative, user.dmp overwritten• Set as the default debugger: drwtsn32.exe –I
• User Dump• Generates memory dump of specific process• Microsoft Knowledge Base Article – 241215
User Dump
WINDOWS TASK MANAGER CAN CAPTURE USER DUMPS IN VISTA & 2008!!!
TestDefaultDebugger - CTX111901
• Can generate a dump from a session
• No keyboard required
• Command line option available
• 32 / 64 bit
SystemDump - CTX111072
Description saved in dump
• Citrix DumpCheck (Explorer Extension)
DumpCheck - CTX108825
Common Issues
Common Problems
• Server\Application Crash
• Server\Application Hang
• CPU Spikes
• Web Interface Debugging
Server\Application Crash
• Some method of capturing the fault is needed• Ntsd - http://support.citrix.com/article/ctx108173 • Windbg - http://support.citrix.com/article/ctx107528 • Userdump - http://support.microsoft.com/kb/241215 • Dr Watson – http://support.citrix.com/article/ctx103209 • WER - http://www.microsoft.com/whdc/maintain/StartWER.mspx
• Verify your chosen method works• TestDefaultDebugger – http://support.citrix.com/article/ctx111901
• Have one of these methods enabled
Capturing Application Crash Dumps
• Use tool analyze crash dumps• http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx• Latest version is part of WDK (620mb download)• Earlier version are available as standalone download
Debugging tools for Windows
WinDbg
Symbols – Huh?
• .PDB – Program Database • Generated during compilation of the application by the vendor• Necessary to translate memory into something human readable..
•11010101001010101 = helloworld()
• Microsoft symbols Server - Essential• http://msdl.microsoft.com/symbols
• Citrix symbols • ftp://ftp.citrix.com/• http://ctxsym.citrix.com/symbols • SRV*c:\symcache*http://msdl.microsoft.com/download/symbols;SRV*c:\
symcache*http://ctxsym.citrix.com/symbols
Symbols
Tell Windbg where to find the symbols
• Can use similar method for Kernel or User Dump analysis• !analyze –v• lmv m suspicousmodule• Update suspiciousmodule to latest version• Search if known stack trace
• Look at stack functions• Understand what the code was trying to do when it crashed
Analyzing crashes
Systemdump_400000 makes a call into USER32
• Component names•DLL•EXE•SYSTEM Driver
Systemdump_400000 makes a call into ntdll Read upwards
• The top of the stack is the last function executed• What caused the crash
• Look for non core OS components• Core OS module are usually not the fault• Closest to the top of the stack• Treat them as suspicious
• Find out via lmv command• Version
• Owner
• Timestamp
Review of the stack
Case Study: Using WinDbg to analyze IMA CrashCase Study: Using WinDbg to analyze IMA Crash
Case Study
• Issue ReportedIMA frequently stopped unexpectedly on several server in the farm
• Data Collected• Collected User Dump
Case Study
• Issue ReportedIMA frequently stopped unexpectedly on several server in the farm
• Data Collected• Collected User Dump
• Analysis Done• !analyze –v• lmv m <modulename>
• Resolution• Uninstall Oracle Client 9.2 and update to 10.2
Server\Application Hangs
Server Hangs
• Dumps are not created automatically• Full memory dumps are most useful
• Need to force a dump• Systemdump - http://support.citrix.com/article/CTX111072
• If server is not fully hung• Keyboard - http://support.microsoft.com/kb/244139/en-us• Hardware NMI Switch• Configure for full memory dump instead of kernel
Analyzing Server Hangs
• Automatic analysis• !analyze –v –hang
•Not 100% reliable for full memory dumps• Lmv m suspectmodulename
• Check for locks• The 3 step programme with two new commands• !locks• Look for exclusive waiters
•Notice contention count• Look at the owner thread code
• !thread <threadID>
• Force a crash of process• userdump.exe - http://support.citrix.com/article/ctx466627 • Vista/2008 – Available from Task Manager
• Same windbg commands again
• Automatic analysis usually good• !analyze –v –hang• Try and understand what code is doing from function names• Might have to chase the hang from one process to another
Analyzing application hangs
Case Study: Using WinDbg to analyze Server hangCase Study: Using WinDbg to analyze Server hang
Case Study
• Issue Reported• XenApp server is hanging during logon
• Data Collected• Collected Kernel Dump
Case Study
• Issue Reported• XenApp server is hanging during logon
• Data Collected• Collected Kernel Dump
• Analysis Done• !analyze –v -hang• !locks
• Resolution• Involved Microsoft and recommended relevant Microsoft Hotfix
CPU Spikes
CPU Spikes
• Try to define a pattern (leverage perfmon)
• Determine offending Thread ID causing the spike (Process Explorer, QSlice)
• Obtain UserDump of offending process immediately after (Userdump.exe, WinDbg.exe)
• !runaway•WinDbg command to view thread times•Topmost thread is one to investigate
• Use application spy to look at what the application is doing (TracePlus, Logger)
ProcDump – New Microsoft Tool!!!
• Microsoft command-line utility• To monitor an application for CPU Spikes• Generate a dump during spikes
• usage: procdump [-64] [[-c CPU usage] [-u] [-s seconds]] [-n exceeds] [-e] [-h] [-m commit usage] [-ma] [-o] [-r] [-t] < <process name or PID> [dump file]] | [-x <image file> <dump file> [arguments]>
• C:\>procdump -c 20 -n 3 -o pnamain c:\dump\pnamain
ProcDump
ProcDump Demo
Debugging WI
Problem categories - UI
UI
- Verify html code- Firebug- IE inspector- IE developer tools
LogicCommuni
cation
- “This looks wrong”- Accessibility- Section 508- Browsers’ incompatibilities
Problem categories - Logic
UI LogicCommunic
ation
Weird or counter-intuitive behaviour Spec says different thing Configuration issues WI trace file Event log Live http headers
Problem categories - Communication
UI LogicCommun
ication
When it’s not WI’s fault New features Performance issues Capturing traffic (Wireshark, Fiddler) Capturing ICA file
Troubleshooting Tips
• Isolate the problem!• Does the issue affect Farm / Server / User ?
•Farm – Try new farm / Data Store•Server – Try different server / Clean build•User – Try new user or Administrator
• Does the issue affect ALL users ?• Is it the same in Fixed Window as Seamless ?• Does the problem happen via RDP ?
• 5 Why’s?
Troubleshooting Tips
Authentication Issues
• What type of Authentication is configured?
• Is Application Enumeration works?
• Is Explicit\Prompt authentication works?
• Is Kerberos enabled?
• Capture Network Traffic
Licensing Issues
• What’s the SA Date and is it valid for current Product?
• License are not Product specific (2007.0131)
• What’s LMC and ‘LMSTAT –a’ command showing?
• Are you able to Telnet LS from XenApp box and vice versa?
• Is customer using Citrix Option (Citrix.opt) file?
• Is product going into Grace period? If not, what’s the error?
• License Acquisition Error 500?
Session Disconnection - Define the issue
Is the issue server-side or client-side ? Where is the network reset packet originating from (client/server)?Use Network and/or CDF tracing to identify the type of disconnection which is occurringIf issue is intermittent enable Connection auditing events to define the rate of the issue and follow up with usersIf client-side; check for network outages and process or device failures.
Narrow down steps required to reproduce the issue.
Rule out Keep-alives, other components, and timeouts. Map out patterns. (users always disconnected when shadowing, etc.) Identify if issue is related to users or group , subset of servers, network segment, build, or reported outages.
Session Disconnection
WsxBrokenConnection: Reason=2, Source=2
What side of the connection the disconnection:
1 = Client 2 = Server
Reason why the disconnection occurred. Reason why the disconnection occurred.
wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2242 CDF_INFO WsxBrokenConnection: From WD: RequestedBPP: 0, SessionBPP: 0, Reason: 0 (overwriting context SessionBPP: 0)wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2260 CDF_INFO WsxBrokenConnection: Reason=2, Source=2wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2292 CDF_INFO WSXICA: BrokenConnection not terminatewsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2299 CDF_INFO WsxBrokenConnection: open event Global\WFSHELL_DISCONNECT_1wsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2313 CDF_INFO WSXICA: SetEvent disconnectwsxica 17896 04/17/2007 08:02:18.636 wsxica.c 2323 CDF_INFO WSXICA: release disconnect semaphore: Global\CPSVC_DISCONNECT_1wsxica 17896 04/17/2007 08:02:18.636 license.c 335 CDF_INFO ReleaseLicense: saved LogonId=1, fLimitChecksDone=1
User’s session was disconnect from CMCUser’s session was disconnect from CMC
CDF Tracing (wsxBrokenConnection)
Disconnection matrix
• BTG to the rescue!
• Topics include all current Citrix Products
• Ensures basic information is collected
• Helps to narrow down technical issues
• Faster resolution times
• Your feedback counts!
Brief Troubleshooting Guide
Additional Information
• Citrix XenApp 5.0 for Windows Server 2008 Administrator's GuideKB Article CTX115519
• Getting Started with Citrix XenApp 5.0 KB Article CTX116418
• Brief Troubleshooting GuideKB Article CTX106727
• Troubleshooting Tools For Your Citrix EnvironmentKB Article CTX107572
• Citrix XenApp 5.0 Installation GuideKB Article CTX116573
• http://support.citrix.com/proddocs/
• Session surveys are available online at www.citrixsummit.com starting Thursday, 7 October• Provide your feedback and pick up a complimentary gift card at the registration
desk
• Download presentations starting Friday, 15 October, from your My Organiser Tool located in your My Synergy Microsite event account
Before you leave…