7/30/2019 windows performance
1/32
WindowsPerformance
Troubleshooting and
AnalysisDaniel Pearson
David Solomon Expert Seminars
7/30/2019 windows performance
2/32
Daniel Pearson
Started working with Windows NT 3.51 Three years at Digital Equipment Corporation
Supporting Intel and Alpha systems running Windows NT
Seven years at Microsoft
Senior Escalation Lead in Windows base team Worked in the Mobile Internet sustained engineering team
Instructor for David Solomon, co-author of the Windows Internals
book series
7/30/2019 windows performance
3/32
Agenda
Components of performance analysis Understanding the tools for troubleshooting and analyzing
performance issues
Troubleshooting CPU and memory issues using various
Windows tools
* Portions of this session are based on material developed by
Mark Russinovich and David Solomon
7/30/2019 windows performance
4/32
Components of Performance Analysis
Event Tracing for Windows Core component of the operating system
Kernel mode data structures
Used to store information about the system and system objects that
can be read by various tools
e.g. dt nt!_KTHREAD KernelTime
CPU performance monitoring events
Refer to the Intel 64 and IA-32 Architectures Software Developers
Manual
http://developer.intel.com/products/processor/manuals
http://developer.intel.com/products/processor/manualshttp://developer.intel.com/products/processor/manuals7/30/2019 windows performance
5/32
Event Tracing for Windows
Built in to the system High performance, low overhead and scalable
2.5% CPU usage for a sustained rate of 10,000 events/sec on
a 2 GHz CPU1
Operations throughout the system that are of interest to performance
are fully instrumented
e.g. process and thread activity, registry I/O, disk I/O
1. Milirud, Michael. 2008. Windows Performance Analysis: Using Windows
Performance Tools. Presented at Microsoft's WinHEC conference, November
5-7, Los Angeles, CA.
7/30/2019 windows performance
6/32
Event Tracing for Windows
Uses a buffering and logging mechanism implemented in the kernel Per-processor buffers that are written to disk by an asynchronous
writer thread
Ability to enable and disable tracing dynamically
Supports a managed code provider
7/30/2019 windows performance
7/32
Sysinternals Utilities
7/30/2019 windows performance
8/32
Sysinternals Utilities
Process Explorer Useful for displaying which files, registry keys and other objects
processes have open and which DLLs they have loaded
Process Monitor
Useful for showing real-time file system, registry and process &
thread activity
Available for download from the TechNet site
http://technet.microsoft.com/sysinternals
http://technet.microsoft.com/sysinternalshttp://technet.microsoft.com/sysinternals7/30/2019 windows performance
9/32
Resource Monitor
7/30/2019 windows performance
10/32
Resource Monitor
Included with Windows Vista and greatly enhanced in Windows 7 andWindows Server 2008 R2
Allows the viewing of CPU, memory, disk and network resources as well
as handles and modules in real time
Ability to end, suspendand resumeprocesses as well as to start, stop
and restartWindows services
Useful for identifying the highest resource consumers by individual
resource type, e.g. CPU
Able to list the wait chain tree of a process to determine if a process is
waiting on another
7/30/2019 windows performance
11/32
Using Resource Monitor
7/30/2019 windows performance
12/32
Performance Monitor
7/30/2019 windows performance
13/32
Performance Monitor
Queries performance counters that measure system state or activity Current values are read at specific intervals
Performance counters are included in the operating system and can be
included as part of applications
Able to collect event trace data from trace providers that report actions
or events
Can combine multiple trace providers into a single session
Configuration information can be collected from registry keys at a
specific time or interval
7/30/2019 windows performance
14/32
Using Performance Monitor
7/30/2019 windows performance
15/32
Windows Performance Analyzer
7/30/2019 windows performance
16/32
Windows Performance Analyzer
Part of the Windows Performance Toolkit Support for both x86, x64, and IA64 architectures
Consists of three primary programs
xperf.exe
Used for controlling tracing and processing trace data xbootmgr.exe
Automates on and off state transitions and captures traces during
those transitions
xperfview.exe
A graphical trace visualization tool to represent data in the form of
interactive graphs and summary tables
7/30/2019 windows performance
17/32
Windows Performance Analyzer
Primarily uses the Event Tracing for Windows infrastructure built in tothe system
Can be enabled or disabled at any time without requiring a system or
process restart
Supports symbol decoding, sample profiling, and recording of call
stacks on kernel events
Designed to be used during automation
All the functions of the tools are available via the command line tool
xperf.exe
7/30/2019 windows performance
18/32
Support for Earlier Systems
The Windows Performance Toolkit will fail to install on Windows XP andon Windows Server 2003 although data collection is supported
Copy xperf.exe and perfctrl.dll
Trace analysis is only supported on Windows Vista and later systems
7/30/2019 windows performance
19/32
Capturing a Performance Trace
Kernel options divided into two parts Kernel Flags
Identified by the use of uppercase characters
e.g. PROC_THREAD, LOADER, PROFILE
Kernel Groups Indentified by the use of title case characters
e.g. Base, Diag, Latency, FileIO
Kernel Groups are made up of a collection of Kernel Flags
e.g. SysProf = PROC_THREAD+LOADER+PROFILE
Flags and groups are separated by the + token
e.g. xperf.exe -on FileIO+DISK_IO_INIT
7/30/2019 windows performance
20/32
Merging of Performance Trace Data
Traces can be copied to another system for analysis The trace file should be merged on the collection system before
analysis to include additional system information
xperf -d trace.etl
Trace
Kernel trace
Merged trace
XPerf
System and symbol
information
7/30/2019 windows performance
21/32
Using the Windows
Performance Toolkit
7/30/2019 windows performance
22/32
Understanding CPU Activity
Windows uses 32 priority levels The system implements a preemptive,
priority driven scheduler
Priority adjustments can be applied to
threads in the dynamic range
At least one runnable thread with the
highest priority will be running
0
16
31
15
Real time
Dynamic
7/30/2019 windows performance
23/32
Context Switching
A switch from one thread to another is known as a context switch Switching involves saving the hardware state of a thread and restoring
the state of another
When a thread is scheduled, that threads context switch count is
also incremented
The context switch count represents how often a thread beginsrunning, not how long it ran
7/30/2019 windows performance
24/32
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where thesystem has spent its time
CPU time accounting is driven by an interrupt timer which is set by
the Hardware Abstraction Layer
Usually at either 10 or 15 msec intervals
Thread execution and context switches that happen between clockintervals are not accounted for
e.g. a thread runs and enters a wait before the clock fires
Thus threads may run but never get charged
7/30/2019 windows performance
25/32
Time Accounting Prior to Windows Vista
Windows accounted for CPU time based on the interval clock timer Thread quantum expiration was not always fair
A thread might get almost no turn
Threads were also charged for interrupts that occurred while they
were running
Idle Idle T1 T2 T2
7/30/2019 windows performance
26/32
Time Accounting Since Windows Vista
Windows Vista and later reads the Time Stamp Counter during everycontext switch
The actual CPU cycles consumed are charged to a thread
Any interrupt time is not charged to the interrupted thread
Allows for more accurate quantum accounting
A thread gets at least one turn and at most will be given one turn
plus an additional tick
Idle Idle T1 T1 T2
7/30/2019 windows performance
27/32
Troubleshooting High
CPU Utilization
7/30/2019 windows performance
28/32
Understanding Memory Management
Windows provides two system memory pools Nonpaged Pool and Paged Pool
Used for system wide persistent data
Prior to Windows Vista, pool sizes were a function of memory size and
whether or not the system was configured as a server or a workstation
Windows Vista introduced the concept of a dynamic systemaddress space
7/30/2019 windows performance
29/32
Dynamic System Address Space
In 32-bit Windows Vista and later, virtual memory is assigned as needed Permits larger paged, nonpaged, and session pools
Components still cannot exceed 2 GB on 32-bit systems
On 64-bit systems, address space regions are configured to their
current maximum limits for all memory sizes
7/30/2019 windows performance
30/32
Troubleshooting
Memory Leaks
7/30/2019 windows performance
31/32
Additional Information
Windows Internals 5th
edition Windows Performance Analysis Developer Center
http://msdn.microsoft.com/performance
Windows Server Performance Team Blog
http://blogs.technet.com/winserverperformance
Ask the Performance Team Blog
http://blogs.technet.com/askperf
http://msdn.microsoft.com/en-us/performancehttp://blogs.technet.com/winserverperformancehttp://blogs.technet.com/askperfhttp://blogs.technet.com/askperfhttp://blogs.technet.com/winserverperformancehttp://msdn.microsoft.com/en-us/performance7/30/2019 windows performance
32/32
Additional Information
David Solomon Expert Seminars offers trainingon Windows Internals both as public and private workshops and public
webinars via the Internet
Currently scheduled up and coming classes
Public workshop in London, April 12th April 16th
Public webinar, April 26th & April 28th
Public workshop in New York, May 3rd May 7th
Public workshop in San Francisco, November 8th November 12th
Visit http://www.solsem.com for further course descriptions and up to
date information
http://www.solsem.com/http://www.solsem.com/