Good virtual machines

Post on 28-Jan-2015

110 views 0 download

Tags:

description

 

transcript

Tips and Tricks for CreatingTips and Tricks for CreatingGood VMs:Good VMs:ItIt’’s Not as Easy as Yous Not as Easy as You’’d Think!d Think!

Greg ShieldsGreg ShieldsPartner and Principal TechnologistConcentrated Technologywww.ConcentratedTech.com

This slide deck was used in one of our many conference presentations. We hope you enjoy it, and invite you to use it

within your own organization however you like.

For more information on our company, including information on private classes and upcoming conference appearances, please

visit our Web site, www.ConcentratedTech.com.

For links to newly-posted decks, follow us on Twitter:@concentrateddon or @concentratdgreg

This work is copyright ©Concentrated Technology, LLC

3030 Awesome Tiplets for Getting the Most Awesome Tiplets for Getting the Most

Out of Your Virtual MachinesOut of Your Virtual Machines

Greg ShieldsGreg ShieldsPartner and Principal TechnologistConcentrated Technologywww.ConcentratedTech.com

#1: Purchase Compatible Hardware#1: Purchase Compatible Hardware

…and not just “compatible with ESX”. Purchase hardware compatible with each

other.– Particularly considering vMotion needs.

#2: Buy Nehalem/Opteron#2: Buy Nehalem/Opteron

Intel Nehalem & AMD Opteron include support for Intel EPT / AMD RVI processor extensions.– Together, referred to as Second Level Address

Translations, or SLAT– Includes hardware-assisted memory management unit

(MMU) virtualization.– Significantly faster for certain workloads, such as

those with large context switches.– Finally, full support for Remote Desktop Services /

XenApp

Note that these support Large Memory Pages, which will disable ESX’s page table sharing.

#3: Mind NIC Oversubscription#3: Mind NIC Oversubscription

One of the greatest benefits of iSCSI is its linear scalability.– Need more throughput, just add another NIC!

However, VLANs and link aggregation introduce the notion of NIC oversubscription.– Ceteris Paribus, Storage traffic >>> Regular traffic.

Even with VLANs, always segregate storage NICs from production networking NICs.– If possible/affordable use segregated network paths.

– Monitor! This will kill your performance faster than anything!

#4: Consider Further#4: Consider FurtherSegregating Heavy WorkloadsSegregating Heavy Workloads

Some VMs run workloads that make heavy use of their attached disks.– Consider segregating these workloads onto their

own independent NICs and paths.– Keep an eye on your IOPS.

#5: vSphere 4.0 VMs Don#5: vSphere 4.0 VMs Don’’t Backup t Backup Applications Correctly!Applications Correctly!

vSphere 4.1 added full support for Microsoft VSS on Server 2008 guests.– This support is only automatic if the guest was

initially created on a vSphere 4.1 host.– Hosts upgraded from vSphere 4.0 aren’t properly

backing up their applications.

Fix this by setting disk.EnableUUID to True.– Power off machine.– Edit Settings | Options | General | Configuration

Parameters | Add Row– Power on machine.

#6: HBA Max Queue Depth#6: HBA Max Queue Depth

One solution for poor fibre storage performance can be adjusting your HBA maximum queue depth.– More queues can mean more performance, but less

cross-device and cross-VM optimizations.– 32 by default.– This is not a task taken lightly.– Kind of like adjusting air/fuel mix on a carburetor.

Multi-step process. See http://kb.vmware.com/kb/1267 for

details.

#7: Consider Hardware iSCSI#7: Consider Hardware iSCSI

…but, perhaps, don’t buy them…

ESX’s software iSCSI initiator works well.– However, using it incurs a small processing

overhead.– Hardware iSCSI NICs offload this overhead to the

card.– NFS/NAS storage also experience this behavior.

#7: Consider Hardware iSCSI#7: Consider Hardware iSCSI

…but, perhaps, don’t buy them…

ESX’s software iSCSI initiator works well.– However, using it incurs a small processing overhead.– Hardware iSCSI NICs offload this overhead to the card.– NFS/NAS storage also experience this behavior.

Newer NICs reduce this effect, those with…– Checksum offload– TCP segmentation offload (TSO)– 64-bit DMA addressing– Multiple Scatter Gather elements per Tx frame– Jumbo frames

#8: Set NICs to Autonegotiate#8: Set NICs to Autonegotiate

VMware’s recommendation is to set all NICs to autonegotiate, full duplex.– This is sort of “duh” these days.– But its worth mentioning, because…

– Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.

#8: Set NICs to Autonegotiate#8: Set NICs to Autonegotiate

VMware’s recommendation is to set all NICs to autonegotiate, full duplex.– This is sort of “duh” these days.– But its worth mentioning, because…

– Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.

– Just smack around those old coots.

#8: Set NICs to Autonegotiate#8: Set NICs to Autonegotiate

VMware’s recommendation is to set all NICs to autonegotiate, full duplex.– This is sort of “duh” these days.– But its worth mentioning, because…

– Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.

– Just smack around those old coots. – “You can take your Token Ring and your IPX

and go home now!”

#9: Do Not Team Storage NICs#9: Do Not Team Storage NICs

What? Don’t team them?– Well, I guess I mean “team” as in the classic sense

of network teaming.

#9: Do Not Team Storage NICs#9: Do Not Team Storage NICs

What? Don’t team them?– Well, I guess I mean “team” as in the classic sense

of network teaming.

Remember that storage NICs leverage MPIO for link aggregation.– MPIO is a superior technology over link aggregation

anyway for storage.– ‘tis also easier to use, and better for routing!

vCenter’s GUI wizards make this hard not to do, but be aware that extra steps are required…

#10: Enable Hyperthreading#10: Enable Hyperthreading

Early in ESX’s days we debated whether hyperthreading improved or decreased overall performance.– That debate is over. The winner is “increase”.

Today, hyperthreading adds a non-linear additional quantity of processing capacity.– Like 20-30% (???), not a full extra proc.

But you know this.– Enable it in your servers’ BIOS.– Just turn it on, OK?

#11: Allocate Only the CPUs #11: Allocate Only the CPUs You NeedYou Need

Allocate only as many vCPUs as a VM requires.– Start with only one as your baseline. Rarely

deviate. Circle this bullet point. No, really.– Don’t use dual vCPUs if single-threaded application.– Don’t assign more vRAM than necessary.

#11: Allocate Only the CPUs #11: Allocate Only the CPUs You NeedYou Need

Allocate only as many vCPUs as a VM requires.– Start with only one as your baseline. Rarely deviate.

Circle this bullet point. No, really.– Don’t use dual vCPUs if single-threaded application.– Don’t assign more vRAM than necessary.

More vCPUs equals more problems.– More vCPUs equals more interrupts– Extra overhead in maintaining consistent memory view

between vCPUs. This is tough, especially with today’s descheduled processing.

– Some OSs migrate single-threaded workloads between multiple CPUs, adding a performance tax.

More CPUs good for CPU spike handling.

#12: Disconnect Unused#12: Disconnect UnusedPhysical Hardware DevicesPhysical Hardware Devices

COM, LPT, USB, Floppy, CD/DVD, NICs, etc.all consume interrupt resources.– High priority resources.– It is a big deal to insert a CD/DVD/USB.

Connected Windows guests will poll CD/DVD drives very frequently, significantly affecting performance.– Disconnect these in VM properties when not in use.– There’s a reason why the “Connected” checkbox

exists!– Note! Connected devices can prevent a vMotion!

#13: Upgrade to VM Version 7#13: Upgrade to VM Version 7

Virtual hardware version 7 offers some very significant performance improvements.– VMXNET3 paravirtualized NIC driver– PVSCSI paravirtualized SCSI driver– Upgrade VMware tools. Reboot.– (More on these in a minute)

Note that VMv7 hardware cannot be vMotioned to ESX servers prior to 4.0.– Be careful of this.

#14: Don#14: Don’’t Fear Scaling Outt Fear Scaling Out

Creating VMs is easy, so we create them.– You’ll eventually run out of CPU resources.– You’ll probably run out of RAM first.– Don’t run more VMs than processing/memory capacity.

When running very close to capacity, use CPU reservations to guarantee 100% CPU availability for the console.– Host | Configuration | System Resource Allocation |

Edit– Particularly important if you have software installed

there.– This is unnecessary in ESXi.

#15: 80% is Nice#15: 80% is Nice

VMware recommends maintaining an administrative ceiling on utilization at 80%.– This reserves enough capacity for failure, service

console.– VMware suggests that 90% should be a warning for

overconsumption.

Less workload dynamics can shift this up.– …but, seriously, who can really state that?

#16: With Older OSs,#16: With Older OSs,Use UP HAL When PossibleUse UP HAL When Possible

Newer OSs (Vista, W7, 2008) use the same HAL for all UP/SMP conditions.

Older OSs leverage two HALs– A Uniprocessor HAL– A Multiprocessor HAL

An SMP HAL that is only given a single vCPU will run slightly slower.– Slightly more synchronization code.

Note that this will impact hot add.

#17: Mind Scheduling Affinity#17: Mind Scheduling Affinity

It is possible to tag a VM to a particular pProc.– Good for ensuring that VM has processing

resources during contention.– Setting Code Sharing to None prevents any other

vProc from using a pProc on the same core. Like disabling HT.

– Setting Code Sharing toInternal prevents vProcson other VMs from usingpProc on same core.Only same VM.

– Just set this to Any.

#18: Don#18: Don’’t Touch this Setting.t Touch this Setting.

Exceptionally rare are the cases when this setting shouldbe adjusted.– So, no touchy.

#18: Don#18: Don’’t Touch this Setting.t Touch this Setting.

Exceptionally rare are the cases when this setting shouldbe adjusted.– So, no touchy.

– I will tell youwhen.

– I havevery reasonableconsulting rates.

#19: Don#19: Don’’t Just Keep Up thet Just Keep Up theOld (and Dumb) Habit of Assigning 4G Old (and Dumb) Habit of Assigning 4G of RAM to Every Stinking Virtual of RAM to Every Stinking Virtual Machine, No Matter What Workload it Machine, No Matter What Workload it Runs. Really.Runs. Really.

Consciously consider the amount of RAM that a VM needs, and assign it that RAM.– Yes, VMware has memory ballooning.– But overallocating unnecessarily increases VM

overhead.– Ballooning isn’t automatic. Ballooning is slow.

Ballooning is reactive.

#19: Don#19: Don’’t Just Keep Up thet Just Keep Up theOld (and Dumb) Habit of Assigning 4G Old (and Dumb) Habit of Assigning 4G of RAM to Every Stinking Virtual of RAM to Every Stinking Virtual Machine, No Matter What Workload it Machine, No Matter What Workload it Runs. Really.Runs. Really.

Consciously consider the amount of RAM that a VM needs, and assign it that RAM.– Yes, VMware has memory ballooning.– But overallocating unnecessarily increases VM overhead.– Ballooning isn’t automatic. Ballooning is slow.

Ballooning is reactive.

Note to Self: Talk about Hyper-V’s Dynamic Memory here. Very cool.

#20: Stop with the Snapshots#20: Stop with the Snapshots

Snapshots are (were) a significant selling point in the early days of virtualization.– About to do something risky? Snapshot!

Its like a career protection device!

#20: Stop with the Snapshots#20: Stop with the Snapshots

Snapshots are (were) a significant selling point in the early days of virtualization.– About to do something risky? Snapshot!

Its like a career protection device!

However, snapshots aren’t (and never were) meant for long-term storage.– And I mean “no more than just a few minutes” long.– They’re not meant for backups.– Reverting to an aged snapshot can break computer

trust relationships to Windows domain.– Managing snapshots, particularly linked ones,

significantly reduces overall VM performance.

#21: Perform vSphere Tasks#21: Perform vSphere Tasksin the Off Hoursin the Off Hours

Some vSphere tasks are actually quite impactful on VM operations.– Provisioning virtual disks– Cloning virtual machines– svMotion– Manipulating file permissions– Backups– Anti-virus (bleh)

Do these tasks during off hours, or you may impact performance for other running VMs.

#22: Mind Affinities#22: Mind Affinities

Some VMs need to regularly communicate with each other with high throughput.– “Keep Virtual Machines Together”– Make sure these machines share the same

vSwitch.– Collocation forces inter-VM traffic through the

system bus rather than pNICs, significantly increasing speed.

The loss of other VMs could be bad, if both are collocated on the same host.– “Separate Virtual Machines”

#23: Disable Screen Savers#23: Disable Screen Savers

And Window animations.

Screen savers represent a machine interrupt, particularly those with heavy graphics.– “Pipes”, I’m looking right at you!– This interrupt is particularly impactful on collocated

VMs.

– …and, plus, screen savers on servers is sooooo 2002.

#24: Use NTP, not VMware#24: Use NTP, not VMwareTools for Time SyncTools for Time Sync

…and here’s one out of the odd files…

VMware suggests configuring VMs to sync time from an external NTP server.– They prefer this even over their own internal

timekeeper.– Their timekeeper uses a much lower resolution

than NTP.– NTP = milliseconds– NT5DS = 1 second– VMware Tools = ?

#25: Never Use PerfMon #25: Never Use PerfMon Inside the VM, Except…Inside the VM, Except…

Not that you’d ever actually use PerfMon, but…– Measuring performance from within a virtual

machine fails to take into account for unscheduled time.

– Essentially, when the ESX server isn’t servicing the VM, no time passes within that VM.

– Also, in-VM PerfMon doesn’t recognize virtualization overhead.

– Most important, in-VM PerfMon can’t see down into layers below the VM: Storage, processing, etc.

#25: Never Use PerfMon #25: Never Use PerfMon Inside the VM, Except…Inside the VM, Except…

Not that you’d ever actually use PerfMon, but…– Measuring performance from within a virtual machine

fails to take into account for unscheduled time.– Essentially, when the ESX server isn’t servicing the VM,

no time passes within that VM.– Also, in-VM PerfMon doesn’t recognize virtualization

overhead.– Most important, in-VM PerfMon can’t see down into

layers below the VM: Storage, processing, etc.

VMware Tools adds PerfMon counters to VMs.– These are OK to use, as they’re synched from ESX.

#26: Paravirtualization is #26: Paravirtualization is Your FriendYour Friend

VM Hardware Version 7 adds two new paravirtualized drivers.– VMXNET3 replaces E1000– PVSCSI replaces BusLogic/LSILogic

Paravirtualized drivers are superior to emulation– They are “aware” they’ve been virtualized. Can

work directly with host without needing emulation.– Mexican menus versus French menus.

VMXNET3 supports TSO & Jumbo Frames,in the VM!– Even if the physical hardware doesn’t support TSO!

#27: Turn on Jumbo Frames,#27: Turn on Jumbo Frames,but Do it Everywherebut Do it Everywhere

If you plan to use Jumbo Frames…– MTU size is usually set to 9000– Make sure you enable it everywhere.– This brings particular assist with large file transfers

(think WDS, virtual disk provisioning, etc.) and storage connections.

Not all network equipment supports Jumbo Frames.– Test, test, test.

#28: DRS Will Prioritize#28: DRS Will PrioritizeFaster Hosts over Slower OnesFaster Hosts over Slower Ones

A neat fact (that I didn’t know):– When potential hosts for a DRS relocation have

compatible CPUs but different CPU frequencies and/or memory capacity…

– …DRS will prioritize relocating VMs to the system with the highest CPU frequency and more memory.

– This won’t be the case if that CPU is already at capacity.

#29: Disable FT, Unless#29: Disable FT, UnlessYouYou’’re Using Itre Using It

…and most of you aren’t.

You can “turn on” but not “enable” FT.– Problem: Turning on Fault Tolerance automatically

disables some features that enhance VM performance.

– Hardware virtual MMU is one.

Or, just don’t use that horrible feature. Har!– (Is there anyone from VMware in the audience…?)

#30: Match Configured OS#30: Match Configured OSwith Actual OSwith Actual OS

Big oops here, usually during OS migrations. This setting also

sets a few important low-level kerneloptimizations.

Make sureyours arecorrect!

BONUS TIP #31: Follow the NumbersBONUS TIP #31: Follow the Numbers

Private Clouds are all about quantifying performance in terms of supply and demand.

vSphere gives you those numbers. Just sum ‘em up.

Final ThoughtsFinal Thoughts

See! Creating good VMs isn’t all that easy.– Our jobs aren’t going away any time soon!– These little optimizations add up

Be smart with your virtual environment and always remember…

Final ThoughtsFinal Thoughts

See! Creating good VMs isn’t all that easy.– Our jobs aren’t going away any time soon!– These little optimizations add up

Be smart with your virtual environment and always remember…

…you cannot change the laws of Physics!

Tips and Tricks for CreatingTips and Tricks for CreatingGood VMs:Good VMs:ItIt’’s Not as Easy as Yous Not as Easy as You’’d Think!d Think!

Greg ShieldsGreg ShieldsPartner and Principal TechnologistConcentrated Technologywww.ConcentratedTech.com

Please fill out evaluations,or no free beer for you!

!!!

This slide deck was used in one of our many conference presentations. We hope you enjoy it, and invite you to use it

within your own organization however you like.

For more information on our company, including information on private classes and upcoming conference appearances, please

visit our Web site, www.ConcentratedTech.com.

For links to newly-posted decks, follow us on Twitter:@concentrateddon or @concentratdgreg

This work is copyright ©Concentrated Technology, LLC