Date post: | 07-Jan-2017 |
Category: |
Technology |
Upload: | cormac-hogan |
View: | 2,080 times |
Download: | 1 times |
Virtual SAN - Day 2 OperationsCormac Hogan, VMware, IncPaudie ORiordan, VMware, IncSTO7534#STO7534
1
CONFIDENTIAL2
This presentation may contain product features that are currently under development.This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.Technical feasibility and market demand will affect final delivery.Pricing and packaging for any new technologies or features discussed or presented have not been determined.Disclaimer
This SessionVirtual SAN has been available since March 2014, almost 2.5 yearsTo date, we have over 5,000 VSAN customers.VMware recognises that dealing with Virtual SAN Operations on a day to day basis requires more than 2 clicksSince the launch of Virtual SAN, additional tools for managing, monitoring and troubleshooting Virtual SAN have become available.In this session, approaches to common problems that actual Virtual SAN administrators face will be discussed.We will discuss how various tools and approaches to various problems can help you manage your data now the VMware consultant left the building.
3CONFIDENTIAL
3
Agenda41Introduction to Session2Monitor Getting The Basics Right3Alerting What Are My Options?4Virtual SAN Upgrade5 Bring it all together Handling a Failure (Demo)
CONFIDENTIAL
Monitoring Get the Basics RightvSphere LoggingVirtual SAN Trace FilesESXi Core Files
Persistent Logging Challenges with ESXi Boot DevicesvSphere Hosts can be deployed on multiple different types of media with draw backs and advantagesSCSI, SSD, USB, SATADOMIf you are already in production consider how logging gets laid outSCSI /SAS/ SATA / SSD / VMFS automatically added Scratch located on VMFSSATADOM VMFS automatically added Scratch located on VMFSUSB / SD (any capacity) No VMFS No persistent Scratch area512 MB RAMDISK instead
VMFS
/scratch (RAMDISK)
/bootbank system
vmkDiagnostic
/altbootbank/storeVMware strongly recommends setting up syslog in all casesCONFIDENTIAL6
SD/USBsize of 4GB for a boot device, 2.2GB of the USB is set aside for the core dump. Before vSphere 5.5, the VMkcore partition was only 100MB in size6
Virtual SAN Trace filesProvides extremely low-level logging for VSANVSAN traces require ~500MB of disk space.Majority of traces in binary format Persisted to VMFS or NFS if availableVSAN Datastore does not support log redirection at this timeStored on RAMDISK if no persistent storage availableIn case of reboot, Most recent/important VSAN traces persisted to store partitionIn case of crash, VSAN traces persisted to diagnostic partitionSince Virtual SAN 6.2 urgent trace files can be redirected to syslog target
/bootbank system
vmkDiagnostic
/altbootbank/store
VMFS
/scratch (RAMDISK)
/store
vmkDiagnostic
CONFIDENTIAL7
Since these traces are of extreme importance to VMware support, extra efforts are made to preserve them when /scratchis not on persistent storage. In these cases, when the ESXi host is booted from SD/USB, and the VSAN traces are on a RAMdisk, they also get copied to/lockerfor persistence via/etc/init.d/vsantraced when the host reboots. Since /locker is relatively small, typically all the VSAN trace files will not fit. To accommodate this, they are saved in value order so that the most recent/significant information is captured first.
When VSAN trace files are being written to a RAMdisk, they should also be persisted on a PSOD. This can be verified by the command esxcli system visorfs ramdisk list.
A common question is why do we not just persist the VSAN traces to the SD/USB rather that doing this step? Again, it is due to the bandwidth of the VSAN trace files. The concern here is that the number of writes generated by VSAN traces, and there are a lot of them, can burn out a USB/SD card.
DOM and CMMDS use vmkernel.log only for very important messages, but usually dont publish to vmkernel logs
VSAN traces. Two types: Urgent and normal traces. Urgent traces are supposed to be 1/10 as chatty as normal traces. vsanUrgent.log is that "urgent trace channel".Introduced it in 6.2 to give LogInsight and other aggregators access to more events from DOM/CMMDS
7
ESXi Core Dump Partition Special Partition incase of diagnostic crash2.2GB space set aside for memory dumpEnsures full memory dump gets written to persistent mediaESXI hosts with less than 512GB Physical MemoryUse SAS/SATA , SATADOM, vSphere ESXi Network Dump Collector if no suitable persistent media available
vmkDiagnostic
/scratch (RAMDISK)
/bootbank system
/altbootbank/storeCONFIDENTIAL8
SD/USBsize of 4GB for a boot device, 2.2GB of the USB is set aside for the core dump. Before vSphere 5.5, the VMkcore partition was only 100MB in sizeSize irrelevant to SSD8
9
Alerting What Are My Options?vSphere Built-InvRealize OperationsvRealize Log Insight
vSphere Built-invSphere Native Alerting70+ Virtual SAN Health AlarmsMany more vSphere alarmsAlert via SNMP / SMTP
Create custom alarmsUse VMware ESXi VOBs orObservation IDs for VSAN
Virtual SAN Management API 6.2 interface for bespoke solution CONFIDENTIAL11
VMware ESXi Observation IDs for Virtual SANEach VOB event is associated with an identifier (ID). Before you create a Virtual SAN alarm in the vCenter Server, you must identify an appropriate VOB ID for the Virtual SAN event for which you want to create an alert. You can create alerts in the VMware ESXi Observation Log file (vobd.log).
To review the list of VOB IDs for Virtual SAN, open thevobd.logfile located on your ESXi host in the/var/logdirectory. The log file contains the following VOB IDs that you can use for creating Virtual SAN alarms.11
vRealize Operations + Log Insight
Virtual SAN awareness with Storage Management Pack Virtual SAN Dashboards and Heat MapsHost and Device StatisticsHealth Alerts
LogInsight also have Virtual SAN awarenessVirtual SAN content packLog aggregation from Virtual SAN nodesIntegration with VROPS alerting
CONFIDENTIAL12
12
13
Virtual SAN Upgrade PrerequisitesWorkflowMonitoringGotchas
14
Upgrade OverviewVirtual SAN 6.2 has a new on disk format for disk groups and exposes new Data Services
Upgrades are performed in multiple phasesPhase 1: Upgrade to vSphere 6.0 U2 Phase 2: Object and Disk format conversion (DFC)
Virtual SAN 6.2
vSphere 6.2
vsan.v2_ondisk_upgrade
Cluster: Manual Mode
Phase 1Phase 2rvc >But before you beginPhase 0: Validate your current enviromentCONFIDENTIAL15
Phase 1: Fresh deployment or upgrade to vSphere 6.2vCenter ServerESXi HypervisorApply critical patches*
Phase 2: Disk format conversion (DFC)PrechecksObject ConversionReformat disk grou15
Phase 0 Please Read Before You StartVirtual SAN 6.2 Release NotesVMware Product Interoperability VMware Virtual SAN Hardware Server, Controller, SSD, Disk on HCLController Firmware, Disk Firmware, Controller Driver, Enclosure Firmware
CONFIDENTIAL16
Disk Format Conversion (DFC) conversion phase is where VMFS-L disk format will be replaced by VirstoFS on all participating magnetic devices.
What happens during the disk reformat phase?All the nodes should have been completed its software --> ESXi 6.2 VSAN2.0 cluster)Operates on one node and one diskgroup at a time must be orchestrated at cluster level as objects get a 1 MB address space and get alligned to 4KNode --> DiskGroup --> Data Evacuation --> reformat disks --> DiskGroup comes OnlineThe above flow repeats for remaining Diskgroups in the node and then the process jumps to the next node.No vsan node with ESXi55x software is allowed to join the VSAN2.0 cluster16
Phase 1 - Upgrading from Virtual SAN 5.5CONFIDENTIAL17You can upgrade from VSAN 5.5 to VSAN 6.X Howeverpatching is critical During upgrade some older releases of vSphere 5.5 may cause VMware Virtual SAN Data Unavailabilityand Instability.Make sure all critical patches are installed prior to upgrade
Not an issue between VSAN 6.0 and VSAN 6.X
More details please read VMware KB 2113024 and VMware KB 2139969
5.5 EP06 or 5.5 P04 to vSphere 6.0 GA can cause VMware Virtual SAN Data Unavailability(2113024)Resolved with patch VMware ESXi 5.5, Patch Release ESXi550-201504001 (2112672)andVMware ESXi 5.5, Patch ESXi550-201504201-BG: Updates esx-base (2112675).
Upgrading from ESXi 5.5 to ESXi 6.x in a Virtual SAN cluster can cause permanent loss of data(kb.vmware.com/kb/2139969)The cluster is mixed between ESXi host versions 5.5 and 6.0 such as during the upgrade of a cluster.A VSAN object is reconfigured while the cluster is in a mixed state.Resolved with VMware ESXi 5.5, Patch Release ESXi550-201601501 (2141164).
17
Phase 1 VSAN Disk Format Conversion TableCONFIDENTIAL18Virtual SAN Starting VersionVirtual SAN Target VersionPost-upgrade on-disk format upgrade required?VersionVirtual SAN 5.5 U1Virtual SAN 5.5 Update X No-Virtual SAN 5.5 Update XVirtual SAN 6.X Yes 1.0 to 2.0 / 3.0Virtual SAN 6.0 Virtual SAN 6.1 No-Virtual SAN 6.0 or 6.1Virtual SAN 6.2Yes 2.0 to 3.0
Starting verson can I go to 6.2????
18
Phase 1 vSphere Software UpgradeStep 1 Upgrade vCenter Server to 6.0 U2Step 2 Upgrade ESXi hosts to 6.0 U2
Maintenance Mode?Ensure accessibilityFast, but with risk Full data migrationSlower, but no risk
CONFIDENTIAL19
Disk Format Conversion (DFC) conversion phase is where VMFS-L disk format will be replaced by VirstoFS on all participating magnetic devices.
What happens during the disk reformat phase?All the nodes should have been completed its software --> ESXi 6.2 VSAN2.0 cluster)Operates on one node and one diskgroup at a time must be orchestrated at cluster level as objects get alligned to 4KNode --> DiskGroup --> Data Evacuation --> reformat MDs with VirstoFs --> DiskGroup comes OnlineThe above flow repeats for renaming Diskgroups in the node and then the process jumps to the next node.No vsan node with ESXi55x software is allowed to join the VSAN2.0 cluster after starting DFC.19
Phase 1 vSphere Software Health Check GOTCHAvCenter 6.0 Update 2 installedHealth check will not work when ESXi version is < 6.0 U2
CONFIDENTIAL20
Disk Format Conversion (DFC) conversion phase is where VMFS-L disk format will be replaced by VirstoFS on all participating magnetic devices.
What happens during the disk reformat phase?All the nodes should have been completed its software --> ESXi 6.2 VSAN2.0 cluster)Operates on one node and one diskgroup at a time must be orchestrated at cluster level as objects get alligned to 4KNode --> DiskGroup --> Data Evacuation --> reformat MDs with VirstoFs --> DiskGroup comes OnlineThe above flow repeats for renaming Diskgroups in the node and then the process jumps to the next node.No vsan node with ESXi55x software is allowed to join the VSAN2.0 cluster after starting DFC.20
Phase 1 vSphere Software Health Check Software Upgraded?Check your Virtual SAN HealthUpdate your HCL Database filesMake sure its all Green
Address any failed tests BEFORE proceeding to the On Disk Format Upgrade!
CONFIDENTIAL21
Phase 2 Disk Upgrade PrechecksAll hosts in cluster are connected to vCenter ServerAll host upgraded to ESXi 6.2No network partitions in the VSAN cluster.No hosts with auto-claim storage.No hosts in Maintenance Mode
CONFIDENTIAL22
Once all the pre-checks are done CMMDS will not allow 5.5x hosts to join the cluster22
Phase 2 Are You Sure?
CONFIDENTIAL23
Phase 2 Virtual SAN Object and Disk Format ConversionTwo Conversion steps
Objects On Disk Format
Version