Post on 29-Aug-2014
transcript
Introduction
Background Reading
Configuration
Supported Hardware
Supported software
Storage configuration
Logical Domain components
Domains
Networks
CPU
Memory
OpenBoot PROM
Physical and virtual console access
Onboard cryptographic acceleration
Resource Isolation
CPU Isolation
Memory Isolation
Network Isolation
Physical vs Virtual performance
Disk I/O Isolation
I/O Scaling
Logical Domain Administration
Building Logical Domains
Converting a physical host into a control domain
Configuring virtual storage for guest domains
Creating guest domains
Operational requirements
Startup and shutdown
Restarting domains
Restarting daemons
Adding storage to a guest
Patching implications
Console access
Commissioning
Decommissioning
Gotchas
Logical Domain Naming Conventions
Hostnames
Virtual disk servers
Virtual disk devices
Virtual disk
Virtual switches
Virtual networks
Supporting Technologies
Built automation - jumpstart
Monitoring
Capacity Management
Backup and recovery
Capacity Planning
Inventory
Outstanding work
Background and further reading
Appendix
cpusin.master
cpuspin.child
Introduction
Foswiki > Main Web > ToolsAndTechnologies > LDOM (19 Apr 2011, AnthonyBennett)
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
1 de 21 26/04/2012 12:36 p.m.
This page details how a virtualised Solaris environment shall be implemented within using Logical_Domains and covers the
following areas:
High level configuration and supported hardware and software components
A breakdown of the various components in Logical Domains
Resource isolation - how resources are isolated or shared between guest domains and the performance implications of
guests under load
LDOM administration - how to use the Logical Domains toolset along with support implications of managing a virtualised
Solaris environment
LDOM component naming conventions - keeping things consistent across the estate.
Supporting technologies - other technology components required to operate a virtualised Solaris environment
Background Reading
For those new to Logical Domains, the following is a suggested reading list to get up to speed with the concepts:
Beginners Guide to LDoms: Understanding and Deploying Logical Domains - good walkthrough on the basics
LDOMS 1.0.3 Admin Guide - fills in the technical gaps to give a complete understanding of the technology.
Octave Orgeron produced a series of articles for USENIX: Part 1, Part 2, Part 3 and Part 4
For keeping up to date with latest LDOM information, the following blogs are worth subscribing to:
Ariel Hendel,
Sun virtualisation,
Liam Merwick,
Sun Omicron,
The Hyper Trap,
The Navel of Narcissus,
Virtual Steve,
Virtuality,
C0t0d0s0,
Sparks and white noise
Configuration
This section covers the configuration components of Logical Domains and standards to be adhered to when deploying domains.
Supported Hardware
LDOMs are supported on the Niagara range of servers (T1000 through T5240) and are standardising on the T5120 for delivering
LDOMs for the following reasons:
T5x20 servers have eight Floating Point Units or FPUs - one per core. In comparison, the Tx000 systems only have a single
FPU for the server. Having multiple FPUs provides a more general purpose server to cope with differing workloads
T5x20 servers have effectively replaced the Tx00 systems in terms of increased performance for a similar cost.
The T5x40 systems are dual CPU servers providing twice the compute resource of the T5x20. With Logical Domains being a
relatively new technology within , this increased density of consolidation is considered too high for initial adoption.
T5120 and T5220 servers offer the same CPU and memory capacity. The servers are 1U and 2U respectively with the
T5220 providing 8 internal disk slots rather than 4 and 6 PCI-E slots rather than 3. The additional storage and I/O capability
of the T5220 provides a more flexible server type to be used by both virtualised and non-virtualised systems.
Supported software
The supported software stack for Logical Domains is as follows:
Control Domain
Logical Domain manager 1.0.3
LDOM Toolset 1.0
Veritas Volume Manager 5.0 mp3
Control and guest domains
Solaris 10 update 5
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
2 de 21 26/04/2012 12:36 p.m.
Storage configuration
There are two elements to LDOM storage configuration: Firstly the storage used to boot the control domain and secondly, storage
for guest domains.
As per existing Solaris standards, the control domain will boot from internal 146GB SAS disks which will be mirrored using Solaris
Volume Manager or SVM. Disk layouts will follow existing jumpstart standards and there are no LDOM-specific requirements with
regards to boot environment storage.
Storage for the guest domains will be SAN presented via a pair of Emulex LPe11000-S Host Bus Adaptors. This SAN storage will
be managed by Veritas Volume Manager and virtual disks for the guest OS will be constructed from veritas volumes. The use of
SAN based storage for the guest domains along with the use of Veritas Volume Manger offers the following benefits:
SAN storage increases availability levels as all storage is accessed via two independent fabrics and storage is automatically
configured in a highly available configuration with the use of RAID 5.
Using SAN based storage moves towards decoupling the control and guests domains and more easily allows for guest
portability and migration in the future.
As storage requirements increase, it is quicker and more efficient to meet this storage requirement from SAN storage rather
than increasing internal storage or possibly needing to by DAS storage.
SAN storage provides a richer set of functionality such as off-host snapshots and replication. Although not used at present,
this technology could be leveraged in the future.
In comparison to SVM, Veritas Volume Manager offers a powerful and more flexible interface to cope with future storage
demands.
ZFS is a relatively new product within Sun. ZFS offers considerable cost savings and will likely be investigated in the future
but at present, Veritas Volume Manager provides a more proven trustworthy storage management platform.
14 * 20GB SAN LUNs are presented to the I/O domain providing 280GB of usable storage for the guest domains. This storage is
subsequently configured into 30GB Veritas volumes, each of which are presented to the guest domains via a virtual disk server
running on the control domain.
The following diagram illustrates how virtualised IO works within Logical Domains.
Using the current Solaris 10 build, a 40GB boot disk leaves 26GB of usable storage for applications. The next Solaris build will be a
slimmer install and provide more flexible partitioning for application installs.
Further details around configuring storage and allocating/scaling guest storage can be found in the
[Logical_Domains#Logical_Domain_Administration|LDOM Administration] section
Logical Domain components
This section details the various resources used in an LDOM environment and how the resources are virtualised and made available
Domains
LDOMs allow four different types of domain to be created:
Control domain - creates and manages other domains. The control domain runs the Logical Domain Manager software provided
by the SUNWldm package. The first step in using Logical Domains is to convert a physical server build into the control domain. There
is a single control domain per server.
Service domain - provides virtual services to other domains. Such services include virtual switches, virtual disk and the virtual
console service. Within , the control and service domain are the same.
I/O domain - has direct ownership and access to I/O devices such as HBAs and network cards. I/O domains provide virtual disk
and network services to other domains. There can be a maximum of two I/O domains per server, one of which must be the control
domain. Within , the I/O and control domain are the same.
Guest domain - provides a virtual machine using services from the Service and I/O domain and managed by the Control domain.
For simplicity, the roles of the control, service and I/O domain are collapsed into a single domain. Throughout the document, the
terms Control, Service and I/O domain are used interchangeably.
The reason for collapsing control, service and I/O domains into a single domain are to reduce complexity and management
overhead. Additionally, the T5x20 server range has a single PCI-E root complex off which all PCI-E devices reside. This
configuration means it is not physically possible to configure a T5x20 in a split PCI configuration to enable two I/O domains.
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
3 de 21 26/04/2012 12:36 p.m.
Networks
The control domain is the only domain to have direct network connectivity. It "owns" the network adaptors and is responsible for
routing all data to and from the public network.
Conversely, guest domains have no direct network connectivity themselves. To allow guest domains to connect with other systems,
vswitches are created on the control domain to virtualise network access for guest domains.
The following diagram shows how the virtualised network layer will be configured on a typical LDOM server:
The control domain will be presented with the three standard network connections: Two FMA connections will be provided
via separate switches for primary network connectivity and a single network connection will be presented for the backup
LAN. All three connections will be gigabit.
Three vswitches will be created on the Control domain. Each vswitch is connected to a physical interface and will provide
external network connectivity to the guest domains.
As part of initialising the Control domain, the Solaris network configuration is updated to use vswitch devices rather than
physical interfaces. The e1000g0 device is swapped with vsw0 , e1000g1 with vsw1 and e1000g2 with vsw2 . This switching
from physical to vswitch devices is required to enable the control domain to communicate to the guest domains directly over
TCP/IP. This communication is not an LDOM requirement, more an operational requirement that all systems on a common
subnet should be able to communicate with each other directly.
After converting network connectivity to vswitches, IP Multipathing or IPMP is implemented across the two FMA network
connections on the control domain. IPMP is configured in an active/passive configuration and this provides highly available
network connectivity to the Control domain.
When adding guest domains, each domain is given three virtual network devices with each vnet device connecting into the
existing vswitches. As for control domains, active/passive IPMP is configured across the two FMA networks for availability
reasons.
See Gotchas? for why network connectivity is not provided via two I/O domains.
CPU
The T5120 has an 8 core CPU with each core being able to run 8 separate threads. Additionally, each core has its own Floating
Point Unit or FPU. To the native operating system, the hardware appears as a 64 cpu system.
CPU allocation is managed at a thread level making it technically possible to have up to 64 domains on a single T5220. In practice,
supporting this number of small domains on a single server is inadvisable for the following reasons:
It is likely that the amount of resource allocated to each server is too small to be usable.
Such a high density of applications on a small server poses too high a risk in terms of the impact of a single hardware failure.
Where multiple operating systems share a common cpu core, workload in one domain could impact the performance of
another.
When allocating CPU resources to a domain, it is not possible to specify an individual thread or cpu core to allocate from - the
lowest available thread will be allocated. To provide repeatable guest domain performance, domains will be configured in multiples
of 8 threads which ensures one or more complete cores are allocated to a single domain. By default, guest domains will be given a
single core of 8 threads although this could be scaled in multiples of 8 threads if required.
CPU resources are hard-allocated to a domain so as long as each domain is allocated threads from a single CPU core, it should
not be possible for high CPU activity in one domain to impact another domain on the same server. (Benchmarking? supports this
statement).
CPU resources are hard allocated to a system but it is possible to administratively move CPU resources between domains via the
ldm command on the control domain. This dynamic reconfiguration can be performed live without the need for a domain reboot.
Memory
As for CPU resources, memory is also hard-allocated to a guest domain. Sun recommends a minimum of 512MB of memory per
Solaris 10 instance and LDOMs allow memory to be allocated in units as small as 4MB.
Although LDOMs allow memory to be allocated in arbitrarily small chunks, it is recommended that the memory allocated to a given
guest domain is directly related to the percentage of cpu resource allocated. Using the T5220 which allows for a recommended
maximum of 8 guests, 1/8th of the total system memory should be allocated along with each CPU core.
On each physical server, 128MB of physical memory is reserved for system use and cannot be allocated to guest domains. To
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
4 de 21 26/04/2012 12:36 p.m.
ensure consistent guest domain configurations, the control domain will be allocated 3968MB allowing all guests to be allocated a
full 4GB.
This pairing of CPU and memory allocation allows for easier estate management as the use of LDOMs grows. Systems will never
be left with unused cpu or memory resources and migrating guest domains between physical systems will be easier as all virtual
systems will be a fraction of all available resources rather than a fully customisable amount.
OpenBoot? PROM
On traditional SPARC systems, the hardware provides a single OpenBoot? PROM or OBP environment on which the operating
system will run. With the introduction of the hypervisor layer and the ability to run concurrent operating systems on a single server,
each guest domain must have its own OBP environment.
A T5220 server will have a single OBP image held on firmware and each domain (even for non-virtualised single-system T5220s)
will load a copy of the OBP into RAM and execute from here. Once the operating system has booted, the OBP memory will be
released to be used by the OS.
Although each domain runs a virtual OBP, each domain maintains its own OBP variables and these variables along with the basic
domain configurations are stored on the system controller and will persist across domain reboots. Although OBP variables persist
across reboots, they will not automatically persist across a hardware powercycle. Operational workarounds to this issue are
detailed in the Administration section.
Physical and virtual console access
The T5220 console is accessible via both the serial port and network management port. Both of these routes provide console
access to the primary or Control domain in the same manner as for a physical box.
To provide console access for guest domains, a virtual network console service is configured as part of the control domain
initialisation. The vntsd daemon can bind to ports in the 5000-5100 range and as guest domains are created, they are automatically
allocated a port number to provide virtual console connectivity. By default, these TCP ports are only available internally to the
control domain but the svc:/ldoms/vntsd:default SMF service is adjusted at control domain initialisation time to ensure the ports
are accessible on the control domain public network. This is in preparation for all Sun console connections to be externally secured
and managed with Conserver
Onboard cryptographic acceleration
The Niagara CPU range provides cryptographic acceleration on the CPU with each of the 8 CPU cores having its own Modular
Arithmetic Unit or MAU. Performing cryptographic operations in hardware rather than software offers significant throughput
improvements. As an example using dsa1024 operations in a single threaded application provides a 245-fold improvement for
verify operations and 15-fold improvement for sign operations.
As part of the fidelity standard configuration, guest domains will be configured with a single MAU per guest domain based on the
configuration guidelines of a complete CPU core being allocated to each guest.
Note, for applications to make use of the onboard cryptographic acceleration, they must be configured to do so. Using the
Cryptographic Accelerator of the UltraSPARC T1 Processor is a good overview on how this can be achieved.
Resource Isolation
The LDOM architecture provides a mixture of dedicated and shared resources between domains. In the case of CPU and memory,
resources are provided via the hypervisor directly to the guest domain whereas storage and network I/O is virtualised via the control
domain.
The summary position of how resource isolation works in LDOMs is as follows:
For CPU bound applications?, it is not possible for CPU loading in one domain to impact another.
Memory bound applications? cannot impact memory access in another domain.
Network loading? scales across parallel LDOMs only marginally less than it would on a physical system.
IO scaling? suffers a 6-10% throughput penalty per domain as guest domains perform concurrent I/O.
These results suggest LDOMs are suitable for the consolidation of all general purpose applications with the exception of
data-bound applications such as databases and backup servers. Backup and database servers are well defined applications for
which other technologies are available to drive up utilisation levels.
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
5 de 21 26/04/2012 12:36 p.m.
The following sections go into the detail of how each of the resource areas have been tested to illustrate domain isolation when
under specific loading.
CPU Isolation
When creating guest domains?, CPU resources will be allocated in groups of 8 to ensure complete cores are allocated to a given
domain. To test the impact of high CPU loading between logical domains, a T5220 has been split into 8 domains as follows:
{| | Hostname | Description | CPUs | Memory |---- |stella01 |Control and I/O domain |0 1 2 3 4 5 6 7 |4GB |---- |stella03 |Guest
domain |8 9 10 11 12 13 14 15 |4GB |---- |stella04 |Guest domain |16 17 18 19 20 21 22 23 |4GB |---- |stella05 |Guest domain |24
25 26 27 28 29 30 31 |4GB |---- |stella06 |Guest domain |32 33 34 35 36 37 38 39 |4GB |---- |stella07 |Guest domain |40 41 42 43
44 45 46 47 |4GB |---- |stella08 |Guest domain |48 49 50 51 52 53 54 55 |4GB |---- |stella09 |Guest domain |56 57 58 59 60 61 62
63 |4GB |---- |}
SciMark is a composite CPU benchmark which executes five computational kernels against small problem sets. For benchmarking,
the C version is used and executed three times and the average of the composite score is recorded.
SciMark? was run on the guest domain stella09 under three load scenarios as follows:
Scenario 1 SciMark? running on stella09. All remaining domains are idle
Scenario 2 SciMark? running on stella09. Remaining 6 guest domains are under heavy CPU loading scripts detailed in
appendix?. Control domain idle.
Scenario 3 SciMark? running on stella09. All guest domains and the control domain are under heavy CPU loading scripts
detailed in appendix?
The composite SciMark2? values are shown below:
{| | Scenario | Description | Composite SciMark2? score |---- |1 |All domains idle |34.42 |---- |2 |All guest domains under cpu load
|34.37 |---- |3 |All guest domains and control domain under cpu load |34.32 |---- |}
These results show that Logical Domains deliver highly effective CPU isolation between domains and high CPU loading does not
significantly impact the control domain.
Memory Isolation
LDOMS hard-allocate memory to each of the configured domains so it should not be possible for a memory-bound domain to
impact the performance of another domain. If applications have a higher memory footprint than provided by the domain
configuration, paging and ultimately swapping will occur which will use virtual I/O to access storage provided via the control domain.
As the virtual I/O layer and ultimately the underlying storage is shared between all guest domains, it is a possibility that significantly
oversubscribed domains could impact the performance of another guest domain. It is expected that application and standard OS
performance monitoring tools would detect and alert on such a scenario before it could impact other guest domains.
To prove memory bound applications do not significantly impact other running domains, the memrand benchmark has been used
from the libMicro portable microbenchmark suite. The choice of memrand follows a blog posting from Phil Harman illustrating
memory latency for increasingly parallel workloads.
As for the CPU testing, a T5220 was split into 8 domains with 4GB and 8 threads (1 complete CPU core) allocated to each domain
and memory tests were conducted under various levels of parallelism. Following Phils example, memrand was run to perform
negative stride pointer chasing to show memory latency as below:
# *./bin/memrand -s 128m -B 1000000 -T 8 -C 10 -L*
# ./bin/../bin-sun4v/memrand -s 128m -B 1000000 -T 8 -C 10 -L
prc thr usecs/call samples errors cnt/samp size
memrand 1 8 0.16282 12 0 1000000 134217728
The following table shows the memory latency under increasing levels of parallelism.
{| | Scenario | Description | Memory latency |---- |1 |memrand run as a single thread on a single guest domain. All other domains
idle |160.2 ns |---- |2 |memrand run as a single thread on two guest domains . |160.2 ns |---- |3 |memrand run as a single thread on
four guest domains |161.0 ns |---- |4 |memrand run as a single thread on 7 guest domains |161.7 ns |---- |5 |memrand run as a
single thread on 7 guest domains and the control domain |162.2 ns |---- |}
These results show that memory access times are not impacted when other domains are performing memory-bound operations.
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
6 de 21 26/04/2012 12:36 p.m.
Network Isolation
The standard LDOM configuration will be to have 8 domains on each T5220 server. Using gigabit network connections negate the
need for individual domains to need their own dedicated network adaptors: 8 domains sharing gigabit networks should provide
suitable bandwidth for standard applications. Three physical networks are presented to the control domain: two FMA connections
and a single backup connection.
The control domain network adaptors are configured as follows to provide virtual switches to the guest Logical Domains:
{| | Underlying Interface | Network connection | Vswitch name |---- |e1000g0 |FMA network |primary-vsw |---- |e1000g1 |Backup
network |backup-vsw |---- |e1000g2 |FMA network |alternate-vsw |---- |}
Vswitches are internal to the server and provide a virtual network segment to other guest domains. Each guest domain will connect
to the vswitch via a virtual network adaptor as shown in the diagram below:
Physical vs Virtual performance
To compare network performance between the physical hardware and Logical Domains, a pair of T2000 servers were used to
generate network load using iperf. All servers are connected via a 1Gb/s switched network and all hosts are running Solaris 10 as
shown below.
For the first set of physical tests, a T5200 with 32GB of memory and 8 cores was used as a load target and the T2000s were used
to generate an increasing number of network streams for 45 second intervals.
For the second set of tests, the same T5220 was configured into 8 domains with each domain having a single core and 4GB of
memory. domain1 is a joint control and I/O domain and domains 2 through 8 are guest domains.
The following table and associated graph shows the aggregate throughput achieved when running parallel iperf streams against a
single T5220 followed by the same parallel testing against individual LDOMs on the same hardware.
{| |colspan=4 | Load generation to a physical T5220 |---- | Client load generator | Load targets | Total streams | Total
throughput in Mbit/s |---- |T2000 a |T5220 |1 |390 |---- |T2000 a |T5220 |rowspan=2;|2 |rowspan=2;|909 |---- |T2000 b |T5220 |----
|T2000 a |T5220 (2 streams) |rowspan=2;| 4 |rowspan=2;| 963 |---- |T2000 b |T5220 (2 streams) |---- |T2000 a |T5220 (3 streams)
|rowspan=2;|6 |rowspan=2;|932.9 |---- |T2000 b |T5220 (3 streams) |---- |T2000 a |T5220 (4 streams) |rowspan=2;|8
|rowspan=2;|1009.5 |---- |T2000 b |T5220 (4 streams) |---- |}
{| |colspan=4 | Load generation to parallel guest LDOMs |---- | Client load generator | Load targets | Total streams | Total
throughput in Mbit/s |---- |T2000 a |domain 8 |1 |399 |---- |T2000 a |domain 8 |rowspan=2;|2 |rowspan=2;|879 |---- |T2000 b
|domain 7 |---- |T2000 a |domain 6, domain 8 |rowspan=2;|4 |rowspan=2;|920 |---- |T2000 b |domain 5, domain 7 |---- |T2000 a
|domain 4, domain 6, domain 8 |rowspan=2;|6 |rowspan=2;|926 |---- |T2000 b |domain 3, domain 5, domain 7 |---- |T2000 a |domain
2, domain 4, domain 6, domain 8 |8 |902.7 |---- |}
These results show that the introduction of a virtualised network layer has an observable but not hugely significant impact on
network throughput. It should be noted that increased CPU utilisation will be seen on the control domain during periods of high
network activity although not enough to significantly impact other guest domains. The following graph shows average CPU
utilisation on the I/O domain when running network tests to parallel guest domains.
With only a single core to perform all network I/O, cpu utilisation runs at 45% in the Logical Domain testing rather than 7% when
running the same tests against a single OS with all 8 cores available. This discrepancy is reasonable considering the reduction on
CPU available to service the network.
Disk I/O Isolation
LDOMs provide a virtualised I/O layer across guest domains. Storage for guest domains will be SAN based and presented via a
pair of HBAs to the I/O domain. This configuration provides no single point of failure external to the server and provides for future
guest mobility.
On the I/O domain, Veritas Volume Manager is used to manage the SAN storage and volumes will be created to present to the
guests as virtual disks. The following diagram illustrates the IO configuration of a typical LDOM host
I/O Scaling
Due to the number of variables involved and differing workloads, it is challenging to provide meaningful I/O benchmarks illustrating
scalability and separation. Each application may have different I/O patterns and block sizes and both host and array caching
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
7 de 21 26/04/2012 12:36 p.m.
influences the outcome.
To provide a simple but reasonably representative test of I/O scaling and throughput, a process is run on the guest domains to copy
a fixed size directory to local storage. The following command copies 12,271 files totalling 973MB:
cd / && tar cf - usr/lib > /opt/dump
The test is run on an increasing number of guest domains in parallel to measure the impact of concurrent I/O.
The LDOM configuration for this test is as follows:
I/O domain boots from internal storage.
10 * 20GB LUNS presented from IBM DS6000 storage to the I/O domain
SAN storage accessed via a pair of Emulex LP11000 HBAs
SAN storage managed by Veritas Volume Manager 4.1 including the IBM DS6000 Array Support Library
7 * 30GB volumes configured, each striped across all 10 LUNS.
Each volume is configured as a virtual disk to be presented to a single guest LDOM.
Guest domains were rebooted between each test to remove the effect of host based filesystem caching.
The following graph shows the total measured throughput as tests were run against an increasing number of parallel guest
domains. The dotted line shows expected throughput if scaling were to be linear:
The following graph shows aggregated I/O throughput for all active SAN devices along with an average disk busy percentage while
running the throughput tests:
This data shows that Logical Domains do not scale in a linear fashion with regards to filesystem based disk I/O but this is not due to
limitations in the underlying SAN storage.
There are a number of elements to note with regards to I/O scaling: Aside from databases and storage-centric products such as
backups, most applications are not I/O bound The throughput figures above are based on filesystem access when reading lots of
small files and is not indicative of maximum achievable throughput.
It is recommended that Logical Domains are not used to consolidate database applications. Databases are typically well
understood standardised products for which other consolidation mechanisms would be more suitable.
Logical Domain Administration
Building Logical Domains
The following prerequisites must be met to use Logical Domains within
Hardware must be a T5220
T5220 firmware version must be 7.1.1 (provided by patch 136932-01)
Solaris build must be FSOS 1.0
System must be built with volume manager 5.0 MP3
XXX GB of SAN storage must be presented.
Jumpstart will need updating to meet the following requirements for LDOMS:
Logical Domains toolset must be installed
Logical Domain Manager 1.0.3 software must be installed
Required patches 125891-01 127755-01 118833-36 124921-02 125043-01 and 127127-11 must be installed
Until jumpstart is updated, add the Logical Domains Toolset package as follows and the control domain initialisation script will add
any required patches and packages:
pkgadd -d /shared/package/location SYSldom all
There are three steps to configuring logical domains:
Converting the physical host into a control domain1.
Configuring SAN storage to be used as virtual storage for the guest2.
Configuring guest domains.3.
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
8 de 21 26/04/2012 12:36 p.m.
The Logical Domains toolset (SYSldom) contains a suite of scripts which have been produced to automate the creation and
management of logical domains. The following sections walk through these stages.
Converting a physical host into a control domain
The first step is to convert the physical host into the primary LDOM. The primary LDOM runs the Logical Domain Manager software,
provides a shared storage and network layer and ultimately manages the guest domains. Creating the primary domain will
configure the basic daemons and virtual services required for guest domains. It will also release most of the CPUs and memory
resources back into an available pool to be used when creating guest domains.
If the server has previously been used as an LDOM, the system controller will need to be reset to a default configuration and then
power cycled. If the system isn't showing the full 32GB of memory in the banner output, it's likely to have been configured as an
LDOM in a previous life. To reset a system to factory defaults, connect to the console (ssh admin@-ilo), and run the following
commands:
sc> *bootmode config="factory-default"*
sc> *poweroff*
Are you sure you want to power off the system [y/n]? y
SC Alert: SC Request to Power Off Host.
SC Alert: Host system has shut down.
sc> *poweron*
SC Alert: Host System has Reset
To convert the physical host to a control domain, run the initialise_ldom script. The script is safe to re-run and will check whether
each stage needs to be done before making any changes. If anything fails, fix according to the alerts given and re-run the script. It
will be necessary to reboot the system on completion:
root@stella02# */opt/SYSldom/bin/initialise_ldom*
Checking prerequisites
Checking hardware is sun4v
Checking for non-virtualised OS
Checking system firmware version
Checking OS version is at least Solaris 10 11/6
Checking required packages
Installing missing packages
Adding SUNWldm.v
Adding SUNWldmib.v
Adding SUNWldlibvirt.v
Adding SUNWldvirtinst.v
Adding SUNWjass
Checking for required patches
Installing missing patches
adding 125891-01
adding 127755-01
adding 127127-11
OK - all prerequisites have been met
Executing JASS
Adjusting ldm_control-config.driver to standards
Adjusting ldm_control-hardening.driver to standards
Running JASS with ldm_control-secure.driver driver
Checking and configuring primary domain
svc:/ldoms/ldmd:default is currently in disabled state. Enabling
Creating virtual diskserver
Creating virtual console concentrator service
svc:/ldoms/vntsd:default is currently in disabled state. Enabling
Switching virtual console to listen on any address
Creating vswitches
Creating primary-vsw on e1000g0
Creating backup-vsw on e1000g1
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
9 de 21 26/04/2012 12:36 p.m.
Creating alternate-vsw on e1000g2
All hostnames are present in the system hosts files
Switching from e1000g0 to vswitch interface with probe-based IPMP
Switching from e1000g1 to vswitch interface
Switching from e1000g2 to vswitch interface with probe-based IPMP
Configuring primary domain
Removing crypto from primary domain
Restricting primary domain to 8 virtual cpus (1 core)
Restricting primary domain memory to ~4GB
Creating initial ldom config on system controller
installing /etc/rc2.d/S99ldombringup script
Adding root cron job to save LDOM configs
Configuration changes have been made which require a reboot
Please reboot with init 6
When the host has rebooted, the available memory should have been reduced to around 4GB, only 8 cpus should be available and
when listing domains, a single "primary" domain will be shown:
root@stella02# *prtdiag | egrep Memory\ size*
Memory size: 3968 Megabytes
root@stella02# psrinfo -vp
The physical processor has 8 virtual processors (0-7)
UltraSPARC-T2 (cpuid 0 clock 1165 MHz)
root@stella02# */opt/SUNWldm/bin/ldm ls*
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 3968M 1.2% 36m
To list the unallocated resources on the host, run ldm ls-devices
All LDOM services and daemons are now ready to support guest domains but no storage has yet been configured to provide virtual
storage for the guests. The storage will be configured in the following section.
Configuring virtual storage for guest domains
All guest domain storage will be SAN presented and managed by Veritas Volume Manager. The configure_ldom_storage script
will configure veritas volume manager, create the appropriate veritas disk group and prompt the user as to which disks should be
added to be used by guest LDOMS.
As for the initialise_ldom script, the script is safe to re-run and will check whether each stage needs to be done before making
any changes. If anything fails, fix according to the alerts given and re-run the script. The only input required by the user is to confirm
which SAN disks should be initialised to be used for guest LDOM storage.
The following prerequisites must be met before running the configure_ldom_storage script:
Veritas Volume Manager 5.0 must be installed
SAN storage should be presented to the host and visible in format
SAN storage should not contain valid veritas data
root@stella02# /opt/SYSldom/bin/configure_ldom_storage
Checking veritas
Volume Manager licences already enabled
Checking veritas daemons
Enabling vxvm configuration daemon
Initialising vxvm
Do you want to initialise the SAN disks for use now? For new LDOMs, answer yes.
For taking over storage from another LDOM, answer no: (y/n): y
Adding storage to vxvm
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
10 de 21 26/04/2012 12:36 p.m.
The following disks are visibile to vxvm
IBM_DS8x000_0 auto:none - - online invalid
IBM_DS8x000_1 auto:none - - online invalid
IBM_DS60000_0 auto:none - - online invalid
IBM_DS60000_1 auto:none - - online invalid
c1t2d0s2 auto:none - - online invalid
c1t3d0s2 auto:none - - online invalid
c1t4d0s2 auto:none - - online invalid
c1t5d0s2 auto:none - - online invalid
c1t6d0s2 auto:none - - online invalid
c1t7d0s2 auto:none - - online invalid
Please enter the list of disks to use for LDOM storage
separated by spaces i.e. emcpower0s2 emcpower1s2 .....
Disk names: IBM_DS8x000_0 IBM_DS8x000_1 IBM_DS60000_0 IBM_DS60000_1
Validating disk names...
About to create stella02_ldom disk group using the following disks:
IBM_DS8x000_0 IBM_DS8x000_1 IBM_DS60000_0
IBM_DS60000_1
Are you sure (y/n): y
initialising IBM_DS8x000_0 in vxvm
initialising IBM_DS8x000_1 in vxvm
initialising IBM_DS60000_0 in vxvm
initialising IBM_DS60000_1 in vxvm
Initialising stella02_ldom disk group with IBM_DS8x000_0
Adding IBM_DS8x000_1 to stella02_ldom
Adding IBM_DS60000_0 to stella02_ldom
Adding IBM_DS60000_1 to stella02_ldom
All disks added to stella02_ldom
There is now 81906Mb available for guest LDOM storage
Once the configure_ldom_storage script has run, the control domain is fully setup and ready to create guest domains.
Creating guest domains
There are two parts required to create a guest domain. The first stage is to create a guest LDOM on the control domain and
allocate cpu, memory, network and I/O resources. The second stage is to jumpstart this virtual machine (the same as if it were a
physical host)
The standard guest configuration will be as follows:
8 virtual CPUs (one complete core)
4GB of memory
Three virtual network connections (2 FMN and 1 backup)
1 virtual console
1 * 30GB virtual disk for the OS.
To create a guest OS, the following are required:
The initialise_ldom and configure_ldom_storage must have already been run
The primary hostname, IPMP interfaces and backup interface must all be resolvable in DNS but not pingable.
To create a new guest domain, run the /opt/SYSldom/bin/add_guest menu.
<verbatim> Guest LDOM creation menu
=================================
</verbatim>
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
11 de 21 26/04/2012 12:36 p.m.
<verbatim> h. Hostname of guest LDOM..............................undefined
c. Number of virtual cpus......................................8
m. Memory allocated in GB......................................4
r. Reset to defaults............................................
q. Quit and abandon changes.....................................
a. Add LDOM.....................................................
</verbatim>
LDOMs already configured: (primary)
56 cpu threads unallocated (8-63)
28.0 GB unallocated memory
stella02_ldom disk group is 80.0 GB with 80.0 GB free
Please make a choice:
Press the h key to enter the guest domain hostname. The menu will validate that the primary hostname, IPMP addresses and
backup interface are resolvable and not pingable. If this is true, the guest domain can be added with the a key.
CPU resources should be allocated in multiples of 8 to ensure each guest is allocated a complete core. If the number of cpus are
adjusted, the memory allocation will be updated to be half the number of cpu threads to ensure even allocation of cpu/memory
amongst guest domains.
The following shows the creation of the stella17 guest domain:
<verbatim> Guest LDOM creation menu
=================================
</verbatim>
<verbatim> h. Hostname of guest LDOM...............................stella17
c. Number of virtual cpus......................................8
m. Memory allocated in GB......................................4
r. Reset to defaults............................................
q. Quit and abandon changes.....................................
a. Add LDOM.....................................................
</verbatim>
LDOMs already configured: (primary)
56 cpu threads unallocated (8-63)
28.0 GB unallocated memory
stella02_ldom disk group is 80.0 GB with 80.0 GB free
Please make a choice: a
Validating LDOM
Stage 1/16: Executing ldm add-domain stella17
Stage 2/16: Executing ldm add-vcpu 8 stella17
Stage 3/16: Executing ldm set-mau 1 stella17
Stage 4/16: Executing ldm add-memory 4G stella17
Stage 5/16: Executing ldm add-vnet vnet_pri primary-vsw stella17
Stage 6/16: Executing ldm add-vnet vnet_bak backup-vsw stella17
Stage 7/16: Executing ldm add-vnet vnet_alt alternate-vsw stella17
Stage 8/16: Executing vxassist -g stella02_ldom make stella17.os 32g
Stage 9/16: Executing ldm add-vdsdev /dev/vx/dsk/stella02_ldom/stella17.os dev-stella17.os@primary-vds0
Stage 10/16: Executing ldm add-vdisk stella17.os dev-stella17.os@primary-vds0 stella17
Stage 11/16: Executing ldm set-variable "nvramrc=`cat /opt/SYSldom/etc/defaultnvramrc`" stella17
Stage 12/16: Executing ldm set-variable boot-device=disk stella17
Stage 13/16: Executing ldm set-variable use-nvramrc?=true stella17
Stage 14/16: Executing ldm set-variable auto-boot?=false stella17
Stage 15/16: Executing ldm bind-domain stella17
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
12 de 21 26/04/2012 12:36 p.m.
Stage 16/16: Executing ldm start-domain stella17
LDom stella17 started
LDOM successfully created and is ready to add to be jumpstarted
Terminal server connection: stella02:5000
Press enter to continue
In the above example, a virtual terminal server connection has been created on stella02, port 5000. Connecting to this port with
telnet will bring up the familiar OBP prompt and the host can be jumpstarted in the normal manner:
root@stella02# *telnet stella02 5000*
Trying 10.60.45.104...
Connected to stella02.
Escape character is '^]'.
Connecting to console "stella17" in group "stella17" ....
Press ~? for control options ..
{0} ok *banner*
SPARC Enterprise T5220, No Keyboard
Copyright 2008 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.28.0, 4096 MB memory available, Serial #66671006.
Ethernet address 0:14:4f:f9:51:9e, Host ID: 83f9519e.
Operational requirements
Startup and shutdown
From a cold-start, the T5220 hardware will require a poweron command from the console. Once the hardware is powered on and
has completed POST, the control domain will auto-boot. By default, guest LDOMs do not auto-start. To resolve this, an /etc/rc2.d
/S99ldomstartup script is deployed when configuring the primary domain. This script will auto-start any bound domains that aren't
already running.
For shutdowns, the guest domains must be shutdown before the control domain to prevent data loss. This ordering should be
factored into any BCP or powerdown tests.
Restarting domains
When the control domain is rebooted, guest domains will freeze until the control domain comes back online. This frozen outage
period is typically around 2 minutes for a fully configured T5220 and existing network connections will not be dropped.
It is considered good practice to manually shutdown all guest domains before restarting a control domain. The ability to reboot
control domains should mainly be used to cope with a hardware fault or panic on the control domain not causing unnecessary
outage on the guest domain.
Restarting guest domains will have no impact on other domains.
Restarting daemons
The ldmd service svc:/ldoms/ldmd:default can be restarted on the primary domain at any time without impacting the guest
domains. It is designed to be a stateless service.
The Virtual console service svc:/ldoms/vntsd:default can be restarted at any time without impacting the guest domains. During
the service restart, any connected console sessions will be dropped and the user will need to reconnect.
Adding storage to a guest
If a guest domain requires more storage than the OS disk, a new virtual disk should be created and attached to the guest. Storage
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
13 de 21 26/04/2012 12:36 p.m.
should be taken from the _ldom disk group and added following the Virtual disk? naming conventions.
The following shows a 100M virtual disk being created and attached to the stella04 guest:
root@stella01# *vxassist -g stella01_ldom maxsize*
Maximum volume size: 65755136 (32107Mb)
root@stella01# *vxassist -g stella01_ldom make stella04.dat02 100M*
root@stella01# *dd if=/dev/zero of=/dev/vx/dsk/stella01_ldom/stella04.dat02 count=1024 bs=1024*
root@stella01# *ldm add-vdsdev /dev/vx/dsk/stella01_ldom/stella04.dat02 dev-stella04.dat02@primary-vds0*
root@stella01# *ldm add-vdisk stella04.dat02 dev-stella04.dat02@primary-vds0 stella04*
Initiating delayed reconfigure operation on LDom stella04. All configuration
changes for other LDoms are disabled until the LDom reboots, at which time
the new configuration for LDom stella04 will also take effect.
Overwriting the beginning of the veritas volume is required to remove the old VTOC in case the storage is being re-used
The guest should now be rebooted. When it restarts, login and run devfsadm and the guest will be able to see a new 100M virtual
device
root@stella04# *echo | format*
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c0d0 <SUN-DiskImage-29GB cyl 38623 alt 2 hd 1 sec 1618>
/virtual-devices@100/channel-devices@200/disk@0
1. c0d1 <SUN-DiskImage-100MB cyl 339 alt 2 hd 1 sec 600>
/virtual-devices@100/channel-devices@200/disk@1
Specify disk (enter its number): Specify disk (enter its number):
Patching implications
Each of the guest domains run an independent OS image so there is no need for guest domains to be kept at the same patch level.
The only LDOM-specific patch requirement is a few additional steps to take before patching a control domain:
All guests should be shutdown and associated change control process followed
It may be advisable to disable the /etc/rc2.d/S99ldombringup script until patching is complete to prevent guest domains
from auto-booting when the primary domain reboots.
Normal patching process on the control domain should be followed including any follow-up checks.
When patch validation is complete, re-enable the /etc/rc2.d/S99ldombringup script and run the script to restart all guest
domains.
Console access
See Physical and virtual console access?
Commissioning
See Building logical domains? and Building guests? to cover the physical process of creating the domains.
With regards to guest domains, the following information should be added to the CMDB/inventory system:
For each guest, who is the primary control domain
Virtual console server details
Decommissioning
The standard Solaris decommissioning process should be followed. Once the host has been decommissioned, running the
following commands on the control domain will replace the 'dispose hardware' stage:
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
14 de 21 26/04/2012 12:36 p.m.
ldm stop-domain -f guestname
ldm unbind-domain guestname
ldm rm-vnet vnet_pri guestname
ldm rm-vnet vnet_bak guestname
ldm rm-vnet vnet_alt guestname
ldm rm-vdisk guestname.os guestname
vxassist -g `hostname`_ldom -rf rm guestname.os
ldm remove-domain guestname
Gotchas
The following is a list of potential issues to be aware of when running a Solaris virtualised environment:
Virtualised operating systems have a virtualised OBP environment which is initially held in memory. Once the guest OS is
loaded, the in-memory OBP is released so it isn't possible to return to the ok> prompt. To get back to the ok> prompt on
guest domains, reboot the domain and send a break before the OS starts to boot.
Guest domains must be shutdown in the normal manner before shutting down the control domain.
Although the LDOM software allows for multiple I/O domains to be created, there are hardware limitations as to why this
cannot typically be adopted:
Split PCI bus is only available on the T5x40 range of servers (and the older T2000). The T5x20 range only has a
single PCI-E root complex so I/O cannot be split.
The internal disks are only available via a single controller. Where servers allow a Split PCI bus configuration, there is
no internal storage to provide a boot environment for the domain so technologies such as SAN or iSCSI boot would
be required.
Logical Domain Naming Conventions
To enable toolset automation for LDOM provisioning and management and to provide a uniform virtualised environment, a number
of conventions have been adopted. This section details the various conventions and their usage.
Hostnames
There is no requirement to couple a logical hostname to its physical control domain. Virtual hostnames will follow the same
convention used for existing physical hosts and be allocated in the same manner. This convention is documented in
Hostname_Convention? page
Using the LDOM software, each logical domain will be named to match the hostname of the guest OS. The only exception to this is
the control domain which will be named primary. Using a generic name for the control domain is a requirement for guest mobility as
guest domains configurations use services in the format @.
With a collapsed control, I/O and service domain, all resources will reference the primary domain. By using this common reference
for the control domain, guest domains can be moved to a new host by copying the underlying virtual disk devices and importing the
guest domain configuration onto the new host.
The following output is from the primary domain stella01 which provides 7 guest domains:
root@stella01# *ldm ls*
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 3968M 0.7% 2h 16m
stella03 active -n--- 5000 8 4G 0.2% 2h 7m
stella04 active -n--- 5001 8 4G 0.2% 2h 7m
stella05 active -n--- 5002 8 4G 0.2% 2h 7m
stella06 active -n--- 5003 8 4G 0.2% 2h 7m
stella07 active -n--- 5004 8 4G 0.2% 2h 7m
stella08 active -n--- 5005 8 4G 0.3% 1h 48m
stella09 active -n--- 5006 8 4G 0.2% 1h 52m
Virtual disk servers
A virtual disk server runs on the control domain to present virtual disk devices to a guest domain. Virtual disk servers are named
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
15 de 21 26/04/2012 12:36 p.m.
-vds where is the domain providing the disk service and is a unique instance number for the disk server within the given domain.
The current configuration provides a single virtual disk server and with current hardware available and availability requirements for
virtualisation, this configuration is unlikely to change.
The list of virtual disk servers along with virtual disks provided by the virtual disk servers can be and the associated can be
retrieved by querying the services
root@stella01# *ldm list-services primary*
_<... output removed ...>_
VDS
NAME VOLUME OPTIONS DEVICE
primary-vds0 dev-stella03.os /dev/vx/dsk/stella01_ldom/stella03.os
dev-stella04.os /dev/vx/dsk/stella01_ldom/stella04.os
dev-stella05.os /dev/vx/dsk/stella01_ldom/stella05.os
dev-stella06.os /dev/vx/dsk/stella01_ldom/stella06.os
dev-stella07.os /dev/vx/dsk/stella01_ldom/stella07.os
dev-stella08.os /dev/vx/dsk/stella01_ldom/stella08.os
dev-stella09.os /dev/vx/dsk/stella01_ldom/stella09.os
Virtual disk devices
Virtual disk devices are managed by the control domain and are a way of abstracting underlying physical storage into a form that
can be presented to one or more guest domains. At the backend, virtual disk devices can be constructed in a number of ways and
have standardised on using veritas volumes.
As part of the configure_ldom_storage tool, a _ldom diskgroup will be created and all virtual disk devices will be constructed from
storage in this diskgroup.
Virtual disk devices will be named dev-. where is the name of the guest to which the storage is presented and represents the use
of the storage within the guest. This naming convention follows through to the underlying veritas volume which will be . in the _ldom
diskgroup.
As part of a standard build, each guest LDOM will only have a single virtual disk device which will have a tag of os. As additional
storage is added to a guest domain the tag datNN should be used where NN is a two digit number starting at 01.
Each virtual disk device should be a standard veritas volume created with vxassist. It is not necessary to stripe volumes across
available storage, leaving vxassist to automatically place the volume will suffice.
The following table shows the virtual disk devices listed for a guest stella04 which has an additional two data devices. Note, this
output shows which virtual disk devices are created on the control domain rather than which devices are mapped to a given guest
domain.
root@stella01# *ldm list-services primary*
_<... output removed ...>_
VDS
NAME VOLUME OPTIONS DEVICE
primary-vds0 dev-stella03.os /dev/vx/dsk/stella01_ldom/stella03.os
dev-stella04.os /dev/vx/dsk/stella01_ldom/stella04.os
dev-stella04.dat01 /dev/vx/dsk/stella01_ldom/stella04.dat01
dev-stella04.dat02 /dev/vx/dsk/stella01_ldom/stella04.dat02
dev-stella05.os /dev/vx/dsk/stella01_ldom/stella05.os
_<... output removed ...>_
Virtual disk
Virtual disks are how a virtual disk device is mapped to a given guest domain. Each virtual disk device may be mapped to multiple
guest domains, each domain having its own virtual disk entry. Although possible, shared virtual disk devices will not be
implemented at as storage sharing can be achieved at a more generic layer with technologies such as NFS.
The naming of a virtual disk will follow the naming of the underlying virtual disk device. For virtual disk devices following the dev-.@
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
16 de 21 26/04/2012 12:36 p.m.
convention, the virtual disk on a guest domain will be . .
The following output shows the naming of two virtual disks on a guest domain stella04 and their underlying virtual disk devices:
root@stella01# *ldm ls -l stella04*
_<... output removed ...>_
DISK
NAME VOLUME TOUT DEVICE SERVER
stella04.os dev-stella04.os@primary-vds0 disk@0 primary
stella04.dat01 dev-stella04.dat01@primary-vds0 disk@1 primary
Unlike standard SCSI disks, virtual disks do not have the concept of a target number and as such, are presented to the guest
operating system in the format c?d?s? rather than c?t?d?s?. All virtual disks from a given virtual disk server will have the same
controller number with disk numbers being allocated in the order of virtual disk creation.
At present, it is not envisaged that shared virtual disk devices will be used as other technologies such as NAS provide a generic
shared storage offering which is more flexible and cross platform.
Virtual switches
Virtual switches are labelled -vsw where describes the network connection. By convention, the control domain will present three
virtual switches to match the three required network connection of a unix host. Two FMA switches will be created and a single
backup switch.
The following shows the three vswitches and their underlying network adaptors. The switches will be automatically configured as
part of the initialise_ldom script:
root@stella01# *ldm list-services primary*
_<... output removed ...>_
VSW
NAME MAC NET-DEV DEVICE MODE
primary-vsw 00:14:4f:d3:db:32 e1000g0 switch@0
backup-vsw 00:14:4f:d3:db:33 e1000g1 switch@1
alternate-vsw 00:14:4f:d3:db:34 e1000g2 switch@2
_<... output removed ...>_
On the control domain, the vswitches are presented as vswN network adaptors where N matches the instance number in the device
column. For example, the backup network would be presented as vsw1. The following shows a typical network configuration on the
control domain with IPMP configured across the primary and alternate vswitches:
root@stella01# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
vsw0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 10.61.83.80 netmask ffffff00 broadcast 10.61.83.255
groupname MAIN
ether 0:14:4f:d3:db:32
vsw0:1: flags=9040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER> mtu 1500 index 2
inet 10.61.83.81 netmask ffffff00 broadcast 10.61.83.255
vsw1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 10.61.17.72 netmask ffffe000 broadcast 10.61.31.255
ether 0:14:4f:d3:db:33
vsw2: flags=69040843<UP,BROADCAST,RUNNING,MULTICAST,DEPRECATED,IPv4,NOFAILOVER,STANDBY,INACTIVE> mtu 1500 index 4
inet 10.61.83.82 netmask ffffff00 broadcast 10.61.83.255
groupname MAIN
ether 0:14:4f:d3:db:34
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
17 de 21 26/04/2012 12:36 p.m.
Virtual networks
Virtual network adaptors are created on a guest domain and connected to a vswitch. Virtual networks will be labelled vnet_ where
is a three letter code to match the virtual switch name.
The following shows the three virtual network devices for a given client and the vswitches they connect into:
root@stella01# *ldm ls -l stella04*
_<... output removed ...>_
NETWORK
NAME SERVICE DEVICE MAC
vnet_pri primary-vsw@primary network@0 00:14:4f:f8:ad:46
vnet_bak backup-vsw@primary network@1 00:14:4f:f8:4d:64
vnet_alt alternate-vsw@primary network@2 00:14:4f:fb:f5:67
_<... output removed ...>_
At a guest operating system level, the virtual networks will be presented as vnet devices with the device instance number matching
the DEVICE column in the ldm ls output:
root@stella04# grep network@ /etc/path_to_inst
"/virtual-devices@100/channel-devices@200/network@0" 0 "vnet"
"/virtual-devices@100/channel-devices@200/network@1" 1 "vnet"
"/virtual-devices@100/channel-devices@200/network@2" 2 "vnet"
Tying all the network layers together produces the following stack:
The control domain provides a backup-vsw@primary vswitch connected to the e1000g1 network adaptor.
On the control domain, the backup-vsw@primary vswitch is presented as vsw1 and configured as a network adaptor
A vnet_bak virtual network device is created for the stella04 guest. This virtual network is connected to the backup-
vsw@primary vswitch
a On the guest domain, a vnet1 device is configured as a network adaptor.
Supporting Technologies
Built automation - jumpstart
Solaris system provisioning is currently achieved via a combination of Jumpstart, post-build install scripts and manual configuration
tasks. This process can be time consuming and error prone and should be reviewed as part of the Solaris Virtualisation program.
The driver for reviewing jumpstart is to provide a timely, repeatable and robust Solaris provisioning environment. One of the drivers
of a virtualised operating environment is the ability to reduce provisioning time and moving to a virtualised Solaris offering will be
hampered without a timely and automated Solaris provisioning environment.
The following are high level Solaris build requirements that need to be met by an automated Solaris build environment. It is not
considered exhaustive and the first point is expected to expand to a large collection of requirements in itself:
All automation should be managed within jumpstart: boot net - install should provide an application-ready system.
Standard hardware layout for each server type
For security reasons, a minimal Solaris install should be done rather than all package
The use of Logical Domains introduces the following requirements of the jumpstart infrastructure:
Automatic system patching should include Solaris and OBP patch requirements for LDOM 1.0.3
For physical T5220 installs, the LDOM 1.0.3 software packages should be automatically installed along with the LDOM tools
package.
The jumpstart configuration should support smaller disk sizes as presented by guest domains.
For guest domains, no root disk mirroring should be performed - virtual devices are already highly available at a control
domain/SAN layer.
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
18 de 21 26/04/2012 12:36 p.m.
For guest domains, IPMP should be configured across vnet devices rather than e1000g devices.
Monitoring
Solaris monitoring is currently managed with HP Openview toolset. There are concerns within the operational team that the current
monitoring configuration does not fully monitor the Solaris operating environment.
The monitoring configuration should be reviewed to ensure that the physical device, operating system, network and storage layers
are all sufficiently monitored.
On top of the base Solaris monitoring requirements, Logical domains introduces a number of monitoring requirements in terms of
both guests and control domains. It is expected that Logical Domain monitoring is automatically included in the Solaris monitoring
template.
The following are the monitoring requirement for logical domains:
A control domain can automatically be detected by the presence of the /opt/SUNWldm/bin/ldm binary
A guest domain can automatically be detected by the presence of the '/virtual-devices.*"vnet"' pattern in
/etc/path_to_inst
For control domains, three SMF services must be in an online state:
svc:/platform/sun4v/drd:default
svc:/ldoms/ldmd:default
svc:/ldoms/vntsd:default
For guest domains, if the standard monitoring raises alerts on unmirrored disks, this should be disabled.
Capacity Management
At a basic level, the capacity requirements for Logical Domains are the same as for physical servers. Core resources such as cpu,
memory, network and I/O must be recorded to identify over/under utilisation of resources along with trending when resource
thresholds may be breached in the future.
With regards to cpu and memory resources, both are hard allocated resources for a guest domain - there is no sharing between
guests. As such, standard capacity management tools can be loaded onto the guest OS and will accurately report capacity data.
Capacity management for disk resources splits into two areas: utilisation and performance.
Storage utilisation should be captured and trended to predict when storage allocations will fill up. From a guest operating system
layer, storage is presented and utilised in the same way as for physical systems so standard capacity management tools will be
able to record and trend this data.
Gathering storage performance information at a guest domain level is not currently possible as the virtual disk driver does not
measure I/O activity or save kstats which could subsequently be read by the iostat command. This issue is being tracked as Sun
Bug ID 6503157. Storage performance monitoring will need to be implemented on the control domain where the underlying physical
storage resides. The use of veritas volumes to provide virtual storage for guest domains will assist the identification of heavy disk
activity as per-guest I/O counters will be available via the vxstat command on the control domain.
As for storage, capacity management for network resources splits into utilisation and performance. The virtual network drivers on
the guest domain expose performance counters meaning that standard capacity management tools loaded on the guest OS will
accurately report capacity data. As the network layer is shared between all guest domains, network utilisation should also be
measured on the control domain.
The measuring of performance data against another host highlights an area where an accurate and automated inventory system is
required to group guest and control domains together to assist the system administrator.
Backup and recovery
Veritas NetBackup? 6.0 is the standard backup and recovery platform within . LDOMs are supported by Symantec as detailed in
the NetBackup 6 OS Compatibility matrix
Both control and guest domains should be added to NetBackup? in the standard manner. Recovery scenarios and extra steps
required for logical domains are detailed in the Logical Domain Administration section of this document.
Capacity Planning
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
19 de 21 26/04/2012 12:36 p.m.
Inventory
Outstanding work
The following list of related work streams that need to be completed to allow the successful deployment of logical domains within
fidelity:
Jumpstart
Monitoring
Inventory
Capacity planning
Operational readiness, training and handover
Tech questions
How much memory do we need per guest? Java can only use 2GB so we may only need to buy 16GB systems.
What size SAN luns do we use? 50GB are probably portable enough...
What size OS disks do we need? Talk to BAU to get indicative sizes.
Any additional storage
SYSldom package to be created and automatically be delivered via jumpstart
Background and further reading
Jeff Savit's LDOMs Concepts and Examples is a good primer explaining the concepts of Logical Domains.
After reading the presentation, Beginners guide to LDoms is a walkthrough on how to configure LDOMs. This document refers to
LDOMS 1.0 rather than 1.0.3 so some of the command output differs slightly and 1.0.3 introduces a some new capabilities (such as
being able to use veritas volumes as virtual storage)
An alternative set of walkthrough documents Is Octave Orgeron's series of articles: An Introduction to Logical Domains Part 1, Part
2, Part 3 and Part 4
Finally, the Logical Domains 1.0.3 Administration Guide and associated Release Notes are the definitive and up-to-date reference
guides.
Appendix
cpusin.master
#!/bin/sh
MAXCHILD=64
if [ -f /tmp/stop ]; then
echo "`date '+%H:%M:%S'` Killing children on `hostname`..."
KIDS=`ps -ef | egrep '[s]pin.child' | awk '{print $2}'`
echo $KIDS | xargs -n 1 kill -9
rm /tmp/stop
exit
else
echo "`date '+%H:%M:%S'` Starting to spawn cpu exercising scripts on `hostname`"
fi
while [ 1 ]
do
CUR=`ps -ef | egrep '[c]puspin.child' | wc -l`
if [ ${CUR} -lt ${MAXCHILD} ]; then
nohup ./cpuspin.child > /dev/null 2>&1 &
else
echo "`date '+%H:%M:%S'` There are now ${MAXCHILD} \c"
echo "cpu exercising scripts running on `hostname`"
break
fi
done
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
20 de 21 26/04/2012 12:36 p.m.
cpuspin.child
#!/bin/sh
while [ 1 ]
do
:
done
Attachments 8
Topic revision: r1 - 19 Apr 2011 - 13:12:51 - AnthonyBennett
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing
authors.
Ideas, requests, problems regarding Foswiki? Send feedback
LDOM < Main < Foswiki http://www.sunsa.co.uk/Main/LDOM
21 de 21 26/04/2012 12:36 p.m.