Sun SPARC Enterprise M3000 Server Architecture
Sun SPARC Enterprise M3000 Server Architecture
Introduction ......................................................................................... 1
Sun SPARC Enterprise M3000 Server Overview............................ 3
Meeting the Needs of Commercial and Scientific Computing ......... 3
System Architecture ............................................................................ 4
System Component Overview......................................................... 4
System Outline ................................................................................ 8
System Bus Architecture—Jupiter Interconnect.................................. 9
Jupiter Interconnect Architecture .................................................... 9
Performance of Sun SPARC Enterprise M3000 Server ................ 10
SPARC64 VII Processor ................................................................... 11
SPARC64 VII Overview ................................................................ 11
SPARC64 VII Microarchitecture .................................................... 12
Details of the Microarchitecture..................................................... 13
Cache System ............................................................................... 16
Reliability, Availability, and Serviceability Functions ..................... 17 I/O
Subsystem................................................................................... 20 I/O
Subsystem Architecture .......................................................... 20
Reliability, Availability, and Serviceability.......................................... 21
Redundant and Hot-Swap Components ....................................... 21
Advanced Reliability Features....................................................... 22
Error Detection, Diagnosis, and Recovery .................................... 22
System Management ........................................................................ 23
eXtended System Control Facility ................................................. 23
Sun SPARC Enterprise M3000 Server Architecture
Oracle Enterprise Manager Ops Center Software......................... 25
Oracle Solaris 10............................................................................... 26
Observability and Performance ..................................................... 26
Availability ..................................................................................... 27
Security ......................................................................................... 28
Virtualization and Resource Management .................................... 28
Conclusion ........................................................................................ 30
Sun SPARC Enterprise M3000 Server Architecture
Introduction Organizations now rely on technology more than ever before. Today, computer systems play a
critical role in every function from product design to customer order fulfillment. In many cases,
business success is dependent on the continuous availability of IT services. Once only required in
pockets of data centers, mainframe-class reliability and serviceability are now essential for
systems throughout an enterprise. In addition, powering data center servers and keeping services
running through a power outage are significant concerns. On the other hand, the environment is
also playing a key role in
such considerations, in areas that include, for example, power conservation and
miniaturization, amid demand to reduce the load on the environment. New computer systems
that consume less power and that emit less greenhouse gases can play an essential role in
protecting the environment.
Although availability is a top priority, costs must also remain within budget and operational
familiarity maintained. To deliver networked services as efficiently and economically as
possible, organizations look to maximize use of every IT asset through consolidation and
virtualization strategies. As a result, modern IT system requirements reach far beyond simple
measures of compute capacity. Organizations need highly flexible servers with built-in
virtualization capabilities and associated tools, technologies, and processes that work to optimize
server use. With budgets still in mind, new computing infrastructures must also help protect
current investments in technology
and training.
.
1
Sun SPARC Enterprise M3000 Server Architecture
2
A High-Performance, High-Reliability, Ecologically Sustainable
Server: Introducing the Sun SPARC Enterprise M3000 Server
Oracle’’s Sun SPARC Enterprise servers are highly reliable, easy-to-manage, vertically
scalable systems with all the benefits of traditional mainframes without the associated cost,
complexity, or vendor lock-in (Figure 1). In fact, Sun SPARC Enterprise servers deliver
mainframe-class system architecture at open system prices.
The Sun SPARC Enterprise M3000 server is the entry-class model that has many
characteristics of Sun SPARC Enterprise servers, and shares benefits such as operability and
manageability common to the servers. With symmetric multiprocessing (SMP), 32 GB
memory subsystem, and high-throughput I/O architecture, the server can ensure core
business operations. Further, the server runs the powerful Oracle Solaris 10 operating system
(OS) and includes leading virtualization technologies. Through the innovative Oracle Solaris
Containers virtualization technology, the server brings sophisticated resource control to an
open systems platform.
The server combines high performance, high quality, and ecological sustainability with a
resilient system architecture, the advanced functions of Oracle Solaris 10, a compact form
factor (two
rack units [2U] in a rack cabinet), and the top CPU power in the entry class of servers. Moreover,
Sun SPARC Enterprise servers offer improved performance over the previous generations of
Sun servers, with a clear upgrade path that protects existing investments in software, training,
and data center practices. By taking advantage of the Sun SPARC Enterprise M3000 server,
IT organizations can create a more-powerful infrastructure, optimize hardware use, and
increase application availability——resulting in lower operational costs and risks.
Figure 1. Sun SPARC Enterprise M3000, M4000, M5000, M8000, and M9000 servers include many features to help improve uptime, application
performance, and data center efficiency.
Sun SPARC Enterprise M3000 Server Architecture
3
Sun SPARC Enterprise M3000 Server Overview
The Sun SPARC Enterprise M3000 server offers numerous power, reliability, and energy-
saving characteristics useful to enterprises. The Sun SPARC Enterprise M3000 server
features an SMP design that uses the latest generation of SPARC64 processors connected to
memory and I/O by a new high-speed, low-latency system interconnect, which delivers
exceptional throughput to software applications. Characteristics of the Sun SPARC Enterprise
M3000 are found in Table 1. Also architected to reduce unplanned downtime, this server
includes stellar reliability, availability, and serviceability (RAS) capabilities to avoid outages
and reduce recovery times. Design features,
such as high-performance CPU and data path integrity, Memory Extended ECC, end-to-end data
protection, hot-swappable components, fault-resilient power options, and hardware
redundancy, boost the reliability of this server. The environment-conscious design of the Sun
SPARC Enterprise M3000 server offers numerous benefits with the aim of energy
consumption reduction that enterprises and data centers require. With the adoption of the
SPARC64 VII
processor, which achieves low power consumption while demonstrating high performance, and a
structural design of improved cooling efficiency and cooling control, the server realizes
power saving, space saving, and a quiet operation that reduces the environmental load.
TABLE 1. CHARACTERISTICS OF SUN SPARC ENTERPRISE M3000 SERVER
ENCLOSURE TWO RACK UNITS
SPARC64 VII processors x 2.52 GHz
x 5 MB Level 2 cache
x Four cores
Memory x Up to 32 GB
x Eight DIMM slots
Internal I/O slots Four PCIe
External I/O chassis None
Internal storage x Serial Attached SCSI
x Up to four drives
Dynamic system domains Maximum of one
External I/O connections One SAS port
Meeting the Needs of Commercial and Scientific Computing
Suiting a wide range of computing environments, the Sun SPARC Enterprise M3000 server
provides the availability features needed to support commercial computing workloads along
with the raw performance demanded by the high-performance community (Table 2).
Sun SPARC Enterprise M3000 Server Architecture
4
TABLE 2. SAMPLE WORKLOADS FOR THE SUN SPARC ENTERPRISE M3000 SERVER
Adaptive services Business processing (ERP, CRM, OLTP, batch)
Database Decision
support Data mart
Web services
System and network management
Application development Application
services
Scientific engineering
System Architecture
Continually challenged to do more with less, IT organizations realize that meeting processing
requirements with fewer, more-powerful systems holds economic advantages. In the Sun
SPARC Enterprise M3000 server, the system interconnect processors, memory subsystem,
and I/O subsystem work together to create a reasonably priced, high-performance platform.
System Component Overview
The design of the Sun SPARC Enterprise M3000 server specifically focuses on delivering
high reliability, outstanding performance, and true SMP throughput. The characteristics and
capabilities of every subsystem work toward this goal. The high-bandwidth system bus,
powerful
SPARC64 VII processor chips, high-density memory option, and high-speed PCI Express (PCIe)
provide not only reliable performance for enterprise applications, but also high-level
operational time and throughput.
System Interconnect
Based on mainframe technology, the Jupiter system interconnect enables high performance
and reliability for the Sun SPARC Enterprise M3000 server. The system controller provides
point-to- point connections among the CPU, memory, and I/O subsystems. The system
interconnect delivers as much as 17 GB/sec of peak bandwidth, offering true SMP
throughput. Additional technical details about the system interconnect are found in the
section titled ““System Bus Architecture——Jupiter Interconnect.””
SPARC64 VII Processor
The Sun SPARC Enterprise M3000 server uses the SPARC64 VII processor developed by
Fujitsu. The SPARC64 VII processor, which has a multicore and multithreading architecture,
has been designed based on experience in the mainframe computer field accumulated over
several
Sun SPARC Enterprise M3000 Server Architecture
5
decades in the pursuit of excellence in reliability and speed. It adopts advanced technology
(65 nm for the SPARC64 VII), which realizes a maximum consumption of 135 watts.
Additional technical details about the SPARC64 VII processor are found in the section
titled ““SPARC64
VII Processor.””
Memory
The memory subsystem of the Sun SPARC Enterprise M3000 server accommodates up to 32
GB of memory. The server uses DDR2 DIMMs with two-way memory interleave to
enhance system performance. Available DIMM sizes include 1 GB, 2 GB, and 4 GB.
Further details about the memory subsystem of the Sun SPARC Enterprise M3000 server
are listed in Table 3.
TABLE 3. SUN SPARC ENTERPRISE M3000 SERVER MEMORY SUBSYSTEM SPECIFICATIONS
Maximum memory capacity 32 GB
DIMM slots 8
Bank size 4 DIMMs
Number of banks 2
Beyond performance, the memory subsystem of the Sun SPARC Enterprise M3000 server is
built with reliability in mind. ECC protection is implemented for all data stored in main
memory, and the following advanced features foster early diagnosis and fault isolation that
preserve system integrity and raise application availability:
x Memory patrol. Memory patrol periodically scans memory for errors. This proactive
function prevents the use of faulty areas of memory before they can cause system or
application errors, improving system reliability.
x Memory Extended ECC. The Memory Extended ECC function provides single-bit error
correction, supporting continuous processing despite events such as burst read errors,
which are sometimes caused by memory device failures.
PCI Express Technology
The Sun SPARC Enterprise M3000 server uses a PCI bus to provide high-speed data
transfer within the I/O subsystem. To support PCIe expansion cards, the server uses a PCIe
physical layer (PCIe PHY) ASIC to manage the implementation of the PCIe protocol. PCIe
technology doubles the peak data transfer rates of the original PCI technology and reaches
the maximum throughput of 20 Gb/sec. In fact, PCIe was developed to accommodate high-
speed interconnects such as Fibre Channel, Infiniband, and Gigabit Ethernet. Additional
technical details about the I/O subsystem are found in the section titled ““I/O Subsystem.””
Sun SPARC Enterprise M3000 Server Architecture
6
Service Processor eXtended System Control Facility
Simplifying management of computer systems leads to higher availability levels for hosted
applications. With this in mind, the Sun SPARC Enterprise M3000 server includes the
eXtended System Control Facility (XSCF). The XSCF consists of a dedicated processor that
is independent of the server and runs the XSCF Control Package (XCP) to provide remote
monitoring and management capabilities. This service processor regularly monitors
environmental sensors, provides advanced warning of potential error conditions, and executes
proactive system maintenance procedures as necessary. Although power is supplied to the
server, the XSCF constantly monitors the platform even when the system is inactive.
The XCP enables audit administration, hardware control capabilities, hardware status
monitoring, reporting, and handling, automatic diagnosis, and domain recovery. Additional
technical details about the XSCF and XCP are found in the section titled ““System
Management.””
Power and Cooling
The Sun SPARC Enterprise M3000 server uses separate modules for power and cooling.
Sensors placed throughout the system measure the temperatures at processors and key ASICS
as well as the exhaust temperature. Hardware redundancy in the power and cooling
subsystems, combined with environmental monitoring, keeps the server operating even under
power or fan fault conditions.
Fan Unit
The Sun SPARC Enterprise M3000 server uses fully redundant, hot-swap fans as the
primary cooling system (Table 4). If a single fan fails, the XSCF detects the failure and
switches the remaining fans to high-speed operation to compensate for the reduced airflow.
The server can operate normally under these conditions, allowing ample time to service the
failed unit. Replacement of fan units can occur without interrupting application processing.
TABLE 4. POWER AND COOLING SPECIFICATIONS OF SUN SPARC ENTERPRISE M3000 SERVER
x Two fan units
Fan units
x Two 80 mm fans
x 1+1 redundant
Power supplies x 470 watts of rated power
x Two units
x 1+1 redundant
x Single-phase
Power cables x Two power cables
x 1+1 redundant
Sun SPARC Enterprise M3000 Server Architecture
7
Power Supply
The use of redundant power supplies and power cables adds to the fault resilience of the Sun
SPARC Enterprise M3000 server (Table 4). Power is supplied to the server by redundant
hot- swap power supplies, enabling continuous server operation even if a power supply fails.
Because the power units are hot-swappable, they can be replaced during system operation.
Optional Dual Power Feed
Although organizations can control most factors within the data center, utility outages are
often unexpected. The consequences of loss of electrical power can be devastating to IT
operations. To enable organizations to reduce the impact of such incidents, the Sun SPARC
Enterprise M3000 server is dual power feed capable. The AC power subsystem in this server
is completely duplicated, providing optional reception of power from two external,
independent AC power sources. The use of a dual power feed ensures that server operations
are not affected, even after
a single power grid failure. Therefore, the server can continue to be used. Though the dual power
feed system and redundant power supply system are not compatible, the redundancy feature
of either system increases system availability.
Operator Panel
The Sun SPARC Enterprise M3000 server features an operator panel, which has the
following functions:
x Displaying server status
x Storing server identification and user setting information
x Changing between operational and maintenance modes
x Turning on power supplies for a domain
During server startup, the front panel LED status indicators monitor the XSCF and
server operation (Figure 2).
Sun SPARC Enterprise M3000 Server Architecture
8
Figure 2. The server operator panel of the Sun SPARC Enterprise M3000.
System Outline
The Sun SPARC Enterprise M3000 server is an economical, high-power compute platform
with enterprise-class features. This server is designed to reliably carry data center workloads
that undertake core business operations. The Sun SPARC Enterprise M3000 server enclosure
measures 2U and supports one processor chip and 32 GB of memory. The SPARC64 VII (four
cores) processor chip is mounted. In addition, the server features four short internal PCIe
slots, four internal disk drives, one internal DVD drive, and an external SAS port for attaching
addition storage or tape device. Two power supplies and two fan units power and cool the
server. Front and rear views and a component diagram of the Sun SPARC Enterprise M3000
server are found in Figure 3 and Figure 4.
Figure 3. Sun SPARC Enterprise M3000 server components.
Sun SPARC Enterprise M3000 Server Architecture
9
Figure 4. Front and rear views of the Sun SPARC Enterprise M3000 server.
System Bus Architecture—Jupiter Interconnect
The ability to deliver fast, predictable performance for a broad set of CPU applications rests
largely on the capabilities of the system bus. The Sun SPARC Enterprise M3000 server
uses a system interconnect designed to deliver consistent low latency. The Jupiter system
bus benefits IT operations by delivering balanced and predictable performance for
application workloads.
Jupiter Interconnect Architecture
The Jupiter interconnect design maximizes the overall performance of the Sun SPARC
Enterprise M3000 server. Implemented as point-to-point connections that use packet-
switched technology, this system bus provides fast response times by transmitting multiple
data streams. Packet switching allows the interconnect to operate at a much-higher
systemwide throughput by eliminating ““dead”” cycles on the bus. All routes are
unidirectional, contention-free paths with multiplexed addresses, data, and control plus ECC
in each direction.
System controllers within the Jupiter interconnect architecture direct traffic among
CPUs, memory, and I/O subsystems.
Sun SPARC Enterprise M3000 Server Architecture
10
System Interconnect Reliability Features
The built-in redundancy and reliability features of the Sun SPARC Enterprise M3000 server
system interconnect enhance the stability of this server. The Jupiter interconnect protects
against loss or corruption of data with full ECC protection on all system buses and in
memory. When a single-bit data error is detected in a CPU, memory, or an I/O controller,
hardware corrects the data and performs the transfer.
Sun SPARC Enterprise M3000 System Interconnect Architecture
The Sun SPARC Enterprise M3000 system is implemented within a single motherboard. This
server design features one logical system board with one system controller. The system
controller is connected to CPUs, memory, and the I/O controller (PCIe bridge), as shown in
Figure 5.
Figure 5. Sun SPARC Enterprise M3000 server system interconnect.
Performance of Sun SPARC Enterprise M3000 Server
The high bandwidth and overall design of the Jupiter system interconnect maximize the
performance of the Sun SPARC Enterprise M3000 server. Theoretical peak system
throughput,
Sun SPARC Enterprise M3000 Server Architecture
11
I/O bandwidth numbers, and stream benchmark results for the Sun SPARC Enterprise
M3000 server are found in Table 5.
TABLE 5. PERFORMANCE OF SUN SPARC ENTERPRISE M3000 SERVER
Theoretical system bandwidth at peak timea
(GB/sec) 17
Theoretical I/O bandwidth at peak timeb
(GB/sec) 4
Triad results of stream benchmark (GB/sec) 4.5
Copy results of stream benchmark (GB/sec) 5.6 aThe theoretical system bandwidth at peak time is calculated by multiplying the bus width by the bus frequency between the system
controller and memory.
bThe theoretical I/O bandwidth at peak time is calculated by multiplying the bus width by the bus frequency between the system controller
and PCI bridge.
SPARC64 VII Processor
The SPARC64 Series consists of SPARC processors developed by Fujitsu for UNIX servers.
Customers have realized high-reliability technology——consistent with the mainframe
class——and a frequency exceeding 1 GHz with the SPARC64 V. The SPARC64 VI has
realized high throughput by using the SPARC64 V as a base and incorporating a two-core by
two-thread architecture. The throughput of the latest SPARC64 VII has been improved
further by incorporating a four-core architecture and by modifying the multithreading
mechanism. The Sun SPARC Enterprise M3000 server uses this SPARC64 VII processor.
SPARC64 VII Overview
The SPARC64 VII is the latest processor developed by Fujitsu for the SPARC64 Series. It uses
65 nm technology and has an operating frequency of 2.5 GHz. The chip measures 21.3 mm by
20.9 mm and has four built-in cores with a shared 5 MB Level 2 (L2) cache configuration.
The operating power consumption is 135 watts.
Fujitsu designed the SPARC64 VII for increased throughput while maintaining the high
performance and high reliability that have been realized with the existing SPARC64 VI. For
increased throughput, the number of built-in cores has been increased from two to four, and
the multithreading mechanism to be used has been changed from vertical multithreading
(VMT) to simultaneous multithreading (SMT). The L2 cache is configured to be shared by
the four cores, and the throughput has been doubled so that data can be supplied to the four
cores. Also, especially with the field of high-performance computing in mind, an intercore
high-speed synchronization mechanism called hardware barrier has been implemented.
Sun SPARC Enterprise M3000 Server Architecture
12
SPARC64 VII Microarchitecture
This section provides an overview of the microarchitecture of the SPARC64 VII. Although
the basic structure of the core pipeline of the SPARC64 VII is the same as that of the
SPARC64 VI, it uses SMT technology instead of VMT technology to implement
multithreading. As shown in
Figure 6, the SPARC64 VI processor takes advantage of VMT technology to execute two threads
in parallel——only one thread is active at any given time. Within the VMT model, a latency
event or specific trigger must occur for processing to switch over to the alternate thread. By
implementing SMT technology, both threads within each core on the SPARC VII processor
can execute simultaneously. As a result, the SPARC VII offers the potential to achieve
greater throughput and performance. As shown in Figure 7, two threads can be executed
simultaneously on each of the four cores.
Figure 6. SPARC64 VI VMT processing mode.
In the SMT design, Fujitsu focused on eliminating interference between threads as much as
possible. The chip is configured so that, as a rule, the hardware resources for one thread are
isolated from those of the other when both threads are running. In contrast, when either
thread is in the idle state, the other thread can use the resources of both threads except for
some resources. Thus, the chip has been designed to provide higher performance than in a
single- thread operation. In the structure, both threads share the pipeline core. However, it is
controlled so that, even if a pipeline is stalled in one thread, the processing in the other thread
is not
clogged up. In the instruction fetch stage, instruction decoding stage, or commit stage, either
thread is selected in each cycle.
Sun SPARC Enterprise M3000 Server Architecture
13
Figure 7. SPARC64 VII SMT processing mode.
Details of the Microarchitecture
As shown in Figure 8, a core of the SPARC64 VII is divided into the instruction fetch block
and instruction execution block. The instruction fetch block contains the primary cache
dedicated for instructions (L1I cache), and the instruction execution block contains the
primary cache for operands (L1D cache).
Sun SPARC Enterprise M3000 Server Architecture
14
Figure 8. Functional diagram of the SPARC64 VII core.
Instruction Fetch Block
The instruction fetch block, which operates independently of the instruction execution block,
takes a series of instructions into the instruction buffer (IBUF), which are expected to be
executed according to branch prediction. The IBUF has a capacity of 256 bytes and can store
up to 64 instructions. When both threads are running, the IBUF is divided evenly for each
thread. If
an instruction execution is stalled, the instruction fetch continues until the IBUF becomes full. In
contrast, if the instruction fetch pauses for some reason such as a cache error, instructions can
be taken from the IBUF and the execution can continue as long as the IBUF contains
instructions. The instruction fetch can be started in every cycle, and 32 bytes, which comprise
eight instructions, are fetched at one time. The throughput of instruction execution is up to
four instructions per cycle, and twice the throughput of instruction execution is ensured for
the instruction fetch. The IBUF conceals the latency of the large-capacity primary instruction
cache by separating the instruction fetch and instruction execution from each other
(decoupling).
Instruction Execution Block
A core of the SPARC64 VII is divided into the instruction fetch block and instruction
execution block. The instruction execution block operates independently of the instruction
fetch block.
Sun SPARC Enterprise M3000 Server Architecture
15
Instruction Decode and Issue
In the instruction decode and instruction issue stages, the four instructions in the Instruction
Word Register (IWR) are decoded simultaneously, and resources required for execution
(various reservation stations, fetch port and store port, and register update buffer) are
determined. Then, whether there are free resources for them is checked. If there are free
resources, they are allocated and given instruction identifications (IID) ranging from 0 to 63.
Then, the instructions
are issued. In other words, the maximum number of in-flight instructions is 64. Meanwhile, when
both threads are running, the maximum number of instructions for each thread is 32. In
each cycle, an instruction of either thread is decoded and threads are alternately switched.
When an instruction is issued, the IWR is released. For the instruction in any slot of the
IWR, there are no restrictions on the allocation of resources such as reservation stations.
Also, there are no restrictions on instruction-type combinations. Therefore, as long as there
are free resources, instructions can be issued. Even if there is insufficient space for four
instructions, as many instructions as possible are issued in program order. As described
above, by eliminating stall conditions of instruction issue as much as possible, a high
multiplicity level is ensured for any binary code.
Instruction Execution
A decoded instruction is registered in a reservation station. The SPARC64 VII has
reservation stations for integer operation (reservation station for execution [RSE]) and
reservation stations for floating point (RSF) operation. The RSEs and RSFs are divided into
two queues for the execution unit. In other words, four reservation stations are provided for
operation. They are RSEA, RSEB, RSFA, and RSFB. Each instruction stored in a reservation
station is dispatched to the execution unit that corresponds to the reservation station in the
order in which source operands are prepared for the instructions. Therefore, four operations
can be dispatched simultaneously. Basically, the oldest instruction that can be dispatched
(oldest ready) is selected from the instructions in a reservation station. However, in cases
where a register to be updated by a load instruction is used as a source operand for an
operation, the instruction is speculatively dispatched before the result of the load instruction
is obtained. Then, in the execution stage, whether the speculatively dispatched instruction has
been successful is determined; this is called speculative dispatch. Use of speculative dispatch
conceals the latency of the pipeline for cache access, increasing the use efficiency of the
execution unit. In addition to the above-described RSEs and RSFs, the other reservation
stations are reservation stations for branch instructions (RSBR) and reservation stations for
calculating addresses for load/store instructions (reservation station for address generation
[RSA]).
Instruction Commit
All results of instructions that are executed out of order are stored once in the GPR Update
Buffer (GUB) and FPR Update Buffer (FUB) work registers, which are not visible to software.
Sun SPARC Enterprise M3000 Server Architecture
16
To ensure the instruction order in a program, registers such as the general-purpose registers
(GPR) and floating-point registers (FPR) and memory are updated in program order in the
commit stage. In addition, control registers such as the PC are updated at the same time in
the commit stage. Precise interrupts are guaranteed, and processing in execution can always
be canceled. The above method is called a synchronous update method, which not only
makes it easier to re-execute instructions after a branch prediction error, but also contributes
to increased RAS, as explained later in this document. The maximum number of instructions
that can be committed at one time is four. The instruction commit stage is shared by the two
threads, and either thread is selected in each cycle to execute commit processing.
Cache System
As shown in Figure 9, the cache memory of the SPARC64 VII has a two-layer structure,
consisting of a medium-capacity primary cache (L1 cache) and a high-capacity secondary
cache (L2 cache).
Figure 9. SPARC64 VII processor core and cache design.
The L1 cache consists of a cache dedicated for instructions (L1I cache) and a cache dedicated
for operands (L1D cache). Each of these caches has a capacity of 64 KB, uses the two-way
set associative method, and has a block size of 64 B. The L1D cache is divided into eight
banks on the four-byte address boundaries, and two operands can be accessed at one time.
The L1 cache uses virtual addresses for cache indexes and physical addresses for cache tags.
In the virtually indexed, physically tagged (VIPT) method, consistency can be lost if the same
area of memory is accessed using different virtual addresses, because different indexes are
used for registration
Sun SPARC Enterprise M3000 Server Architecture
17
(synonym problem). Through coordination with the L2 cache, the SPARC64 VII resolves
the synonym problem with hardware.
The L2 cache has a maximum capacity of 5 MB, uses a 10-way set associative method, has
a block size of 256 B, and is shared by the four cores. It adopts a two-bank interleaved
structure,
so 64 B of data can be read in each cycle. The bus for sending data that is read from the L2
cache to the L1 cache has a width of 32 B per two cores, and the bus for sending data from the
L1 cache to L2 cache has a width of 16 B per core.
The cache update policies of the L1 cache and L2 cache are both write-back. That is, stored
data is written into only one cache hierarchy. In the write-back method, cache-missed lines
are always loaded onto the cache memory, so that the store operations can be completed by
updating one cache hierarchy. In the write-back method, it is necessary to bring old data in
memory onto the cache even if the data is stored, when a cache error occurs; however, the
store operation is completed only on the cache when a cache hit occurs. In general, because
the frequency of the store operation is quite high, the write-back method has an advantage
because it can reduce intercache traffic and memory access traffic.
Meanwhile, because the write-back method keeps the latest data in the cache, if an error occurs
in the relevant processor, there is a risk that the error could affect not only the internal
operation of the processor but also the entire system. The SPARC64 VII has powerful RAS
functions to cope with this problem.
Also, a new hardware barrier mechanism has been implemented in the SPARC64 VII. The
hardware barrier mechanism synchronizes the cores in the CPU chip with each other, and
faster synchronization processing can be implemented compared with a conventional
synchronization process realized by software. This mechanism is especially useful in the
high-performance computing area.
Reliability, Availability, and Serviceability Functions
In the SPARC64 VII, RAS functions comparable to those of mainframe computers have
been implemented. With these RAS functions, errors are reliably detected, their effect is
kept within a limited range, recovery processing is tried, error logs are recorded, and
software is notified.
In other words, the basics of RAS functions are thoroughly implemented. Through the imple-
mentation of the RAS functions, the SPARC64 VII provides high reliability, high
availability, high serviceability, and high data integrity as a processor for mission-critical
UNIX servers.
Reliability, Availability, and Serviceability of Internal RAMs
Among the parts of a processor, RAM has the highest error occurrence frequency. Error
detection and correction methods for the SPARC64 VII processor are highlighted in Table 6.
In the SPARC64 VII, because any one-bit error in RAM can automatically be corrected by
hardware without intervention by software, it does not affect software.
Sun SPARC Enterprise M3000 Server Architecture
18
TABLE 6. ERROR DETECTION AND CORRECTION METHOD FOR INTERNAL RAMS
TYPE ERROR DETECTION AND
PROTECTION METHOD
ERROR CORRECTION METHOD
L1 instruction cache Data Parity Invalidation and reread
Tag Parity + duplication Rewrite of duplicated data
L1 data cache Data SECDED ECC One-bit error correction using ECC
Tag Parity + duplication Rewrite of duplicated data
L2 cache Data SECDEDa
ECC One-bit error correction using ECC
Tag SECDED ECC One-bit error correction using ECC
Instruction TLB Parity Invalidation
Data TLB Parity Invalidation
Branch history Parity Recovery from branch prediction failure
aSECDED: Single error correction and double error correction.
For the L1 cache, L2 cache, and TLB, degradation can be performed separately in way units.
Error occurrence counts are made for each function unit. If the error occurrence count per
unit time exceeds the upper limit, degradation is performed and the relevant way is not
subsequently used. Hardware performs degradation automatically; at the same time, it also
performs the required operation to ensure the continuity of coherency automatically. More
specifically, hardware automatically performs the following: (1) operation that writes back to
the L2 cache all the dirty lines in the way of the L1D cache to be degraded, and (2) operation
that writes back to memory the dirty lines in the way of the L2 cache to be degraded. The
degradation of a way is performed without adversely affecting software, and software
operation is free from any effect except for a slowdown of the processing speed.
Reliability, Availability, and Serviceability of Internal Registers and Execution Units
The SPARC64 VII also provides error protection for registers and execution units, making
doubly sure that data integrity is guaranteed (Table 7). For integer architecture registers,
ECC is used from the SPARC64 VII to increase reliability. If an error occurs, the ECC
circuit corrects the error. Parity bits have been added to the floating-point architecture
registers and other registers. Also, the parity prediction circuit, residue check circuit, and
other circuits have been added to the execution unit to propagate parity information to
output results. In the unlikely event that a parity error occurs, it is detected, and hardware
automatically re-executes the instruction to attempt recovery as described below. This
function is called instruction retry.
TABLE 7. ERROR DETECTION AND PROTECTION METHOD FOR INTERNAL REGISTERS AND EXECUTION UNITS
TYPE ERROR DETECTION
METHOD
PROTECTION METHOD
Register
Integer register SECDED ECC
Floating-point register Parity
Sun SPARC Enterprise M3000 Server Architecture
19
PC, PSTATE, etc. Parity
Computation input-output register Parity
Execution unit
Addition and subtraction, division, shift, and graphic
operations
Parity prediction
Multiplication Parity prediction + residue check
Synchronous Update Method and Instruction Retry
As shown in the explanation of the instruction execution block, the SPARC64 VII uses
the synchronous update method. When an error is detected, all the instructions being
executed at this time are canceled. Interim results before commitment can be discarded,
and only results updated by instructions that have been completed without encountering
any errors remain in programmable resources. Therefore, not only can errors be prevented
from destroying programmable resources, but hardware can also perform an instruction
retry after error
detection. Even in the case of a hang, because stalled instructions can be discarded once and then
retried from the beginning, there is a possibility of recovery.
Instruction retry is triggered by an error and is automatically started. A retry is performed
instruction by instruction to increase the chance of normal execution. When the execution is
completed normally, the state automatically returns to the normal execution state. During
this period, no software intervention is required, and if the instruction retry succeeds, the
error does not affect software. An instruction retry is repeated until the number of retry
times reaches the threshold, and when the threshold is exceeded, the occurrence of the error
is reported to the
software by an interrupt. Operational flow is shown in Figure 10.
Figure 10. Instruction retry by hardware after error detection.
Increased Serviceability
The SPARC64 VII has error-checking mechanisms in a variety of locations. If an error
occurs, the system is notified of the error through a dedicated interface. On receipt of this
notification,
Sun SPARC Enterprise M3000 Server Architecture
20
the XSCF firmware collects error logs through the dedicated interface and analyzes them.
This series of operations does not affect software and is performed in the background.
With the mechanism described above, a system in which the SPARC64 VII is mounted
can identify the location and type of a failure quickly and accurately while continuing
operation. Thus, the system can obtain information useful for preventive maintenance to
increase serviceability.
I/O Subsystem
A growing reliance on computer systems for every aspect of business operations brings along
a need to store and process ever-increasing amounts of information. Powerful I/O subsystems
are crucial to effectively moving and manipulating these large data sets. The Sun SPARC
Enterprise M3000 server delivers exceptional I/O expansion and performance, enabling
organizations to scale systems and accommodate evolving data storage needs.
I/O Subsystem Architecture
The use of PCI technology is crucial to the performance of the I/O subsystem within the Sun
SPARC Enterprise M3000 server. A PCIe bridge supplies the connection between the main
system and all I/O components, such as PCIe slots and internal drives (Figure 11). The PCIe
bus also enables the connection of external I/O devices by using internal PCI slots.
Figure 11. Sun SPARC Enterprise M3000 server I/O subsystem architecture.
Sun SPARC Enterprise M3000 Server Architecture
21
Sun SPARC Enterprise M3000 Server I/O Subsystem
In the Sun SPARC Enterprise M3000 server, a single PCIe bridge mounted on the
motherboard connects all I/O components to the system controllers. The Sun SPARC
Enterprise M3000 server has four PCIe slots.
I/O Devices
Along with a disk device directly integrated into it, the Sun SPARC Enterprise M3000
server supports one internal DVD drive and four internal SAS 2.5-inch hard disk drives.
The Sun SPARC Enterprise M3000 server also supports one external SAS port, which can
be connected to any SAS storage or tape device.
Reliability, Availability, and Serviceability
Reducing downtime——both planned and unplanned——is critical for IT services. System
designs must include mechanisms that foster fault resilience, quick repair, and even rapid
expansion, without impacting the availability of key services. Specifically designed to support
complex, network computing solutions and stringent high-availability requirements, the
system in the Sun SPARC Enterprise M3000 server includes redundant, hot-swap system
components; diagnostic and error recovery features throughout the design; and built-in remote
management features. The advanced architecture of this reliable server enables high levels of
application availability and
rapid recovery from many types of hardware faults, simplifying system operation and lowering
costs for enterprises.
Redundant and Hot-Swap Components
Today’’s IT organizations are challenged by the pace of nonstop business operations. In a
networked global economy, revenue opportunities remain available around the clock, forcing
planned downtime windows to shrink and, in some cases, disappear entirely. To meet these
demands, the Sun SPARC Enterprise M3000 server employs built-in redundant and hot-swap
hardware to help mitigate the disruptions caused by individual component failures or
changes to system configurations. In fact, these systems are able to recover from hardware
failures——often with no impact to users or system functionality.
The Sun SPARC Enterprise M3000 server features redundant, hot-swap power supplies and
fan units. Also, redundant internal storage can be created by combining hot-swap disk drives
with disk mirroring software. If a fault occurs, these duplicated components can enable
continued operation. Depending upon the component and type of error, the system could
continue to operate in a degraded mode or could reboot——with the failure automatically
diagnosed and the relevant component automatically configured out of the system. In
addition, hot-swap hardware
within the Sun SPARC Enterprise M3000 server speeds service and allows for the
replacement or addition of components, without stopping the system.
Sun SPARC Enterprise M3000 Server Architecture
22
Advanced Reliability Features
Advanced reliability features included within the components of the Sun SPARC
Enterprise M3000 server increase the overall stability of this platform. In addition,
advanced CPU integration and guaranteed data path integrity provide for autonomous
error recovery by the SPARC 64 VII processor, reducing the time to initiate corrective
action and subsequently increasing uptime.
XSCF and the predictive self-healing feature in Oracle Solaris further enhance the reliability
of Sun SPARC Enterprise servers. The implementation of XSCF and predictive self-healing
for Sun SPARC Enterprise servers enables the constant monitoring of all CPUs and memory.
Depending upon the nature of the error, persistent CPU soft errors can be resolved by
automatically
offlining a thread, core, or entire CPU. In addition, a memory page retirement capability enables
memory pages to be taken offline proactively, in response to multiple corrections to data
access for a specific memory DIMM.
Error Detection, Diagnosis, and Recovery
The Sun SPARC Enterprise M3000 server features important technologies that correct
failures early and keep marginal components from causing repeated downtime. Architectural
advances that inherently increase reliability are augmented by the error detection and
recovery capabilities within the server hardware subsystems. Ultimately, the following
features work together to raise application availability:
x End-to-end data protection detects and corrects errors throughout the system,
ensuring complete data integrity.
x State-of-the-art fault isolation enables the server to isolate errors within component
boundaries and offline only the relevant resources instead of whole components. This
feature applies to CPUs (cores), memory, and I/O devices.
x Constant environment monitoring provides a historical log of all pertinent environmental
and error conditions.
x The host watchdog feature periodically checks the operation of software, including the
domain operating system. This feature also uses the XSCF firmware to trigger error
notification and recovery functions.
x Periodic component status checks are performed to determine the status of many system
devices to detect signs of an impending fault. Recovery mechanisms are triggered to
prevent system and application failures.
x Error logging, multistage alerts, electronic field-replaceable unit identification information,
and system fault LED indicators all contribute to rapid problem resolution.
Sun SPARC Enterprise M3000 Server Architecture
23
System Management
Providing hands-on, local system administration for server systems is no longer realistic for
many organizations. Around-the-clock system operation, disaster recovery hot sites, and
geographically dispersed organizations lead to requirements for remote management of
systems. One of the many benefits of Oracle servers is the support for lights-out data centers,
enabling expensive support staff to work at any location with network access. The Sun
SPARC Enterprise M3000 system design, combined with the powerful XSCF, XSCF Control
Package (XCP), and system management software, enables administrators to remotely
execute and control nearly any task. These management tools and remote functions lower
administrative loads, saving organizations time and reducing operational expenses.
eXtended System Control Facility
The XSCF is the core technology of remote monitoring and management capabilities in the
Sun SPARC Enterprise M3000 server. The XSCF consists of a dedicated processor that is
independent of the server system and runs the XCP. The Domain to Service Processor
Communication Protocol (DSCP) is used for communication between the XSCF and the
server. The DSCP runs on a private TCP/IP-based or PPP-based communication link
between the service processor and each domain. Although input power is supplied to the
server, the XSCF constantly monitors the system even when the domain is inactive.
The XSCF regularly monitors the environmental sensors, provides advance warnings of
potential error conditions, and executes proactive system maintenance procedures as
necessary. For example, the XSCF can initiate a server shutdown in response to temperature
conditions that might lead to physical system damage. The XCP running on the service
processor enables administrators to remotely control and monitor a domain as well as the
platform itself. Using a network or serial connection to the XSCF, operators can effectively
administer the server from anywhere on the network. Remote connections to the service
processor run separately from the operating system and provide the full control and authority
of a system console.
DSCP Network
The DSCP service provides a secure TCP/IP and PPP-based communications link between
the service processor and each domain. Without this link, the XSCF cannot communicate
with the domain. The service processor requires one IP address dedicated to the DSCP
service on the XSCF side of the link and one IP address on the domain side.
eXtended System Control Facility Control Package
The XCP enables users to control and monitor the server system quickly and effectively.
The XCP provides a command-line interface (CLI) and Web browser user interface that
gives administrators and operators access to all system controller functions. Password-
protected
Sun SPARC Enterprise M3000 Server Architecture
24
accounts with specific administration capabilities also provide system security for domain
consoles. Communication between the XSCF and individual domains uses an encrypted
connection based on secure shell (SSH) and Secure Sockets Layer (SSL), enabling secure,
remote execution of commands provided by the XCP.
The XCP provides the interface for the following key server functions:
x Audit administration including the logging of interactions between the XSCF and the domains
x Monitoring and control of power to the components inside the Sun SPARC Enterprise
M3000 server
x Interpretation of hardware information presented, and notification of impending
problems such as high temperatures or power supply problems, as well as access to the
system administration interface
x Integration with the fault management architecture of Oracle Solaris 10 to improve
availability through accurate fault diagnosis and predictive fault analysis
x Execution and monitoring of diagnostic programs, such as the OpenBoot PROM (OBP)
and power-on self-test (POST)
Role-Based System Management
The XCP supports role-based system access control through the organization of users into
groups. Different privileges are assigned to each group. Privileges allow a user to perform a
specific set of actions on a specific set of hardware, including physical components,
domains, or physical components within a domain. In addition, a user can possess multiple,
different privileges on any number of domains.
Platform Management
Oracle Enterprise Manager Ops Center software, as well as other third-party tools, offer
advanced management functions that complement the capabilities of the XCP. To simplify
integration, the XSCF can communicate to system management tools by enabling an SNMP
agent on the service processor. The network interface on the service processor facilitates data
transfer to SNMP managers within third-party management applications. SNMP V1, V2, and
V3 and concurrent access from multiple SNMP managers are supported.
The service processor SNMP agent can export the following types of information to an SNMP
manager:
x System information such as chassis ID, platform type, total number of CPUs, and
total memory
x Hardware configuration
Sun SPARC Enterprise M3000 Server Architecture
25
x Domain status
x Power status
x Environmental
status
The service processor SNMP agent can supply system and fault event information using
public management information bases (MIBs). The XSCF supports the configuration of the
following two MIBs (configuration commands can be found in Table 8):
x XSCF extension MIB (SP-MIB). Provides information on the status and configuration
of the platform. For fault events, the SP-MIB sends a trap with basic fault information.
x Fault Management MIB (FM-MIB). Records fault event data. The FM-MIB provides
the same detailed information as the FMA MIB in an Oracle Solaris domain. This data
can help service technicians diagnose failures.
TABLE 8. SERVICE PROCESSOR SNMP AGENT CONFIGURABLE FOR ONE OR BOTH MIBS
MIB CONFIGURATION COMMAND
SP traps only setsnmp enable SP_MIB
FMA traps only setsnmp enable FM_MIB
SP and FMA traps setsnmp enable
Oracle Enterprise Manager Ops Center Software
Controlling a rapidly changing IT infrastructure requires intelligent management tools and an
ability to provision servers efficiently. Oracle Enterprise Manager Ops Center is a highly
scalable data center management platform that provides organizations with systems lifecycle
management and automation processes to help manage data center requirements such as
server consolidation, compliance reporting, and rapid provisioning. This management
platform helps enterprises to provision and administer both physical and virtual data center
assets. Oracle Enterprise Manager Ops Center provides a single console to help discover,
provision, update, and manage globally dispersed heterogeneous IT environments, which may
include Oracle and non-Oracle hardware running Windows, Linux, and Oracle Solaris
operating systems. When used in conjunction with the Sun SPARC Enterprise M3000 server,
this enterprise platform can automate the knowledge necessary for patch lifecycle
management and maintenance. Oracle Enterprise Manager Ops Center can help system
administrators automate software installations, simulation, rollback, compliance checking,
reporting, and many other related activities. In addition, Oracle Enterprise Manager Ops
Center can be used to discover the embedded service tag technology in the service processor
and the domain running on the Sun SPARC Enterprise M3000 server.
Oracle Solaris JumpStart can be implemented in this management solution to provision
Oracle Solaris onto the Sun SPARC Enterprise M3000 server. Oracle Enterprise Manager
Ops Center helps facilitate and control administrative actions from a central location to
ensure accountability and auditing. These automation capabilities can be used for knowledge-
based change
Sun SPARC Enterprise M3000 Server Architecture
26
management in conjunction with existing configuration management investments. Taking
advantage of Oracle Enterprise Manager Ops Center can help organizations create a more-
reliable environment that offers considerable cost savings through maintenance reduction
and more-rapid recovery down times.
Oracle Solaris 10
With mission-critical business objectives on the line, enterprises need a robust operating
environment with the ability to optimize the performance, availability, security, and use of
hardware assets. In a class by itself, Oracle Solaris 10 offers many innovative technologies to
help IT organizations improve operations and realize the full potential of Sun SPARC
Enterprise servers.
Observability and Performance
IT organizations need to make effective use of the power of hardware platforms. Oracle Solaris
10 supports near-linear scalability proportional to the number of CPUs (cores) and memory
addressability that reaches well beyond the physical memory limits of even Oracle’’s largest
server. The following advanced features of Oracle Solaris 10 provide IT organizations with
the ability to identify potential software tuning opportunities and maximize raw system
throughput:
x Oracle Solaris DTrace is a powerful tool that provides a true system-level view of
application and kernel activities, even those running in a Java Virtual Machine. DTrace
software safely instruments the running operating system kernel and active applications
without rebooting the kernel or recompiling——or even restarting——software. By using
this feature, administrators can view accurate and concise information in real time and
highlight patterns and trends in application execution. The dynamic instrumentation that
DTrace provides enables organizations to reduce the time to diagnose problems from days
and weeks to minutes and hours, resulting in faster data-driven fixes.
x The highly scalable, optimized TCP/IP stack in Oracle Solaris 10 lowers overhead by
reducing the number of instructions required to process packets. This technology also
provides support for large numbers of connections and enables server network throughput
to grow linearly with the number of CPUs and network interface cards. By taking
advantage of Oracle Solaris 10 network stack, organizations can significantly improve
application efficiency and performance.
x The memory handling system of Oracle Solaris 10 provides multiple page size support
to enable applications to access virtual memory more efficiently, improving
performance for applications that use large memory intensively.
x Oracle Solaris 10 multithreaded execution model plays an important role in enabling Sun
SPARC Enterprise servers to deliver scalable performance. Improvements to the
threading
Sun SPARC Enterprise M3000 Server Architecture
27
capabilities in Oracle Solaris 10 occur with every release, resulting in performance and
stability improvements for existing applications without recompilation
Availability
The ability to rapidly diagnose, isolate, and recover from hardware and application faults is
paramount for meeting the needs of nonstop business operations. Longstanding features of
the Oracle Solaris provide for system self-healing. For example, the kernel memory
scrubber constantly scans physical memory, correcting any single-bit errors to reduce the
likelihood of those problems turning into uncorrectable double-bit errors. Oracle Solaris 10
takes a big leap forward in self-healing with the introduction of the fault manager and
service manager features. With these features, business-critical applications and essential
system services can continue uninterrupted in the event of software failures, major hardware
component breakdowns, and software misconfiguration problems.
x Fault manager reduces complexity by automatically diagnosing faults in the system and
initiating self-healing actions to help prevent service interruptions. Fault manager diagnosis
engine produces a fault diagnosis once discernible patterns are observed from a stream of
incoming errors. Following error identification, fault manager provides information to
agents that know how to respond to specific faults. Problem components can be configured
out of a system before a failure occurs——and in the event of a failure, this feature
performs automatic recovery and application restart. For example, an agent designed to
respond to a memory error might determine the memory addresses affected by a specific
failure and remove the affected locations from the available memory pool.
x Service manager software converts the core set of services packaged with the operating
system into first-class objects that administrators can manipulate with a consistent set of
administration commands. Using service manager, administrators can take actions on
services including start, stop, restart, enable, disable, view status, and snapshot. Service
snapshots save the complete configuration of a service, giving administrators a way to roll
back any erroneous changes. Snapshots are taken automatically whenever a service starts to
help reduce risk by guarding against erroneous errors. Because service manager is
integrated with fault manager, when a low-level fault is found to impact a higher-level
component of a running service, fault manager can direct service manager to take
appropriate action.
In addition to handling error conditions, efficiently managing planned downtime greatly
enhances availability levels. Tools included with Oracle Solaris 10, such as Oracle Solaris
Flash and Oracle Solaris Live Upgrade, can help enterprises achieve more-rapid and
consistent installation of software, upgrades, and patches, leading to improved uptime.
x Oracle Solaris Flash enables IT organizations to quickly install and update systems with an
Oracle Solaris 10 configuration tailored to enterprise needs. This technology provides tools
for
Sun SPARC Enterprise M3000 Server Architecture
28
system administrators to build custom rapid-install images——including applications,
patches, and parameters——which can be installed at a data rate close to the full speed of
the hardware.
x Oracle Solaris Live Upgrade provides mechanisms to upgrade and manage multiple on-
disk instances of Oracle Solaris 10. This technology enables system administrators to
install a new operating system on a running production system without taking it offline,
with the only downtime for the application being the time necessary to reboot the new
configuration.
Security
Today’’s increasingly connected systems create benefits and challenges. While the global
network offers opportunities to increase revenue, enterprises must pay close attention to
security concerns. The most secure operating system on the planet, Oracle Solaris 10
provides features previously found only in the trust military-grade Oracle Solaris. These
capabilities enable the
strong controls required by governments and financial institutions, but also benefit all enterprises
focused on security concerns and requirements for auditing capabilities.
x The user rights management and process rights management capabilities in Oracle Solaris
work in conjunction with Oracle Solaris Containers to enable multiple applications to
securely share the same domain. Security risks are reduced by granting users and
applications only the minimum capabilities needed to perform assigned duties. Best yet,
unlike other solutions on
the market, no application changes are required to take advantage of these security
enhancements.
x The security policy in Oracle Solaris 10 can be extended with labeling features
previously available only in highly specialized operating systems or appliances. These
extensions deliver true multilevel security within a commercial grade operating system,
beneficial to civilian organizations with specific regulatory or information protection
requirements.
x Oracle Solaris 10 provides features that fortify platforms against compromise. Firewall
protection technology included within Oracle Solaris 10 distribution protects individual
systems against attack. In addition, file integrity checking and digitally signed binaries
within Oracle Solaris 10 enable administrators to verify that platforms remain untouched
by hackers. Secure remote access capabilities also increase security by centralizing the
administration of system access across multiple operating systems.
Virtualization and Resource Management
The economic need to maximize the use of every IT asset often necessitates consolidating
multiple applications onto single server platforms. Virtualization techniques enhance
consolidation strategies one step further by helping organizations create administrative and
resource boundaries between applications within each domain on a server. By taking
advantage of Oracle Solaris Containers and Oracle Solaris Resource Manager software,
Sun SPARC Enterprise M3000 Server Architecture
29
organizations can improve resource use and reduce downtime——without additional
software licensing expenses.
Sun SPARC Enterprise M3000 Server Architecture
30
Oracle Solaris Containers
Containers provide a breakthrough approach to virtualization and software partitioning,
supporting the creation of many private execution environments within a single instance
of
Oracle Solaris (Figure 13). Within the Container model, each environment holds a unique identity
and maintains resource and namespace isolation. Administrators can configure separate LAN
or virtual LAN connections with exclusive IP stacks for individual Containers, creating
secure separation of network traffic. By supporting fine-grained control over the assignment
of system rights and resources, Containers can ease consolidation efforts.
Applications within containers are isolated, preventing processing in one container from
monitoring or affecting processes running in another container. Even a superuser process
cannot view or affect activity in other containers. Software fault and security isolation
features in Oracle Solaris Containers prohibit poorly behaved applications from impacting
other containers. This isolation supports better administrative control, helping organizations
eliminate error
propagation, unauthorized access, and unintentional intrusions.
Figure 13. Containers isolate applications using flexible software mechanisms.
Hosting multiple applications on one system helps organization realize the use of expensive
resources to greater effect. Using Containers can lead to lower costs by helping IT
organizations harness and provision otherwise idle compute power into a secure, isolated
runtime environment for new deployments. For example, a database, Web server, and batch
application•each running on its own system•can be consolidated onto a single server
configured to give each access to one-third of the available system resources. That same
server can be automatically reconfigured
so that the Web server receives 75 percent of the network bandwidth during peak load
conditions. When applied to test and development environments, Containers can minimize
the need for dedicated test systems and facilitate the implementation of multiple
deployment scenarios with ease. At the end of a testing cycle, administrators can also
rapidly duplicate validated configurations for production deployment. With the ability to
dynamically allocate resources, Containers help improve resource use without increasing
the number of operating system instances to manage.
Sun SPARC Enterprise M3000 Server Architecture
31
Oracle Solaris Resource Manager
Resource management tools address the needs of consolidation efforts, which require soft
resource boundaries between applications. With no privileges to access underlying
hardware, resource management software leverages operating system controls to govern the
use of CPU, memory, and I/O. Oracle Solaris Resource Manager software enables system
administrators to set and enforce policies that guarantee a share of CPU cycles and virtual
memory space to individual applications. Administrators can also set upper limits on process
count, number of logins, and connect time for each system user ID. In addition, Oracle
Solaris Resource Manager
can be used along with other virtualization technologies to further define resource rights for
each virtualized boundary. In fact, Oracle Solaris Resource Manager enables the dynamic
allocation of processors and individual processor cores to a Container. The power to define
and readily adjust compute resource levels within virtualized environments helps enterprises
improve hardware use and better guarantee the quality of service for individual applications.
Conclusion
To support the high demand for reliability, manageability, and reduced environmental loads
in data centers, infrastructures need to provide ever-increasing performance and capacity
along with power conservation and miniaturization. Outfitted with the SPARC64 VII
processor developed to provide high performance and low power consumption, a large
memory capacity, a reliable architecture, and a system monitoring feature, Oracle’’s Sun
SPARC Enterprise M3000 server delivers new levels of power, availability, and ease-of-use
to enterprises. Organizations using this server can open the door to a new environment,
fostering greater business opportunities and gaining a strategic asset in the quest to get ahead
and stay ahead of the competition.