Intel® Cluster Checker 1.8User's Guide
This page was intentionally left blank
ABOUT THIS DOCUMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
GETTING STARTED WITH INTEL® CLUSTER CHECKER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1. CONFIGURING INTEL® CLUSTER CHECKER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1. DEFINING THE NODES TO CHECK ...................................................................................................................... 11 1.2. DEFINING INTEL® CLUSTER CHECKER CONFIGURATION ...................................................................................... 13
1.2.1. List of Nodes ................................................................................................................................. 14 1.2.2. Altering the Runtime Behavior of the Tool ................................................................... 14 1.2.3. Selecting Test Modules ............................................................................................................ 16 1.2.4. Configuring Test Modules ....................................................................................................... 17 1.2.5. Using Multiple Configuration Files ...................................................................................... 25
1.3. LICENSE FILE PATH CONFIGURATION ................................................................................................................ 25 1.4. UPDATING OLD CONFIGURATION FILES ............................................................................................................. 26
2. RUNNING INTEL® CLUSTER CHECKER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1. VERIFYING CLUSTER CORRECTNESS ................................................................................................................... 27 2.1.1. Console Output ........................................................................................................................... 27 2.1.2. Log Files ........................................................................................................................................... 29 2.1.3. Additional Output ........................................................................................................................ 29 2.1.4. Command Line Options ............................................................................................................ 30 2.1.5. Environment Variables ............................................................................................................. 34
2.2. GATHERING CLUSTER INFORMATION .................................................................................................................. 35 2.2.1. Command Line Options ............................................................................................................ 36
3. USER-DEFINED CHECKING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1. CORRECTNESS CHECKING ................................................................................................................................... 38 3.2. UNIFORMITY CHECKING ...................................................................................................................................... 38
4. COPY EXACTLY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5. INTEL® CLUSTER CHECKER TEST MODULES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.1. COMPLIANCE TEST MODULES ........................................................................................................................... 41 5.2. SDK COMPLIANCE TEST MODULES ................................................................................................................ 42 5.3. DEFAULT TEST MODULES ................................................................................................................................. 42 5.4. OPTIONAL TEST MODULES ............................................................................................................................... 44
6. PERFORMANCE TEST MODULES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.1. SINGLE-NODE BENCHMARKS .............................................................................................................................. 46 6.2. PAIR-WISE BENCHMARKS ................................................................................................................................... 47 6.3. CLUSTER-WIDE BENCHMARKS ............................................................................................................................ 47
7. HETEROGENEOUS CLUSTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.1. NOMINAL HARDWARE VARIATION ..................................................................................................................... 48 7.2. SUB-CLUSTERS ................................................................................................................................................... 48 7.3. FAT NODES ...................................................................................................................................................... 50
8. AUTOMATIC CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
8.1. OVERVIEW .......................................................................................................................................................... 52 8.2. COMMAND LINE OPTIONS .................................................................................................................................. 52
8.2.1. Automatic Configuration Options ....................................................................................... 52 8.3. CONSOLE OUTPUT AND LOGS ........................................................................................................................... 56 8.4. CLUSTER NODES AUTOMATIC DISCOVERY ....................................................................................................... 56
8.4.1. Configuration Options ............................................................................................................... 57 8.5. PERFORMANCE THRESHOLDS AUTOMATIC CONFIGURATION ............................................................................. 57
8.5.1. Hardware Scanning .................................................................................................................... 57 8.5.2. Additional Output ........................................................................................................................ 58 8.5.3. Benchmarking and Performance Disclaimers ............................................................... 58
8.6. AUTOMATIC CONFIGURATION ADVANCED USAGE .............................................................................................. 59 8.6.1. Group Configuration Alternatives ...................................................................................... 59 8.6.2. Heterogeneous Hardware Support ................................................................................... 60 8.6.3. Single Node Performance ....................................................................................................... 60
9. THIRD PARTY COPYRIGHT NOTICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Disclaimer and Legal Information
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site.
The Intel® Cluster Checker tool is intended to be used by registered Intel® Cluster Ready partners only. The Intel® Cluster Checker tool executes only in systems which run over Intel® processors.
Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Intel recommends that you evaluate other related products to determine which best meets your requirements.
* Other names and brands may be claimed as the property of others
Copyright © 2006-2011, Intel Corporation. All rights reserved.
5
About This Document
Intel® Cluster Checker verifies the configuration and performance of Linux-based clusters and checks compliance with the Intel® Cluster Ready Specification. This User's Guide provides step-by-step instructions for using Intel® Cluster Checker.
This guide is organized into the following sections:
A Getting Started With Intel® Cluster Checker introduction with the very basics steps to run the tool.
A complete guide with the details on how to configure, execute and tune Intel® Cluster Checker for specific clusters, divided in the following chapters:
Chapter 1 describes how to configure Intel® Cluster Checker for a specific cluster.Chapter 2 describes how to execute Intel® Cluster Checker.Chapter 3 describes how to run custom checks without creating new test modules.Chapter 4 describes the Copy Exactly feature.Chapter 5 lists all the test modules included with Intel® Cluster Checker.
Chapter 6 describes the performance benchmarks included in Intel® Cluster Checker.Chapter 7 describes how to configure Intel® Cluster Checker to recognize and verify
heterogeneous clusters.Chapter 8 describes the automatic configuration feature. Chapter 9 contains copyright notices for the third party tools that are distributed
with Intel® Cluster Checker.
Other documents included with Intel® Cluster Checker distribution
• Intel® Cluster Checker Developer's GuideInformation on how to create new test modules using the Intel® Cluster Checker plug-in architecture.
• Intel® Cluster Checker Test Module Reference GuideDetailed information about each test module.
Further information and support can be found online at http://www.intel.com/go/cluster.
6
Getting Started With Intel ® Cluster Checker
Overview Intel® Cluster Checker runs as a sequence of individual tests called test modules. Each test module conducts a specific test over a single cluster node or over the whole cluster, depending on its type. You can find information about what each test module checks and how to configure them in the separate Test Modules Reference Guide.
Simply put, using Intel® Cluster Checker involves the following steps1:0. Setting up the run-time environment.1. Creating a nodes file, which contains the list of cluster nodes.2. Creating a configuration file. 3. Executing Intel® Cluster Checker, passing your configuration file as a
parameter.4. Analyzing Intel® Cluster Checker's output.5. Starting over, until all test modules produce the desired results.
This tutorial guides you through the process of running Intel® Cluster Checker to get you started.
Step 0 Set up the run-time environment
To get started, setup the run-time environment for Intel® Cluster Checker by executing the initialization script clckvars.sh, saved in Intel® Cluster Checker's install path.
$ source <installpath>/clckvars.sh
The initialization script: • Adds clustercheck to the execution path.• Enables command line options automatic completion (pressing TAB).• Enables the tool man pages with the man utility. To see the general man page
just execute:
$ man clustercheck
1 For installation help and information on run-time prerequisites, please read the Release Notes for your version of Intel® Cluster Checker.
7
To open test modules man pages use the following syntax: clck-<test_module_name>. Example for hardware_uniformity man page:
$ man clckhardware_uniformity
Step 1 Create a nodes fi le
Intel® Cluster Checker must know which nodes to process when running test modules. A nodes file is a list of your cluster's nodes hostnames or IP addresses, one name per line. As a working example, consider a computer cluster called Cluster, made up of four nodes named node1, node2, node3 and node4. In our example, the nodes file looks like this:
Step 2 Create a configuration fi le
Intel® Cluster Checker's behavior can be customized with configuration files. A configuration file is a validated XML file that holds values that model Intel® Cluster Checker's run-time behavior. The most simple configuration file consists only in the <nodefile> element, which points to an Intel® Cluster Checker nodes file. Following our example:
8
<! My configuration file: /home/icr/myconfig.xml ><cluster>
<nodefile>/home/icr/nodesfile</nodefile></cluster>
Listing 2. A simple configuration file.
# /home/icr/nodesfile: Cluster nodes to processnode1node2node3node4
Listing 1. A simple node definition file. Note the use of '#' to include comments in nodes files. See Defining the Nodes to Check for more information.
The configuration file may also contain test module configurations, including a list of which test modules to run and their custom parameters. If no test module configuration is provided (as in this example) then Intel® Cluster Checker runs a predefined subset of test modules. However, bear in mind that test modules must be properly configured to provide meaningful results.
Step 3 Run Intel® Cluster Checker
You can now run Intel® Cluster Checker by calling clustercheck binary, passing your configuration file as a parameter, for instance:
$ clustercheck myconfig.xml
When executed, Intel® Cluster Checker runs the pertinent test modules in a sequence, providing feedback on the outcome of each test.
NOTE: If no configuration file is passed by command line, clustercheck searches for a configuration file on the the following locations:
1. <install_path>/etc/config.xml2. /etc/intel/clck/config.xml
Note that this works as a fallback mechanism. The first file found file is used.
Step 4 Analyze Intel® Cluster Checker output
During and after execution, Intel® Cluster Checker provides reports on the results to each of the test modules executed. From these reports you can to detect flaws or recognize opportunities to improve your cluster's operation.
Intel® Cluster Checker generates 2 output reports with information of its execution. The names of the log files are the same as the name of the configuration file that was inferred, plus a time-stamp and a specific suffix. Following our example, this would be the log files created from the execution of Intel® Cluster Checker:
File name Description
myconfig20110304.085149.out The .out log file contains the console output that Intel® Cluster Checker prints during
9
execution.myconfig20110304.085149.xml The .xml log file is the output of Intel®
Cluster Checker's highest verbosity. Being an XML file, it is suitable for parsing with other tools.
Step 5 Start over
After addressing the issues raised by test modules results, start over from step 2. Make changes to the nodes file, the configuration file or the cluster itself until the test modules produce no more warnings.
Further reading Try the following:
$ man clustercheck$ clustercheck –help
And see also• Configuring Intel® Cluster Checker for information on configuration options.• Running Intel® Cluster Checker for information on customizing Intel® Cluster
Checker's run-time behavior.
10
1. Configuring Intel® Cluster Checker
Intel® Cluster Checker is highly configurable. The default settings are appropriate in most cases, but may not be appropriate for all clusters. Consequently, the most valuable check is one that is optimized for your cluster.
1.1. Defining the Nodes to Check
Typically clusters are composed of many individual nodes where some nodes are used as computational resources, some are used to control the cluster, and some may be used for other purposes, such as storage servers. Intel® Cluster Checker recognizes 3 functional types of nodes: compute, head, and other. Some test modules are only executed to check nodes of a certain type while other may behave differently depending on the node type.
The cluster nodes are defined in a text file. The most basic file only lists the node names, one per line. For example, the following file defines 4 compute nodes:
# list of nodes to checknode1node2node3 # fails intermittentlynode4
The ‘#’ symbol has different uses in the nodes file. It may be used to introduce comments to the file by placing it at the beginning of a line or after the name of a node (as in above example). However, the ‘#’ may also be used for configuration options if it is followed by one of the keywords: ‘type:’ or ‘group:’
If the keywords '# type: head’ appears in the comment text on the same line as a node, Intel® Cluster Checker considers the node to be a head node. Similarly, nodes of functional types ‘compute’ and ‘other’ may also be defined. By default, a node without an explicitly defined type is considered a compute node. For example, the following file defines 4 nodes: node1 is a head node, node2 and node3 are compute nodes, and node 4 belongs to type other:
# list of nodes to checknode1 # type: headnode2 node3 # type: compute fails intermittently
11
node4 # type: other
One node may have multiple functions. For example, on small clusters, it is not uncommon for one node to serve as both the head node and as a computational resource. The node type definition may be repeated to assign more than 1 type to a particular node. A comma-separated list may also be used to assign a node to more than 1 type. For example, the following nodelist file defines 4 nodes, node1 is a combined head and compute node, node2 and node3 are compute/other nodes, and node4 belongs to types head, compute, and other:
# list of nodes to checknode1 # type: head type: computenode2 # type: compute, othernode3 # type: compute type: other fails intermittentlynode4 # type: other, head type: compute
For backwards compatibility, a head node may also be designated using the bare word 'head' if immediately following the '#' character2:
node1 # head
Clusters are typically homogeneous, but nodes may differ in some known aspects. For example, some nodes may have more memory than others. Intel® Cluster Checker can be configured to recognize this kind of heterogeneity using the 'group' property. Group assignments are similar to types, except the label would be ‘# group:' and the string is arbitrary. For example, the following file defines 4 nodes: node1 is a head/compute node with extra memory (belongs to bigmem group), node2 is a compute node with extra memory and higher frequency processors (belongs to bigmem and fastcpu), and node3 and node4 are compute nodes with the standard hardware:
# list of nodes to checknode1 # type: compute, head group: bigmemnode2 # group: bigmem group: fastcpunode3 # fails intermittentlynode4
Assigning a node to a group is necessary but not sufficient for Intel® Cluster Checker to recognize heterogeneous nodes. The group name must also be used in the XML configuration file, for instructions see 1.2.4.
2 This feature should be considered deprecated and may be removed in the future.12
1.2. Defining Intel® Cluster Checker Configuration
The Intel® Cluster Checker configuration parameters are specified in an XML file. These parameters control the execution of each test module and define the list of nodes to be checked.
Several example XML configuration files are included with the tool and are located in the <installationpath>/examples/ directory.
Before the execution, the configuration file is verified with Intel® Cluster Checker XML validation schema. This corroborates that the correct parameters are used at the correct location, ensuring the user will get the expected behavior of the tool. Although not recommended, this verification can be disabled using the --force switch (see 2.1.4 for details of this option).
Tip to manually val idate the XML configuration fi le: A W3C XML Schema (clck.xsd) and XLST transformer style-sheet (clck.xsl) are included with Intel® Cluster Checker. The schema and style-sheet may be used with third-party XML editors to generate and/or validate a configuration file.
The xmllint tool included in the libxml2 package can be used to validate if a given configuration file follows the required schema. Because the schema validation requires the XML file to be ordered, it is advisable to first order it with the xsltproc command to avoid ordering issues. The command below shows how to order the file examples/example.xml:
xsltproc output examples/sorted.xml clck.xsl examples/example.xml
Then use xmllint to verify if the file examples/sorted.xml is compliant with the required schema:
xmllint schema clck.xsd examples/sorted noout
If references to external xml files are included in configuration file (see 1.2.5 for instructions of how to include other files) the option xinclude must also be used in the validation.
13
1.2.1. List of Nodes
The nodes are typically read from the path specified in the XML configuration file. The nodefile command line option may also be used to specify the file containing the list of nodes (see 2.1.4 for the complete list of command line options).
<nodefile> file </nodefile>Read the cluster nodes from file. If file begins with '$', it is interpreted as an environment variable, e.g. <nodefile>$PBS_NODEFILE</nodefile>.
The following parameters are supported only for backwards compatibility matters. Their usage is discouraged:
<head> value </head>This is an alternative method for defining a head node. This value will be added to the list of nodes to be checked as if it appeared in the node list file. This option may be specified more than once to define more than one head node in this manner.
<mixedhead/>Binary flag denoting that all head nodes are also compute nodes.
<node_suffix> value </node_suffix>Value will be appended to the cluster node names, i.e., mycluster1.mydomain. If the suffix is a domain name, the '.' must be explicitly included with the suffix. The default is no suffix.
1.2.2. Altering the Runtime Behavior of the Tool
The following parameters control how Intel® Cluster Checker operates and are optional.
<alltoallthreshold> value </alltoallthreshold>Override the default method used to control the combinational explosion in all-to-all checks as the number of nodes increases. The runtime behavior is to check all node pairs as long as the number of nodes is less or equal than the provided value (or 64 by default). Above that, the behavior is to check each node only against its nearest neighbors in the node list or more if alltoall-throttle is specified.
<alltoallthrottle> value </alltoallthrottle>
14
The value corresponds to the degree of neighbors used to generate pairs when the number of nodes is greater than the one specified by alltoall-threshold. Node neighbors are determined by position in the nodelist file, not the physical arrangement of the nodes. Value will be 1 if not specified.
<head_tempdir> value </head_tempdir>Change the default location where temporary directories/files are created locally in the head node to prepare the checks to execute. The default location is /tmp. Value should be the absolute path to an existent directory in the head node with read, write and execute permissions. This value can also be specified with the environment variable CLCK_HEAD_TEMPDIR (see 2.1.5 for details of environment variables). It is important to note that the environment variable has precedence over the configuration file.
<node_tempdir> value </node_tempdir>Change the default location where temporary directories/files are created in the nodes for testing purposes. Test modules that include the head node in their checks will use this path when verifying it. The default location is /tmp. Value should be the absolute path to an existent directory in all nodes with read, write and execute permissions. This value can also be specified with the environment variable CLCK_NODE_TEMPDIR (see 2.1.5 for details of environment variables). It is important to note that the environment variable has precedence over the configuration file.
<processlimit> value </processlimit>Override the default number of nodes that can be simultaneously checked (checks are parallelized by forking a process for each node). No more than value number of processes will ever be running concurrently. The default value is 64.
<retry> value </retry>The number of times a test module should be re-executed if it fails.
<user> value </user>The system user name to use when running test modules.This setting only affects test modules which are intended for regular users when using the tool as privileged user .
<env> export NAME=VALUE </env>Set user defined environment variables before executing the test module commands on the cluster nodes. The tag may be repeated many times to set
15
more than one environment variable. This tag may be used globally for all tests or inside each test module configuration. It is important to note that the resulting environment configuration for a specific test module will be the merge of the globally configured environment variables and the ones configured in the test module. If a globally configured environment variable is redefined inside a test module configuration, the global value will be overridden for that test module. Example:
<cluster> <env> export HOME=/home/$USER </env> <nodefile>/etc/intel/clcknodelist</nodefile> <test> <intel_mpi_rt> <env> export I_MPI_PERHOST=1 </env> <env> export I_MPI_DEBUG=5 </env> <device>rdssm</device> <mpipath>/opt/intel/mpirt/3.1</mpipath> <processnumber>2</processnumber> </intel_mpi_rt> </test></cluster>
Also take into account that the environment variables will be set in the order they are entered (top-bottom). This is specifically important for environment variables whose value depend on other ones. For cshell change the keyword 'export' to 'set'.
1.2.3. Selecting Test Modules
The default list of test modules to be executed may be altered at run time by including the following set of tags in the configuration file (see chapter 5.3 for the list of default test modules). Equivalent command-line options are also available (see 2.1.4 for details of each of these options). These configuration file tags and their respective command-line option may be freely mixed. However, command-line options have precedence over the configuration tags.
16
<exclude_module> test_module </exclude_module>Individually exclude the test module named test_module. If other test modules depend on the excluded one, they will also be excluded. This tag may be used multiple times to exclude many test modules and is also available as command line option.
<include_module> test_module </include_module>Include the test module named test_module. The included test module will run in addition to the standard set of test modules. If the included test module depends on other test modules that are not explicitly included, the required dependencies will also be executed. This option may be used multiple times to include more than one test module and is also available as command line option.
<include_only_module> test_module </include_only_module>Ignore the default set of test modules and include only the individual one named test_module. Test modules required to satisfy the dependencies of the included one will also be included. This option may be used multiple times and is also available as command line option. This option also has precedence over <include_only>, so any modules added with that option will be ignored.
Each default test modules has an execution level associated (see Table 6.3 ), changing the default execution level with the level command line option will also change the list of test modules to be executed (for details of this option see 2.1.4).
In addition to the default set of test modules, Intel® Cluster Checker has different predefined sets that can be executed using command line options. These options are : compliance, sdkcompliance, certification and deployment (for details of these options see 2.1.4). Note that the three configuration tags described in this section will also alter the list of test modules included by these options.
1.2.4. Configuring Test Modules
The individual test modules may be configured for your cluster. The configuration parameters are contained inside the <test> ... </test> XML container. The <test> block should only be specified once in a configuration file. Parameters for a specific test module are further enclosed in a tag matching the name of the test module. For example, to configure the clock_sync test module:
17
<cluster>...
<test> ... <clock_sync> <deviation>30</deviation> </clock_sync> ...</test>
</cluster>
Additional documentation about the specific configuration options for each test module can be found at the Intel® Cluster Checker Test Module Reference Guide document.
1.2.4.1. Creating Configuration Groups
The group mechanism gives the flexibility of having a test module with different configuration values for different groups of nodes within the cluster. The matching between the configuration values and nodes is done based on the groups that each node belongs to, which is defined in the nodelist (see 1.1 for details on how to create the nodelist). The <group> tag in the configuration file should be located in the next nested level to the name of the test module. Otherwise, it will not pass the schema validation. Although not recommended, if the force command line option is used and the tag is located at deeper XML levels, its configuration values will be ignored.For example, the following configuration for the system_memory test module specifies that nodes belonging to the 'bigmem' group should have 8GB of physical memory while the default amount of physical memory for the remaining nodes is 4GB and all nodes should have 4GB of virtual memory:
<system_memory><group name="bigmem">
<physical>8388608</physical> </group>
<physical>4194304</physical> <swap>4194304</swap></system_memory>
Group configuration parameters always supersede the default parameters for the nodes belonging to that group. Multiple group containers may be defined per module. So, the group configuration values will be used only for the nodes that belong to the
18
specified group. If a node does not belong to any group, the test module general configuration values will be used.
Group names combination is also supported. Using the reserved key words 'OR' and 'AND' enhanced sentences can be created to have a configuration container apply to more than one group and/or to the intersection of two or more groups. For example, the following configuration shows four different values for an item: the first one applies to nodes that belong to group g1, g2 or g3; the second one applies only to nodes that belong to group g1, g4 and g5; the third one applies to nodes that belong to g5 or g6 and also belong to g7; and at last the default value for nodes that not match any of the above.
<system_memory><group name="g1 OR g2 OR g3">
<physical>8388608</physical> </group>
<group name="g1 AND g4 AND g5"> <physical>2097152</physical> </group>
<group name="g5 OR g6 AND g7"> <physical>16777216</physical> </group>
<physical>4194304</physical>
</system_memory>
It is important to consider that the expressions are evaluated from left to right and the operations have the same precedence. Also note that each node can match only one configuration option for each test module. If more than one configuration option applies to a node, a message indicating this will be printed and the test module will be skipped. A detailed list of the conflicting nodes will be printed if Intel® Cluster Checker is executed with verbosity 3 or higher.
1.2.4.2. Using Global Configuration Options
Most of the test modules have configuration options that are specific for each one (refer to the Test Modules Reference Guide for details). However, some test modules share some configuration options. For these cases, the <global_configuration> container can be used to write the configuration option once and avoid repeating it for
19
every test module that uses it. The container should be placed outside the <test> container.
Two kinds of global configuration options are available: Single entry ones, used to set paths to utilities and libraries; and one option of multiples entries, used to configure the network fabrics.
Single entry global configuration options have only one XML tag that provides the information needed for that option. This kind of options are used to configure specific paths required by the test modules. If a global configuration option is defined, all test modules for which this configuration option has meaning will use the value globally defined. However, if a test module has a specific configuration, that value will be used instead. This means that test modules local configuration has precedence over the global one.See Table 2.1 for details of the configuration options available and test modules that they apply to.
For example, the base path of the Intel® MPI Library is used by imb_collective_intel_mpi, imb_pingpong_intel_mpi and hpcc test modules. Therefore, the following configuration:
<test><imb_collective_intel_mpi>
<benchmark>barrier</benchmark><fabric>
<device>sock</device></fabric><mpipath>/opt/intel/mpi/3.0</mpipath>
</imb_collective_intel_mpi><imb_pingpong_intel_mpi>
<fabric><bandwidth>110</bandwidth><device>sock</device><latency>35</latency>
</fabric><mpipath>/opt/intel/mpi/3.0</mpipath>
</imb_pingpong_intel_mpi><hpcc>
<mpipath>/opt/intel/impi/4.0.0</mpipath><hpcc>
</test>
Can be simplified by:
20
<global_configuration><mpipath>/opt/intel/mpi/3.0</mpipath>
</global_configuration> <test>
<imb_collective_intel_mpi><benchmark>barrier</benchmark><fabric>
<device>sock</device></fabric>
</imb_collective_intel_mpi>
<imb_pingpong_intel_mpi><fabric>
<bandwidth>110</bandwidth><device>sock</device><latency>35</latency>
</fabric></imb_pingpong_intel_mpi>
<hpcc><mpipath>/opt/intel/impi/4.0.0</mpipath>
<hpcc> </test>
Note that hpcc will use a different version of Intel® MPI Library because its local configuration overrides the global one.
The global configuration option with multiple entries is used to define the network fabrics to be used by test modules that exercise Intel® MPI Library (see Table 2.1 for the list of test modules). The <network> container is used to hold the different fabrics that should be entered using the <fabric> container. Multiple fabrics can be configured and each one must include the corresponding MPI device (<device>). Optional attributes are available for the user to specify a custom name to the fabric (name=) and its state (enabled=). The default state is enabled.
There are three options to configure network fabrics in test modules:
1. No local network fabric configuration in the test module. The test module will use every global network in enabled state. If no global network fabric is available (or enabled) the test module will use its default.
21
2. In the test module local configuration use the <device> tag to refer to a globally configured network fabric by its name (attribute name=), instead of defining an MPI device. In this case, only the specified network fabric will be exercised by the test module.
3. In the test module local configuration use the <device> tag to specify an MPI device. In this case globally configured network fabrics will be ignored.
Note that cases 1 and 2 allow to enable/disable network fabrics from one place and affect several test modules.
The following example shows how to create a configuration in which two network fabrics are globally defined.
<global_configuration><mpipath>/opt/intel/impi/3.2</mpipath><network>
<fabric name=”IB” enabled=”on”><device>rdssm</device>
</fabric><fabric name=”SMETH” enabled=”on”>
<device>shm:tcp</device></fabric>
</network> </global_configuration>
<test><intel_mpi>
<device>IB</device></intel_mpi>
<imkl_hpl><fabric>
<device>rdssm</device><hpl>0.5</hpl><processnumber> 3 </processnumber>
</fabric> </imkl_hpl>
</test>
The runtime behavior of this configuration is:
• Every test module that uses Intel® MPI Library with exception of intel_mpi and imkl_hpl will exercise both network fabrics (both are enabled).
22
• intel_mpi will only test the network fabric named “IB” in global configuration.• imkl_hpl will ignore globally configured network fabrics and will test rdssm.
Note that <device> supports I_MPI_DEVICE and I_MPI_FABRICS styles to specify an MPI network fabric. An I_MPI_DEVICE definition must use one of: sock, shm, ssm, rdma, rdssm. In the case of the I_MPI_FABRICS style, the definition must match {shm,dapl,tcp,ptl,tmi,ofa}:{dapl,tcp,ptl,tmi,ofa}.
Additional Intel® MPI Library options can be provided by using an 'options' XML attribute for the <device> tag. The options will be reordered as required by MPI, placing global ones first.
For instance, the first example increases the verbosity of the MPI library run-time messages, the second one specifies the TCP network to use, third one enables the multi-rail fabric combination feature using MPI options and the last one shows how to select the tag matching interface (TMI*) transport.
<device options="genv I_MPI_DEBUG 5">ssm</device> <device options="genv I_MPI_TCP_NETMASK ib0">ssm</device> <device options="genv I_MPI_OFA_NUM_ADAPTERS 2">shm:ofa</device> <device options="genv I_MPI_TMI_LIBRARY /usr/lib/libtmi.so genv I_MPI_TMI_CONFIG /etc/tmi.conf genv I_MPI_TMI_PROVIDER mx”> tmi </device>
See the Intel® MPI Library Reference Manual for more details on MPI device selection and the available configuration options.
The following table shows the global configuration options available and the test modules for which they apply.
Global configuration options Test Modules
<cc-path> clomp, hpcc, intel_cc, intel_cc_rtl, intel_cce_rtl, intel_mpi_testsuite, memory_bandwidth_stream
<fc-path> intel_fc_rtl, intel_fce_rtl, intel_mpi_testsuite
<gcc-path> gcc
<ibstat-path> dat_conf, openib
<mkl-path> hpcc, mflops_intel_mkl
<mpi-path> hpcc, imb_collective_intel_mpi,
23
imb_message_integrity_intel_mpi, imb_pingpong_intel_mpi, imkl_hpl, intel_mpi, intel_mpi_internode, intel_mpi_rt, intel_mpi_rt_internode, intel_mpi_testsuite
<network> <fabric> <device>
hpcc, imb_collective_intel_mpi, imb_message_integrity_intel_mpi, imb_pingpong_intel_mpi, imkl_hpl, intel_mpi, intel_mpi_internode, intel_mpi_rt, intel_mpi_rt_internode, intel_mpi_testsuite
<perl-path> perl
<python-path> python
<ssh-path> ssh_version
Table 2.1 Global Configuration Options
When used in local configuration scope these parameters are not recognized inside group containers.
1.2.4.3. Altering the Test Module Dependencies
The relation between test modules is defined by a hierarchic structure of dependencies. The hierarchy is built with simple test modules at the top and more complicated ones at the bottom. A graphic with the test module dependencies hierarchy is provided inside the documentation folder in the file doc/ICR_Cluster_Checker_Dependencies_Graph.jpg.The dependencies between the test modules imply that if one fails, all the other ones depending on it will be skipped. The test modules dependencies may be modified using the following parameters:
<add_dependency> test_module </add_dependency>Add test_module to the list of dependencies for the test module where this option appears. This option may appear multiple times to specify more than one additional dependency.
<remove_dependency> test_module </remove_dependency>Remove test_module from the list of dependencies for the test module where this option appears. This option may appear multiple times to remove more than one dependency. Warning: this option should be used with extreme caution since it may result in unstable checking behavior.
Note that genuine_intel test module can not be removed using this option.
24
1.2.4.4. Altering the Nodes Checked by a Test Module
Some test modules universally apply to all nodes, while others may only be appropriate for specific types of nodes. The type of nodes checked by a test module may be modified by using the following parameters:
<check_compute> value </check_compute>Override the default test module behavior as to which types of nodes are checked. Value may be set to any of ‘true’, ‘on’, or ‘1’ to configure the test module to include compute nodes in the check, or any of ‘false’, ‘off’, or ‘0’ to configure the test module to exclude compute nodes in the check.
<check_dedicated_head> value </check_dedicated_head>3
Override the default test module behavior as to which types of nodes are checked. Value may be set to ‘true’, ‘on’, or ‘1’ to configure the test module to include 'dedicated' head nodes in the check, or ‘false’, ‘off’, or ‘0’ to configure the test module to exclude 'dedicated' head nodes in the check. Dedicated head nodes are nodes that are exclusively head nodes and do not belong to any other node types.
<check_head> value </check_head>Override the default test module behavior as to which types of nodes are checked. Value may be set to ‘true’, ‘on’, or ‘1’ to configure the test module to include head nodes in the check, or ‘false’, ‘off’, or ‘0’ to configure the test module to exclude head nodes in the check.
<check_other> value </check_other>Override the default test module behavior as to which types of nodes are checked. Value may be set to ‘true’, ‘on’, or ‘1’ to configure the test module to include other nodes in the ‘false’, ‘off’, or ‘0’ to configure the test module to exclude other nodes in the check.
1.2.5. Using Multiple Configuration Files
After defining the XInclude* namespace, a configuration file may use <include> tags to reference external files in order to reuse settings or provide custom alternatives.3 This option is provided for backward compatibility only. It is deprecated; the <check_head> parameter should be used instead. If the <check_head> parameter is defined, this option will be ignored.
25
For instance the following configuration
<cluster> <test> <speedstep> <state>on</state> </speedstep> </test>
</cluster>
Can be replaced by:
<cluster xmlns:xi="http://www.w3.org/2001/XInclude"> <test> <xi:include href="speedstep.zml"/> </test>
</cluster>
Where the contents of speedstep.zml are:
<speedstep> <state>on</state>
</speedstep>
1.3. License Fi le Path Configuration
The configuration of the license file path can be done in two ways:
1. Setting the environment variable INTEL_LICENSE_FILE to point to a folder containing a valid Intel® Cluster Checker license file. The folder and the file must have read and execute permissions. Example for Bash* shell:
export INTEL_LICENSE_FILE=/opt/intel/licenses
2. Creating a text file named .flexlmrc in the user's home directory with the following content:
INTEL_LICENSE_FILE=<path_to_folder_containing_the_license>
26
If option 1 is used, after the first execution of the tool option 2 will automatically be performed.
1.4. Updating Old Configuration Fi les
Before trying to execute Intel® Cluster Checker with a configuration file used with old versions of the tool, the following items need to be considered:
• The configuration file contents are automatically verified using an XML validation schema, any non-compliance will prevent the execution until resolved. Although not recommended, the execution can be forced by using the force flag (see 2.1.4 for details of this option).
• The <version_id> configuration tag is not longer required, the configuration format is now automatically verified and no explicit version definition is needed.
• Updated configuration options for the following test modules:• the intel_mpi_rt and hpcc test modules allow configuration of the Intel®
MPI tuning feature• the openib test module offers configuration uniformity and correctness
checks• the intel_mpi_testsuite test module can be configured to select among
several different suites, also to exclude tests if required• the imb_pingpong_intel_mpi test module offers extra configuration tags
to customize the behavior of the Intel® MPI Benchmark
For further details on Intel® Cluster Checker test modules and the configuration options refer to the Intel® Cluster Checker Test Modules Reference Guide.
27
2. Running Intel® Cluster Checker
2.1. Verifying Cluster Correctness
Intel® Cluster Checker has two execution modes: running checks on the cluster and gathering cluster information (see 2.2 for instructions on gathering cluster information).The general form of executing Intel® Cluster Checker is with the following command:
clustercheck xml_config_file options
Intel® Cluster Checker will simultaneously print output to the terminal and write 2 timestamp-labeled output reports in the current directory (see 2.1.5 for instructions on how to save the output files in an alternative location). The text output report (.out) is identical to the output printed to the terminal. The XML-based report corresponding to the highest verbosity level output and is suitable for parsing by other programs.
The tool may be run by both privileged and unprivileged users. Some test modules may require privileged access to function properly. Such test modules are automatically flagged and skipped if run by an unprivileged user.
Tip for running Intel® Cluster Checker: Creating a special user account expressly for the purpose of running Intel® Cluster Checker is recommended. This setup provides a stable, reproducible environment for performing checks and a convenient way to store the records of past checks.
Tips for systems with l imited disk space: By setting the environment variable TEMP it is possible to have Intel® Cluster Checker decompress its temporary libraries at a user defined path (default is /tmp). For example:
TEMP=/home/icr clustercheck
Additionally, use the environment variables CLCK_HEAD_TEMPDIR and CLCK_NODE_TEMPDIR (see 2.1.5 for complete list of the environment variables) to change the temporary locations used during execution.
28
2.1.1. Console Output
During execution, several configuration and diagnostic messages are provided through console output.
The default output contains a header showing the configuration to be used during execution, the status of each test module and sub-tests, and at last the overall result. However, the output may change depending on some configuration options and execution modes.
The header shows information such as the command line, user credentials and start date of the execution. Some settings details are also shown; for instance the path to the configuration file used and its contents, and the list of nodes checked.
The results report lists all the test modules included in the execution, detailing their names and descriptions. In the default verbosity level, only the failed sub-tests will be shown in console. However, if verbosity is increased by the user (see 2.1.3) the entire list of executed sub-tests is shown detailing each one's status.
The sub-tests are grouped and sorted by an associated severity, being the severity levels shown before each set of results. The severity of the issues found by each sub-test will help to understand if troubleshooting is required.
The available severity levels are SUCCESS, NOTICE, WARNING, ERROR, and CRITICAL. A brief description of each of them can be found on the following table.
Severity Level DescriptionSUCCESS The sub-test passed successfully.NOTICE Informational findings or potential errors, such as inconsistencies on uniformity.WARNING Non-urgent failures, such as minimum performance deviations.ERROR Items to be corrected, such as non-functional benchmarks.CRITICAL Significant errors, such as non-working cluster-wide subsystems.
Table 3.1 Sub Tests Severity Levels
In the test modules included in the Intel Cluster Ready compliance set (see 5 for details) the severity of the findings is set to ERROR.
For base test modules such as ping and ssh, the severity of the findings is considered CRITICAL. In the case of performance thresholds and deviation sub-tests, the severity will depend on how distant the results are from the expected values. In the case of performance thresholds sub-tests the comparison of the result will be done
29
against the value set in the configuration file. In the case of deviation sub-tests the comparison of the result will be against the median (see 6 for more details). A difference of up to 5% will be considered as WARNING severity, over 20% as ERROR and of CRITICAL severity over 50%. If the performance thresholds sub-test have no value set in the configuration file, the severity will be NOTICE.
2.1.2. Log Fi les
Two types of files are created with the results of each execution. On one hand a simple file is generated replicating the text showed in the console output. As with the console output, the amount of information available in the file is controlled with the verbose option (see 2.1.4 for details of command line options). The name of this file is formed using the input configuration file name plus a time-stamp (indicating the time of execution) and a .out suffix.
On the other hand, an XML* file is generated with the complete list of tests and sub-tests executed. This file always contains all the results for every node in the cluster. For that reason this file is significantly larger than the simple text one. The XML* format makes it suitable for other tools to use it as input. The name of this file is also created using the input file name plus a time-stamp but an .xml suffix. The tool uses a simple fallback mechanism to define where the log files are created. When a valid directory with write permission is found, it is used and the fallback stops in that step. The steps are:
1. The directory defined in the environment variable CLCK_LOG_DIRECTORY if it is defined (see 2.1.5 for details of environment variables).
2. /var/log/intel/clck/ if the tool was installed in the standard way.3. <installation_directory>/logs/ if the tool was installed in a custom
fashion.4. The current directory from which the tool is executed.
Note that the -report option will use the same fallback mechanism when searching for log files.
For details on how to install the tool refer to section 4 of the Release Notes. Note that the creation of log files may be disabled with the nolog option (see 2.1.4 for details of command line options).
30
2.1.3. Additional Output
<verbose>value </verbose>Control the output verbosity. A higher value produces more output. Can also be set by command line option (see 2.1.4 for the complete list of command line options).
<debug/>Setting this option creates files named testmodule.timestamp.debug that contains the command(s) executed on each node and the corresponding output. This option can be individually enabled for each test module. The debug files creation can also be enabled with the debug command line option (see 2.1.4 for details of command line options). The location of debug files will follow the same mechanism as log files (see 2.1.2 for details).
Tip for resolving reported issues: Determining why a test module is failing may not always be readily apparent based on the diagnostic messages. Looking at the debug file may reveal a more complete error message than the one printed in the console. The user can also try running the same commands himself to confirm that the output is the same.
2.1.4. Command Line Options
Command line options may be used to alter the runtime behavior of Intel® Cluster Checker.
autoconfigureEnable the automatic configuration capabilities. Also available as the auto short option. See 8 for more details.
31
compliance versionCheck compliance of the cluster to the Intel® Cluster Ready Specification. Versions 1.0.x are all functionally equivalent. Multiple versions can be entered separated by commas to verify compliance of several of them in a single execution. If no values are provided, they are read from the /etc/intel/icr file. In the case that the file is not available, all existing versions are checked. See Table 6.1 for the list of test modules executed. This list can be altered with the options described at 1.2.3.A successful run should not be interpreted as complete Intel® Cluster Ready compliance as other requirements must also be met. This option replaces the default set of test modules with the compliance set.
certification versionCheck requirements for certification of the cluster against the Intel® Cluster Ready program, under the provided Intel® Cluster Ready specification version. The values for the version are the same as for the --compliance option. A successful run should not be interpreted as certifying Intel® Cluster Ready compliance as other requirements must also be met. This option replaces the default set of test modules, as required by the certification procedure. This option includes different sets of test modules if executed with users with privileges.
debugGenerate debug files for every test module executed. To enable debug files only for specific test modules use the XML configuration file (see 2.1.3 for instructions). The location of debug files will follow the same mechanism as log files ( see 2.1.2 for details).
deployment versionCheck a cluster after first deployment, this option executes both compliance and 'wellness' test modules. The values for the version are the same as for the --compliance option. See the list option for more details on each set of test modules.
exclude test_moduleExclude the test module named test_module. If other test modules depend on the excluded ones, they will be skipped. This option may be used multiple times to exclude many test modules and is also available in the XML configuration file (see 1.2.3).
Note that genuine_intel test module can not be removed using this option.
32
forceDo not apply validation of the XML configuration file before execution.
helpPrint a short help menu describing command line options.
include test_moduleInclude the test module named test_module. The included test module will run in addition to the standard set. If the included test module depends on other ones that are not included, the required dependencies will also be included. This option may be used multiple times to include more than one test module and is also available in the XML configuration file (see 1.2.3).
include_only test_moduleIgnore the default set of test modules and execute only the one named test_module and all its dependencies (dependencies are executed first). This option is particularly useful when working to resolve a failure and the user wants to run only the failing test module. This option may be specified multiple times and is also available in the XML configuration file (see 1.2.3).Command line option has precedence over the configuration file. So, if a test module is configured with this option in the XML file and another is requested by command line option, only the command line one will be included. This option also has precedence over include, so any modules added with that option will be ignored.Combining this option with exclude is discouraged.
level valueEach test module has a check level assigned: Fast, non-intrusive tests have a low level (i.e., 1) while slow, intrusive tests have higher levels assigned. The level option tells the tool to run only test modules with levels less than or equal to value. However, by explicitly including test modules with higher check level, this option is overridden. The minimum value is 1, the maximum is 5 and default is 3. See chapter 5 for the complete list of test modules and their check level.
listPrint the list of test modules to the screen and exit.
nodefile
33
Check the nodes listed in file. Note: this option overrides the <nodefile> parameter from the XML configuration file.
nodepsAdd no dependencies to exclusively included modules. This option only works when used together with include_only or <include_only_module>, and is meant for troubleshooting purposes only.
To guaranty accessibility to all cluster nodes to be tested the test module ping is always executed first.
noheaderDo not print the tool version, XML file, or information on included / excluded test modules as part of the output. The configuration file contents are written in the XML output report regardless of this value.
nologDo not write the output report to disk.
packagesGenerate the set of files required for the packages test module. Since the head node and one node on each defined node group is analyzed, their installed packages are listed on a file for a comparison during the execution of the test module.The reference list may contain comments in each line (after a package entry or in a new line) following the '#' character.
report valueInstruct cluster-check to look for the latest output logs in the log directory (see 2.1.2) and generate a report with descriptive information about each execution. The value entered represents the number of latest logs that will beexamined to generate the output. The files are ordered by date, taking the latest files first. A maximum of five logs are examined. To specify a directory that contains the output logs to be analyzed use the CLCK_LOG_DIRECTORY environment variable.In addition to the latest five logs, all certification logs are checked for a successful execution.Verbosity can be changed using verbose value , where higher values produce more output. These values range from 1 to 4:
34
1 - reports if successful logs were found, and provides the date of the execution and the path to the log.2 - adds a list of details containing: full path to log, check type, overall status, date, version of Intel(R) Cluster Checker run, command line and, if the overall status failed, a short message with a possible reason.3 - adds the list of failing modules.4 - adds the list of the passing modules.
If --report is executed without a value, then only certification logs are examined until a successful one is found or no more certification logs are left, and a report with a verbosity of 2 is generated.
reverse (experimental)Enable a reverse dependency tracking mode assuming an optimistic behavior. cluster-check runs without adding dependencies to the list of test modules to be executed. This list is sorted with the modules with more dependencies first. If a test module fails, then root causing is triggered and the list of modules to be executed is placed by the failed module's dependencies, sorted in the same way.
To guaranty accessibility to all cluster nodes to be tested the test module ping is always executed first.
This option can be used together with level, but not with deployment, certification, compliance options or sdkcompliance.
WARNING: This feature is experimental and included here for feedback. The dependency handling mechanism is altered and test module assumptions may not be properly handled. Output logs generated using this option cannot be used for certification purposes.
sdkcomplianceCheck compliance of the cluster to the Development Cluster section of the Intel® Cluster Ready Specification version 1.1. If the --compliance option is not also specified, --compliance=1.1 is implied by the selection of this option. Note that is option is only available for version 1.1 of the Specification. See Table 6.2 for the list of test modules. The list of test modules may be altered with the options described at 1.2.3.
35
A successful run should not be interpreted as complete Intel® Cluster Ready compliance as other requirements must also be met. This option replaces the default set of test modules with the complete set of compliance.
verbose valueControl the amount of output. Higher values produce more output:
1 Report the overall success / failure only with no information on the status of the test modules.
2 Report the success / failure of each test module, the overall success / failure and the total elapsed time. Failing or indeterminate test modules will print additional output. This is the default verbosity level.
3 Same as 2, but also prints the name of the failing test modules that cause other ones to be skipped.
4 Report the success / failure of each test module and the overall success / failure. All test modules, regardless of status, will print additional output.
5 Report the success / failure of each test module and the overall success / failure. All test modules, regardless of status, will print additional output. In addition, the version of each one will be displayed.
Tip for running Intel® Cluster Checker: Use the level command line option to develop an automated, periodic check for your cluster. For example, consider running the relatively quick, unobtrusive level one test modules daily or as part of a resource manager job preamble script while saving the higher level test modules for weekly or monthly preventive maintenance periods.
2.1.5. Environment Variables
These are the environment variables recognized by the tool:
CLCK_LOG_DIRECTORYWrite the output reports to the specified directory rather than the one from which the tool is executed.
CLCK_MODULE_PATHSet the path to search for third-party Intel® Cluster Checker test modules. The environment variable may contain multiple directories separated by a colon.
CLCK_HEAD_TEMPDIR
36
Change the default location where temporary directories/files are created in the head node to prepare the checks to execute. The default location is at /tmp. CLCK_HEAD_TEMPDIR should contain the absolute path to an existent directory in the head node with read, write and execute permissions. This value can also be specified from the XML configuration file by using the <head_tempdir> tag. Note that the environment variable has precedence over the configuration file.Take into account that the MPD subsystem of Intel® MPI Library may create its own temporary files, look for I_MPI_MPD_TMPDIR in the Intel® MPI Library user guide for details of how to configure their location.
CLCK_NODE_TEMPDIRChange the default location at which temporary directories/files are created in compute nodes for testing purposes. Test modules that include the head node in their checks will use this path when verifying it. The default location is at /tmp. CLCK_NODE_TEMPDIR should contain the absolute path to an existent directory in all nodes with read, write and execute permissions. This value can also be specified from the XML configuration file by using the <node_tempdir> tag. Note that the environment variable has precedence over the configuration file.
CLCK_REGULAR_USERThe user to behave as when running as a privileged user. This setting only affects test modules which are intended for regular users. This has precedence over the <user> configuration tag.
2.2. Gathering Cluster Information
It is possible to use Intel® Cluster Checker to generate a report with useful cluster information for statistical sales analysis. The report can be created using the command line option salesreport to process an existent log file generated by the tool.
A text file in comma-separated values (CSV) format will be generated with the collected information. The following items will be available in the file:
1. vendor sales order number2. analyzed log file name3. Intel® Cluster Checker time-stamp4. Intel® Cluster Checker serial number5. number of nodes
37
6. overall Intel® Cluster Checker pass/fail7. total amount of memory8. number of CPUs9. number of cores10. types of CPUs11. brand of interconnect if available12. kernel version13. OFED* version14. Intel® MPI Library version15. Intel® Cluster Runtime version16. Intel® Cluster Ready Reference Implementation identifier17. Intel® Cluster Checker version
If an item is not found in the clusterchecklog.xml file the matching field will be completed with the text “Not available”. To enable the generation of a report with all the items completed, information from core_count, cpuinfo, kernel, pci, and system_memory test module must be available on the provided log. This is normally the log of a default execution (‘wellness’ with execution level 3).
2.2.1. Command Line Options
Sales report mode is enabled using custom command line options.
clustercheck salesreport clusterchecklog.xml ordernumber salesordernumber
salesreport clusterchecklog.xmlInstructs Intel® Cluster Checker to run in XML parsing mode on the provided log file and create the sales report with the information described above.
ordernumber salesordernumberSales order number of the cluster for which salesreport is being generated. If this option is not provided, the corresponding field in the output file will be completed with the tag "salesordernumber".
38
3. User-Defined Checking
Intel® Cluster Checker plug-in architecture allows the user to create new test modules (see the Developer's Guide for complete details). However, two special test modules permit the user to define custom test modules without the need to write any code. The generic_correctness and generic_uniformity test modules execute commands specified by the user and check the correctness and uniformity of the output.
These test modules are not part of the default set and must be included using the <include_module> tag or the include command line option.
Tip for running user-defined checking: Since the generic test modules run arbitrary commands, by default they will not be evaluated if the tool is run as a privileged user. See the entries for the generic test modules in Test Modules Reference Guide to learn how to override this behavior.
3.1. Correctness Checking
The generic_correctness test module executes an arbitrary command and compares the output to a specified value. An exact match (case and whitespace sensitive) is considered a successful result. The output of multiple commands may be checked by using multiple <item> container tags:
<generic_correctness><item>
<command>uname r</command> <result>2.4.2120.EL</result> </item>
<item> <command>/sbin/lsmod | grep e1000</command> <result>e1000 171104 1</result> </item></generic_correctness>
Consult the generic_correctness entry in the Test Modules Reference Guide for more information.
39
3.2. Uniformity Checking
The generic_uniformity test module executes an arbitrary command and compares the output to the other cluster nodes. The same result on all nodes is considered a successful result.
<generic_uniformity><command>uname r</command>
<command>/sbin/lsmod | grep e1000</command></generic_uniformity>
Consult the generic_uniformity entry in the Test Modules Reference Guide for more information.
40
4. Copy Exactly
copy_exactly is a test module that Intel® Cluster Checker executes to verify that nodes are an exact copy of a reference node. Using a list of reference file checksums, the copy_exactly4 test module confirms the checksums match the actual files present on the nodes. The list of reference file checksums may be provided by a third party, or may be generated using one of your nodes as reference. The node_checksum utility is provided with Intel® Cluster Checker to generate this reference file.
The node_checksum program generates a file containing the checksum for most of the files on a node. In most cases, the node_checksum utility without any command line options should be run on the reference node. It automatically excludes files that are known to vary between nodes because they contain MAC or IP address, hostnames, or other node specific information or are temporary. Additionally, user specified files may also be excluded from the checksum list via an exceptions file during the generation step.
The exceptions file is a list of basic regular expressions, one per line. All special characters, such as '+', may need to be escaped with '\' to be literally interpreted. Only characters '. ' and '$' are accepted as part of a regular expression. For example, to exclude the /usr/bin/gcc and /usr/bin/g++ files, all files ending in gconf.xml, and all files in the /usr/local/node/ folder, the exception file would contain:
/usr/bin/gcc/usr/bin/g\+\+gconf.xml$/usr/local/node/
The exceptions file should not contain any blank lines. To apply the exceptions file, run node_checksum with the path to the exceptions file as the first and only command line option. For more information on using extended regular expressions, see the man page for grep.
4 See http://www.intel.com/design/quality/mq_ce.htm for more about the Intel® Copy Exactly philosophy.
41
Please see the copy_exactly documentation in the Test Modules Reference Guide for more information. Also, a similar validation at installed packages level could be done through the packages test module.
42
5. Intel® Cluster Checker Test Modules
The test modules are divided into 4 sets:
1. One set is used to check compliance with the base Intel® Cluster Ready Specification. The set is used with the compliance command line option (see 2.1.4 for the complete list of command line options).
2. The second set is used to check compliance with the Developer Cluster section of the Intel® Cluster Ready Specification. The set is used with the --sdk-compliance command line option (see 2.1.4).
3. The third set is used to check if the cluster is configured correctly and performs according to expectation. This set is the default (no command line option) and is named ‘wellness’ mode.
4. The fourth set is composed by optional tests modules.
The list switch is available to print the list of available test modules (see 2.1.4 for the complete list of command line options).
5.1. Compliance Test Modules
Running the tool in compliance mode using the compliance command line option loads the following set of test modules. Excepting the gige test module that is only included in compliance for 1.0 and glibc_verision and openssh_verision that are included for 1.0 and 1.1, all other test modules are included in 1.0, 1.1 and 1.2 compliance modes.
Test module Name Description1GiB_memory Minimum of 1 GiB of memory per node and 0.5 GiB of memory per core65GiB_storage_head Minimum of 65 GiB of direct access storage on the head nodebase_libs Base and runtime libraries are providedcluster_size Cluster size >= 4 nodesfile_tree Materially identical software images / file treesgenuine_intel GenuineIntel processorsgige Gigabit Ethernet port presentglibc_version 32-bit and 64-bit GNU runtime (glibc) version compliancehome Shared, common /homeicr_version_compliance Intel® Cluster Ready version compliance (/etc/intel/icr)intel_cc_rtl_version 32-bit Intel® C++ Compiler runtime libraries version complianceintel_cce_rtl_version 64-bit Intel® C++ Compiler runtime libraries version compliance
43
Test module Name Descriptionintel_cmkl_rtl_version Intel® Math Kernel Library, Cluster Edition runtime version complianceintel_fc_rtl_version 32-bit Intel® Fortran Compiler runtime libraries version complianceintel_fce_rtl_version 64-bit Intel® Fortran Compiler runtime libraries version complianceintel_mpi_rtl_version Intel® MPI Library runtime version complianceintel_tbb_rtl_version Intel® Threading Building Blocks runtime version compliancejava_version Java Runtime Environment version compliancekernel_version Kernel version complianceLib32_counterpart_lib64 32-bit libraries have 64-bit counterpartsmpi_consistency Consistent MPI image (mpirun / mpiexec)ip_consistency All-to-all Network Connectivityopenssh_version OpenSSH version complianceperl_version Perl version compliancepython_version Python version compliancesingle_authentication Single authentication domaintcl_version Tcl version complianceX11_clients X11 clients are provided on head nodeX11_libs X11 runtime libraries are provided
Table 6.1 Compliance Test Modules
5.2. SDK Compliance Test Modules
Running the tool in development cluster compliance mode using the sdkcompliance command line option loads the test modules from Table 6.1 and adds the following set of test modules:
Test module Name Descriptionbinutils_version GNU binutils version compliancegcc_version GNU C Compiler suite (gcc and g++) version compliancegdb_version GNU debugger (gdb) version compliancegmake_version GNU make version complianceintel_devtools_version Intel® Cluster Ready Developer Tools version complianceJdk_version Java Software Development Kit version compliance
Table 6.2 SDK Compliance Test Modules ( in addition to Table 6.1)
5.3. Default Test Modules
By default, Intel® Cluster Checker runs in 'wellness' mode with check level 3. So, the tests from the list below with check level 3 or lower will be executed by default. Changing the check level with level command line option will include/exclude test modules only from this set.
Test module Name Description Levelarch System Architecture Uniformity 1
44
Test module Name Description Levelavailable_disk Available Disk 1bash Bourne Again Shell 1clean_ipc System V Interprocess Communication 2clock_granularity gettimeofday() Clock Granularity 1clock_sync Clock Synchronization 1core_count Core Count (Multi-core & Hyper-Threading Technology) 1core_frequency Core frequency uniformity 1cpuinfo /proc/cpuinfo Uniformity 1csh C Shell 1dat_conf Valid /etc/dat.conf entries 1disk_bandwidth Single-node Disk Bandwidth 3dmidecode SMBIOS/DMI Uniformity 1environment Uniform environment variables 1file_permissions File Existence, Ownership, and Permissions 1genuine_intel GenuineIntel processors 1hdparm Singe-node Disk Performance (hdparm) 3hardware_uniformity Hardware uniformity 3hostname Hostname Correctness 1hpcc HPC Challenge Benchmark (Intel® C++ Compiler, Intel® MPI
Library, Intel® Math Kernel Library)4
imb_collective_intel_mpi MPI Collectives (Intel® MPI Benchmarks; Intel® MPI Library) 3imb_message_integrity_intel_mpi MPI Message Integrity (Intel® MPI Benchmarks; Intel® MPI
Library)3
imb_pingpong_intel_mpi Network Performance (Intel® MPI Benchmarks; Intel® MPI Library)
3
intel_cce_rtl Intel® C++ Compiler runtime libraries 2intel_fce_rtl Intel® Fortran Compiler runtime libraries 2intel_mpi_rt Intel® MPI Library Runtime Environment (Single-node) 1intel_mpi_rt_internode Intel® MPI Library (All nodes) 2kernel Kernel Version Uniformity 1kernel_modules Kernel Test module Correctness and Uniformity 1kernel_parameters Linux Kernel Runtime Parameters 1ksh Korn Shell 1loopback Loopback Address 1memory_bandwidth_stream Single-node Memory Bandwidth (STREAM) 3mflops_intel_mkl Single-node floating point performance (Intel® Math Kernel
Library)3
mount_proc procfs Filesystem 1nfs_mounts NFS mounts 1nsswitch Name Service Configuration (/etc/nsswitch.conf) 1packages Installed packages 4pci PCI Device Consistency 1perl Perl Interpreter 1ping Basic Network Connectivity 1ip_consistency All-to-all Network Connectivity 1process_check Stale Process Check 2python Python Interpreter 1sh Bourne Shell 1shm_mount /dev/shm mount test 1single_authentication Single authentication domain 1speedstep Intel® SpeedStep(R) Technology 1
45
Test module Name Description Levelssh Node SSH Connectivity 1stray_uids Files ownership UID/GID check 1system_memory System Memory Uniformity 1tcsh Enhanced C Shell 1tmp Permissions on /tmp 1uid_sync User and Group Uniformity 1
Table 6.3 Default Test Modules
5.4. Optional Test Modules
For these test modules to be executed they should be explicitly requested by the user, either through the command line options (see 2.1.4 for the complete list of command line options) or editing the configuration file (see 1.2.3 for instructions using the configuration file).
Test module Name Descriptionclomp Intel® C++ Compiler Cluster OpenMP runtime librarycopy_exactly Copy Exactly! file treescron Cron Disableddmidecode Check the uniformity of the SMBIOS/DMI informationetc_hosts IP entries in /etc/hosts filegcc GNU C/C++ compilergeneric_correctness Generic correctness testgeneric_uniformity Generic uniformity testhost_conf /etc/host.conf Configurationibadm Mellanox InfiniBand In-band Monitorimkl_hpl Intel® Optimized HPL Benchmark intel_cc Intel® C++ Compilerintel_fc Intel® Fortran Compilerintel_ethernet_driver Intel® Ethernet Network Driversintel_mpi Intel® MPI Library (Single-node)intel_mpi_internode Intel® MPI Library (All nodes)intel_mpi_testsuite Intel® MPI Library Test Suiteipoib IP over InfiniBandiwarp Check uniformity of iWarp deviceslsb Linux Standard Base (LSB*) Compliancenisdomain NIS Domainnismaps NIS Password Map Consistencynumactl Check NUMA Hardware and Policy Uniformityopenib InfiniBand Adapter Status (OpenIB)portal Portal name resolutionprocessor_cache Processor multiple layers cache testprocessor_msr Processor Model Specific Registers (MSRs)ssh_version SSH version uniformitysubnet_manager InfiniBand Subnet Manager
46
Table 6.4 Optional Test Modules
47
6. Performance Test Modules
Intel® Cluster Checker includes several test modules which exercise most used High Performance Computing benchmarks for clusters; this allows performance comparisons against a reference system to ensure proper health and functionality of the systems.
The following table summarizes the performance-related test modules. More details of each one can be found in the Intel® Cluster Checker Test Module Reference Guide.
test module description type
hdparm hard disk read timings single-node
memory_bandwidth STREAM* single-node
mflops_intel_mkl Intel® Math Kernel Library DGEMM single-node
imb_pingpong_intel Intel® MPI Benchmark - Ping Pong pair-wise
imkl_hpl Intel® Optimized HPL* cluster-wide
hpcc Intel® Optimized HPCC* cluster-wide
Table 7.1 Performance Test Modules
These test modules are optimized to balance execution time on their default behavior. However, benchmarks can be configured to gather better performance numbers if required, by increasing their input problem size or execution approach.Performance-related test modules can also validate the results obtained against user provided thresholds if explicitly configured. In addition to the binary version of the benchmark some of the test modules have a build option to be compile it from the sources at execution time if required.
Intel® Cluster Checker includes three types of benchmarks: single-node, pair-wise and cluster-wide.
6.1. Single-node Benchmarks
In these benchmarks results don't depend on the quantity of nodes in the cluster and can be easily used to compare the nodes from clusters of different sizes. Most of the single-node benchmarks have a performance deviation check to verify that all nodes report a similar performance among them.
48
The performance deviation is measured using the median and the standard deviation of the gathered performance values. Each value should be in the range of a predefined number of standard deviations from the overall median.The allowed range can be summarized as (median ± factor x stddev), with factor equal to 3 by default. However, this factor can be configured by the user in each test module.
6.2. Pair-wise Benchmarks
Pair-wise benchmarks select combinations of two nodes in the cluster and test the communication performance between them. This helps to detect possible degradation in the fabrics used in the cluster. This kind of benchmarks also have the performance deviation check to verify that all node pairs report a similar performance among them.
6.3. Cluster-wide Benchmarks
Cluster-wide benchmarks do depend on the whole quantity of nodes on the cluster. Therefore, their results can be used to understand the behavior of a cluster as a whole.
The HPCC* benchmark is a set of benchmarks used to exercise different components of a system, it includes benchmarks that are also provided as independent test modules in order to allow the gathering of both single-node and cluster-wide performance measurements.
49
7. Heterogeneous Clusters
7.1. Nominal Hardware Variation
Intel® Cluster Checker must be configured to recognize nominal hardware variations. There is no limit on the number of nodes or types of nominal variations allowed within the cluster.
Intel® Cluster Checker should be configured to recognize nominal variation using the 'group' property. This feature requires editing the nodes list file and the XML configuration file.
The nodes list file permits to manage groups using the 'group:' label and the name of the group. For example, the following file defines 4 nodes where node2 and node3 have different processor models and the other ones have similar hardware.
# list of nodes to checknode1 # head node2 # group: XeonE5506node3 # group: XeonX5560node4
In addition, the XML configuration file should be edited to use the created groups. Every test modules that checks hardware uniformity should contain a <group> tag with the corresponding group name. The following example shows how to configure the hardware_uniformity test module.
<hardware_uniformity> <group name="XeonE5506"/>
<group name="XeonX5560"/></hardware_uniformity>
7.2. Sub-clusters
Sub-clusters are internal divisions of a single cluster where some nodes may have completely different hardware capabilities from others . As defined by the resource manager, jobs may or may not span sub-clusters; the application proxy test modules in Intel® Cluster Checker (e.g. HPCC) may be configured to check sub-clusters independently if jobs will not span sub-clusters.
50
It is possible to configure Intel® Cluster Checker to use different configurations for each sub-cluster. It is necessary reflect this configuration in the nodes file and in the XML configuration file. The node file should be configured using the ‘group:’ label. The following example shows how to define 5 nodes on the group1 and 4 nodes on the group2
# list of nodes to checknode1 # head group: subCluster1 node2 # group: subCluster1node3 # group: subCluster1 node4 # group: subCluster1node5 # group: subCluster1node6 # group: subCluster2node7 # group: subCluster2node8 # group: subCluster2node9 # group: subCluster2
The XML configuration file should use the <group> tag. The name for each group should be the same in both files. The following example shows how to configure the hpcc test module for two the different groups.
<hpcc> <group name="subCluster1"><ccpath>/opt/intel/cce/9.1.038/</ccpath>
<fabric><bandwidth>.015</bandwidth> <device>sock</device> <dgemm>8.5</dgemm> <fft>.5</fft> <hpl>.023</hpl> <latency>60</latency> <ptrans>.15</ptrans> <randomaccess>.002</randomaccess> <stream>0.9</stream>
</fabric><fabric>
<bandwidth>.25</bandwidth> <device>rdssm</device> <dgemm>8.5</dgemm> <fft>.5</fft> <hpl>.026</hpl> <latency>20</latency> <ptrans>1.0</ptrans> <randomaccess>.012</randomaccess>
51
<stream>0.9</stream> </fabric> <mklpath>/opt/intel/cmkl/9.0/</mklpath> <mpipath>/opt/intel/impi/3.0/</mpipath><processnumber>4</processnumber> <threadnumber>1</threadnumber>
</group>
<group name="subCluster2"> <ccpath>/opt/intel/cce/9.1.038/</ccpath>
<fabric> <bandwidth>.03</bandwidth>
<device>sock</device> <dgemm>8.4</dgemm>
<fft>1.04</fft> <hpl>.020</hpl>
<latency>25</latency> <ptrans>.2</ptrans> <randomaccess>.002</randomaccess>
<stream>2.69</stream> </fabric>
<fabric><bandwidth>.8</bandwidth> <device>rdssm</device>
<dgemm>8.4</dgemm> <fft>1.025</fft> <hpl>.022</hpl> <latency>8</latency>
<ptrans>.32</ptrans> <randomaccess>.012</randomaccess> <stream>2.69</stream>
</fabric> <mklpath>/opt/intel/cmkl/9.0/</mklpath>
<mpipath>/opt/intel/impi/3.0/</mpipath><threadnumber>1</threadnumber>
</group> </hpcc>
7.3. Fat Nodes
‘Fat nodes’ are nodes with a hardware super-set relative to other nodes used for the same purpose, e.g. extra memory or additional secondary storage compared to a 'regular' compute node.
52
Intel® Cluster Checker must be configured to recognize the 'fat' node variation from 'regular' nodes. For example, in order to support nodes with extra memory or secondary storage, the nodes on the node list should be configured as part of a ‘fat’ group.
# list of nodes to checknode1 # head node2 # group: fatnode3 node4
The special characteristics of the fat node should be specified in the XML Configuration file by defining different parameters for specific test modules. The following example shows how to configure the system_memory test module.
<system_memory>
<group name="fat"><physical>8388608</physical>
</group>
<physical>4194304</physical> <swap>4194304</swap></system_memory>
53
8. Automatic Configuration
This section provides the details of the Intel® Cluster Checker automatic configuration feature.
8.1. Overview
The autoconfigure option simplifies the initial configuration required to run the tool. At the beginning of the execution the tool scans the cluster nodes to gather the information that will be used to complete the configuration provided by the user in a basic configuration file. The feature is capable of detecting the cluster nodes, configuring the path for the Intel® Cluster Runtimes tools and configuring the single node performance test modules thresholds. This section explains the details for each one.
8.2. Command Line Options
autoconfigure [OPTIONS]Instructs the tool to automatically set some configuration parameters. The option may be shortened, as --auto for instance.
Note that automatic configuration mode requires a basic configuration file to run. Therefore, the configuration file must be passed by command line together with the above option or must be available at one of its default locations (see 1.2).
8.2.1. Automatic Configuration Options
The optional additional parameters are distributed across two categories: options to control the targets for automatic configuration and options to control how the new configuration is stored. The options provided should be entered separated by commas as a single string with no spaces in between.
8.2.1.1. Automatic Configuration Targets
These options control what parts of the configuration will be subject to automatic configuration. When a specific target option is provided, only that target will be
54
automatically configured. If no option is provided, the tool will default to all the targets available to the user running the tool.
• global (enabled by default)Use the global configuration feature (see 1.2.4.2) to automatically set the path to the Intel® Cluster Runtimes Tools. The target configuration options are <ccpath>, <fcpath>, <mpipath> and <mklpath>. If the paths are already configured they will be redefined.
It is a requirement that the tools must be installed according to the Intel® Cluster Ready Specification and be uniform across all nodes. If more than one version of a tool is detected, the latest one will be used.
The following example will attempt to perform global path configuration. A backup file will be stored on the same directory of the provided configuration file before modifications.
Example:
clustercheck auto global config.xml
• nodes (enabled by default)Automatically discover the nodes of the cluster and, if applicable, create a new nodelist file. If the configuration file has the path to a nodelist file, or if the nodefile command line option is provided, it will disable nodes automatic discovery. For more details of the discovery see 8.4.If performance automatic configuration is also used, the nodelist file will also contain the information of the hardware discovered in each node for advance usage of the grouping feature (for more details of HW grouping see 8.5.1).
The following example will attempt to perform compute nodes discovery. A new nodelist file will be automatically generated in the same location of the provided configuration file.
Example:
clustercheck auto nodes /home/icr/clck_conf/config.xml
• performance (privileged user only)Scan the cluster compute nodes to detect the main hardware components and based on the information gathered perform a basic heuristic calculation to set
55
the thresholds for single node performance test modules. Although the directly targeted test modules are mflops_intel_mkl and memory_bandwidth_stream, other test modules may take benefit of this feature. See 8.6 for more details. This option is available only to the privileged user and it is enabled by default.
The following example will attempt to perform performance thresholds configuration. A backup configuration file will be generated before modifying the default configuration file. Note that since no configuration file is provided the tool will look for it at the default locations (see Getting Started With Intel®Cluster Checker for more details).
Example:
clustercheck auto performance
The value of the <user> tag is automatically detected when running as a privileged user and it will be included on the generated configuration file if applicable.
The feature will check if any of 'clck' or 'icr' are valid users on the system. An alternative user name can be provided by using the <user> tag or by using an environment variable named as CLCK_REGULAR_USER.
8.2.1.2. Files Handling
These options control how files will be handled to save the results of auto-configuration. They are mutually exclusive, meaning that only one can be used at a time. If no file handling option is provided, the tool will set backup by default.
If no configuration file is provided, a set of default location will be searched as detailed above.
• backup (enabled by default)Create a backup of the files provided by the user and edit the provided ones. The backup will have the same file base name with the .backup suffix and a time-stamp. This targets the XML configuration file and the nodelist file, when applicable.
It optionally accepts a path to define the destination of the backups; otherwise the same path of the original files are used.
The following example will attempt to perform compute node discovery, global path configuration and performance thresholds configuration if
56
executed as root. A backup file inside the /home/icr/clck_conf directory will be created before modifying the configuration file provided.
Example:
clustercheck auto backup=/home/icr/clck_conf config.xml
• newfile Create new files containing the automatically configured parameters. The new files will have the same file base name with the .new suffix and a time-stamp. This includes the XML configuration file and the nodelist file, when applicable. It optionally accepts a path to define the destination of the new files; otherwise use the same path of the original files.
The following example will attempt to perform compute node discovery, global path configuration and performance threshold configuration if executed as root. The results will be written into a new configuration file inside the /home/icr/clck_conf directory.
Example:
clustercheck auto newfile=/home/icr/clck_conf config.xml
• overwriteOverwrite the file provided by the user with the automatically configured parameters. This will target the XML configuration file and the nodelist file, ifrequired. Non-compliant XML information may be lost as a side effect.
The following example will attempt to perform compute node discovery, global path configuration and performance threshold configuration if executed as root. The results will be written into the provided configuration file.
Example:
clustercheck auto overwrite config.xml
• nowriteDo not save the automatic configuration, just use it in the current execution.
57
The following example will attempt to perform compute node discovery, global path configuration and performance threshold configuration if executed as root. The results will not affect the configuration file provided.
Example:
clustercheck auto nowrite config.xml
In the cases where nodes automatic discovery is used and no nodelist file is provided, a new nodelist file will be created at the same path of the XML configuration file targeted by automatic configuration. The name of the created nodelist file will be nodelist.<timestamp>.auto.
Note that the automatic configuration will maintain any configuration provided in the user configuration file. If this file references external XML files using XInclude, the automatic option parses the referenced XMLs and include them in the new file. Therefore no XInclude directives will appear in the final configuration file.
As a complex example, the following command will attempt automatic configuration of global path configuration and compute nodes discovery. The config.xml file will be used as a starting point but a new file will be created at the /etc/intel/clck directory.
Example:
clustercheck auto newfile=/etc/intel/clck,global,nodes config.xml
8.3. Console output and Logs
At the beginning of the execution the tool will print on the screen the configuration with all the automatically configured parameters and, if applicable, the name of the file that contains it. By increasing the default verbosity, the tool will display on the screen the hardware information gathered and thresholds calculated by the performance automatic configuration option. See 8.5.2 for more details. If the input parameters are modified by auto-configuration, the output XML log will include both configurations: the one modified by automatic configuration (which was actually used during execution) and the original one provided by the user.Additionally, a log of the commands executed during nodes discovery and hardware scanning can be generated with the usual debugging option (debug or <debug/> see 2.1.3 for more details) .
58
8.4. Cluster Nodes Automatic Discovery
When the nodes auto-configuration option is enabled the tool will discover the compute nodes available in the cluster using a cluster-wide command (it assumes it runs on the front-end/head node). Currently the tool has built in support for: ROCKS+* up to version 5.3, PCM* up to version 2.1 and Perceus* 1.5.3. For other provisioning systems use the <nodelist_cmd> configuration tag (see 8.4.1).
8.4.1. Configuration Options
This configuration will be used by the tool only when running in automatic configuration mode.
<nodelist_cmd>print_cluster_nodes_command</nodelist_cmd> For provisioning systems not supported by default use the <nodelist_cmd> configuration tag in to indicate the exact command that returns the list of compute nodes in the following format:
headnode computenode1 computenode2 computenode3
Note that only the names of the nodes should be present in the output, one per line with no extra characters. Also, white-space and comments will be removed from the output before the actual execution.
Example:
<cluster> <nodelist_cmd>/opt/rocks/bin/dbreport nodes</nodelist_cmd> . . . </cluster>
8.5. Performance Thresholds Automatic Configuration
Because the performance of each node depends on its hardware components, the auto-configuration performs a scan on each compute node to discover its bill of materials. With the information gathered a simplistic heuristic calculation is performed. If the tool cannot calculate the value, a fallback mechanism is used. This mechanism also allows the user to alter the default behavior. For details see 8.6.3.
59
8.5.1. Hardware Scanning
During performance auto-configuration Intel® Cluster Checker queries each compute node and executes the commands listed below:
• dmidecode• lspci• cat /proc/cpuinfo
With the information gathered the compute nodes list is edited detailing the main hardware components for each compute node. The association of each compute node with its hardware components description is done using the “group” feature of Intel® Cluster Checker. Then the configuration is edited with the created ''groups” to match the different setting to each node during the testing phase (see the 1.1 and 1.2.4 for more details of the group feature).The following table shows the hardware components that are used to create groups and differentiate compute nodes. It also includes some examples of group names for each hardware component.
Hardware Source Group name examples
Processor Processor model string X5355, E5506
Sockets Processors quantity 1_PROCESSOR, 2_PROCESSOR
Base Board Base board identifier S5400SF, X38ML
Memory Speed <type>_SPEED_<Mhz> DIMM_SPEED_800, MM_SPEED_800
Memory Size MM_SIZE_<MB> MM_SIZE_4096, MM_SIZE_12288
Ethernet Ethernet device identifier 82575EB, 82598EB
Infiniband* Infiniband device identifier MT25208, MT25204
Table 9.1 Hardware Scanning components
8.5.2. Additional Output
If required, the output to screen can be increased with verbose (or <verbose>) and more information will be shown.Verbose level:
4 The types and groups for each node.
60
5 4 + configuration values for single-node performance test modules: node, test module, parameter, value, step in which that value was obtained and groups used for that value (if applies).
8.5.3. Benchmarking and Performance Disclaimers
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering to purchase. For more information on performance tests and on the performance of Intel® products, visit the Intel® Performance Benchmark Limitations website.
8.6. Automatic Configuration Advanced Usage
8.6.1. Group Configuration Alternatives
Based on the fact that compute nodes are automatically grouped during performance automatic configuration, it is possible to create beforehand a configuration file that can be used in different clusters. Therefore, the configuration file will have parameters to match different values according to the available hardware on the compute nodes and the groups created. It is also possible to configure default values for compute nodes that do not match a specific hardware combination. This is done by simply placing the configuration parameters with no group. Single-node node performance test modules (mflops_intel_mkl, memory_bandwidth_stream, imb_pingpong_intel_mpi and hdparm) follow specific steps to determine the configuration thresholds against which to test compute nodes performance (see details in 8.6.3).The following example shows how to configure the core_count test module to use different values of logical and physical cores based on the detected base board and processor model. Also note that default values (4 logical and 4 physical cores) are set for all compute nodes that do not match any group.
<core_count>
<group name="X38ML AND X3230"> <logicalcores>4</logicalcores> <physicalcores>4</physicalcores>
61
</group>
<group name="S5000PAL AND X5355"> <logicalcores>8</logicalcores> <physicalcores>8</physicalcores>
</group>
<group name="S5520UR AND X5355"> <logicalcores>16</logicalcores> <physicalcores>8</physicalcores>
</group>
<logicalcores>4</logicalcores> <physicalcores>4</physicalcores> </core_count>
8.6.2. Heterogeneous Hardware Support
Most of the test modules checking hardware homogeneity may be configured to take advantage of the on-the-fly nodes grouping of the performance automatic configuration. By editing the configuration file it is possible to have tests modules compare nodes in groups. The following configuration example shows how to tell the dmidecode test module to differentiate compute nodes (by base board and processor model) and compare each one only with other nodes belonging to the same group. Note that nodes that do not match any configured group will be considered to be part of the “default” group.
<dmidecode><group name=”S5400SF AND X5472”></group><group name=”X38ML AND X5355”></group>
<dmidecode>
It is important to have a comprehensive knowledge of the hardware components of each compute node to create the correct configuration. The following is a short list of the test modules checking hardware homogeneity:
• arch• core_count• core_frequency• cpuinfo• dmidecode• hardware_uniformity• iwarp
62
• pci• processor_cache
8.6.3. Single Node Performance
In the case of the single-node performance test modules, the automatic configuration mode offers a fallback mechanism that looks for different configurations available at execution time. The following list shows the order of precedence used to obtain the required configuration:
A- Direct group match in XML configuration file.B- Heuristic thresholds calculation.C- Default threshold value in XML configuration file.D- Fallback floor value according to historic figures.
The following table shows the steps that apply to each single-node performance test module:
Test Module Steps
mflops_intel_mkl A, B, C, D
memory_bandwidth_stream A, B, C, D
imb_pingpong_intel_mpi A, C, D
hdparm A, C, D
Table 9.2 Single Node Performance thresholds completion steps
This mechanism is used to obtain the threshold for each configuration parameter of the above listed test modules. Therefore, for test modules with more than one configuration parameter it is possible to obtain different parameter values at different steps.
8.6.3.1. Direct Group Match
In the case that the user knows the expected performance figures for each compute node with different hardware components, the user can build a configuration file with the specific values for each one. This option has the highest precedence. So, if a match for a node is found, it will be used.
63
The following table shows the hardware components that affect single-node performance test modules and gives examples of how to create groups to differentiate configuration values. Note that this table is only a reference and that nodes may be grouped according the user's criteria.
Test Module Hardware Group name examples
imb_pingpong_intel_mpi Ethernet DeviceInfiniband Device
82575EB AND MT25204
memory_bandwidth_stream Base Board IdentifierMemory Type
X38ML AND DIMM_SPEED_667
mflops_intel_mkl Processor CountProcessor Identifier
1_PROCESSOR AND X3230
Table 9.3 Hardware components for single-node performance
8.6.3.2. Heuristic Thresholds Calculation
A simple theoretical value is set based on the characteristics of the hardware components detected. Only the memory_bandwidth_stream and mflops_intel_mkl test modules support this approach.
8.6.3.3. Default Threshold Configuration
If nor a direct group match nor an heuristic calculation are possible, the tool searches in the configuration file for a default value. This value should be configured in for each test leaving it outside all groups. The following example shows how to configure the default value in the mflops_intel_mkl test module. The value 21328 will be used to test all nodes that do not belong to the group "2_PROCESSOR AND E5506 ".
<mflops_intel_mkl><group name= "2_PROCESSOR AND E5506">
<mflops>34128</mflops></group><mflops>21328</mflops>
</mflops_intel_mkl>
8.6.3.4. Fallback Floor Value
64
If none of the previous alternatives succeeded an historical fallback value is used. This value is taken from the least known performing system at the moment. The value is intended only to prove that the device/feature being tested is just working and by no means that it is performing optimally. In the imb_pingpong_intel_mpi test module, floor values will be completed only for the rdssm fabric.
65
9. Third Party Copyright Notices
The product is comprised of the following software and the following information is made available in compliance with these licenses:
The Intel® MPI Library Test Suite is based in part on the MPI C++ Test Suite. This product includes software developed at the Ohio Supercomputer Center at The Ohio State University, the University of Notre Dame and the Pervasive Technology Labs at Indiana University with original ideas contributed from Cornell University. For technical information contact Andrew Lumsdaine at the Pervasive Technology Labs at Indiana University. For administrative and license questions contact the Advanced Research and Technology Institute at 1100 Waterway Blvd. Indianapolis, Indiana 46202, phone 317-274-5905, fax 317-274-5902.
Software License for LAM/MPI Copyright (c) 20012003 The Trustees of Indiana University. All rights reserved.Copyright (c) 19982001 University of Notre Dame. All rights reserved.Copyright (c) 19941998 The Ohio State University. All rights reserved. Indiana University has the exclusive rights to license this product under the following license. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1) All redistributions of source code must retain the above copyright notice, the list of authors in the original source code, this list of conditions and the disclaimer listed in this license;2) All redistributions in binary form must reproduce the above copyright notice, this list of conditions and the disclaimer listed in this license in the documentation and/or other materials provided with the distribution;3) Any documentation included with all redistributions must include the following acknowledgement: "This product includes software developed at the Ohio Supercomputer Center at The Ohio State University, the University of Notre Dame and the Pervasive Technology Labs at Indiana University with original ideas contributed from Cornell University. For technical information contact Andrew Lumsdaine at the Pervasive
66
Technology Labs at Indiana University. For administrative and license questions contact the Advanced Research and Technology Institute at 1100 Waterway Blvd. Indianapolis, Indiana 46202, phone 3172745905, fax3172745902." Alternatively, this acknowledgement may appear in the software itself, and wherever such thirdparty acknowledgments normally appear. 4) The name "LAM" or "LAM/MPI" shall not be used to endorse or promote products derived from this software without prior written permission from Indiana University. For written permission, please contact Indiana University Advanced Research & Technology Institute.5) Products derived from this software may not be called "LAM" or "LAM/MPI", nor may "LAM" or "LAM/MPI" appear in their name, without prior written permission of Indiana University Advanced Research & Technology Institute. Indiana University provides no reassurances that the source code provided does not infringe the patent or any other intellectual property rights of any other entity. Indiana University disclaims any liability to any recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. LICENSEE UNDERSTANDS THAT SOFTWARE IS PROVIDED "AS IS" FOR WHICH NO WARRANTIES AS TO CAPABILITIES OR ACCURACY ARE MADE. INDIANA UNIVERSITY GIVES NO WARRANTIES AND MAKES NO REPRESENTATION THAT SOFTWARE IS FREE OF INFRINGEMENT OF THIRD PARTY PATENT, COPYRIGHT, OR OTHER PROPRIETARY RIGHTS. INDIANA UNIVERSITY MAKES NO WARRANTIES THAT SOFTWARE IS FREE FROM "BUGS", "VIRUSES", "TROJAN HORSES", "TRAP DOORS", "WORMS", OR OTHER HARMFUL CODE. LICENSEE ASSUMES THE ENTIRE RISK AS TO THE PERFORMANCE OF SOFTWARE AND/OR ASSOCIATED MATERIALS, AND TO THE PERFORMANCE AND VALIDITY OF INFORMATION GENERATED USING SOFTWARE. Indiana University has the exclusive rights to license this product under this license.
Intel® MPI Benchmarks is made available under the Common Public License. The source code is made available with this product and may also be downloaded from http://www.intel.com/cd/software/products/asmo-na/eng/219848.htm.
Intel® MPI Benchmarks (Common Public License)IMPORTANT READ BEFORE COPYING, INSTALLING OR USINGCommon Public License Version 1.0
67
THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS COMMON PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT.
DEFINITIONS"Contribution" means:
* in the case of the initial Contributor, the initial code and documentation distributed under this Agreement, and * in the case of each subsequent Contributor: o changes to the Program, and o additions to the Program; where such changes and/or additions to the Program originate from and are distributed by that particular Contributor. A Contribution 'originates' from a Contributor if it was added to the Program by such Contributor itself or anyone acting on such Contributor's behalf. Contributions do not include additions to the Program which: (i) are separate test modules of software distributed in conjunction with the Program under their own license agreement, and (ii) are not derivative works of the Program.
"Contributor" means any person or entity that distributes the Program.
"Licensed Patents " mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program.
"Program" means the Contributions distributed in accordance with this Agreement.
"Recipient" means anyone who receives the Program under this Agreement, including all Contributors.
GRANT OF RIGHTSSubject to the terms of this Agreement, each Contributor hereby grants Recipient a nonexclusive, worldwide, royaltyfree copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form.
Subject to the terms of this Agreement, each Contributor hereby grants Recipient a nonexclusive, worldwide, royaltyfree patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. This patent license shall apply to the combination of the Contribution and the Program
68
if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license shall not apply to any other combinations which include the Contribution. No hardware per se is licensed hereunder.
Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient's responsibility to acquire that license before distributing the Program.
Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement.
REQUIREMENTSA Contributor may choose to distribute the Program in object code form under its own license agreement, provided that:
* it complies with the terms and conditions of this Agreement; and * its license agreement: o effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and noninfringement, and implied warranties or conditions of merchantability and fitness for a particular purpose; o effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits; o states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party; and o states that source code for the Program is available from such Contributor, and informs licensees how to obtain it in a reasonable manner on or through a medium customarily used for software exchange.
When the Program is made available in source code form:
o it must be made available under this Agreement; and
69
o a copy of this Agreement must be included with each copy of the Program. Contributors may not remove or alter any copyright notices contained within the Program.
Each Contributor must identify itself as the originator of its Contribution, if any, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution.
COMMERCIAL DISTRIBUTIONCommercial distributors of software may accept certain responsibilities with respect to end users, business partners and the like. While this license is intended to facilitate the commercial use of the Program, the Contributor who includes the Program in a commercial product offering should do so in a manner which does not create potential liability for other Contributors. Therefore, if a Contributor includes the Program in a commercial product offering, such Contributor ("Commercial Contributor") hereby agrees to defend and indemnify every other Contributor ("Indemnified Contributor") against any losses, damages and costs (collectively "Losses") arising from claims, lawsuits and other legal actions brought by a third party against the Indemnified Contributor to the extent caused by the acts or omissions of such Commercial Contributor in connection with its distribution of the Program in a commercial product offering. The obligations in this section do not apply to any claims or Losses relating to any actual or alleged intellectual property infringement. In order to qualify, an Indemnified Contributor must:
* promptly notify the Commercial Contributor in writing of such claim, and * allow the Commercial Contributor to control, and cooperate with the Commercial Contributor in, the defense and any related settlement negotiations. The Indemnified Contributor may participate in any such claim at its own expense.
For example, a Contributor might include the Program in a commercial product offering, Product X. That Contributor is then a Commercial Contributor. If that Commercial Contributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Contributor's responsibility alone. Under this section, the Commercial Contributor would have to defend claims against the other Contributors related to those performance claims and warranties, and if a court requires any other Contributor to pay any damages as a result, the Commercial Contributor must pay those damages.
NO WARRANTY
70
EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NONINFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is solely responsible for determining the appropriateness of using and distributing the Program and assumes all risks associated with its exercise of rights under this Agreement, including but not limited to the risks and costs of program errors, compliance with applicable laws, damage to or loss of data, programs or equipment, and unavailability or interruption of operations.
DISCLAIMER OF LIABILITYEXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
GENERALIf any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.
If Recipient institutes patent litigation against a Contributor with respect to a patent applicable to software (including a crossclaim or counterclaim in a lawsuit), then any patent licenses granted by that Contributor to such Recipient under this Agreement shall terminate as of the date such litigation is filed. In addition, if Recipient institutes patent litigation against any entity (including a crossclaim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient's patent(s), then such Recipient's rights granted under Section 2(b) shall terminate as of the date such litigation is filed.
All Recipient's rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient's rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient's obligations under this Agreement
71
and any licenses granted by Recipient relating to the Program shall continue and survive.
Everyone is permitted to copy and distribute copies of this Agreement, but in order to avoid inconsistency the Agreement is copyrighted and may only be modified in the following manner. The Agreement Steward reserves the right to publish new versions (including revisions) of this Agreement from time to time. No one other than the Agreement Steward has the right to modify this Agreement. IBM is the initial Agreement Steward. IBM may assign the responsibility to serve as the Agreement Steward to a suitable separate entity. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved.
This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation.
License for Use of "Intel® MPI Benchmarks" Name and Trademark
In addition to the provisions of the Common Public License as included in the Intel® MPI Benchmarks distribution, Intel® grants the recipient the right to use the name and trademark "Intel® MPI Benchmarks" in relation to disclosures or publications of results, provided that bespoke results were obtained by running the benchmarks generated from the original, unchanged source code as distributed by Intel®.
Under no circumstances shall the recipient be permitted to use the name and trademark "Intel® MPI Benchmarks" in relation to results obtained by running benchmarks generated from source code that is different from the original source code as distributed by Intel®, regardless of whether the differences are caused by modifying the existing benchmark components or by adding new components.
dmidecode is made available under the GNU Public License (GPL):
72
Copyright (C) 20002002 Alan Cox <[email protected]>Copyright (C) 20022007 Jean Delvare <khali@linuxfr.org> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 021111307 USA
A copy of the HPC Challenge Benchmark License is made available in accordance with the requirements of the license.
LicenseCopyright © 2011 The University of Tennessee. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: ∙ Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. ∙ Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer listed in this license in the documentation and/or other materials provided with the distribution. ∙ Neither the name of the copyright holders nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. in no event shall the copyright owner or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage.
73
External XML parsing software is included as part of this product.
Libexpat library for XML parsing is used in this product under the MIT License:
Copyright (c) 1998, 1999, 2000 Thai Open Source Software Center Ltd and Clark CooperCopyright (c) 2001, 2002, 2003, 2004, 2005, 2006 Expat maintainers.
Permission is hereby granted, free of charge, to any person obtaininga copy of this software and associated documentation files (the"Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
The tool includes software developed at the University of Tennessee, Knoxville, Innovative Computing Laboratories and neither the University nor ICL endorse or promote this product. Although HPL 2.0 is redistributable under certain conditions, this particular package is subject to the MKL license.-----------------------------------------------------------------------------------------------------------
HPL Copyright Notice and Licensing TermsRedistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions, and the following disclaimer in the documentation and/or other materials provided with the distribution.
74
3. All advertising materials mentioning features or use of this software must display the following acknowledgment: This product includes software developed at the University of Tennessee, Knoxville, Innovative Computing Laboratory.4. The name of the University, the name of the Laboratory, or the names of Its contributors may not be used to endorse or promote products derived from this software without specific written permission.DisclaimerTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS `AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE UNIVERSITY OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
75