Adm2000 Lab Guide

Post on 17-Feb-2018

217 views 0 download

transcript

  • 7/23/2019 Adm2000 Lab Guide

    1/48

    Hadoop Cluster AdministrationLab Guide September 2015, v5.0

    For use with the following courses:

    ADM 200 ADM 201

    ADM 202 ADM 203

  • 7/23/2019 Adm2000 Lab Guide

    2/48

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.ii

    This Guide is protected under U.S. and international copyright laws, and is the exclusive property ofMapR Technologies, Inc.

    2015, MapR Technologies, Inc. All rights reserved. All other trademarks cited here are the property oftheir respective owners.

  • 7/23/2019 Adm2000 Lab Guide

    3/48

    Get Started

    Icons Used in This Guide

    This lab guide uses the following ions to draw attention to different types of information:

    Note : Additional information that will clarify something, provide additionaldetails, or help you avoid mistakes.

    CAUTION : Details you must read to avoid potentially serious problems.

    Q&A : A question posed to the learner during a lab exercise.

    Try This! Extra exercises you can complete to strengthen learning.

    Lab Requirements

    You will need the following to complete the labs for this course:

    Access to a physical or virtual cluster with at least 4 nodes. Instructions on setting up virtualclusters using either Amazon Web Services (AWS) or Google Cloud Platform (GCP) are includedwith the course files.

    Visit http://doc.mapr.com/display/MapR/Preparing+Each+Node for information on requiredspecifications for the nodes.

    SSH access to the nodes. For Mac users, this is built into the standard terminal program.Windows users may need to download and install an additional utility for this (such as PuTTY). If you are using GCP, you can SSH into the nodes from the GCP console.

    You can download PuTTY at http://www.putty.org .

    Note : Make sure that you can access the nodes via SSH before starting the labs.

  • 7/23/2019 Adm2000 Lab Guide

    4/48

    Get Started

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2

    Using This Guide1. You will select one of your nodes to be the master node . Most of the work will be performed from

    the master node, and the other nodes will be accessed from there.

    2. When command syntax is presented in this guide, any arguments that are enclosed in chevrons,

    !"#$% '(#)* , should be substituted with an appropriate value. For example, this:

    + ,- !)./0,% 1#"%* !2%)'#34'#.3 1#"%*

    might be entered as this:

    + ,- 5%',5))(5))(26,.31#7 5%',5))(5))(26,.31#7894$

    Note : Sample commands provide guidance, but do not always reflect exactly what youwill see on the screen. For example, if there is output associated with a command, itmay not be shown.

    Tips for Using AWS and GCP Clusters

    Using AWS Clusters

    Use these conventions when working with an AWS cluster for the labs:

    AWS Clusters Information

    IP Addresses Each node in your cluster will have both an internal IP address, and an external

    IP address. To view them, log into the AWS console and click on the node. Use the external IP address to connect to a node in your cluster, from

    a node outside the cluster (for example, from a terminal window onyour laptop).

    Use the internal IP address to connect from one node in the cluster toanother.

    Default The default AWS user name is ec2-user. Whenever you see !/)%0* in acommand sample in the lab guide, substitute %,:;/)%08

    Log in as root The user %,:;/)%0 has )/2. privileges. To log into a node as 0..' , first log in

    as %,:;/)%0 and then )/2. to root:

    < )/2. ;#

    SSH access To connect to a node in your cluster, use the .pem file that was provided withthe course materials (or that you downloaded when you created your AWSinstances). For example:

    ))( =# !8-%> 1#"%* %,:;/)%0?!%@'%034" AB 4220%))*

  • 7/23/2019 Adm2000 Lab Guide

    5/48

    Get Started

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.3

    Using GCP Clusters

    Use these conventions when working with a GCP cluster for the labs:

    GCP Clusters Information

    IP Addresses Each node in your cluster will have both an internal IP address, and an externalIP address. To determine the IP addresses of your nodes, run this commandfrom your terminal window (where you installed the Google Cloud SDK):

    7,"./2 ,.>-/'% #3)'43,%) "#)'

    To connect to the node from a system outside the cluster, such as yourlaptop, use the SSH button in the Google Developer's Console (see theinformation below on SSH access).

    Use the internal IP address to connect from one node in the cluster toanother.

    Default The default user name will be based on the Google account under which yourproject was created. If you are unsure of the default user name, connect to oneof your GCP nodes (see SSH access, below). The login prompt displayed willinclude your user name. For example:

    C/)%034>%?3.2%D EF Compute Engine > VM instances .3. Click the SSH button to the right of the node you want to connect to.

    Lab GS1: SSH Into Your Nodes

    Estimated time to complete: 10 minutes

    Overview

    The purpose of this lab is to make sure you can connect to the nodes you will be using throughout thecourse. This is required for all of the remaining labs.

  • 7/23/2019 Adm2000 Lab Guide

    6/48

    Get Started

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.4

    Note : Instructions in this section are specific to the nodes that are used in the classroomtraining. If you are in the classroom training, make sure you download the course files to yoursystem before beginning.

    If you are taking the on-demand training, you will need to provide you own nodes and the

    method for connecting to those nodes may differ from the instructions presented here. If you are using GCP nodes, you can SSH into them directly from the Google

    Developer's Console.

    If you are using AWS nodes, you should have downloaded the .pem file when youcreated your instances. Windows users of AWS nodes will need to convert the .pemfile to a .ppk file: instructions for doing that can be found in the AWS documentation,last seen here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html.

    Connect to your Nodes: Windows with PuTTY

    This procedure assumes that you are using PuTTY as your terminal application. Other applications willwork, but you will need to adjust the instructions accordingly.

    If you are using a Mac or Linux system, skip to the next section for instructions.

    1. Open a PuTTY window. In the Host Name (or IP address) field, enter the external IP addressfor the master node in your cluster.

    2. In the Category list on the left-hand side, navigate to Connect > SSH > Auth .

  • 7/23/2019 Adm2000 Lab Guide

    7/48

    Get Started

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5

    3. A new screen will appear when you click Auth in the menu. Browse for the .ppk file (suppliedwith the course files for classroom training), and click Open to open a terminal window.

    4. Once the terminal window opens up, log in as the default user ( ec2-user , for AWS nodes). Apassword is not required since you are using the .ppk file to authenticate. Once logged in, youwill see the command prompt, and be able to sudo to the root user.

    5. Log out of the node, and repeat for the other nodes in your cluster. You can also open multiplePuTTY windows to have access to multiple nodes at the same time.

  • 7/23/2019 Adm2000 Lab Guide

    8/48

    Get Started

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.6

    Connect to your Nodes: Mac or Linux

    Follow these instructions to SSH into your nodes from a Max or Linux system.

    1. Set permissions on the .pem file (supplied with the course files for classroom training) to 600, ifthey are not already set correctly:

    < ,(>.2 GHH )'/2%3')HIDI:HD:8-%>

    2. Use a terminal window to SSH into the master node in your assigned cluster:

    < ))( =# )'/2%3')HIDI:HD:8-%> !/)%0*?!%@'%034" AB 4220%))*

    Make sure to use the external IP address for the node. On AWS clusters, !/)%0 > will be ec2-user.

    Once logged in, you should see the command prompt. You should also be able to sudo to theroot user:

    < )/2. ;#

    3. Log out of the node, and repeat for the other nodes in your cluster.

    Lab GS2: Set Up Passwordless SSH

    In the Prepare for Installation labs, you will use clustershell to copy files from the master node, to othernodes in the cluster. For this to work, you must have passwordless SSH set up between your nodes. Setup passwordless SSH only on the first 3 nodes in your cluster.

    Note : Follow the instructions in Appendix A of the lab guide for details on how to set uppasswordless SSH. These instructions were written for the AWS nodes that are used in theclassroom training; this procedure may be different for on-demand training students.

    Classroom students: Check with your instructor before proceeding, as passwordless SSH mayhave been set up in advance.

  • 7/23/2019 Adm2000 Lab Guide

    9/48

    Lesson 1: Prepare for Installation

    Lab 1.1: Audit the ClusterEstimated time to complete: 15 minutes

    Install the Pre-Install and Post-Install Tools

    1. Log in to the master node as !""# .

    2. Download the zip file:

    %&'# (##)+,,-".!*' - 012'*345)!3-"4,6789::3;1)

    3. Extract the files from 6789::3;1) ; this will create two directories (post-install and pre-install).Verify that the directories exist, and contain files.

    < .=;1) 6789::3;1)< 2* )!'/1=*#522< 2* )"*#/1=*#522

    Install and Configure Clustershell

    >3 Install the package -2.*#'!*('22/>3?/>3'2?3="5!-(3!)4 , located in the pre-install directory.

    < !)4 /1 )!'/1=*#522,-2.*#'!*('22/>3?/>3'2?3="5!-(3!)4

    2. Edit the ,'#-,-2.*#'!*('22,&!".)* file to include all the internal IP addresses for the threenodes in your cluster, separated by spaces or commas. The file should contain just this line:

    522+ @AB 5CC!'** 45*#'! ="C'D @AB 5CC!'** ="C' >D @AB 5CC!'** ="C' 9D

    Note : Be sure to use the internal IP addresses for your nodes.

    For example:

  • 7/23/2019 Adm2000 Lab Guide

    10/48

    Lesson 1: Prepare for Installation

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.1-2

    3. Test that -2.*( provides passwordless access to all the nodes of the cluster:

    < -2.*( /5 C5#'

    Copy the Scripts

    1. Copy the ,!""#,)!'/1=*#522 and ,!""#,)"*#/1=*#522 directories to all of your nodes:

    < -2.*( /5 //-")E ,!""#,)!'/1=*#522 ,!""#,)"*#/1=*#5222. When that completes, confirm that all of the nodes have a copy of the directories:

    < -2.*( /F5 2* /25 ,!""#, G &!') )!'/1=*#522< -2.*( /F5 2* /25 ,!""#, G &!') )"*#/1=*#522

    Run a Cluster Audit

    1. From the master node, run an audit of the nodes in the cluster:

    < ,!""#,)!'/1=*#522,-2.*#'!/5.C1#3*( G #'' -2.*#'!/5.C1#32"&

    Note : On some OS versions, an SSH bug causes #-&'#5##!+ A=H521C 5!&.4'=# errors to be displayed to the screen during the audit. These can be ignored.

    2. View the output file to evaluate your cluster hardware. Look for hardware or firmware levels thatare mismatched, or nodes that don't meet the baseline requirements to install Hadoop.

  • 7/23/2019 Adm2000 Lab Guide

    11/48

    Lesson 1: Prepare for Installation

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.1-3

    Lab 1.2: Run Pre-Install Tests

    Evaluate Network Bandwidth

    Estimated time to complete: 5 minutes1. As root on the master node, type this command to start the network test:

    < ,!""#,)!'/1=*#522,='#%"!I/#'*#3*( G #'' ='#%"!I/#'*#32"&

    Press '=#'! at the prompt to continue. This runs the RPC test to validate the networkbandwidth. This test will take a few minutes to run.

    Note : Results should be about 90% of peak bandwidth. So with a 1GbE network,expect to see results of about 115MB/sec. With a 10GbE network, look for resultsaround 1100MB/sec. If you are not seeing results in this range, then you need tocheck with your network administrators to verify the connections and firmware.

    With virtual clusters, expect to see lower than optimal results.

    Evaluate Data Flow

    Estimated time to complete: 5 minutes

    1. Type the following command to run the *#!'54 utility:

    < -2.*( /F5 ,!""#,)!'/1=*#522,4'4"!E/#'*#3*( G #'' 4'4"!E/#'*#32"&

    As with the network performance test, it will take a few minutes to complete.

    2. Review the results.

    This tests the memory performance of the cluster. The exact bandwidth of memory is highly variable andis dependent on the speed of the DIMMs, the number of memory channels and, to a lesser degree, theCPU frequency.

    Evaluate Raw Disk Performance

    Estimated time to complete: 20 minutes

    Caution! This test destroys any existing data on the disks it uses. Make sure the drives do nothave any needed data on them, and that you do not run this test after you have installed MapRon the cluster.

    The first step lets you view the disks that will be used in the test: review the output carefully tomake sure the list contains only intended disks.

  • 7/23/2019 Adm2000 Lab Guide

    12/48

    Lesson 1: Prepare for Installation

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.1-4

    1. Type the command below to list the unused disks on each node. These are the disks that IOzonewill run against, so be sure to examine the list carefully.

    < -2.*( /5J ,!""#,)!'/1=*#522,C1*I/#'*#3*(

    2. After you have verified the list of disks is correct, run the command with the //C'*#!"E argument:

    < -2.*( /5J ,!""#,)!'/1=*#522,C1*I/#'*#3*( //C'*#!"E

    Note : In the lab environment, the test will run for 15-20 minutes, depending on thenumber and sizes of the disks on your nodes. In a production environment with alarger cluster, it can take significantly longer.

    The test will generate one output log for each disk on your system. For example:

    KHCJ/1";"='32"&KHC-/1";"='32"&

    KHCC/1";"='32"&3. If there are many different drives, the output can be difficult to read. The *.44AL;"='3*( script

    creates a summary of the output. Run the script and review the output:

    < -2.*( /5 M,!""#,)!'/1=*#522,*.44AL;"='3*(M

    Note : The script assumes that the log files are in the present working directory, so thescript must be run from the directory that contains the log files.

    Keep the results of this and the other benchmark tests for post-installation comparison.

    Lab 1.3: Plan Service Layout

    Estimated time to complete: 10 minutes

    You are a system administrator for company ABC, which is just getting started with Hadoop. You will beinstalling and configuring a small 3-node cluster for initial deployment, with high availability. The R&Dand Marketing departments will share this cluster.

    Q: What type(s) of nodes (data, control, or control-as-data) will your cluster have?

    A: Since you only have three nodes, you will need to spread the control services (such asZooKeeper and CLDB) out over all of the nodes. To do this and still have room for data,you will need to use control-as-data nodes.

  • 7/23/2019 Adm2000 Lab Guide

    13/48

    Lesson 1: Prepare for Installation

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.1-5

    Fill out the chart below to show where the various services will be installed on your cluster:

    Z o o

    K e e p e r

    C L D B

    M F S

    N F S

    H i s t o r y

    S e r v e r

    R e s o u r c e

    M a n a g e r

    N o d e M a n a g e r

    W a r

    d e n

    Node 1 (in rack A)

    Node 2 (in rack B)

    Node 3 (in rack C)

    Try this! How would you configure a 10-node cluster with the same requirements?

    Z o o

    K e e p e r

    C L D B

    M F S

    N F S

    H i s t o r y

    S e r v e r

    R e s o u r c e

    M a n a g e r

    N o d e M a n a g e r

    W a r

    d e n

    Node 1 (in rack A) Node 2 (in rack A)

    Node 3 (in rack A)

    Node 4 (in rack B)

    Node 5 (in rack B)

    Node 6 (in rack B)

    Node 7 (in rack B)

    Node 8 (in rack C)

    Node 9 (in rack C)

    Node 10 (in rack C)

  • 7/23/2019 Adm2000 Lab Guide

    14/48

    Lesson 2: Install a MapR Cluster Lab 2.1: Install a MapR Cluster

    Estimated time to complete: 45 minutes

    Preparation

    1. Log in as root on the master node, then download and run the !"#$%&'()# script:

    * ,-'( .((#/00#"12"-'3!"#$314!0$'5'"&'&067&("55'$0!"#$%&'()#3&.* 8"&. 30!"#$%&'()#3&.

    This script will prepare the node to use the browser-based installer.

    Note : If you see the message, 9$$4$/ :4(.67- (4 ;4 , you can ignore it.

    Accept the defaults for the mapr admin user, UID, GID, and password. Enter !"#$ as thepassword, then re-enter it confirm:

    2. When the script completes, point your browser to the node at port 9443:

    Note : For virtual clusters, the mapr-setup.sh script may display the internal IP addresshere in this screen. Make sure you use the external IP address of the node.

  • 7/23/2019 Adm2000 Lab Guide

    15/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-2

    3. Open the requested URL to continue with the installation. Ignore any warnings you receive aboutthe connection not being secure, and log into the installer as the user !"#$ (the password willalso be !"#$ , unless you changed it when running the mapr-setup.sh script).

    4. From the main MapR Installer screen, click Next in the lower right corner to start moving throughthe installation process. The first screen is Select Version & Services .

    Select Version & Services

    The first screen is Select Version & Services , as shown below.

    1. From the MapR Version pull-down menu, set the MapR Version to 5.0.0.

    2. In the Edition field, select Enterprise Database Edition (this is the default).

    3. Under License Option , select Add License After Installation Completes .

    4. In the Select Services section, the option Data Lake: Common Hadoop Services is selected bydefault as the Auto-Provisioning Template . Perform the following actions to review services,and to set them for the course:

    a. Click Show advanced service options .

    This displays the services that will be installed with the selected template.

  • 7/23/2019 Adm2000 Lab Guide

    16/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-3

    b. Change the Auto-Provisioning Template selection to Data Exploration: InteractiveSQL with Apache Drill to see how it changes the services template. Then selectOperational Analytics: NoSQL database with MapR-DB to see those options.

    c. Change the selection to Custom Services and select the following:

    HBase/MapR-DB Common YARN + MapReduce

    5. Click Next to advance to the Database Setup screen.

    Database Setup

    The entries displayed on this screen will depend on which services were selected on the previous screen(a database needs to be selected for Hue, Oozie, Metrics, or Hive). Since we did not select any servicesrequiring a database setup, there will be no database to configure.

    Click Next to advance to the Set Up the Cluster screen.

  • 7/23/2019 Adm2000 Lab Guide

    17/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-4

    Set Up the Cluster

    This screen is where you define the MapR Administrator Account, and name your cluster.

    1. The MapR Administrator Account section will show the values you entered when you ran thesetup script. In the Password field, enter the password for the mapr user.

    2. In the Cluster Name field, enter a name for your cluster. If you are in the instructor-led course,use the cluster name that was assigned to you in the .hosts file to be sure your cluster name isnot the same as another student's.

    3. Click Next to advance to the Configure Nodes screen.

    Configure Nodes

    This screen is where you define the nodes to include in your cluster, the disks that will be used, and theauthentication method. The hostname of the install node will already be filled in for you.

  • 7/23/2019 Adm2000 Lab Guide

    18/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-5

    1. In the Nodes section, enter the fully qualified hostnames of the three nodes that will be in yourcluster, one per line. The hostname of the install node will generally be filled in for you.

    Caution! Install the cluster on just your first 3 nodes. The 4 th node will be used in alater lab, and should not have MapR installed on it at this time.

    For example:

    2. In the Disks section, enter the names of the disks that will be used by the cluster. You can runthis command on a node to verify the disks on your system:

    * -$'# ;'?

    Enter the disks as a comma-separated list. For example:

    3. In Configure Remote Authentication , select SSH Password as the Login Method. Then:

    a. Enter root as the SSH Username .

    b. Enter mapr as the SSH Password .

    c. Leave the SSH Port as set, to 22 .

    6. Click Next to advance to the Verify Nodes screen.

  • 7/23/2019 Adm2000 Lab Guide

    19/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-6

    Verify Nodes

    Verify Nodes verifies that each node can be reached and meets the minimum requirements. Whencomplete, the node icons will display as green (ready to install), yellow (warning), or red (cannot install).

    1. To check the status of a yellow or red node, click on the node icon. A box on the right-hand sideof the screen will appear, with details of any warnings or errors found. For example:

    Note : In the screenshot above, there are warnings because the nodes do not haveswap space. This will often be the case with virtual AWS or GCP clusters. You canproceed with warnings, but in a production environment it is best to correct any issuesand re-verify the nodes.

    2. When ready, click Next to advance to the Configure Service Layout screen.

  • 7/23/2019 Adm2000 Lab Guide

    20/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-7

    Configure Service Layout

    The Configure Service Layout screen displays the services that will be installed on the nodes. Fromhere, you can rearrange and add services as well.

    1. Click View Node Layout to see which services will be installed on which nodes. After reviewingthe layout, click Close .

    2. Click Advanced Configuration and review the screen. Services are divided into logicalgroupings; you can change these groupings, or add groups of your own. Make some changes tothe service layout for practice:

    a) Look for the DEFAULT group. Drag the NFS service icon to the row below it, to make anew group.

    b) Change the group name to NFS .

    c) In the Nodes column of the new group, click the Modify button.

    d) Select one or more nodes to be in the NFS group by clicking on the node icons.

    e) Click OK . You now have those nodes in the NFS group so the NFS service will beinstalled on them.

    f) Click the trashcan icon to the right of the NFS group to delete it. Click OK to verify thedeletion.

  • 7/23/2019 Adm2000 Lab Guide

    21/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-8

    g) Scroll up to the top of the page. The NFS service icon is now unprovisioned and must beplaced in a group before the installation can proceed.

    h) Drag the NFS service icon back to the DEFAULT group.

    3. At the bottom of the page, click Restore Defaults , then OK to confirm. This will restore thedefault service layout.

    4. Configure the service layout to match what you came up with in the service layout lab forCompany ABC, if it differs from the default layout.

    Caution! The service layout you configured for Company ABC does not list all of theservices that will go on a MapR cluster. Do not delete services from this screen that donot appear on your worksheet; just make sure that the services that DO appear on yourworksheet are laid out the way you intend.

    Note : Make note of which node is running the JobHistoryServer, and record its externalIP address. You will need this information later when viewing job history.

    5. Drag the Webserver service into the DEFAULT group, so it will be installed on all three nodes.

    6. Click Save to save the service layout, then click Install to start the installation process.

    Installing MapR

    The installation will take approximately 30 minutes to complete on a 3-node lab cluster. Timerequired in a production environment will vary based on what is being installed, and the number of nodes.

    When the install is complete, click Next . Since you did not install a license at the start of the installation,a page will appear letting you know that a license must be entered. Click Next to advance to the finalstep of the installation.

  • 7/23/2019 Adm2000 Lab Guide

    22/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-9

    Lab 2. 2 : Install a MapR License1. Launch the MapR Control System (MCS) UI by pointing your browser at the external IP address

    of your install node, at port 8443 (or by clicking the link on the last page of the Installer). Ignoreany messages that appear about the connection not being secure, and continue on.

    2. Log into the MCS as the user !"#$ , and accept the license agreement.3. Click on the Manage Licenses link in the upper right-hand corner of the MCS to open t he

    licensing window. Then:

    a. If you already have a license file, click Add licenses via upload and browse to thelocation of the license file. Then click Apply Licenses .

    b. If you do not have a license file, click Add licenses via Web . This will prompt you to loginto your mapr.com account (if you have one), or create an account. From there, you canregister your cluster and download a trial license.

    4. Some nodes in the cluster may have orange icons in the node heatmap, indicating degradedservice. This is normal since some services were started before the license was applied. You willtypically have to restart CLDB and NFS Gateway services.

    To restart any failed services:

    a. Find the Services panel on the right side of the dashboard, and note any services thathave failures:

    b. Click on the red number in the Fail column to see a list of nodes where the specifiedservice has failed. Select all the nodes and click the Manage Services button at the top.

  • 7/23/2019 Adm2000 Lab Guide

    23/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-10

    c. Find the service you want to restart. From the drop-down list next to the service name,select Restart. Then click OK .

    5. Back at the Services pane, check to see that all the NFS Gateway services are running. You maynot see failures: instead, you may just see no active NFS services:

    If they are not running, restart them as you did with the CLDB service.

    6. Return to the dashboard you should see all green node icons (you may have to refresh thescreen).

  • 7/23/2019 Adm2000 Lab Guide

    24/48

    Lesson 2: Install a MapR Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.2-11

    Lessons Learned

    Some of the key takeaways from this lesson are listed below.

    The MapR installer will guide you through the installation process, and make sure thatinterdependencies are not violated.

    The MapR Installer verifies that nodes are ready for installation before proceeding.

    Plan your service layout prior to installing the MapR software. In particular:

    o Make sure that you have identified where the key control services (CLDB, ZooKeeper,ResourceManager) will be running in the cluster.

    o Ensure that you have enough instances of the control services to provide high availabilityfor your organization if it is required.

    After installing MapR, apply a license and restart any failed services.

  • 7/23/2019 Adm2000 Lab Guide

    25/48

    Lesson 3: Verify and Test the Cluster Lab 3.1: runRWSpeedTest

    Estimated time to complete: 10 minutes

    The !"#$%&'(()*(+, script uses an HDFS API to stress test the IO subsystem.

    1. Log into the master node as !--, .

    2. Run the test:

    . 01"+2 345 67!--,7'-+,38#+,5117!"#$%&'(()*(+,9+26 : ,(( $%&'(()*(+,91-;

    The output provides an estimate of the maximum throughput the I/O subsystem can deliver.

    3. Compare the results to the results from the pre-install )8+-1"=( 0!(5,( 3#5=( ?(#02=5!

  • 7/23/2019 Adm2000 Lab Guide

    26/48

    Lesson 3: Verify and Test the Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.3-2

    5. Open the MCS to the dashboard view so you can watch node utilization while the next step(TeraSort) is running.

    a. At the top of the dashboard, set the heatmap to show Disk Space Utilization so you willsee the load on each node. It should be spread relatively evenly across the cluster.Hotspots suggest a problem with a hard drive or its controller.

    b. On the right-hand side of the dashboard, the YARN section will display information on the job as it is running.

    6. Type the following to sort the newly created data:

    B5!# C5! 7-',7=5'!725)--'725)--'3D9E9F7+25!(725)--'7=5'!()"0(725)--'3=5'!()"0(3(G5='1(+3D9E9F3=5'!3@HFI9C5! ,(!5+-!, 7?(#02=5!

  • 7/23/2019 Adm2000 Lab Guide

    27/48

    Lesson 3: Verify and Test the Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.3-3

    c. Jobs are listed with the most recently job at the top. Click the Job ID link to see jobdetails. It will show the number of map and reduce tasks, as well as how many attemptswere failed, killed, or successful:

    d. To see the results of the map or reduce tasks, click on Map in the Task Type column.This will show all of the map tasks for that job, their statuses, and the elapsed time.

    You can keep drilling down to get detailed information on each task in the job.

    Lessons Learned Running benchmark tests after installation gives you a performance baseline that you can refer to

    later, and helps you spot any concerns early on.

    Jobs can be monitored with the MCS, or with the JobHistoryServer.

  • 7/23/2019 Adm2000 Lab Guide

    28/48

    Lesson 4: Configure Cluster Storage Lab 4.1: Configure Node Topology

    Estimated time to complete: 10 minutes

    Topology can be changed through the MCS or the command line. In this lab, you will assign each of yournodes to a separate topology: /data/rack1, /data/rack2, or /data/rack3.

    1. In the MCS, navigate to Cluster > Nodes .

    2. Select the first node by checking the box next to it.

    3. Click Change Topology . The Change Topology dialog box appears.

    4. Type in the topology path, !"#$#!%#&'( , to create and assign the new topology. Then click OK .

    5. Repeat the steps to assign !"#$#!%#&') to a different node.

    6. Using the *#+%&,- /0"1 *021 command at the command line, assign the !"#$#!%#&'3 topology to the last node.

    For syntax information, enter *#+%&,- /0"1 *021 with no arguments.

    To determine the node's server ID, run *#+%&,- /0"1 ,-4$ 5&0,6*/4 -" .

  • 7/23/2019 Adm2000 Lab Guide

    29/48

    Lesson 4: Configure Cluster Storage

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.4-2

    7. Create the /decommissioned topology:

    a. Using either the command line or the MCS, move one of the nodes to the/decommissioned topology to create it.

    b. Move the node back to its appropriate topology. The /decommissioned topology willremain, but have no nodes assigned to it.

    8. Verify that the /decommissioned topology was created, and that all of the nodes are assigned totheir correct topologies under /data:

    *#+%&,- /0"1 $0+0*#+%&,- /0"1 ,-4$ 5740/ 8 9%1+ $0+0

    Lab 4.2: Create Volumes

    Estimated time to complete: 30 minutes

    Overview

    The marketing and R&D departments will be sharing nodes in the cluster, but still want to keep access totheir data isolated. To do this, you will create directories and volumes in the cluster for the departmentsand their users.

    Add Users and Groups

    The table below lists users and groups in the marketing and R&D departments.

    Name Type UID GID

    mkt group -- 6000miner group -- 6001

    sharon user 600 6000

    keith user 601 6000

    rnd group -- 7000

    cobra group -- 7001

    rattler group -- 7002

    jenn user 700 7000

    tucker user 701 7000

    mark user 702 7000

    marje user 703 7000

    porter user 704 7000

  • 7/23/2019 Adm2000 Lab Guide

    30/48

    Lesson 4: Configure Cluster Storage

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.4-3

    Before they can be assigned volumes or permissions in the cluster, they must exist at the OS level oneach node in the cluster. They must have the same name, UID, and GID on each node.

    1. Log into your master node as root.

    2. Create UNIX users and groups for all of the entities in the table above. Use &,64: to facilitate theoperations. For example:

    ; &,64: 5#&,64:< 9%06+#"" *'$ 59 =>>>&,64:< 641%#"" 4:#%0/ 59 =>>> 56 =>>&,64:< ?&,64:< @6-$

    Create Directories and Volumes

    Now build the following hierarchy to store project data. In the diagram below, the rectangles (such as projects ) represent directories in the cluster, and the triangles represent volumes.

    1. Log into the master node as root.

    2. Create the directories shown in the diagram above (the rectangular entries). Use the command:

    :#"00+ A4 5*'"-% 5+ B+#$: Volumes .

    2. In the list of volumes, click on NFStest to open the volume properties.

    3. Expand the Snapshot Scheduling section. Select the Critical data schedule,

    4. Click OK . This creates a snapshot schedule for the volume.

    Note : The Critical data schedule takes a snapshot every hour, at the top of the hour.Depending on what time it is when you apply the schedule, it may take up to an hour forthe snapshot to be created.

    5. Navigate to MapR-FS > Schedules . You will see that the schedule you selected for the volume

    is listed as In use . Click the green checkmark to see information about the schedule includingwhen the snapshot will expire.

    6. Click the drop-down box that shows Hourly , and change it to Every 5 min . Change the Retainfor field to be 1 hour, and click Save Schedule . This changes the Critical data schedule, forevery volume that is using the schedule. Now a scheduled snapshot will be taken every 5minutes, instead of at the top of the hour.

    7. Click the schedule name to see a list of all volumes that are associated with that schedule.

  • 7/23/2019 Adm2000 Lab Guide

    34/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-3

    Create a Manual Snapshot

    1. In the MCS, navigate back to the MapR-FS > Volumes .

    2. Check the box to the left of the NFStest volume.

    3. At the top of the pane, click Volume Actions .

    4. Click New Snapshot . The Create New Snapshot dialog box appears. Enter D$0,$(E5 as thename of the snapshot, and click OK .

    5. Navigate to MapR-FS > Snapshots . The snapshot you took should be listed there.

    Restore Data From a Snapshot

    1. From the command line, view the list of snapshots:

    ! #$%&'() *+(,#- 40$%46+. ()4.

    2. List the hidden directory that contains the snapshot:

    ! 6$8++% 94 /(4 7123.-4.7C40$%46+.

    3. Remove a file from the volume, then list the volume to see that it is gone:

    ! 6$8++% 94 / 7123.-4.7=9)(- 0$#-@! 6$8++% 94 /(4 7123.-4.

    4. Restore the file from the D$0,$(E5 snapshot:

    ! 6$8++% 94 /'% 7123.-4.7C40$%46+.7D$0,$(E57=9)(- 0$#-@ 7123.-4.

    5. Verify that the file has been restored:

    ! 6$8++% 94 /(4 7123.-4.7=9)(- 0$#-@

  • 7/23/2019 Adm2000 Lab Guide

    35/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-4

    6. Since the cluster file system is mounted, you can also use Linux commands to see the status ofthe file. Use the ls command to see that the file has been restored:

    ! (4 7#$%&7='(,4.-& 0$#-@7123.-4.7=9)(- 0$#-@

    Note : Remember that "/" in the hadoop command is the root of the cluster file system,and the "/" in Linux is the root of the local file system. This is why different paths arespecified for the 6$8++% /94 /(4 command, and the Linux (4 command.

    Remove or Preserve a Snapshot

    You can view a list of snapshots in the MCS, but you won't be able to see their contents. You can alsopreserve a snapshot that is scheduled to expire, or you can delete a snapshot.

    1. In the MCS, navigate to MapR-FS > Snapshots .

    2. Select the manual snapshot by checking the box to its left. Since there is no expiration date on amanual snapshot, you do not have the option to preserve it. Click Remove Snapshot to delete it.

    3. At the command line, verify that the snapshot is gone:! 6$8++% 94 :(4 7123.-4.7C40$%46+.

    4. Select one of the scheduled snapshots, and click Preserve Snapshot .

    Preserving it removes the expiration date.

  • 7/23/2019 Adm2000 Lab Guide

    36/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-5

    Lab 5.3: Configure a Local Mirror

    Estimated time to complete: 30 minutes

    1. In the MCS, navigate to MapR-FS > Volumes .

    2. Click New Volume . Create a new volume with the following properties:

    Volume Type Local Mirror Volume

    Mirror Name NFStest-mirror

    Source Volume Name NFStest

    Mount Path /NFStest-mirror

    Topology /data

    Replication 2

    NS Replication 2

    Mirror Schedule Normal data

    3. Click OK . This creates and mounts the local mirror volume.

    4. Navigate to MapR-FS > Mirror Volumes . Verify that your mirror volume appears.

    Q: The Last Mirrored and % Done columns do not contain information. Why not?

    A: Your actions created the mirror volume, but did not start the mirroring process.If you want the volume to mirror right away, you can manually start the mirror.Otherwise, it will start at the time specified in the schedule (which you can seeby navigating to MapR-FS > Schedules ).

    Follow these steps to force the mirror volume to start synchronizing prior to the scheduled time:

    1. Navigate to MapR-FS > Mirror Volumes .

    2. Check the box next to the mirror volume you just created.

    3. Click Volume Actions . From the pull-down menu, select Start Mirroring . This will start thesynchronization process.

  • 7/23/2019 Adm2000 Lab Guide

    37/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-6

    Create a Custom Schedule

    Three schedules are created by default: Normal data, Important data, and Critical data. You can alsocreate custom schedules.

    1. In the MCS, navigate to MapR-FS > Schedules .

    2. Click New Schedule .

    3. Name the new schedule Quarterly .

    4. In the Schedule Rules section, click the arrow on the first drop-down box. This will show all ofthe intervals that can be selected for your schedule.

    a. Select Yearly

    b. Set the next field to on the 31 st

    c. Set the next field to March .

    d. Set the retain time to 1 years

    This sets the action to occur on March 31 st each year, and be kept for one year.

    5. Click Add Rule to add another rule to your schedule. Add a total of 3 more rules, that will:

    Run yearly on June 30 th, and be retained for 1 year.

    Run yearly only September 30 th, and be retained for 1 year.

    Run yearly on December 31 st , and be retained for 2 years.

    6. Click Save Schedule when you have fully defined the schedule. The newly created scheduleappears in the schedule list, and is available for mirrors and snapshots.

    Note : Schedules are available to be used for either snapshots or mirrors. If a scheduleis applied to a mirror volume, the retain time is ignored (the mirror will not expire). If theschedule is used for a snapshot, the snapshot will automatically be deleted when theretain interval is met.

  • 7/23/2019 Adm2000 Lab Guide

    38/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-7

    Lab 5.4: Configure a Remote Mirror

    Estimated time to complete: 30 minutes

    For this lab, you will need a second cluster. For classroom training, the instructor will pair you withanother student, and you will each create a remote mirror with the other student's cluster. On-demandtraining students will need to install a second cluster to perform this lab. You can create a single-nodecluster on the 4 th node, that you used for the NFS exercise.

    Note : To create a remote mirror, the following conditions must be met:

    Each cluster must already be up and running.

    Each cluster must have a unique name

    Every node in each cluster must be able to resolve all nodes in remote clusters, eitherthrough DNS or entries in /etc/hosts.

    The MapR user for both the local (source) and remote (destination) clusters must have

    the same UID. You need to have dump permission on the source volume, and restore permissions on

    the mirror volumes at the destination cluster.

    Edit the Source Cluster Configuration

    1. Determine the cluster name and CLDB nodes on the destination cluster, by viewing this file onthe destination cluster:

    7+%.7#$%&7'+097#$%&/'(,4.-&4C'+09

    2. Log into a node on the source cluster.3. Edit the 7+%.7#$%&7'+097#$%&/'(,4.-&4C'+09 file. You should see a line with the source

    cluster name and a list of its CLDB nodes. Add a second line to describe the destination cluster,in the form:

    ='(,4.-& 0$#-@ =FGHI 0+8-5@AJKKK =FGHI 0+8-K@AJKKK =FGHI 0+8-L@AJKKK

    4. Restart the warden on all of the nodes in the source cluster:

    ! '(,46 :$ M4-&*)'- #$%&/N$&8-0 &-4.$&.M

    Edit the Destination Cluster Configuration

    Perform the same steps to make the destination cluster aware of the source cluster:

    1. Log into a node on the destination cluster.

    2. Edit the 7+%.7#$%&7'+097#$%&/'(,4.-&4C'+09 file. Add a line to describe the source cluster, inthe form:

    ='(,4.-& 0$#-@ =FGHI 0+8-5@AJKKK =FGHI 0+8-K@AJKKK =FGHI 0+8-L@AJKKK

    3. Restart the warden on all of the nodes in the destination cluster.

  • 7/23/2019 Adm2000 Lab Guide

    39/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-8

    Verify Cluster Configuration

    Verify that each cluster has a unique name and is aware of the other cluster

    1. Log on to the MCS of the source cluster.

    2. Verify that the name of the source cluster is listed at the top.

    3. Click the + symbol next to the name and verify that the destination cluster is listed underAvailable Clusters .

    4. Click on the link for the destination cluster to open the MCS for the destination cluster.

    Note : For an AWS cluster, it will attempt to connect using the internal IP address, whichwill fail. You will need to connect using the node's external IP address.

    5. Repeat these steps to verify visibility on the destination cluster's MCS.

    Create a Remote Mirror Volume with MCS

    1. Log into the MCS on the destination cluster.2. Navigate to MapR-FS > Volumes .

    3. Use the New Volume button to set up a Remote Mirror Volume .

    Set the Volume Type to Remote Mirror Volume .

    Assign a descriptive name as the Mirror Name .

  • 7/23/2019 Adm2000 Lab Guide

    40/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-9

    For the Source Volume Name , enter the name of the volume on the source cluster that youwant to mirror.

    For the Source Cluster Name , enter the name of the source cluster (once you start typing,the cluster name should appear so you can select it).

    For the Mount Path , enter the path on the destination cluster where the mirror volume will bemounted.

    Under Permissions , make sure that the user mapr has restore permissions.

    4. Click OK to create the volume. The volume should appear in your volume list: navigate toMapR-FS > Mirror Volumes to verify.

    Initiate Mirroring to the Destination Cluster

    1. In the MCS, select the remote mirror volume you created.

    2. Click Volume Actions and select Start Mirroring . Give the volume a few minutes to finishmirroring.

    3. Log into any node on the destination cluster, and list the contents of the destination mirrorvolume:

    ! 6$8++% 94 :(4 =&-#+.- #)&&+& #+,0. %+)0.@

    Or, if the volume is mounted via NFS, you can simply use operating system commands:

    ! (4 7#$%&7='(,4.-& 0$#-@7=&-#+.- #)&&+& #+,0. %+)0.@

    You should see the same contents in the mirror volume as you do in the source volume.4. Click on the volume name to open volume properties.

    5. In the Snapshot Scheduling section, set a Mirror Schedule . This will ensure that the remotemirror is updated on a regular basis.

  • 7/23/2019 Adm2000 Lab Guide

    41/48

    Lesson 5: Data Ingestion

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.5-10

    Lessons Learned With the cluster file system mounted, you can use standard Linux commands to copy data into

    the cluster.

    Snapshots can be created manually, or on a schedule. Snapshots that are created manually do

    not have a set expiration date. Snapshots that are created on a schedule will have an expirationdate, but then can be preserved before they expire if you want to keep them longer.

    Mirror volumes must be synchronized after they are created. They can be synchronizedmanually, or with a schedule.

    When a schedule is applied to a mirror volume, the retain time is ignored (data in mirror volumesdoes not expire; the mirror is updated with new data each time it is synchronized).

    Remote mirrors are set up between two clusters, typically for disaster recovery purposes.

    With a local mirror volume, the data is pushed from the source volume to the mirror volume. Witha remote mirror volume, data is pulled from the remote mirror.

  • 7/23/2019 Adm2000 Lab Guide

    42/48

    Lesson 6: Monitor Your Cluster Lab 6.1: Monitor Cluster Health

    Estimated time to complete: 10 minutes

    Check Cluster Heat Map

    1. In the MCS, navigate to Cluster > Dashboard .

    The Cluster Heatmap is displayed in the center panel.

    2. By default, the cluster heatmap shows the Health view. From the drop-down list at the top of theheat map, choose other options to see how they impact the node icons:

    You can also use the drop-down list to filter by certain alarm types. For example, if you selectNode Alarm Heartbeat Processing Slow , you will see alarms on any nodes that have a slowheartbeat:

    3. Click on a node icon to see more on the node's status. The view shows information on:

    The node's performance

    MapReduce slots (for MRv1)

  • 7/23/2019 Adm2000 Lab Guide

    43/48

    Lesson 6: Monitor Your Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.6-2

    Database operations

    MapR-FS disks

    System disks

    Services running on the node

    Check for Service Failures

    1. On the dashboard of the MCS, look at the Services pane:

    The pane lists services running on the cluster. In particular, look for any numbers in the Failcolumn.

    2. Click a failed service: a screen displays that shows which node has the failed service. From here,you can start or restart the service.

  • 7/23/2019 Adm2000 Lab Guide

    44/48

    Lesson 6: Monitor Your Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.6-3

    Lab 6.2: Stop, Start, and Restart Services

    Estimated time to complete: 5 minutes

    You can start, stop, and restart services through either the MCS, or with the command-line interface.

    1. In the MCS, navigate to Cluster > Nodes .2. Click the checkbox next to one or more nodes, then click Manage Services at the top of the

    screen.

    3. A list of services displays. From the drop-down list next to a service name, choose to start, stop,or restart the service on the selected nodes. Then click OK .

    4. You can also start and stop services from the command line:

    ! #$%&'() *+,- .-&/)'-. 01*$#-2 .3$&34.3+%4&-.3$&3 5*+,-. 1().3 +6 *+,-.2

  • 7/23/2019 Adm2000 Lab Guide

    45/48

    Lesson 6: Monitor Your Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.6-4

    Lab 6.3: Perform Maintenance

    Estimated time to complete: 20 minutes

    Replace a Failed Disk

    If a drive goes down, access to the entire storage pool is lost even if the other two disks are fullyoperational. Follow these steps to hot-swap a drive and rebuild the storage pool.

    Note : You will perform this procedure on a "healthy" disk, since you do not have any faileddisks in your lab cluster. The procedure is the same for a disk that has actually failed.

    1. When a drive fails, a Data Under Replicated alarm is raised. Check the alarm in the MCS to

    determine which node had a disk failure.2. If there was an actual disk failure, you would view the logs to determine the cause, to make sure

    the disk needs to be replaced. The log files are located at 7+%37#$%&7(+8.76$)(-,,).9:(+8 .

    3. In the MCS, navigate to Cluster > Nodes .

    4. Click the name of the node with the failed drive: the node properties display (for the purposes ofthis lab, you can select any of the nodes).

    5. Scroll down to MapR-FS and Available Disks .

    6. Scroll to the right and check the Storage Pool ID. Make note of all of the disks included in thesame storage pool:

    7. Check the box next to the failed disk, and click Remove Disk(s) to MapR-FS .

  • 7/23/2019 Adm2000 Lab Guide

    46/48

    Lesson 6: Monitor Your Cluster

    PROPRIETARY AND CONFIDENTIAL INFORMATION

    2015 MapR Technologies, Inc. All Rights Reserved.6-5

    8. Click OK . All of the disks in the storage pool will be brought offline to the cluster. All of the disksin the storage pool will be removed from MapR-FS, and the File System column will be empty.The Used column will show 0%:

    9. Replace the failed disk with a new one.

    10. Select all the devices that were part of the storage pool, and click Add Disks to MapR-FS. Thedisks will be assigned the next available Storage Pool ID, and the File System will once againshow as MapR-FS .

    Decommission a NodeUse the /decommissioned topology to take a node offline, either for retirement or to perform extendedmaintenance.

    1. In the MCS, navigate to Cluster > Nodes .

    2. Check the box next to the node you will take offline.

    3. Click Change Topology .

    4. In the drop-down list, select the /decommissioned topology. Then click OK . The node is movedto the decommissioned topology. Since the containers on the node belong in a different topology(such as /data), the system will initiate the process of creating new copies of the data on available

    nodes.

  • 7/23/2019 Adm2000 Lab Guide

    47/48

    Appendix A: Set Up Passwordless SSH Estimated time to complete: 30 minutes

    If passwordless SSH is not already set up on your cluster, follow these instructions. For the instructor-ledcourse ADM 2000, check with your instructor prior to completing these steps.

    Note : These instructions are specific to the classroom training, which uses AWS nodes for theclusters. If you are taking the on-demand course and supplying your own nodes, you will needto make adjustments.

    Allow Root Logins on All Nodes

    Log into each of your nodes as the !""# user, one at a time, and perform the following steps:

    1. Change the password of the !""# account to $%&!. Ignore any cautions about the passwordstrength.

    ' &%))*+

    2. Copy the ,-#.,))/,))/+0."1234 file to ))/+0."123456%7 , and then edit the file:

    ' .& ,-#.,))/,))/+0."1234 ,-#.,))/,))/+0."123456%7' 83 ,-#.,))/,))/+0."1234

    3. In the ,-#.,))/,))/+0."1234 file, comment or uncomment lines to match what is shown below:

    9-!$3#:""#;"431

  • 7/23/2019 Adm2000 Lab Guide

    48/48

    Appendix A: Set Up Passwordless SSH

    Generate and Copy Key Pairs

    1. Log in as the !""# user on the master node, and run the ))/@7-