Date post: | 18-Jul-2015 |
Category: |
Technology |
Upload: | larry-lo |
View: | 114 times |
Download: | 1 times |
1. Introduction2. InstallDockeronUbuntu14.043. SetHadoopEnvironmentinaDockerContainer4. SetHBaseEnvironmentinaDockercontainer5. ExportandImportaDockerImagebetweenNodesinCluster6. TheProblemIHaven'tSolved7. PossibleProblems
TableofContents
ThisisthebasictutorialtohelpdeveloperorsystemadministratortobuildabasiccloudenvironmentwithDocker.
Inthisbook,IwillnotuseDockerfiletocreateacontainerbecauseIdon'tknowhowtousethatyet.XDDD
Intheendofthisbook,IwillsummarysomeproblemIhaven'tsolvedyet.
Ifthereisanymistake,pleaseletmeknow.
DistributeCloudEnvironmentwithDocker
$sudoapt-getupdate
$sudoapt-getinstalldocker.io
Enabletab-completionofDockercommandsinBASH
$source/etc/bash_completion.d/docker.io
First,checkyoursystemcandealwithhttpsURLs:the/usr/lib/apt/methods/httpsshouldexist.If
itdoesn't,youneedtoinstallthepackageapt-transport-https
$apt-getupdate
$apt-getinstallapt-transport-https
AddtheDockerrepositorykeytoyoursystemkeychain.
$sudoapt-keyadv--keyserverhkp://keyserver.ubuntu.com:80--recv-keys36A1D7869245C8950F966E92D8576A8BA88D21E9
AddtheDockerrepositorytoyouraptsourcelist
$sudosh-c“echodebhttp://get.docker.io/ubuntudockermain>/etc/apt/sources.list.d/docker.list"
$sudoapt-getupdate
$sudoapt-getinstalllxc-docker
Toverifythateverythinghasworkedasexepected:
$sudodockerrun-i-tubuntu/bin/bash
YoushoulddownloadthelatestversionofUbuntuimage,andthenstartbashinanewcontainer.
Showrunningcontainers:
InstallDockeronUbuntu14.04LTS
Ubuntu-maintainedPackageInstallation
Docker-maintainedPackageInstallation
Note:IfyouwantinstalltherecentlyversionofDocker,youdonotneedtoinstalldocker.iofromUbuntu
BasicDockerCommand-line
$sudodockerps
Showallimagesinyourlocalrepository:
$sudodockerimages
Runacontainerfromaspecificimage
$sudodockerrun-i-t<image_id||repository:tag>-bash
Startaexistedcontainer:
$sudodockerstart-i<image_id>
Attacharunningcontainer:
$sudodockerattach<container_id>
Exitwithoutshuttingdownacontainer:
[Ctrl-p]+[Ctrl-q]
https://docs.docker.com/installation/ubuntulinux/#ubuntu-trusty-1404-lts-64-bit
Reference
InanewDockercontainer,youneedtosetbasicenvironmentfirstbeforesettingHadoop.
$sudoapt-getupdate
$sudoapt-getinstalldefault-jdk
ThedefaultJDKwillbeinstalledat/usr/lib/jvm/<java-version>
$sudoapt-getinstallgitwgetvimssh
$adduserhduser
Grantauserprivileges
$visudo
Appendthehduseryoujustcreatedbelowtherootandjustsetprivilegesspecificationasroot
$ssh-keygen-trsa
$cat.ssh/id_rsa.pub>>.ssh/authorized_keys
SetHadoopEnvironmentinaDockerContainer
UpdateAptList
InstallJavaJDK
Installsomeneededpackage
Createausertomanagehadoopcluster
GenerateSSHauthorizedkeytoletsocketconnectionwithoutpassword
SettheportofSSHandSSHD
BecauseIuseDockerasmydistributedtool,sothedefaultsshport22hasbeenlistenedbyhost
machine,IneedtouseanotherporttocommunicatebetweenDockercontainersindifferenthost
machine.Inmyexample,Iwilluse2122portforlisteningandsendingrequest.
ssh
$sudovi/etc/ssh/ssh_config
->Port2122
wq!
sshd
$sudovi/etc/ssh/sshd_config
->Port2122
->UsePAMno
wq!
Isupposedmycloudenvironmentisasfollowing:
VM:5nodes(master,master2,slave1,slave2,slave3)OS:Ubuntu14.04LTSDockerVersion:1.3.1HadoopVersion:2.6.0
Downloadhadoop-2.6.0
$wgethttp://ftp.twaren.net/Unix/Web/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
$tarzxvfhadoop-2.6.0.tar.gz
Setenvironmentpath
$sudovi/etc/profile
->exportJAVA_HOME=/usr/lib/jvm/<java-version>
->exportHADOOP_HOME=<YOUR_HADOOP_PACKAGE_PATH>
->exportHADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
->exportPATH=$HADOOP_HOME/bin:$PATH
->exportCLASSPATH=$HADOOP_HOME/lib:$CLASSPATH
:wq!
$source/etc/profile
Allneededwillbestoredin<HADOOP_HOME>/etc/hadoop
SetHadoopEnvironment
ModifyHadoopconfiguration
core-site.xml
<configuration>
<property>
<name>fs.DefaultFS</name>
<value>hdfs://master:9000</value>
<description>Themasterendpointincluster.</description>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/<HADOOP_HOME>/temp</value>
<description>Abaseforothertemporarydirectories.</description>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master2:90001</value>
<description>Setsecondarynamenodetopreventthemasternodecrash.</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/<HADOOP_HOME>/dfs/name</value>
<description>Setthelocationofnamenodeversion.</description>
</property>
<property>
<name>dfs.namenode.data.dir</name>
<value>file:/<HADOOP_HOME>/dfs/data</value>
<description>Setthelocationofdatanodeversion.</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
<description>Thenumberofreplicationincluster.</description>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
yarn-stie.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
hadoop-env.sh
$vi<HADOOP_CONF_DIR>/hadoop-env.sh
->exportJAVA_HOME=/usr/lib/jvm/<java-version>
wq!
yarn-env.sh
$vi<HADOOP_CONF_DIR>/yarn-env.sh
->exportJAVA_HOME=/usr/lib/jvm/<java-version>
wq!
slaves
$vi<HADOOP_CONF_DIR>/slaves
->master2
->slave1
->slave2
->slave3
AftersettingHadoopenvironmentinprevioussection,youcansethbaseenvironmentnow.
Inthefollowingexample,Iwillusecustomzookeepertomanagetheresourceofmycluster.
HBaseversion:0.99.2Zookeeperversion:3.3.6
$wgethttp://ftp.twaren.net/Unix/Web/apache/hbase/hbase-0.99.2/hbase-0.99.2-bin.tar.gz
$wgethttp://ftp.twaren.net/Unix/Web/apache/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.tar.gz
$tar-zxvfhbase-0.99.2-bin.tar.gz
$tar-zxvfzookeeper-3.3.6.tar.gz
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master</name>
<value>hdfs://master:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value><ZOOKEEPER_HOME>/data</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
</configuration>
SetHBaseEnvironmentinaDockercontainer
DownloadHBaseandZookeeper
SettheconfigurationofHBase
hbase-env.sh
$vi<HBASE_HOME>/conf/hbase-env.sh
->exportHBASE_HOME=<HBASE_HOME>
->exportHADOOP_HOME=<HBASE_HOME>
->exportHBASE_CLASSPATH=$HADOOP_CONF_DIR
->HBASE_MANAGES_ZK=false
regionservers
$vi<HBASE_HOME>/conf/slaves
->master2
->slave1
->slave2
->slave3
Asprevioussettingofhbase-env.sh,youcanseeIsetHBASE_MANAGES_ZK=falsetousemycustom
zookeepertomanageandmonitortheresourceofcluster.
zoo.cfg
$vi<ZOOKEEPER_HOME>/conf/zoo.cfg
->dataDir=<ZOOKEEPER_HOME>/data
->clientPort=2181
->server.1=master:2888:3888
Andthenaddamyidfileunder<ZOOKEEPER_HOME>/datatotellzookeeperwhichnodeisthe
zookeeperrunningon.
forexample,asmyzoo.cfgsetserver.1=master:2888:3888,itmeansthiszookeeperthreadrunning
onmasternodebinding2888portand3888port.SoIneedtotellzookeeperwhichmachineisthat
runon.
$vimyid
->1
wq!
$sudovi/etc/profile
->exportHBASE_HOME=<HBASE_HOME>
->exportZOOKEEPER_HOME=<ZOOKEEPER_HOME>
->exportPATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HBASE_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH
SettheconfigurationofZookeeper
SetSystemEnvironment
AfterfollowingprevioussectionstosetHadoopandHBaseconfigurations,wecancommitthisDocker
imagetodistributecloudcluster.
BriefSummary
ThefollowingfigureshowswhatIplantodistributemycloudcluster.
Ifyoufollowtheprevioussections,youwillgetaDockercontainerwithhadoopenvironment.Tousethat,youneedtoduplicateittootherendpointsyouwanttodistributetobuildhadoopcluster.
Savethecontainerasacompressedtarfile.
$sudodockersave<image_repository:tag>>XXXX.tar
Afterfinishingexport,Iusescptotransporttheimagetoothernodesincluster.
ExportandImportaDockerImagebetweenNodesinCluster
NOTE:MyexperimentenvironmentisonWindowsAzureVirtualMachine.
ExportaDockercontainertoaimage.
$scp-P[port]XXXX.tar[account]@[domain]/<whereyouwanttostore>
Now,switchtothemaster2toshowhowtousetheimagewejusttransportfrommaster.
$sudodockerload<XXXX.tar
Afterloadingtarfile,wecanchecktheimagejustimportingfromlocalrepository.
$sudodockerimages
Now,wecanstarttousethisimagetodistributethecluster.Inthisexample,Iwriteasimpleshell
scripttorunDockerimageinsteadofDockerfilebecauseIhaven'tknowedhowtouseDockerfile
tobuildaDockercontainer.
Because--linkoptionisnotfitinmysituation,Iusethebasicallyportmappingtotryconnectwitheach
containerviainternet
$vibootstrap.sh
->sudodockerrun-i-t-p2122:2122-p50020:50020-p50090:50090-p50070:50070-p50010:50010-p50075:50075-p8031:8031-p8032:8032-p8033:8033-p8040:8040-p8042:8042-p49707:49707-p8088:8088-p8030:8030-p9000:9000-p9001:9001-p3888:3888-p2888:2888-p2181:2181-p16020:16020-p60010:60010-p60020:60020-p60000:60000-p9090:9090-h<hostname><repository:tag>
wq!
$shbootstrap.sh
Inthisexample,Iusemaster2ashostnameandlistenallneededportsfromcontainertoendpoint
machine.
Now,wecanstarttobootthehadoopclusteron.Therearesomestepsbeforestart-dfs.sh.
Eachcontainersinclusterneedtodothefollowingstatement
$source/etc/profile
$sudovi/etc/hosts
->putallIPandhostname
##forexample
Shellscript
DistributewithDockerContainer
127.0.0.5master
10.0.0.2master2
10.0.0.3slave1
10.0.0.4slave2
10.0.0.5slave3
wq!
##restartsshtwice
$sudoservicesshrestart
$sudoservicesshrestart
Mastercontainerneedtodothis
##formathdfsnamenode
$hdfsnamenode-format
$<HADOOP_HOME>sbin/start-dfs.sh
$<HADOOP_HOME>sbin/start-yarn.sh
##maketherootdirectoryofhbase
$hadoopfs-mkdir/hbase
##startzookeeper
$zkServer.shstart
##starthbase
$start-hbase.sh
##TEST
$jps
Ifsuccess,youwillseethefollowingprocessrunningonmaster
Now,wecanstartusinghadoopandhbasetorecordandanalyzedatabyfollowingthistutorial.Nevertheless,therearestillproblemsI'vemetbutnotsolved.
Insection4,afterIdockerruneachcontainers,Ihavetomodifyevery/etc/hostsofeachcontainers
toconnecteachothers.Butitwillhappentwoproblems.
First,itisinconvenienttomodifyeveryhostsfileiflargeamountsofendpointmachines.
Second,ifrebootcontainers,theIPofcontainerwillbeautomaticallyresetbyDocker.SupposeyouuseHBaseasyourNoSQLDB,ZookeeperwillstoreoldIPandallregionserverscan'ttracebackthe
masternode.
I'vesurveyedsomemethodtosolvebutnotimplementyet.Lookatthefollowinglink:
http://jpetazzo.github.io/2013/10/16/configure-docker-bridge-network/
IfIsolveproblems,Iwillupdatethebook.
I'vereferencedthebeforelinkandtrytosolvethesecondproblemImet.TherearesomenewproblemshappenedsothatIdon'tsolveit.
I'vemodifytheinterfaceIPandrouteofcontainersuccessfully,butIcan'tsshintocontainerafterdo
that.ThfollowingcodeiswhatIuse,maybeeverybodycandiscussonthisissueinstackoverflow.
http://stackoverflow.com/questions/27937185/assign-static-ip-to-docker-container
pid=$(sudodockerinspect-f'{{.State.Pid}}'<container_name>2>/dev/null)
sudorm-rf/var/run/netns/*
sudoln-s/proc/$pid/ns/net/var/run/netns/$pid
sudoiplinkaddAtypevethpeernameB
sudobrctladdifdocker0A
sudoiplinksetAup
sudoiplinksetBnetns$pid
sudoipnetnsexec$pidiplinkseteth0down
sudoipnetnsexec$pidiplinkdeleteeth0
sudoipnetnsexec$pidiplinksetdevBnameeth0
sudoipnetnsexec$pidiplinkseteth0address12:34:56:78:9a:bc
sudoipnetnsexec$pidiplinkseteth0down
sudoipnetnsexec$pidiplinkseteth0up
sudoipnetnsexec$pidipaddradd172.17.0.1/16deveth0
sudoipnetnsexec$pidiprouteadddefaultvia172.17.42.1
TheProblemIHaven'tSolved
Update
2015/01/20