+ All Categories
Home > Documents > Linux command line basics II: downloading data and...

Linux command line basics II: downloading data and...

Date post: 01-Feb-2018
Category:
Upload: doanliem
View: 223 times
Download: 0 times
Share this document with a friend
31
Linux command line basics II: downloading data and controlling files Yanbin Yin 1
Transcript
Page 1: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Linux command line basics II: downloading data and

controlling filesYanbinYin

1

Page 2: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

2

Learningprogramminghastogothroughthehands-onpractice,alotofpractice

HearingwhatIdescribeaboutacommandoraprogramhelps,butyouwillnotbeabletodoitunlessyoutypeinthecodesandrunittoseewhathappens

Readingothers’codeshelpsbutoften isharderthanwritingitbyyourself fromscratch

Althoughpainfulandfrustrating,trouble-shootingisnormalandpartofthelearningexperience(askexperiencedpeopleorgoogle)

Toavoiderrors,youhavetofollowrules;mosterrorsoccurredinprogrammingarebecauseofnotknowingrulesorforgettingrules

Usecommentsincaseyouforgetwhatyou’vewrittenmeans

write->run->errors->edit->errors->………………………………….. ->run->success

Goodnews:finishedscriptscouldbereusedoreditedforlateruse

Thingsyoushouldknowaboutprogramming

Page 3: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Homework#71.Createafolderunderyourhomecalledhw7

2.Changedirectorytohw7

3.GotoNCBIftpsite,findthegenome,bacteria,ecoli MG1655folder,anddownloadtheptt fileandthefaa fileinthere

4.Createacopyoftheppt file,iftheoriginalfileiscalledA.ptt,namethecopiedfileA.ptt.bakDothesamethingforthefaa file

5.Readthechapter5ofhttp://edu.isb-sib.ch/pluginfile.php/2878/mod_resource/content/3/couselab-html/content.html andfinishallquizzesinthere

6.Usewhatyoulearnedinchapter5tocounthowmanyproteinsequencesareinthefaafle ofstep4.

3

Writeareport(inwordorppt)toincludealltheoperations/commands andscreenshots.

DueonNov10(sendbyemail)Officehour:Tue,ThuandFri2-4pm,MO325AOremail:[email protected]

Page 4: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Whatwelearnedlastclass:

filesystem,relative/absolutepaths,workingfolder,homefolder

ssh,pwd,lscd,mkdir,rmdir,rm,mancp,mv

Ifthingsgowrong, try:

Ctrl+c (sometimesmultiple times)

qtoexitfrommanpage

4

Page 5: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

http://korflab.ucdavis.edu/Unix_and_Perl/unix_and_perl_v3.1.1.pdf5

Page 6: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Viewfiles:more, less, head, tail

HowyouuseTabkeytoautocomplete

less /home/ thenhittabtwice,youwillseeallfiles/foldersunder /home/less /home/yyin/ thenhittabtwice,youwillsee…

less /home/yyin/U thenhittabonce,Unix_and_Perl_course willbeautocompleted

less /home/yyin/Unix_and_Perl_course/ keepdoing thisuntilyouget

less /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/At_proteins.fasta

See next page for screen shots

q:quitviewing ↑or↓:moveupordownalinespace:nextpage />:searchfortext‘>’BorPgUp:backapage ForPgDn: forwardapagen:findnextoccurrenceof‘abc’G:gototheend ?:findpreviousoccuence of ‘abc’

6

Page 7: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

HowyouuseTabkeytoautocomplete

7

Page 8: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

more /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/At_genes.gff

moreissimilartoless,butcandolessthanless

head /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/chr1.fasta

head -20 /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/chr1.fasta

headtodumpthetopfewlinestothescreen

tail /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/intron_IME_data.fasta

tail -20 /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/intron_IME_data.fasta

tailtodumpthelastfewlinestothescreen

more, less,head, taildonot loadallfilecontenttothememoryYoucaneditthefilecontenteither, theyarejustviewers 8

Page 9: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

9

CreateoreditfilesTexteditors:nanopicovi

Supposeyouareatyourhome:

WritethetoppartoftheintAt_genes.gff filetoanewfilehead -20 /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/At_genes.gff > head

Trynano (Intuitiveuserinterface)nano head

Tryvi(command-driven interface,butmuchmorepower)vi head

Createafilefromscratchusingvi.1) youtypevi filename andhitenter2) afteryouareinvi,typei togetintoeditmodeandcopy&pastecontentinvi3) hitEsc toexiteditmodeandthen:x tosavethefileandexitvi.

Page 10: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

10

Inputandoutputredirection:thegreater-thansign

Unixhasaspecialwaytodirectinputandoutputfromcommandsorprograms.

Bydefault,theinput isfromkeyboard (calledstandardinput,stdin):youtypeinacommandandShelltakesthecommandandexecutesit.

Thestandardoutputbydefaultistotheterminalscreen(stdout);

ifthecommandorprogramfailed,youwillalsohavestandarderrorsdumped totheterminalscreen(stderr).

However, ifyoudonotwanttheoutputdumped tothescreen,youcanuse“>”toredirect/writetheoutput intoafile.Forexample,try

ls /home/yyinls /home/yyin > listls /home/yyimls /home/yyim 2> err

“2>”todumptheerrormessageNospacehere!

Page 11: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

11

vibasics

commandmode editmodeiEsc

Thefollowing commandsoperateincommandmode(hit Esc beforeusingthem)x deleteonecharacteratcursorpositionu undodd deletethecurrentlineG gotoendoffile1G gotobeginning offile10G gotoline10$ gotoendofline1 gotobeginning ofline:q! exitwithoutsaving:w save(butnotexit):wq or:x saveandexitArrowkeys: movecursoraround (inbothmodes)

http://cbsu.tc.cornell.edu/ww/1/Default.aspx?wid=36

Page 12: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

12

Searchandsubstitutioninvi

Incommandmode,youcandoanumber offancythings.Themostusefulare:

- Search: hitslash(“/”)togetthecursortotheleft-bottomcorner;youcantypeanywordorlettertosearchit;typentogotothenextinstance

- Replace:hitEsc(atanytime,hittingEsctogetbacktothedefaultstatusisthesafestthing todo)andtype“:1,$s/+/pos/g”andthenenterwillreplaceall“+”to“pos”.

Trythisinvi head

:1,$s/+/pos/gReady to type in command

From the first line to the last

Substitution

The first field: to be replaced

The second field: to replace with

all instances in a row

1) hitEsc toexiteditmodeandthen:q!toNOTsavethefileandexitvi.

Page 13: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Wildcardsandregularexpression

Regularexpression(regexorregexp)isaverypowerfultoolfortextprocessingandwidelyusedintexteditors(e.g.vi)andprogramminglanguages(e.g.Shellcommands:sed,awk,grep andperl,python,PHP)toautomaticallyedit(matchandreplacestrings)texts.

Findingandreplacingexactwordsorcharactersaresimple,e.g.theviexampleshownabove

However,ifyouwanttomatchmultiplewordsorcharacters,youwillneedwildcardsorpatterns.

13

Page 14: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

alistofcommonlyusedwildcardsandpatterns:

* anynumbersofletters,numbersandcharactersexceptforspacesandspecialcharacters,e.g.()[]+\/$@#%;,?

. anysingleletter,numberandcharacterincludingspecialcharacters^ startofaline$ endofaline^$ anemptyline,i.e.nothingbetween̂ and$[] createyourownpattern,e.g.[ATGC]matchesoneofthefourlettersonly,

[ATGC]{2}matchestwosuchletters;[0-9]:anynumbers

\w anyletter(a-zandA-Z)\d anynumber(0-9)+ previousitemsatleastonetimes,e.g.\w+matcheswordsofanysizes{n} previousitemsntimes,e.g.\w{5}matcheswordswithexactlyfiveletters\s space\t tabularspace\n newline

caret

http://www.bsd.org/regexintro.html

Curlybrackets

14

Page 15: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

This overwrite the head file:head -20 /home/yyin/Unix_and_Perl_course/Data/Arabidopsis/At_proteins.fasta > head

vi head

Inside vi, try :1,$s/ *//g

Hit u to undo

What about :1,$s/ .*//g

1) hitEsc toexiteditmodeandthen:x tosavethefileandexitvi.

Useregex insidevi

15

Page 16: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

16

Getdatafromremoteftp/httpwebsiteftplftpsftpncftp

lftp addr command to connect to a remote ftp servercd dir change to the directorycd .. change to the upper folder (..)ls list files and folders in the current directory at oncels dir list files and folders in dir at oncels | less list page by page (good if the list is too long)get file get a filemirror dir get a folderzless file view the file contentby or bye exit lftp

Page 17: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

NCBIftpsite:

ConnecttoNCBIftpsite:lftp ftp.ncbi.nih.gov

Thepromptwillchangeto:lftp ftp.ncbi.nih.gov:/>

After‘>’youcantypeincommandandhitenter:lftp ftp.ncbi.nih.gov:/>ls

Theftpsitecanalsobeaccessedthroughawebbrowser

17

Page 18: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

ls command:

listfilesandfolders

18

Page 19: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Wherebacterialgenomesareintheftpsite?

19

Page 20: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Theendofthepageafterls

20

Page 21: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

cdne

Thenpresstabkeytoauto-completeorlist

21

Page 22: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Howtotransferfilebetweenalinux andawindowsmachine?UseSSHsecurefiletransferclient

OpenthesoftwareHitenter

PutIPaddress(10.157.217.4)PutusernameHitconnect

Chooseyes

PutpasswordHitok

22

Page 23: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Iftransferfrom localtoremote:locateyourfileanddragtotherightIftransferfromremotetolocal:locateyourfileanddragtotheleft

23

Page 24: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

TransferfilesbetweentwoLinuxmachines(ormac andlinux)

scp:securecopyfiles/foldersbetweenhostsonanetwork

YouareataLinuxorMacmachine,e.g.yourlaptopwithUbuntu installedandyouwanttocopysomefilefromser

Openaterminalinyourmachine

scp [email protected]:/home/yyin/Unix_and_Perl_course/Data/Arabidopsis/At_genes.gff .

scp username@IP:/path .

Youwillbeaskedforpasswordonser

24

Page 25: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

25

wget isaprogramuseful fordownloading filesfrombothFTPandHTTPsites.

wget isnon-interactive:yousimplyenterthenecessaryoptions andargumentsonthecommandlineandthefileisdownloaded foryou.

Youmustidentify thelinksfirst:browseahttpwebpageoraftpsiteandlocatetheremotefiles/foldersyouwanttodownload andthengototheterminalandtype

wget -q ftp.ncbi.nih.gov/blast/db/FASTA/yeast.aa.gz

wget -r -q ftp://ftp.ncbi.nih.gov/genomes/archive/old_refseq/Bacteria/Escherichia_coli_K_12_substr__MG1655_uid57779

wget –q ftp.ncbi.nih.gov:/blast/executables/LATEST/ncbi-blast-2.2.27+-x64-linux.tar.gz

wget ftp://emboss.open-bio.org/pub/EMBOSS/emboss-latest.tar.gz

wget

-qquiet-rrecursive(for folders)

IttaketimetodownloadPut& attheendofcommand linetoputthejobtothebackground

Page 26: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

26

Archiveandcompressfiles/folders

Tosavediskspace,wecancompresslargefilesifwedonotintendtousethemforawhile.Alotoffilesdownloaded fromthewebarecompressedandneedtobeuncompressedbeforeanyprocessingcantakeplace.

Commoncompressedformats:•gzip (gz)

gzip my_file (compressesfilemy_file,producing itscompressedversion,my_file.gz)

gzip –dmy_file.gz(decompressmy_file.gz,producing itsoriginalversionmy_file)

•bzip2bzip2my_file (compressesfilemy_file,producing itscompressedversion,

my_file.bz2)bunzip2my_file.bz2(decompressmy_file.bz2,producing itsoriginal

versionmy_file)

zless toviewzipped files

Page 27: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

27

Commoncompressedformats(continued):•zip

zipmy_file.zipmy_file1my_file2my_file3(createacompressedarchivecalledmy_files.zip, containing threefiles:my_file1,my_file2,

my_file3)zip-rmy_file.zipmy_file1my_dir (ifmy_dir isadirectory,createan

archivemy_file.zipcontaining thefilemy_file1andthedirectorymy_dir

withallitscontent)zip–lmy_file.zip(listcontentsoftheziparchivemy_file.zip)unzipmy_files.zip(decompressthearchiveintotheconstituentfilesand

directories•tar

tar-cvf my_file.tarmy_file1my_file2my_dir (createacompressedarchivecalledmy_files.tar,containing filesmy_file1,my_file2

andthedirectorymy_dir withallitscontent)

tar–tvf my_file.tar(listcontentsofthetararchivemy_file.tar)tar-xvf my_files.tar(decompressthearchiveintotheconstituentfiles

anddirectories)

Usemantartolearnmore

Page 28: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

28

Commoncompressedformats(continued):•tgz (also,tar.gz – essentiallyacomboof“tar”and“gzip”)

tar-czvf my_file.tgzmy_file1my_file2my_dir (createacompressedarchivecalledmy_files.tgz,containing filesmy_file1,my_file2

andthedirectorymy_dir withallitscontent)

tar–tzvf my_file.tgz(listcontentsofthetararchivemy_file.tar)tar-xzvf my_files.tgz(decompress thearchiveintotheconstituentfiles

anddirectories)

Page 29: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

Wget thebookmaterialsofUnixandPerlPrimerforBiologistshttp://korflab.ucdavis.edu/Unix_and_Perl/

mkdir book

cd bookwget http://korflab.ucdavis.edu/Unix_and_Perl/current.zip

unzip current.zip

Unpackage theembosspackage

cdmkdir toolscd toolsmv ../emboss-latest.tar.gz toolstar –zxf emboss-latest.tar.gz &

29

Page 30: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

30

CheckdiskusageDiskspaceisalimited resource,andyouwanttofrequentlymonitorhowmuchdiskspaceyouhaveused.Tocheckthediskspaceusageforafolder,usethedu(diskusage)commandyyin@ser:~$ du -hs .318M .yyin@ser:~$ du -hs Unix_and_Perl_course/131M Unix_and_Perl_course/

Tocheckhowmuchspaceleftontheentirestoragefilesystem,usethedf command

Page 31: Linux command line basics II: downloading data and ...cys.bios.niu.edu/yyin/teach/PBB/linux-cmd2.pdf · Linux command line basics II: downloading data and controlling files Yanbin

31

- Savehistoryofyourcommands:history > hist1less hist1

- Sendmessagetootheronlineuserswrite username(ctrl+c toexit)

- Changeyourpasswordpasswd

Ctrl+c totellShelltostopcurrentprocessCtrl+z tosuspendbg tosendtobackgroundCtrl+d toexittheterminal(logout)


Recommended