+ All Categories
Home > Documents > Proceedings of the 10th USENIX Security Symposium

Proceedings of the 10th USENIX Security Symposium

Date post: 17-Jan-2022
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
17
USENIX Association Proceedings of the 10 th USENIX Security Symposium Washington, D.C., USA August 13–17, 2001 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2001 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
Transcript
Page 1: Proceedings of the 10th USENIX Security Symposium

USENIX Association

Proceedings of the10th USENIX Security

Symposium

Washington, D.C., USAAugust 13–17, 2001

THE ADVANCED COMPUTING SYSTEMS ASSOCIATION

© 2001 by The USENIX Association All Rights Reserved For more information about the USENIX Association:

Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.orgRights to individual papers remain with the author or the author's employer.

Permission is granted for noncommercial reproduction of the work for educational or research purposes.This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.

Page 2: Proceedings of the 10th USENIX Security Symposium

Timing Analysisof Keystrokesand Timing Attacks on SSH�

Dawn XiaodongSong David Wagner XuqingTianUniversityof California, Berkeley

Abstract

SSH is designedto provide a securechannelbetweentwo hosts. Despitethe encryptionand authenticationmechanismsit uses,SSH hastwo weakness:First, thetransmittedpackets are paddedonly to an eight-byteboundary(if a block cipheris in use),which revealstheapproximatesizeof the original data. Second,in inter-activemode,every individualkeystrokethatausertypesis sentto theremotemachinein aseparateIP packet im-mediatelyafterthekey is pressed,which leakstheinter-keystroke timing information of users’typing. In thispaper, we show how theseseeminglyminor weaknessesresultin serioussecurityrisks.

First we show that even very simple statistical tech-niquessuffice to revealsensitive informationsuchasthelengthof users’passwordsorevenrootpasswords.Moreimportantly, we further show that by using more ad-vancedstatisticaltechniqueson timing informationcol-lectedfrom thenetwork, theeavesdroppercanlearnsig-nificant informationaboutwhat userstype in SSH ses-sions. In particular, we perform a statisticalstudy ofusers’typing patternsand show that thesepatternsre-veal informationaboutthekeys typed.By developingaHiddenMarkov Model andour key sequencepredictionalgorithm,we canpredictkey sequencesfrom theinter-keystroke timings. We further develop an attacker sys-tem,Herbivore , whichtriesto learnusers’passwordsbymonitoringSSH sessions.By collectingtiming informa-tion on thenetwork, Herbivorecanspeedup exhaustivesearchfor passwordsby a factorof 50. We alsoproposesomecountermeasures.

In generalour resultsapply not only to SSH, but alsoto a generalclassof protocolsfor encryptinginteractivetraffic. We show that timing leaksopena new set ofsecurity risks, and hencecautionmust be taken whendesigningthis typeof protocol.

�This researchwassupportedin partby theDefenseAdvancedRe-

searchProjectsAgency underDARPA contractN6601-99-28913(un-der supervisionof the SpaceandNaval WarfareSystemsCenterSanDiego) and by the National SciencefoundationundergrantsFD99-79852andCCR-0093337.

1 Intr oduction

Justa few yearsago,peoplecommonlyusedastonish-ingly insecurenetworking applicationssuch as tel-net, rlogin, or ftp, which simply passall confi-dential information, including users’passwords, in theclear over the network. This situationwas aggravatedthroughbroadcast-basednetworks thatwerecommonlyused(e.g.,Ethernet)which alloweda malicioususertoeavesdropon the network and to collect all communi-catedinformation[CB94, GS96].

Fortunately, many usersandsystemadministratorshavebecomeaware of this issue and have taken counter-measures.To curb eavesdroppers,securityresearchersdesignedthe SecureShell (SSH), which offers an en-cryptedchannelbetweenthe two hostsand strongau-thenticationof boththeremotehostandtheuser[Yl o96,SSL01, YKS

�00b]. Today, SSH is quitepopular, andit

haslargely replacedtelnet andrlogin.

Many usersbelieve that they aresecureagainsteaves-droppersif they useSSH. Unfortunately, in this paperwe show that despitestate-of-the-artencryptiontech-niquesandadvancedpassword authenticationprotocols[YKS

�00a], SSH connectionscanstill leak significant

information about sensitive data such as users’ pass-words. This problemis particularlyseriousbecauseitmeansusersmay have a false confidenceof securitywhenthey useSSH.

In particularweidentify thattwoseeminglyminorweak-nessesof SSH leadto serioussecurityrisks. First, thetransmittedpackets are paddedonly to an eight-byteboundary(if a block cipher is in use). Thereforeaneavesdroppercaneasilylearntheapproximatelengthofthe original data. Second,in interactive mode, everyindividual keystroke that a usertypesis sentto the re-mote machinein a separateIP packet immediatelyaf-ter the key is pressed(exceptfor somemetakeys suchShift or Ctrl). We show in thepaperthat this prop-ertycanenabletheeavesdropperto learntheexactlengthof users’passwords.More importantly, aswehaveveri-fied,thetimeit takestheoperatingsystemto sendoutthepacket after the key pressis in generalnegligible com-paring to the inter-keystroke timing. Hencean eaves-

Page 3: Proceedings of the 10th USENIX Security Symposium

droppercanlearnthepreciseinter-keystroke timingsofusers’typing from thearrival timesof packets.

Experienceshows thatusers’typing follows stablepat-terns1. Many researchershave proposedto usethe du-ration of key strokesandlatenciesbetweenkey strokesasa biometricfor userauthentication[GLPS80, UW85,LW88, LWU89, JG90, BSH90, MR97, RLCM98,MRW99]. A morechallengingquestionwhich hasnotyet beenaddressedin the literatureis whetherwe canusetiming informationaboutkey strokesto infer thekeysequencesbeingtyped.If wecan,canweestimatequan-titatively how many bits of informationarerevealedbythe timing information? Experienceseemsto indicatethat the timing informationof keystrokesrevealssomeinformationaboutthe key sequencesbeing typed. Forexample,wemighthaveall experiencedthattheelapsedtime betweentyping the two letters“er” can be muchsmallerthanbetweentyping “qz”. This observation isparticularly relevant to security. Sinceaswe show theattackercangetpreciseinter-keystroketimingsof users’typing in a SSH sessionby recordingthepacket arrivaltimes,if theattacker caninfer whatuserstypefrom theinter-keystroke timings, thenhe could learnwhat userstypein aSSH sessionfrom thepacket arrival times.

In this paperwe study users’keyboarddynamicsandshow thatthetiming informationof keystrokesdoesleakinformation about the key sequencestyped. Throughmoredetailedanalysisweshow thatthetiming informa-tion leaksabout1 bit of informationaboutthe contentper keystroke pair. Becausethe entropy of passwordsis only 4–8 bits per character, this 1 bit per keystrokepair informationcanrevealsignificantinformationaboutthe contenttyped. In order to useinter-keystroke tim-ings to infer keystroke sequences,we build a HiddenMarkov Modelanddevelopan-Viterbi algorithmfor thekeystroke sequenceinference.To evaluatetheeffective-nessof the attack,we further build an attacker system,Herbivore,whichmonitorsthenetwork andcollectstim-ing information aboutkeystrokes of users’passwords.Herbivore thenusesour key sequencepredictionalgo-rithm for password prediction. Our experimentsshowthat,for passwordsthatarechosenuniformly at randomwith lengthof 7 to8characters,Herbivorecanreducethecostof password crackingby a factorof 50 andhencespeedup exhaustive searchdramatically. We alsopro-posesomecountermeasuresto mitigatetheproblem.

Weemphasizethattheattacksdescribedin thispaperarea generalissuefor any protocolthatencryptsinteractivetraffic. For concreteness,we studyprimarily SSH, buttheseissuesaffectnotonlySSH 1 andSSH 2, but also

1In this paperwe only consideruserswho arefamiliar with key-boardtypingandusetouchtyping.

any otherprotocolfor encryptingtypeddata.

The outline of this paperis as follows. In Section2we discussin more details about the vulnerabilitiesof SSH and varioussimple techniquesan attacker canuse to learn sensitive information such as the lengthof users’passwordsand the inter-keystroke timings ofusers’passwords typed. In Section3 we presentourstatisticalstudyon users’typing patternsandshow thatinter-keystroke timingsrevealabout1 bit of informationperkeystrokepair. In Section4 wedescribehow wecaninfer key sequencesusinga HiddenMarkov Model anda n-Viterbi algorithm. In Section5 we describethede-sign,developmentandevaluationof anattacker system,Herbivore,whichlearnsusers’passwordsby monitoringSSH sessions.We proposecountermeasuresto preventtheseattacksin Section7, andconcludein Section8.

2 Eavesdropping SSH

TheSecureShellSSH [SSL01, YKS�

00b] is usedto en-crypt thecommunicationlink betweena localhostandaremotemachine.Despitetheuseof strongcryptographicalgorithms,SSH still leaksinformationin two ways:

� First, thetransmittedpacketsarepaddedonly to aneight-byteboundary(if a block cipher is in use),which leaks the approximatesize of the originaldata.

� Second, in interactive mode, every individualkeystroke that a user types is sent to the remotemachinein a separateIP packet immediatelyafterthekey is pressed(exceptfor somemetakeys suchShift orCtrl). Becausethetimeit takestheop-eratingsystemto sendout thepacket after thekeypressis in generalnegligiblecomparingto theinter-keystroke timing (as we have verified), this alsoenablesan eavesdropperto learnthe preciseinter-keystroke timingsof users’typing from thearrivaltimesof packets.

The first weaknessposessomeobvious securityrisks.For example, when one logs into a remotesite R inSSH, all the charactersof the initial login passwordare batchedup, paddedto an eight-byteboundaryif ablock cipheris in use,encrypted,andtransmittedto R.Due to the way paddingis done,an eavesdroppercanlearn one bit of information on the initial login pass-word, namely, whetherit is at least7 characterslongor not. Thesecondweaknesscanleadto somepotentialanonymity risks since,asmany researchershave foundpreviously, inter-keystroke timings canreveal the iden-

Page 4: Proceedings of the 10th USENIX Security Symposium

SSHServer B

ClientHost A "s"

20

"u"

20

20 20

20

28

Return

"Password: "

20 20 20 20 20

"i" "a""J""u""l" Return

20

N

Prompttime

time

Figure1: Thetraffic signatureassociatedwith runningSU in aSSH session.Thenumbersin thefigurearethesize(in bytes)of thecorrespondingpacket payloads.

tity of theuser[GLPS80, UW85, LW88, LWU89, JG90,BSH90, MR97, RLCM98, MRW99].

In thissection,weshow thatseveralsimpleandpracticalattacksexploiting thesetwo weaknesses.In particular,anattackercanidentify whichtransmittedpacketscorre-spondto keystrokesof sensitive datasuchaspasswordsin a SSH session.Using this information, the attackercaneasilyfind out theexact lengthof users’passwordsandeventhepreciseinter-keystroketimingsof thetypedpasswords. Learning the exact length of users’ pass-words allows eavesdroppersto target userswith shortpasswords. Learningthe inter-keystroke timing infor-mationof the typedpasswordsallows eavesdropperstoinfer the contentof the passwords as we will show inSection3 and4.

Traffic Signature Attack We canoftenexploit prop-ertiesof applicationsto identify which packets corre-spondto the typing of a password. Consider, for in-stance,theSU command.Assumethe userhasalreadyestablisheda SSH connectionfrom local host A to re-mote host B. When the user types the commandSUin the establishedSSH connectionA � B, we obtainapeculiartraffic signatureas shown in Figure 1. If theSSH sessionusesSSH 1.x2 and a block cipher suchasDESfor theencryption[NBS77, NIS99], asis com-mon, then the local host A sendsthree20-bytepack-ets: “s”, “u”, “Return”. The remotehostB echoesthe“s” and“u” in two 20-bytepacketsandsendsa 28-bytepacket for the “Password: ” prompt. ThenA sends20-byte packets,one for eachof the password characters,without receiving any echodatapackets. B thensendssomefinal packetscontainingtherootpromptif SU suc-ceeds,otherwisesomefailuremessages.Thusby check-ing the traffic against this “su” signature,the attackercanidentify whentheuserissuestheSU commandand

2Theattackalsoworkswhenssh 2.x is in use.Only thepacketsizesareslightly different.

hencelearnwhich packetscorrespondto the passwordkeystrokes. Note thatsimilar techniquescanbeusedtoidentify when userstype passwords to authenticatetootherapplicationssuchasPGP[Zim95] in a SSH ses-sion.

Multi-User Attack Evenmorepowerful attacksexistwhen the attacker also has an accounton the remotemachinewhere the user is logging into throughSSH.For example, the processstatuscommandps can listall the processesrunningon a system.This allows theattacker to observe eachcommandthatany useris run-ning. Again, if theuseris runningany commandthatre-quiresapasswordinput(suchassu orpgp) theattackercanidentify the packetscorrespondingto the passwordkeystrokes.

NestedSSH Attack Assumethe userhasalreadyes-tablisheda SSH sessionbetweenthe local host A andremotehostB. Thentheuserwantsto openanotherSSHsessionfrom B to anotherremotehostC asshown in Fig-ure2. In thiscase,theuser’spassword for C is transmit-ted, onekeystroke at a time, acrosstheSSH-encryptedlink A � B from the user to B, even thoughthe SSHclient on machineB patientlywaits for all charactersofthe password beforeit sendsthemall in onepacket tohostC for authentication(asdesignedin theSSH proto-col [YKS

�00a]). It is easyto identify suchanestedSSH

connectionusing techniquesdevelopedby ZhangandPaxson[ZP00b, ZP00a]. Hencein this casethe eaves-droppercaneasilyidentify thepacketscorrespondingtotheuser’s password on link A � B, andfrom this learnthe lengthandthe inter-keystroke timings of the users’passwordonhostC.

Page 5: Proceedings of the 10th USENIX Security Symposium

Adversary

CA

B

eavesdrop

pass

word SSH2

SSH1

password

Figure2: ThenestedSSH attack.

3 Statistical Analysis of Inter -keystrokeTimings

As a first study towardsinferring key sequencesfromtiming information,wedeveloptechniquesfor statisticalanalysisof the inter-keystroke timings. In this section,we first describehow we collect trainingdataandshowsomesimple timing characteristicsof characterpairs.We thenshow how we modeltheinter-keystroke timingof a givencharacterpair asa Gaussiandistribution. Wethendescribehow to estimatequantitatively theamountof informationaboutthecharacterpairthatonecanlearnusingtheinter-keystroketiming information.Denotethesetof characterpairsof interestasQ, andlet

�Q�denote

thecardinalityof thesetQ.

3.1 Data Collection

The two keystrokesof a pair of characters� ka � kb gen-eratesfour events: thepressof ka, thereleaseof ka, thepressof kb, and the releaseof kb. However, becauseonly key presses(not key releases)triggerpacket trans-mission,aneavesdroppercanonly learntiming informa-tion aboutthekey-pressevents.Sincethemainfocusofour study is in the scenariowherean adversarylearnstiming informationon keystrokesby simply monitoringthe network, we focus only on key-pressevents. Thetimedifferencebetweentwo key pressesis calledthela-tencybetweenthetwo keystrokes.We alsousetheterminter-keystroketiming to referto thelatency betweentwokeystrokes.

In orderto characterizehow muchinformationis leakedby inter-keystroketimings,wehaveperformedanumberof empiricalteststo measurethe typing patternsof realusers. Becausepasswords are probablythe most sen-sitive datathat a userwill ever type, we focusonly oninformationrevealedaboutpasswords(ratherthanotherformsof interactive traffic).

Our focuson passwordscreatesmany challenges.Pass-wordsareenteredverydifferentlyfrom othertext: pass-wordsaretypedfrequentlyenoughthat,for many users,the keystroke patternis memorizedandoften typedal-most without consciousthought. Furthermore,well-chosenpasswordsshouldbe randomandhave little orno structure(for instance,they shouldnot be basedondictionary words). As a consequence,naive measure-mentsof keystroke timingswill not berepresentative ofhow userstype passwordsunlessgreatcareis taken inthedesignof theexperimentalmethodology.

Our experimentalmethodologyis carefullydesignedtoaddresstheseissues.Dueto securityandprivacy consid-erations,we chosenot to gatherdataon realpasswords;therefore,we have chosena datacollection procedureintendedto mimic how userstype real passwords. Aconservative methodis to pick a randompassword forthe user(whereeachcharacterof the password is cho-senuniformly at randomfrom asetof 10 letterkeysand5 numberkeys, independentlyof all othercharactersinthe password), have the userpracticetyping this pass-word many timeswithout collectingany measurements,andthenmeasureinter-keystroke timing informationonthispasswordoncetheuserhashadachanceto practiceit at length.

However, we found that, when the goal is to try toidentify potentially relevant timing properties(ratherthanverify conjecturedproperties),thisconservativeap-proachis inefficient. In particular, userstypically typepasswordsin groupsof 3–4characters,with fairly longpausesbetweeneachgroup. This distortsthe digraphstatisticsfor the pair of charactersthat spansthe groupboundaryand artificially inflates the varianceof ourmeasurements.As a result we would needto collecta great deal of data for many randompasswords be-fore this effect would averageout. In addition,it takesquiteawhile for usersto becomefamiliarwith longran-dompasswords.Thismakestheconservativeapproacharatherblunttool for understandinginter-keystrokestatis-tics.

Fortunately, there is a lesscostly way to gather inter-keystroke timing statistics: we gathertraining dataoneachpair of characters� ka � kb astypedin isolation.Wepick acharacterpairandasktheuserto typethispair30–40 times,returningto thehomerow eachtime betweenrepetitions.For eachuser, we repeatthis for many pos-siblepairs(142pairs,in ourexperiments)andwegatherdataon inter-keystroke timings for eachsuchpair. Wecollectedthelatency of eachcharacterpairmeasurementand computedthe meanvalue and the standarddevia-tion. In ourexperience,thisgivesbetterresults.

Page 6: Proceedings of the 10th USENIX Security Symposium

0 100 200 3000

5

10

15

Inter−keystroke Timing for v−o (milliseconds)

Fre

quen

cy

0 100 200 3000

5

10

15

Inter−keystroke Timing for v−b (milliseconds)

Fre

quen

cy

Figure3: Thedistributionof inter-keystroke timingsfor two samplecharacterpairs.

As an example,Figure 3 shows the latency histogramof two samplecharacterpairs. The left model corre-spondsto the latency betweenthe pair � v, o , and theright modelcorrespondsto � v, b . We canseethat thelatency between � v, o is clearly shorterthan the la-tency between� v, b , and the latency distributions ofthesetwo samplecharacterpairsarealmostentirelynon-overlapping.

Theoptimizeddatacollectionapproachgivesusa moreefficient way to study fine-grained details of inter-keystrokestatisticswithout requiringcollectinganenor-mousamountof data.Weuseddatacollectedin thiswayto quickly identify plausibleconjectures,developpoten-tial attacks,and to train our attackmodels. As far aswe areaware,collectingdataon keystroke pairsin iso-lationdoesnotseemto biasthedatain any obviousway.Nonetheless,we also validateall our resultsusing theconservative measurementmethod(seeSection5).

3.2 SimpleTiming Characteristics

Next, we divide the test characterpairs into five cate-gories,basedon whetherthey aretypedusingthesamehand,thesamefinger, andwhetherthey involve a num-berkey:

� Two letter keys typedwith alternatinghands,i.e.,

onewith left handandonewith right hand;

� Two characterscontainingone letter key and onenumberkey typedwith alternatinghands;

� Two letterkeys,bothtypedwith thesamehandbutwith two differentfingers;

� Two letter keys typedwith the samefinger of thesamehand;

� Two characterscontainingone letter key and onenumberkey, bothtypedwith thesamehand.

Figure4 shows the histogramof latency distribution ofcharacterpairsfor eachcategory. We split thewholela-tency rangeinto six binsasshown in thex-axis. Withineachcategory, we put eachcharacterpair into the cor-respondingbin if its meanlatency value is within therangeof the bin. Eachbar in the histogramof a cate-goryrepresentstheratioof thenumberof characterpairsin theassociatedbin over the total numberof characterpairsin thecategory.3 We canseethatall thecharacterpairsthat aretypedusingtwo differenthandstake lessthan150milliseconds,while pairstypedusingthesamehandandparticularlythesamefinger take substantiallylonger. Characterpairsthatalternatebetweenoneletterkey andonenumberkey, but aretypedusingthe same

3Hencethesumof all barswithin onecategory is 1.

Page 7: Proceedings of the 10th USENIX Security Symposium

< 100 100-150 150-200 200-250 250-300 > 300

Latency (milliseconds)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rat

io o

f cha

ract

er p

airs

Two letter keys, alternating handsA letter and a number, alternating handsTwo letters, same hand, different fingersTwo letters, same fingerA letter and a number, same hand

Histogram of the latency of character pairs

Figure4: Inter-keystroketimingsfor characterpairsin fivedifferentcategories.Notethatsomebarsatsomepositionsdisappearbecausethecorrespondingheightis zero.

hand,take the longesttime to type. This is simply be-causetwo handsoffer a certainamountof parallelism,while characterpairstypedwith onehandrequirea cer-tain degreeof sequentialmovementsandhencetendtotake longer. This is especiallyobviousin thecaseof oneletterandonenumberpairstypedusingonehand.Theyin generalrequiremorehandmovementandhencethelongesttime.4

So, if the attacker observesa characterpair typedwithlatency morethan150 milliseconds,he canguesswithhigh probabilityof successthat thecharacterpair is nottyped using two different handsand hencecan inferabout1 bit of informationaboutthecontentof thechar-acterpair. Becausethe 142 characterpairsareformedfrom randomly selectedletter keys and numberkeys,they seemlikely to form a representative sampleof thewhole keyboard. Hencethis simple classificationex-tendsto thewholekeyboard,andalreadyindicatesthatthe inter-keystroke timing leakssubstantialinformationaboutwhatis typed.

The propertiesdescribedabove are unlikely to be ex-haustive. For instance,earlier work on timing attackson multi-usermachinessuggestedthat inter-keystroketimingsmayadditionallyrevealwhich charactersin the

4Notethathereweonly considerusersthatusetouchtyping.

passwordareupper-case[Tro98].

3.3 GaussianModeling

Fromtheplot of thelatency distributionof agivenchar-acterpair, suchastheonesshown in Figure3,wecanseethat the latency betweenthe two key strokesof a givencharacterpair formsa Gaussian-like unimodaldistribu-tion. Hencea naturalassumption(which is confirmedby our empiricalobservations)is that theprobabilityofthelatency y betweentwo keystrokesof a characterpairq Q, Pr� y� q� , formsa univariateGaussiandistribution � µq � σq , meaning

Pr� y� q��� 1�2πσq

e��y� µq � 2

2σ2q �

whereµq is themeanvalueof the latency for characterpair q andσq is the standarddeviation. Given a setoftrainingdata � � qi � yi �� 1 � i � N, whereqi is the i-th charac-ter pair andyi is the correspondinglatency in the datacollection,we canderive the parameters� � µq � σq �� q � Qbasedon maximumlikelihoodestimation,i.e., we com-putethemeanandthestandarddeviation for eachchar-acterpair.

Figure5 showstheestimatedGaussianmodelsof thela-tenciesof the 142 characterpairs. Our empiricalresult

Page 8: Proceedings of the 10th USENIX Security Symposium

0 50 100 150 200 250 300 350 4000

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Latency (millisecond)

Pro

babi

lity

Figure 5: EstimatedGaussiandistributions of all 142characterpairscollectedfrom auser.

0 50 100 150 200 250 3004

4.5

5

5.5

6

6.5

7

Latency (milliseconds)

Ent

ropy

(bi

ts)

(a) Entropy of characterpairsgivena latency obser-vation

0 50 100 150 200 250 3000.5

1

1.5

2

2.5

3

3.5

Latency (milliseconds)

Info

rmat

ion

Gai

n (b

its)

(b) Informationgaininducedby alatency observation

Figure6: Entropy andinformationgain asa functionoftheinter-keystroke latency.

shows that most of the latenciesof the characterpairslie between50 and 250 milliseconds. The averageofthestandarddeviationof the142characterpairsis about30 milliseconds. The graphalso indicatesthat the la-tency distributionsof the characterpairsseverely over-lap, which meansthe inferenceof characterpairsusingjust latency informationis achallengingtask.

3.4 Inf ormation Gain Estimation

We would like to estimatequantitatively how muchinformation the latency information reveals about thecharacterpairs typed. This will be an upper boundof how much informationan attacker canextract fromthe timing information using any particular method.We estimateit by computingthe information gain in-ducedby the latency information. If we selecta char-acter pair uniformly at random from the character-pair space,and if the attacker doesnot get any addi-tional information, the entropy of the probability dis-tribution of characterpairs to the attacker is H0 � q���� ∑q � QPr� q� log2Pr� q��� log2

�Q���

If the attacker learnsthe latency y0 betweenthe two keystrokesof the char-acterpair, the estimatedentropy of the probability dis-tribution of characterpairs to the attacker is H1 � q� y �y0��� � ∑q � QPr� q� y0� log2Pr� q� y0� � where Pr� q� y0� �

Pr! y0 " q#%$ Pr! q#∑q& Q Pr! y0 " q#%$ Pr! q# � and Pr� y0

�q� is computed using the

Gaussiandistribution obtainedin theparameterestima-tion phasein the previous subsection.The informationgain inducedby theobservationof latency y0 is thedif-ferencebetweenthetwo entropies,H0 � q� � H1 � q� y � y0� �Using the parameterestimationof the 142 characterpairsobtainedin the previous section,we cancomputeH1 � q� y � y0� andH0 � q� � H1 � q� y � y0� asshown in Fig-ure6(a)andFigure6(b).

The estimatedinformationgain, alsocalledmutual in-formation, is I � q;y�'� H0 � q� � H1 � q� y�(� H0 � q� � Pr� y0�*)H1 � q� y � y0� dy0 � where Pr� y0�+� ∑q � QPr� y0

�q� Pr� q� �

FromthenumericalcomputationweobtainI � q;y�,� 1�2.

This meansthe estimatedinformation gain availablefrom latency information is about1

�2 bits per charac-

terpairwhenthecharacterpairhasuniformdistribution.Hencethe attacker could potentiallyextract 1

�2 bits of

informationper characterpair by using the latency in-formation in this case. Becausethe characterpairs inour experimentsareselecteduniformly at randomfromall letter andnumberkeys, we expect that they will berepresentative of the whole keyboard. Intuitively, Fig-ure5 is a sufficiently-largerandomsamplingof a muchdensergraphcontainingthe latency distributionsof allpossiblecharacterpairs. More detailedanalysisshowsthattheestimatedinformationgain computedusing142samplecharacterpairs is a goodestimateof the infor-

Page 9: Proceedings of the 10th USENIX Security Symposium

mationgain whenthe character-pair spaceincludesallletterandnumbercharacterpairs.This estimateis com-parableto the back-of-the-envelopecalculationin Sec-tion 3.2 basedon our classificationinto five categoriesof keystroke pairs.

Becausetheentropy of written Englishis so low (about0.6–1.3bits percharacter[Sha50]), the1

�2-bit informa-

tiongainpercharacterpairleakedthroughthelatency in-formationseemsto besignificant.5 For example,wecanexpect that users’PGP passphraseswill often containonly 1 bit of entropy percharacter. Hencethelatency in-formationmayrevealsignificantinformationaboutPGPpassphrases.

Theinformationgain curve in Figure6(b) shows a con-vex shape.Notethatlatenciesgreaterthan175millisec-ondsarerelatively rare;however, whenever we seesucha long time betweenkeystrokes, we learn a lot of in-formationaboutwhat wastyped,becausetherearenotmany possibilitiesthat would lead to sucha large la-tency. The characterpairs that take longer than 175millisecondsto typearemostlypairscontainingnumberkeys or pairstypedwith onefinger. Hencethis analysissuggeststhatpasswordscontainingnumberkeysor char-acterpairsthataretypedwith onefingerareparticularlyvulnerableto suchtiming attacks.

Another interestingobservation is that the meanof thestandarddeviationsof the characterpairs is only about30 millisecondsasshown in our experiments,while thestandarddeviation of round-triptime on the Internetinmany casesis lessthan10 milliseconds[Bel93]. There-fore even whenthe attacker is far from theSSH clienthost,hecanstill getsufficiently-preciseinter-keystroketiming information. This makesthe timing attackevenmoresevere.

4 Inferring Character SequencesFromInter -KeystrokeTiming Inf ormation

In this section,we describehow we can infer charac-ter sequencesusingthe latency information. In partic-ular, we modeltherelationshipof latenciesandcharac-ter sequencesasa HiddenMarkov Model [RN95]. WeextendthestandardViterbi algorithmto ann-Viterbi al-gorithm that outputsthe n most likely candidatechar-actersequences.We furtherestimatehow many bits ofinformationabouttherealcharactersequencethis algo-

5Notethat the1 - 2-bit informationgain is estimatedfor thecaseofrandomlyselectedpasswordswherethe sequenceof charactershavea uniform distribution. However, this is not the casefor texts. Morecareful calculationis neededto estimatethe informationgain in thecaseof natualtext.

rithm extractsfrom the latency informationandshow itis nearlyoptimal.

4.1 Hidden Mark ov Model

In general,a Markov Model is a way of describingafinite-statestochasticprocesswith thepropertythat theprobabilityof transitioningfrom thecurrentstateto an-otherstatedependsonly on thecurrentstate,not on anyprior stateof theprocess[RN95]. In a HiddenMarkovModel(HMM), thecurrentstateof theprocesscannotbedirectly observed. Instead,someoutputsfrom thestateareobserved,andtheprobabilitydistributionof possibleoutputsgiven the stateis dependentonly on the state.UsingaHMM, onecaninfer informationaboutthepriorpaththeprocesshastakenfromthesequenceof observedoutputsof thestates,andefficient algorithmsareknownfor workingwith HMM’ s. Becauseof this,HMM’ shavebeenwidelyusedin areassuchasspeechrecognitionandtext modeling.

In our setting,we considereachcharacterpair of inter-estasa hidden(non-observable)state,and the latencybetweenthe two keystrokesof the characterpair astheoutputobservation from the character-pair state. Eachstatecorrespondsto a pair of characters,sothatthetyp-ing of a charactersequenceK0 � �.��� � KT , is a processthatgoesthroughT states,q1 � ���.� � qT , whereqt � 1 / t / T representsthe t-th characterpair � Kt � 1 � Kt typed. Letyt � 1 / t / T denotethe observed latency of stateqt .Thenwe modelthe typing of a charactersequenceasaHMM. Thismeanswemake two assumptions.First, theprobabilityof transitionfrom thecurrentstateto anotherstateis only dependenton the currentstate,not on theprior pathof theprocess.If thecharactersequenceis apassword chosenuniformly at random,this assumptionobviouslyholds.In thecaseof text, thisassumptiondoesnot hold strictly but experiencein speechrecognitionandtext modelingshows thatsomeextensionsto HMMstill work well [RN95]. Second,the probability distri-bution of the latency observation is only dependentonthecurrentcharacterpair andnot on any previouschar-actersin thesequence.This assumptionmight hold forsomecasesandnot for othercaseswherethe typing ofpreviouscharacterschangesthepositionof thehandandinfluencesthe typing of later characterpairs. However,making this assumptionmakes our analysisand infer-encealgorithmmuchsimplerandstill givesgoodresultsasshown from theexperiments.Hence,we usea HMMto modelthe typing of charactersequencesasshown inFigure7.

As in the previous section,we assumethe setof possi-ble characterpairsis Q, hencethesetof possiblestatesin the HMM is Q. We assumethat the probability of

Page 10: Proceedings of the 10th USENIX Security Symposium

y 1

q1

t=1

y2

q2

t=2

y 3

q3

t=3

y T

qT

t=T

01020302020

Figure7: A representationof a traceof a HMM. Eachvertical slice representsa time step. In eachtime slice, thetop nodeqt is a variablerepresentinga characterpair, andthebottomnodeyt is theobservablevariabledenotingthelatency betweenthetwo keystrokes.

the latency y of a characterpair q, Pr� y� q� (q Q), is aGaussiandistribution

� µq � σq , wheretheparameters� � µq � σq 4� q � Q are obtainedusing the maximumlikeli-hoodestimation.

4.2 The n-Viterbi Algorithm for Character Se-quenceInfer ence

Givenanobservation 5y � � y1 � y2 � �.��� � yT , a sequenceoflatenciesof somecharactersequencefrom a user’s typ-ing, we would like to infer the real charactersequencethat theuserhastyped. For eachpossiblecharacterse-quence5q � � q1 � q2 � ���.� � qT , we cancomputehow likelythecharactersequenceis giventheobservation,namelyPr� 5q � 5y� � TheprobabilityPr� 5q � 5y� essentiallygivesarank-ing for the candidatecharactersequence5q: the higherPr� 5q � 5y� is, the more likely 5q is the real characterse-quence.We use 5q 6 to denotethemost-likely sequence,which is the sequencethat correspondsto the highestvalueof Pr�75q� 5y� for all possible5q with regardto a given5y.

TheViterbi algorithmis widely usedin solvingthemostlikely sequenceof statesgivenasequenceof observationin HMM problems[RN95]. An naivewayof computing5q 6 would computePr�75q� 5y� for all possible5q, andhencerequiresO � �Q� T running time. The Viterbi algorithmusesdynamicprogrammingfor arunningtimecomplex-ity O � �Q� 2T .In our setting,becausethe latency distributionsof dif-ferentcharacterpairshighly overlap,theprobabilitythatthe most likely sequenceis the right sequencewill bevery low. Hence,insteadof just computingthe mostlikely sequence,we needto computethe n most likelysequencesandhopethe real sequencewill be in the nmostlikely sequenceswith highprobabilityfor n greaterthana certainthreshold.Hencewe extendthestandardViterbi algorithm to n-Viterbi algorithm to output then most-likely sequenceswith running time complexity

O � n�Q� 2T . We give a detaileddescriptionof the n-Viterbi algorithmin AppendixA.

4.3 How to Estimate the Effectivenessof the n-Viterbi Algorithm

Wewould like to estimatehow big thethresholdn hastobe suchthat the real charactersequencewill be amongthen most-likely sequenceswith sufficiently high prob-ability. In an experimentif the real charactersequenceappearsin thenmost-likelysequences,wesaytheexper-iment is a successwith regard to thethresholdn, other-wise,a failure. Theprobabilityof suchdefinedsuccessis a function of n. It is easyto seethat the function ismonotonicallyincreasingwith regardto n. If for asmalln, thesuccessprobabilityis alreadyhigh, thismeansthealgorithmis very effective becauseit filters out mostofthe sequencesandhenceoneonly needsto try a smallsetof candidatesbeforefinding the real sequence.Ontheotherhand,if we needa high thresholdof n to getasufficiently high successprobability, thenthealgorithmis lesseffective: onewould needto try many morecan-didatesbeforefinding therealsequence.NotethatfromSection3.4 we seethat the timing information revealsabout 1

�2 bits of information per characterpair. For

the caseof a randompassword of lengthT 8 1, whichforms T consecutive characterpairs, the latency infor-mation could reveal approximately1

�2T bits of infor-

mationabouttherealpassword sequence.Hencethis isanupperboundon theeffectivenessof thealgorithmtoinfer charactersequencesusinglatency information.Wewould like to estimatehow closeour algorithmis com-paredto theupperbound.

First, we look at the simplecasewhenT � 1. Given alatency observationy of a characterpair q, we computethe probability Pr� q9 � y� � q9: Q� and selectthe n most-likely characterpairsΦ �;� q j1 � �.�.� � q jn � . Wewould liketo computetheprobability that therealcharacterpair qis in thesetΦ over all possiblevaluesof y. To simplify

Page 11: Proceedings of the 10th USENIX Security Symposium

0 20 40 60 80 100 120 1400.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Threshold n

Pro

babi

lity

of S

ucce

ss

Probability of Success vs. Threshold n

s.d. = 25s.d. = 30s.d. =35

Figure 8: The probability that the n-Viterbi algorithmoutputsthe correctpassword beforethe first n guesses,graphedasa functionof n.

the numericalcomputation,we approximatethe resultby assumingthatall theGaussiandistributionshave thesamestandarddeviation σ . This is a goodapproxima-tion of the real experiment: aswe seein the Figure5,mostkeypairshave a standarddeviation between25–35milliseconds.

Figure 8 graphsthe probability that the real characterpair appearswithin the n most-likely characterpairsagainstthe thresholdn. The top curve is whenσ � 25,themiddlecurve is whenσ � 30,andthebottomcurveis when σ � 35

�Using the middle curve, we get that

whenn � 70 the probability of successis 90%, mean-ing that with 90% probability, the real characterpairappearsin the 70 most-likely sequencesoutput by then-Viterbi algorithm. Let’s denotesucha thresholdcor-respondingto the 90% successprobability asn6 . Thuslog2 � �Q�=< n6 � 1 is the approximatenumberof bits ofinformation per characterpair the algorithm extracts.Note that from the previous sectionwe seethat the la-tency informationrevealsabout1

�2 bits of information

percharacterpair. Henceourn-Viterbialgorithmisnear-optimal.

In thecaseof uniformly randomlychosenpasswordsoflengthT 8 1, thenumberof bitsof informationthealgo-rithm canextractis approximatelyT ) log2 � �Q�=< n6 ?> T,which is closeto theoptimalvalue1

�2T bits.

5 Building Herbivore and Timing Attackson SSH

To evaluatethe effectivenessof our timing attackstoSSH, we build an attacker programthat we call Herbi-vore. In this section,we describetheexperimentresults

CA

B

eavesdrop

pass

word SSH2 password

SSH1

Herbivore

HMMnViterbi

Candidate Passwords

Figure9: TheHerbivorearchitecture.

of usingHerbivoreto learnusers’passwords.

5.1 HerbivorePreying for Passwords

We built anattacker engineHerbivoreasshown in Fig-ure 9. It monitorsthe network andcollectsthe arrivaltimesof packets.Usingthetechniquedescribedin Sec-tion 2, Herbivoreinferswhichpacketscorrespondto theuser’sSSH passwordswhentheuseropensanSSH ses-sion to anotherhostwithin anestablishedSSH connec-tion. Herbivorethenmeasurestheinter-arrival timesbe-tweenpackets containingthe password charactersandusesour n-Viterbi algorithmto generatea list of candi-datepasswords. The candidatepasswordsaresortedindecreasingorderof the probability Pr�75q � y� , and in ourexperimentswe recordthepositionof therealpasswordin thecandidatelist. We reportthepositionof thepass-word asa percentage,so with m possiblepasswordsintotal, if therealpasswordappearsatpositionu in theor-deredcandidatelist, wesaytherealpasswordappearsatthe top 100u

m %. This givesa naturalway to quantify theeffectivenessof ourapproach.

5.2 Optimization for Long Character Se-quences

Thecomplexity of then-Viterbi algorithmis linearin thenumbern of candidatesit outputs.As the lengthof thepassword grows, thespaceof possiblepasswordsgrowsexponentially. If the n-Viterbi algorithmcanonly ruleout a constantfraction of the password space,n wouldalsogrow exponentiallyasthe password lengthgrows.Hencethealgorithmmightbeinefficientwhenthepass-word is long. In particular, we observed that memoryusagecangrow substantiallyfor longerpasswords.

Also, andmoreimportantly, we observed in theexperi-mentsthatuserstendto typelongpasswordsin segmentsof 3 to 5 lettersandpausebetweenthesegments.If we

Page 12: Proceedings of the 10th USENIX Security Symposium

1 2 3 4 5 6 7 8 9 100

2

4

6

8

10

12

14

16

Test Number

Ran

king

Per

cent

age

of th

e co

rrec

t ans

wer

in o

utpu

t lis

t (%

)

Figure10: The percentageof the password spacetriedby Herbivore in 10 testsbeforefinding the right pass-word.

usethetiming betweenthesegmentsfor theprediction,it might biasour predictionssincetypically suchpausesarenoticeablylongerthanmostotherinter-keystroke la-tencies. Fortunately, this large differencemeansthatpausesbetweengroupsof password characterscan beclearly identified before we apply the n-Viterbi algo-rithm.

Henceto reducethebiasandto reducethememoryre-quirementsof thealgorithm,we breakthetiming infor-mationof thepassword into segmentscontaining3 or 4latency intervals. We useeachsegmentto form a HMMand then at the end combinethe result from differentsegmentsto form thecandidatepasswordordering.

5.3 Experimental Results for Password Infer -encefor a SingleUser

Wemeasuretheeffectivenessof ourn-Viterbi algorithmatcrackingpasswordsthroughempiricalmeasurements.In our experiment,we usetraining datacompiledfromisolatedkeypairsto traintheHMM. Then,wepick aran-dom password for the user. We have the userusethispassword to authenticateto anotherSSH sessionwithinan establishedSSH sessionas shown in Figure 9, andwe apply our n-Viterbi algorithmto simulatean attackon this password. Note that we have the testusertypethepasswordmany timesbeforethetestto ensurefamil-iarity with thepassword,andwetry to deducetheuser’spasswordusingtrainingdatafrom thesameuser.

All passwordsareselecteduniformly atrandomfrom thecharacterspaceasin theexperimentin Section3,sotheycontainno structure.Recoveringsuchpasswordsis thehardestcasefor theattacker, soif timing analysiscanre-cover informationin suchascenario,wecanexpectthat

0 10 20 30 40 50 60 70 80 90 1000

50

100

150

200

250

300

350

400

Ela

psed

Tim

e (m

illis

econ

d)

Key Pair

Figure11: A comparisonof two users’typing patterns.The “diamond” symbolsshow the meanvaluesof thelatenciesof one user, with an error-bar indicating onestandarddeviation. The“x” symbolindicatesthemeanvaluesof thelatenciesof anotheruser.

timing analysiswill beanevengreaterthreatin settingswherepasswordsarechosenlesscarefully.

We performedtestsfor 10 differentpasswords,eachoflength8. Figure10showsthepercentageof thepositionsof therealpassword in theorderedcandidatelistsoutputby the n-Viterbi algorithm. For example,0

�3% means

thattherealpasswordappearedat thetop0�3%position

in theoutputcandidatelist. Theseexperimentsindicatethat on averagethe real password is locatedwithin thetop 2

�7% of thecandidaterankinglist. Themedianpo-

sition is about1%, soabouthalf the time thepasswordwill be in the top 1% of the list of candidatesproducedby our n-Viterbi algorithm.Therefore,in orderto crackthepassword,Herbivoreonly needsto test1

<50timesas

many passwordsasbrute-forcesearch,onaverage.

The 50 @ reductionin workfactorcomparedto exhaus-tive searchcorrespondsto a total of 5

�7 bits of infor-

mationlearnedperpassword usingthelatency informa-tion. This is closeto the information gain analysisinSections3 and4, which predicteda gain of about1 bitperkeystroke pair: recall that thepasswordsin this testareof length8, so eachpassword contains7 keystrokepairs. We attribute thedifferenceto minor variationbe-tweenthedistributionsof inter-keystroketimingsin ran-dompasswordsandthedistribution of timingsfor char-acterpairstypedin isolation.

For easeof testing,our experimentswereon passwordswith a reducedsetof possiblecharacters.However, wecanexpecttheseresultsto carryover to passwordscho-senfrom the full setof possiblecharacters.Assumingthat the informationgain availablefrom inter-keystroketiming informationis about1 bit percharacterpair even

Page 13: Proceedings of the 10th USENIX Security Symposium

Training Test TestCasesSet Set Password 1 Password2 Password3 Password4 Password5User1 User1 15

�6% 0

�7% 2

�0% 1

�3% 1

�6%

User1 User2 62�3% 15

�2% 7

�0% 14

�8% 0

�3%

User1 User3 6�4% N/A 1

�8% 3

�1% 4

�2%

User1 User4 1�9% 31

�4% 1

�1% 0

�1% 28

�8%

User2 User1 4�9% 1

�3% 1

�6% 12

�3% 3

�1%

User2 User2 30�8% 15

�0% 2

�8% 3

�7% 2

�9%

User2 User3 4�7% N/A 5

�3% 6

�7% 38

�4%

User2 User4 0�7% 16

�8% 3

�9% 0

�6% 5

�4%

Table1: Successratesfor password inferencewith multiple users. The numbersarethe percentageof the searchspacetheattackerhasto searchbeforehefindstheright password.

when we extend to the whole keyboard,we expect toseethis 50 timesreductionin workfactorfor passwordsof length7–8evenwhenthepasswordsarechosenran-domly from all letterandnumberkeys. This50 @ reduc-tion can make password crackingmore practical. Forexample,for a password containingrandomly-selectedlower-caseletterkeys andnumberkeys, without timinginformation,theattackerwouldneedto try 368 < 2 candi-datepasswordsonaveragebeforehefindstheright one.Benchmarksindicatethat a 840 MHz PentiumIII cancheckabout250� 000 candidatepasswords per secondin a off-line dictionaryattack. Thus,exhaustive searchwould take about65 PC-daysto cracka password com-posedof randomly-selectedlower-caseletter keys andnumberkeys. If the attacker usesthe timing informa-tion, the computationcan be donein 1

�3 days,which

makesthecrack50 @ morepractical.

5.4 Experimental Results for Password Infer -encefor Multiple Users

Onepotentialweaknessin our simulationsis that real-world attackersmightnotbeableto getasmuchtrainingdatafrom thevictim for thestatisticalanalysisaswehadavailable in our experiments.However, we arguenextthat this is unlikely to poseaneffective defenseagainsttiming attacks: thereareotherways that attackerscanobtainthetrainingdatarequiredfor theattack.

Onesimpleobservationis thattheattackercaneasilygethis own typing statistics,or thetyping statisticsof a co-conspirator. Henceit is importantto evaluatehow wellthe password inferencetechniquesperformwhenusingone person’s typing statisticsto infer passwords typedby anotherperson.

In this experiment,we collectedthe typing statisticsoftwo users,User1 andUser2. An interestingresult isthat 75% of the characterpairstake aboutthe samela-tency to typefor bothtwo users:in otherwords,thedif-

ferencebetweenthe averagelatenciesof the two usersfor suchcharacterpairsis smallerthanonestandardde-viation. Similarly, the simpletiming characteristicsre-portedin Section3.2—e.g.,keypairs typed with alter-natepairs tend to have much lower inter-keystroke la-tency than keypairs typed with the samehand—wereobserved to be essentiallyuser-independent.This sug-geststhattypingstatisticshavealargecomponentthatiscommonacrossa broaduserpopulationandwhich thuscanbeexploitedby attackersevenin theabsenceof anytrainingdatafrom thevictim.

To testthishypothesisfurther, wehadfour users(includ-ing User1 and2, from our previous experiments)typethesamesetof five randomly-selectedpasswords.Pass-words1 and2 have length8. Passwords3 and4 havelength7, andpassword 5 haslength6. Herbivore thenruns the n-Viterbi algorithm using the typing statisticsfrom User1 and2 to infer passwordstypedby the fourtestusersseparately. Table1 shows thepercentageposi-tion of therealpasswordsoccurredin theoutputcandi-daterankinglist, whichis thepercentageof thepasswordspacetheattacker hasto searchbeforehefindstherightpassword. User3 did not typePassword2 sotheentryisnotavailable.

Thisexperimentshows severalinterestingresults:

� Unsurprisingly, inferring a user’s password caningeneralbe donesomewhat moreeffectively if oneusestraining datafrom the sameuserratherthantrainingdatafrom otherusers.

� The distancebetweenthe typing statisticsof twouserscanvary significantlyaccordingto how onechoosesthe pair of users.A userUa’s typing pat-ternmightbemoresimilar to userUb’s thanto userUc’s. Thus it can give better resultsto useUb’straining datathanUc’s training datato infer pass-words typed by Ua. In this experiment,it shows

Page 14: Proceedings of the 10th USENIX Security Symposium

that in generalusingUser1’s trainingdatagivesabetterresultto inferpasswordstypedby User3 thanusingUser2’s trainingdata.And User2’s trainingdatagivesabetterinferencefor passwordstypedbyUser4 thanUser1’s trainingdata.

� Most importantly, thisexperimentshows thattrain-ing datafrom oneusercanbesuccessfullyappliedto infer passwordstypedby anotheruser. Hencetheattackcanbeeffective evenwhentheattacker doesnothave typingstatisticsfrom thevictim.

5.5 Extensions

We expectthatHerbivorecouldalsobeusedto infer in-formationabouttext or commandsthatuserstype. Theentropy of written English is very low (about0.6–1.3bits percharacter[Sha50]) in comparisonto theamountof informationleakedby inter-keystroke timings(about1 bit of informationperkey pair; seeSection3). How-ever, mountingsuchan attackwould appearto requirebettermodelsof written text [RN95]. In any case,wehavenotstudiedsuchascenarioin ourexperiments,andwe leave this for futurework.

6 RelatedWork

Timing analysishas previously beenusedby Kocherto attack cryptosystems[Koc95]. Trostle exploited asimilar idea,showing how a malicioususeron a multi-userworkstationcangain informationaboutotherusers’passwordsusingCPU timings [Tro98]. We expectourHidden Markov Model techniquesmight find applica-tionsin Trostle’s threatmodelaswell.

Most recently, other researchershave independentlypointed out the possibility of timing attackson SSH[DS01]. Someof their observations reveal additionalweaknessesin SSH: For instance,they noted that theSSH 1.x protocol reveals the exact length of pass-words, becauseciphertexts containa length field sentin the clear(SSH 2 doesnot have this problem);theydiscussedhow to deal with the presenceof backspacecharacters;and,they initiatedaninvestigationof theim-pactof timing attacksonothersessiondata(suchasshellcommandstypedin theSSH session).

7 Countermeasures

AlthoughSSH providesanencryptedandauthenticatedlink betweenthe local host and the remotemachine,aneavesdroppercanstill learninformationabouttypedkeystrokesdueto two weaknessesof SSH. First, every

individual keystroke that a usertypesis sentto the re-motemachinein anindividualIPpacket(exceptfor metakeys suchas Shift and Ctrl); second,as soonascommandoutputis availableon the remotemachine,itis sentto the local host in one or multiple IP packets,leakinginformationon theapproximatesizeof theout-put. We have shown in this paperhow theseseeminglyminorweaknessesleadto severereal-world attacks.

Notethat in our traffic signatureattack,theattacker cantell that the useris typing passwordsbecausethereareno echopackets. Sooneway to fix this problemis thatwhentheserverdetectsthattheechomodeis turnedoff,theservercanreturndummypacketsthatwill beignoredby theclientwhenit receiveskeystrokepacketsfrom theclient. Thisfix canreducetheeffectivenessof thetrafficsignatureattackbut couldfail in otherattackssuchasournestedSSH attackwheretheattackercanguesswhentheuser is typing his password by simply monitoring thenetwork connections.This fix doesnot prevent inter-keystroke timing information,though.

To preventtheattacks,weneedto preventtheleakageofthetiming informationof thekeystrokes.Onenaive ap-proachmightbeto modify SSHsothatuponreceiving akeystrokewith latency lessthanη millisecondsfrom thepreviouskeystroke,theprogramwill delaythepacketbya randomamountof up to η milliseconds.Becauseourexperimentindicatesthatthespectrumof thelatency be-tweentwo keystrokesof continuoustyping is between0–500milliseconds,we couldsetη � 500for example,and sucha randomdelay would randomizethe timinginformationof thekeystrokes.Sucha randomdelayim-posesanoverheadof about250millisecondsonaverage.Unfortunately, if theattacker canmonitorthesameuserlogin many timesandcomputetheaverageof thelaten-ciesof thepasswordsequences,hecanreducetheeffec-tivenessof the randomizednoise. For example,if theattacker canget the timing informationof a user’s SSHauthenticationfor 50 times,thenoisecontributedby therandomdelayis only about20–40milliseconds.Soweshouldnotusethismethod.

A betterway to prevent leakageof inter-keystroke tim-ing information is to sendtraffic at a constantrate ofλ packets per secondwhen the link is active. Choos-ing λ presentsa tradeoff betweenusability and over-head:Increasingλ reducesthedummytraffic but causelonger latency for the user. Assume,for example,thatwe setλ � 50 milliseconds.Sincethe latency betweentwo keystrokes is usually greaterthan 50 millisecondsand the network delay is alreadyat least in the tensof milliseconds,this may be a reasonabletradeoff be-tweencommunicationoverheadandadditionaldelay. Insucha scenario,the SSH client would always senda

Page 15: Proceedings of the 10th USENIX Security Symposium

datapacket every 50 milliseconds. Assuming64 bytepackets(40 bytesfor IP andTCPheaders,and24 bytesfor SSH data), the communicationoverheadis 1280bytes/second,which canevenfit in low-bandwidthcon-nections,suchas modemconnections.If no real dataneedsto be sent, the client will senddummy trafficwhich the remotemachineignores.6 If the usertypesmultiple keys in a singletime period,thekeystrokesarebufferedandsenttogetherin thenext scheduledpacket.While thismethodpreventstheeavesdropperfrom learn-ing timing information about keystrokes typed at theclient side,it doesnotpreventinformationleakagefromthe sizeof responsepackets from the remotemachine.Hencetheserver sidewould alsoneedto sendresponsetraffic ataconstantpacket ratesimilar to theclient side.

8 Conclusion

In this paper, we identifiedseveralserioussecurityrisksin SSH dueto two weaknessesof SSH: First, the trans-mittedpacketsarepaddedonly to aneight-bytebound-ary (if a block cipher is in use),which revealsthe ap-proximatesizeof the original data. Second,in interac-tive mode,every individual keystroke that a usertypesis sent to the remotemachinein a separateIP packetimmediatelyafter the key is pressed(except for somemetakeyssuchShift or Ctrl), which leakstheinter-keystroketimingsof users’typing. Weshowedthatthesetwo weaknessesreveala surprisingamountof informa-tion on passwords and other text typed over SSH ses-sions(about1 bit of information per characterpair inthecaseof randomlychosenpasswords). This suggeststhatSSH is notassecureascommonlybelieved.

Thelessonswelearnedandthetechniqueswedevelopedin this paperapply to a generalclassof protocolsthataim to provide securechannelsbetweenmachines.Weshow that timing informationopensa new setof risks,andwe recommendthat developerstake carewhende-signingthesetypesof protocols.

Acknowledgement

We would like to thankAdrian Perrigfor his greathelpthroughall phasesof the project. We are indebtedtoKris Hildrum, DoantamPhanand RobertJohnsonfortheir help in the testingphase. We would also like tothankEric Xing for discussionsonstatisticaltechniques.

6If after a certaintimeout(e.g.,10λ ) thereis still no real datatosend,theclient canconsiderthecurrentlink is inactive andstopsend-ing dummytraffic until it hasdatato sendagain. The timeoutperiodprovidesa tradeoff betweensecurityandefficiency.

Finally we would like to thank Nikita Borisov, Mon-ica Chew, Kris Hildrum, RobertJohnson,andSolarDe-signerfor their helpful commentson thepaper.

References

[Bel93] StevenM. Bellovin. Packetsfoundonaninternet.Computer CommunicationsReview, 23(3):26–31,July1993.

[BSH90] S. Bleha, C. Slivinksy, and B. Hussein.Computer-access security systems usingkeystrokes dynamics. In IEEE Transactionson Pattern Analysis and Machine IntelligencePAMI-12, volume12,December1990.

[CB94] William R. Cheswickand Steven M. Bellovin.Firewalls and Internet Security– RepellingtheWily Hacker. ProfessionalComputingSeries.Addison-Wesley, 1994. ISBN 0-201-63357-4.

[DS01] SolarDesignerandDug Song. Passive analysisof SSH(secureshell) traffic. Openwall advisoryOW-003,March2001.

[GLPS80] R.Gaines,W. Lisowski,S.Press,andN. Shapiro.Authenticationby keystroke timing: Somepre-liminary results. TechnicalReportRandreportR-256-NSF, Randcorporation,1980.

[GS96] SimsonGarfinkel andGeneSpafford. PracticalUNIX & Internet Security. O’Reilly & Asso-ciates,1996.

[JG90] Rick Joyce andGopalGupta. Identity authenti-cationbasedon keystroke latencies. Communi-cationsof the ACM, 33(2):168– 176, February1990.

[Koc95] P. Kocher. Cryptanalysisof Diffie-Hellman,RSA, DSS,and othercryptosystemsusing tim-ing attacks.In Advancesin cryptology, CRYPTO’95, pages171–183.Springer-Verlag,1995.

[LW88] G.LeggettandJ.Williams. Verifying identityviakeystroke characteristics.InternationalJournalof Man-MachineStudies, 28(1):67–76,1988.

[LWU89] G. Leggett,J.Williams, andD. Umphress.Veri-ficationof useridentityvia keystrokecharacteris-tics. HumanFactors in ManagementInformationSystems, 1989.

[MR97] FabianMonroseandAvi Rubin. Authenticationvia keystroke dynamics. In Proceedingsof the4th ACM Conferenceon ComputerandCommu-nicationsSecurity, pages48–56,April 1997.

[MRW99] F. Monrose,M. K. Reiter, andS. Wetzel. Pass-wordhardeningbasedonkeystrokedynamics.InProceedingsof the6thACM ConferenceonCom-puter and CommunicationsSecurity, November1999.

[NBS77] NationalBureauof Standards.SpecificationfortheDataEncryptionStandard.FederalInforma-tion ProcessingStandardsPublication46 (FIPSPUB 46),January1977.

Page 16: Proceedings of the 10th USENIX Security Symposium

[NIS99] U. S. National Institute of Standards andTechnology(NIST). Data Encryption Standard(DES). Draft FederalInformation ProcessingStandardsPublication 46-3 (FIPS PUB 46-3),January1999.

[RLCM98] J.A. Robinson,V. M. Liang,J.A. Chambers,andC. L. MacKenzie.Computeruserverificationus-ing loginstringkeystrokedynamics.IEEETrans-actionson System,Man,andCybernetics, 28(2),1998.

[RN95] StuartRussellandPeterNorvig. Artificial Intelli-gence, A modernapproach. PrenticeHall, 1995.

[Sha50] ClaudeE. Shannon. PredictionandEntropy ofPrintedEnglish.Bell Sys.Tech.J (3), 1950.

[SSL01] IETF SecureShell Working Group (SECSH).http://www.ietf.org/html.charters/secsh-charter.html, 2001.

[Tro98] JonathanTrostle. Timing attacksagainsttrustedpath. In IEEE Symposiumon Securityand Pri-vacy, 1998.

[UW85] D. Umphressand J. Williams. Identity veri-fication through keyboard characteristics. In-ternational Journal of Man-Machine Studies,23(3):263–273,1985.

[YKS A 00a] T. Yl onen, T. Kivinen, M. Saarinen,T. Rinne,and S. Lehtinen. SSH authenticationprotocol.InternetDraft, InternetEngineeringTaskForce,May 2000.Work in progress.

[YKS A 00b] T. Yl onen, T. Kivinen, M. Saarinen,T. Rinne,andS. Lehtinen. SSHprotocolarchitecture.In-ternet Draft, Internet EngineeringTask Force,May 2000.Work in progress.

[Yl o96] Tatu Yl onen. SSH– SecureLogin Connectionsover theInternet.In SixthUSENIXSecuritySym-posium, SanJose,California,July1996.

[Zim95] Philip R. Zimmermann.TheOfficial PGPUser’sGuide. MIT Press,Cambridge,MA, USA, 1995.ISBN 0-262-74017-6.

[ZP00a] Yin Zhangand Vern Paxson. Detectingback-doors. In Proc. of 9th USENIXSecuritySympo-sium, August2000.

[ZP00b] Yin ZhangandVernPaxson.Detectingsteppingstones.In Proc.of 9th USENIXSecuritySympo-sium, August2000.

A The n-Viterbi Algorithm

The Viterbi algorithmis widely usedin solving HMMproblems.Givenanobservation B y1 C.D.D.DEC yT F of aHMM,the Viterbi algorithm inductively computesthe mostlikely sequenceB q1 C q2 C.D�D.DGC qt F thatgeneratedtheobser-vation for eacht H 1C 2C.D�D.D.C T. Let SB qt F be the mostlikely sequenceat time t that endswith stateqt , with

correspondingposteriorprobabilityV B qt F . The Viterbialgorithmstartswith

SB q1 F H q1 and V B q1 F H PrI q1 J y1KLCandcomputes

V B qt F H maxqt M 1

PrI yt J qt K PrI qt J qt N 1K V B qt N 1 FThenwe let qt N 1 be thestatethatmaximizestheaboveexpressionanddefineSB qt F to be SB qt N 1 F J qt . The finalresultof theViterbi algorithmreturnsthemostlikely se-quenceof agivensequenceof observations.

We extend the Viterbi algorithm to the n-Viterbi algo-rithm, which returnsthe n most likely sequencesgivena sequenceof observations.Figure12 shows a diagramof then-Viterbi algorithm.At eachtimeslicet, weasso-ciatea list with eachpossiblestatenodethatkeepstrackof the n most likely sequencesthat lead to the stateatthattimeslice.

Let Sn B qt F denotethesetof then mostlikely sequencesendingwith stateqt at time t, with correspondingpos-terior probabilitiesVn B qt F . At time t H 1, we initializethe n-Viterbi algorithmin the sameway as the Viterbialgorithm,

Sn B q1 F HPO q1 Q and Vn B q1 F H PrI q1 J y1KLDFor time t, we let

Vn B qt F H nmax O PrI yt J qt K PrI qt J qt N 1K v: qt N 1 R QC v R Vn B qt N 1 F Q

wherenmaxdenotesthesetof then largestvalues.Welet Sn B qt F bethesetn highest-probabilitysequencescor-respondingto thechoiceof Vn B qt F above.

Except for the first and the secondstep,at eachtimeslice, for eachpossiblestate,we needto go throughn S JQJ possibilitiesand computethe n most likely se-quencesthat leadto that stateat that time slice. Hencethecomplexity of n-Viterbi algorithmis O B nJQJ 2T F .

Page 17: Proceedings of the 10th USENIX Security Symposium

qS

q2 q2

q1

q1

qS

q1

q2

qS

qS

q2

q1

t=1 t=2 t=3 t=T

12

n

... ...

12

n

... ...

12

n

... ...

12

n

... ...

12

n

... ...

12

n

... ...

12

n

... ...

12

n

... ...

12

n

... ...

1

1

1

...

...

...

. . .

. .

.

. . .

. .

.

. . .

. .

.

. . .

. .

.

Figure12: A pictorial representationof then-Viterbi Algorithm. Eachverticalslicerepresentsa time step,andeachnoderepresentsa possiblestateat a particulartime slice. Thelist associatedwith eachnodestoresthen mostlikelysequencesendingwith thatstateup to thattimeslice.


Recommended