Precise Interface Identification to Improve Testing and Analysis of Web Applications

Post on 23-Feb-2016

43 views 0 download

Tags:

description

Precise Interface Identification to Improve Testing and Analysis of Web Applications. William G.J. Halfond, Saswat Anand , and Alessandro Orso Georgia Institute of Technology. End Users. Web Server. Example Web Application. Initial Visit. Web Application. getQuote.jsp. buyPolicy.jsp. - PowerPoint PPT Presentation

transcript

Precise Interface Identification to Improve Testing and Analysis of

Web Applications

William G.J. Halfond,Saswat Anand, and Alessandro Orso

Georgia Institute of Technology

2

Example Web Application

Web Server

End Users

Initial Visit Web Application

getQuote.jsp

buyPolicy.jspQuote Information

http://host/getQuote.jsp?action=doquote&car=jeep

3

Interface Identification

public void write(File outfile, String buffer, int length)

Domain information

Grouping of parameters

1. Names of parameters2. Grouping of parameters3. Domain information

Parameter names

4

Example Web Application

Interface Domain Constraints

action = “checkeligibility” integer(age) age < 16

action = “checkeligibility” integer(age) age 16

public void service (HttpRequest req) 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) 8. if (aValue.equals( “doquote” )) 9. String nValue = req.getIP( “name” )

10. String carType = req.getIP( “type” )11. int carYear = getNumIP( “year” )12. calculateQuote(carType, carYear)

public int getNumIP(String name) 1. String value = getIP(name) 2. int param = Integer.parse(value) 3. return param

1. Names of parameters2. Grouping of parameters3. Domain information

Parameter Namesaction, age, name, type, year

Groupings of Parameters action

action, age

action, name, type, year

DynamicSpider

• Web spider crawls pages of application• Limitation: No guarantee of completeness

StaticDFW1:

• Identify parameter names via static analysis• Limitation: Only identifes parameter names

WAMDF2:

• Uses iterative data-flow analysis• Limitation: Assumes all paths feasible

Previous Approaches: Interface Identification

5

1. Deng, Frankl, Wang, SEN 2004.2. Halfond and Orso, FSE 2007.

(action, age, name, type, year)

1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” ))

… 8. if (aValue.equals( “doquote” ))

4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( )

Our Approach

Statically identify interfaces by using symbolic execution to model input parameters and domain constraining operations.

1. Program transformation2. Symbolic execution3. Interface identification

6

7

1 – Program Transformation

1. Introduce symbolic values

2. Replace domain-constraining operations

value getIP(name)

s new SymbolicValue()s.assignName(name)SymbolicState.add(s, value)return s

1. Accessing an input parameter2. Conversion to numeric type3. String comparison4. Arithmetic constraints

8

2 – Symbolic Execution

Symbolically execute the transformed web application -- track path conditions and symbolic state.

SymbolicExecution

Transformed Web Application

getQuote.jsp

buyPolicy.jsp

Path Conditionsc1 c2 c3

c3 c4 c5

Symbolic Statessaction aValuesyear carYear

9

2 – Access Input Parameters

1. String aValue = req.getIP( “action” )

(PC, SS)

(PC, SS[saction aValue])

PC = Path ConditionSS = Symbolic State

10

2 – String Comparison

(PC saction “checkeligibility”, SS[saction aValue])

(PC, SS[saction aValue])

2. if (aValue.equals( “checkeligibility” ))

8. if (aValue.equals( “doQuote” ))

1. String aValue = req.getIP( “action” )

(PC saction “checkeligibility”, SS[saction aValue])

TRUEFALSE

11

3 – Interface Identification

PC1 saction “checkeligibility” integer(sage) sage 16

PC2 saction “checkeligibility” integer(sage) sage 16

SS [saction aValue, sageuserAge]

1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( )

12

Empirical Evaluation

Research Questions (RQ):1. Efficiency -- Is the new approach efficient in

terms of its analysis time requirements?

2. Precision -- Is the new approach more precise than previous approaches?

3. Usefulness -- Does the new approach improve the performance of quality assurance techniques?

13

Implementation: WAMSE

• Written in Java for Java Enterprise Edition (JEE) based web applications

• Implementation Modules1. TRANSFORM

• Customized JEE libraries• Stinger for analysis and automated transformation

2. SE ENGINE• Symbolic execution engine built on JavaPathFinder• Constraint solver is YICES

3. PC ANALYSIS

Implementation: Other Approaches

14

DynamicSpider

• Web spider crawls pages of application• OWASP Web Scarab Project

StaticDFW1:

• Identify parameter names via static analysis• Reimplementation of the author-provided code

WAMDF2:

• Uses iterative data-flow analysis• Implementation from previous work

1. Deng, Frankl, Wang, SEN 2004.2. Halfond and Orso, FSE 2007.

15

Subject Applications

Subject LOC Classes Servlets

Bookstore 19,402 28 27

Classifieds 10,702 18 18

Employee Directory 5,529 11 9

Events 7,164 13 12

Subjects available online from GotoCode.com

16

RQ1: Efficiency

Bookstore Classifieds Employee Dir. Events0

1000

2000

3000

4000

5000

Ana

lysi

s Ti

me

(s)

1. High amount of infeasible paths in subjects2. Low number of constraints per parameter3. Web applications highly modular

WAMSE WAMDF DFW Spider

17

RQ2: Precision

Bookstore Classifieds Employee Dir. Events0

100

200

300

400

Num

ber o

f Int

erfa

ces

On average, 80% of WAMDF

interfaces were spurious

WAMSE WAMDF

RQ3: Usefulness

Measure improvement of three quality assurance techniques:

a) Invocation Verificationb) Penetration Testingc) Test Input Generation

18

19

RQ3a – Invocation Verification

Approach False Positives False NegativesWAMDF 0% 50%

Spider 39% 0%

WAMSE 0% 0%

Verification of invocations for subject Bookstore

Web Application

getQuote.jsp buyPolicy.jspX

20

RQ3b – Penetration Testing

Bookstore Classifieds Employee Dir. Events0

10

20

30

40

Num

ber o

f Vul

nera

bilit

ies

WAMSE WAMDF DFW Spider

Number of vulnerabilities: 2X – 6X higher for WAMSE

21

RQ3c – Test Input Generation

Bookstore Classifieds EmployeeDir. Events5060708090

100

% Stmt.Coverage

Bookstore Classifieds EmployeeDir. Events10203040506070

% BranchCoverage

Bookstore Classifieds EmployeeDir. Events10

100

1000

# CommandForms

Branch coverage increase: 3%-67%

Statement coverage increase: 3%-25%

Command form increase: 651%-1,577%

WAMSE WAMDF DFW Spider

22

RQ3c – Test Suite Size

Bookstore Classifieds Employee Dir. Events1000

10000

100000

1000000

Num

ber o

f Tes

t Cas

es

RQ3c results:1. Higher coverage for measured metrics2. Smaller average test suite

WAMSE WAMDF DFW Spider

Test suite decrease in size: 4X – 10X

Summary of Results

• Developed interface identification technique for web applications based on symbolic execution.

• Empirical evaluation:• Similar analysis time to other techniques• More precise than current techniques• Improves quality assurance techniques

23