+ All Categories
Home > Documents > benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira [email protected]...

benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira [email protected]...

Date post: 23-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
7
7/23/18 1 Marco Vieira [email protected] Department of Informatics Engineering University of Coimbra - Portugal BENCHMARKING THE SECURITY OF SOFTWARE SYSTEMS OR TO BENCHMARK OR NOT TO BENCHMARK QRS 2018 Lisbon, Portugal July 19 th , 2018 Marco Vieira QRS 2018, Lisbon, Portugal, July 19 th , 2018 2 BENCHMARKING Assessing and comparing computer systems and/or components according to specific quality attributes § Performance benchmarking Well established both in terms of research and application Supported by organizations like TPC and SPEC Mostly for marketing § Dependability benchmarking Well established from a research perspective No endorsement from the industry Marco Vieira QRS 2018, Lisbon, Portugal, July 19 th , 2018 3 BENCHMARKING Assessing and comparing computer systems and/or components according to specific quality attributes § Security benchmarking Several works can be found No common approach available yet 2017 Performance benchmarks Dependability benchmarks Security benchmarks CIS 2000 Whetstone Wisconsin Bench TP1 DebitCredit Orange Book TPC & SPEC SIGDeB Common Criteria 1972 1983 1985 1988 1999 EMBC 1987 Release of commercial performance benchmarksResearch projects on dependability & security benchmarks Marco Vieira QRS 2018, Lisbon, Portugal, July 19 th , 2018 4 OUTLINE § The past: Performance & Dependability Benchmarking § The present: Security Benchmarking § Benchmarking the Security of Systems Approach: Qualification + Trustworthiness Assessment Example: Benchmarking Web Service Frameworks § Benchmarking Security Tools Approach: Vulnerability and Attack Injection Example: Benchmarking Intrusion Detection Systems § Challenges and Conclusions Marco Vieira QRS 2018, Lisbon, Portugal, July 19 th , 2018 5 PERFORMANCE BENCHMARKING Assessing and comparing computer systems and/or components in terms of performance Marco Vieira QRS 2018, Lisbon, Portugal, July 19 th , 2018 6 PERFORMANCE BENCHMARKING SUB Metrics Workload § Workload: Set of representative operations § Metrics: Throughput Response time Latency
Transcript
Page 1: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

1

[email protected]

DepartmentofInformaticsEngineeringUniversityofCoimbra- Portugal

BENCHMARKINGTHE SECURITY OF SOFTWARE SYSTEMS OR

TO BENCHMARK OR NOT TO BENCHMARK

QRS 2018Lisbon, PortugalJuly 19th, 2018

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 2

BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsaccording to specific qualityattributes

§ Performancebenchmarking– Wellestablishedbothintermsofresearchandapplication– SupportedbyorganizationslikeTPCandSPEC– Mostlyformarketing

§ Dependabilitybenchmarking– Wellestablishedfromaresearchperspective– Noendorsementfromtheindustry

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 3

BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsaccording to specific qualityattributes

§ Securitybenchmarking– Severalworkscanbefound– Nocommonapproachavailableyet

2017

Performance benchmarks Dependability benchmarks

Security benchmarks

CIS

2000

Whetstone Wisconsin BenchTP1DebitCreditOrange Book

TPC & SPEC SIGDeBCommon Criteria

1972 1983 1985 1988 1999

EMBC

1987

Release of commercial performance benchmarks… Research projects on dependability & security benchmarks

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 4

OUTLINE

§ Thepast:Performance&DependabilityBenchmarking

§ Thepresent:SecurityBenchmarking

§ BenchmarkingtheSecurityofSystems– Approach:Qualification+TrustworthinessAssessment– Example:BenchmarkingWebServiceFrameworks

§ BenchmarkingSecurityTools– Approach:VulnerabilityandAttackInjection– Example:BenchmarkingIntrusionDetectionSystems

§ ChallengesandConclusions

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 5

PERFORMANCE BENCHMARKING

Assessingandcomparingcomputer systems and/or components

in terms of performance

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 6

PERFORMANCE BENCHMARKING

SUB MetricsWorkload

§ Workload:– Setofrepresentativeoperations

§ Metrics:– Throughput– Responsetime– Latency– …

Page 2: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

2

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 7

TPC-C(1992)

§ Workload:– Databasetransactions

§ Metrics:– Transactionrate(tpmC)– Pricepertransaction($/tpmC)

Althoughsomeintegritytestsareperformed,it assumes thatnothingfails

DBMS MetricsWorkload

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 8

DEPENDABILITY BENCHMARKING

Assessingandcomparingcomputer systems and/or componentsconsidering dependabilityattributes

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 9

DEPENDABILITY BENCHMARKING

SUB

ExperimentalmetricsWorkload

Faultload

§ Faultload:– Setofrepresentativefaults,injectedintothesystem

§ Metrics:– Performanceand/ordependability

• Bothbaselineandinthepresenceoffaults

– Unconditionaland/ordirect

Unconditionalmetrics

Models

Parameters(faultrates,MTBF,etc.)

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 10

§ Workload:– TPC-Ctransactions

§ Faultload:– Operatorfaults+Softwarefaults+HWcomponentfailures

§ Metrics:– Performance:tpmC,$/tpmC,Tf,$/Tf– Dependability:Ne,AvtS,AvtC

DBENCH-OLTP(2005)

SUB

ExperimentalmetricsWorkload

Faultload

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 11

DBENCH-OLTP(2005)

Faultload:Operatorfaults

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 12

DBENCH-OLTP(2005)

Baseline Performance

0

1000

2000

3000

4000

A B C D E F G H I J K

tpmC

0

10

20

30

$tpmC$/tpmC

Performance With Faults

0

1000

2000

3000

4000

A B C D E F G H I J K

Tf

0

10

20

30

$Tf$/Tf

Availability

50

60

70

80

90

100

A B C D E F G H I J K

% AvtS (Server)AvtC (Clients)

Doesnottakeintoaccountmaliciousbehaviors(faults=vulnerability+attack)

Page 3: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

3

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 13

SECURITY BENCHMARKING

Assessingandcomparingcomputer systems and/or components

considering securityaspects

§ BenchmarkingtheSecurityofSystems/Components– Systemsthatshouldimplementsecurityrequirements– OS,middleware,serversoftware,etc.

§ BenchmarkingSecurityTools– Toolsusedtoimprovethesecurityofsystems– Penetrationtesters,staticanalyzers,IDS,etc.

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 14

BENCHMARKING SECURITY OF SYSTEMS

§ Attackload:– Representativeattacks

§ Metrics:– Performance+dependability– Security(e.g.,numbervulnerabilities,attackdetection)

SUB

ExperimentalmetricsWorkload

Attackload

Unconditionalmetrics

Models

Parameters(vulnerabilityexposure,meantimebetweenattacks,etc.)

Attackingwhat?Doweknowthevulnerabilities?Whatarerepresentativeattacks?

Doesnotworkifonewantstobenchmarkhowsecuredifferentsystemsare!

e.g.doesthenumberofvulnerabilitiesofasystemrepresent anything?

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 15

ADIFFERENT APPROACH…

SUBsSecurity

Qualification

Unacceptable

Security=0

§ SecurityQualification:– Applystate-of-the-arttechniquesandtoolstodetectvulnerabilities

– SUBswithvulnerabilitiesare:• Disqualified!• Orvulnerabilitiesarefixed…

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 16

ADIFFERENT APPROACH…

TrustworthinessAssessment

MetricsAcceptable

§ TrustworthinessAssessment:– Gatherevidencesonhowmuchonecantrust– e.g.,bestcodingpractices,developmentprocess,badsmells

SUBsSecurity

Qualification

Unacceptable

Security=0

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 17

ADIFFERENT APPROACH…

§ Metrics:– Portraytrustfromauserperspective– Dynamic:maychangeovertime– Dependonthetypeofevidencesgathered– Differentmetricsfordifferentattackvectors

TrustworthinessAssessment

MetricsAcceptableSUBs

SecurityQualification

Unacceptable

Security=0

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 18

EXAMPLE:WEB SERVICE FRAMEWORKS

Assessment(CPU+mem.)

Trust.Score

AcceptableWSFs

Qualification(testing)

Unacceptable

Security=0

§ Qualification– DoS Attacks– CoerciveParsing,MalformedXML,MaliciousAttachment,etc.

§ TrustworthinessAssessment:– Qualitymodeltocomputeascore

Page 4: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

4

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 19

QUALITY MODEL

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 20

SYSTEMS UNDER BENCHMARKING

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 21

TRUSTWORTHINESS RESULTS

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 22

§ Faultload:– Vulnerabilitiesareinjected– Attackstargettheinjectedvulnerabilities

§ Datacanbecollectedforbenchmarkingsecuritytools– Penetrationtesters,staticanalyzers,IDS,etc.

BENCHMARKING SECURITY TOOLS

SUB

ExperimentalmetricsWorkload

Faultload(vulnerabilities+attacks)

Sec.Tool

Data

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 23

VULNERABILITY AND ATTACK INJECTION

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 24

EXAMPLE:BENCHMARKING IDS§ Securityrequiresadefenseindepthapproach

– Codingbestpractices– Testing– Staticanalysis– …

§ Vulnerability-freecodeishard(orevenimpossible)toachieve...

§ Intrusiondetectiontoolssupportapost-deploymentapproach– Forprotectingagainstknownandunknownattacks

Page 5: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

5

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 25

EVALUATION APPROACH

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 26

EXAMPLES OF VULNERABILITIES INJECTED

Original PHP code Code with injected vulnerability Operation performed

$id=intval($_GET['id']); $id=$_GET['id']; Removed the “intval” function allowing also non numeric values (i.e. SQL commands) in the “$id” variable

$page = urlencode($page); $page = $page; Removed the “urlencode” function allowing also alphanumeric values (i.e. SQL commands) in the “$page” variable

… … …

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 27

EXAMPLES OF ATTACKS

Attack payloads Expected result ' Modifies the structure of the query; usually results in an error

or 1=1 Modifies the structure of the query. Overrides the query restrictions by adding a statement that is always true.

' or 'a'='a Modifies the structure of the query. Overrides the query restrictions by adding a statement that is always true.

+connection_id()-connection_id() Modifies the query result to 0

+1-1 Modifies the query result to 0 +67-ASCII('A') Modifies the query result to 0 +51-ASCII(1) Modifies the query result to 0 … …

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 28

SYSTEMS UNDER BENCHMARKING

Tool Architectural Level monitored

Detection Approach

Data Source Known

Technology Limitations

ACD Application Anomaly Based Apache Log Only GET method Apache Scalp Application Signature Based Apache Log Only GET method ModSecurity Application Signature Based HTTP traffic - Snort (v2.8 and v2.9)

Network Signature Based Network Trafic -

GreenSQL Database Signature Based SQL Proxy Trafic MySQL data

DB IDS Database Anomaly Based SQL Sniffer Trafic

MySQL and Oracle data

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 29

EXPERIMENTAL SETUP

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 30

MAIN RESULTS

P N Pop TP TN FN FP

ACD 1275 376 174 675 50 0.883 0.358 0.088 0.135

Scalp 1275 206 224 845 0 1.000 0.196 0.210 0.196

ModSecurity 826

225 1051 236 225 590 0 1.000 0.286 0.276 0.286

Net Snort 2.8 1275 0 817 458 0 - 0.000 - 0.000

GreenSQL 1275 244 813 214 4 0.984 0.533 0.775 0.528

DB IDS 1275 451 384 7 433 0.510 0.985 0.492 0.455

Net Snort 2.9 173

878 1051 0 878 173 0 - 0.000 - 0.000

458

817

DB

App

1051

224

Alllvl Tool

Review ReportedPrec. Infor.Mark.Recall

Page 6: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

6

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 31

WHAT IS WRONG?§ Establishedbenchmarksaremostlyformarketing!

§ Strictbenchmarkingconditions– Fixedworkload&faultload +Smallsetofmetrics

§ Workload&faultload:– Maynotberepresentativeoftheuserscenario

§ Metrics:– Fixed!Maynotsatisfytheuserneeds– Decisionbasedonseveralmetricsisdifficult!

Nosecuritybenchmarkendorsedbyanyorganization or industry

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 32

FIXED!

§ Example:– Benchmarkingvulnerabilitydetectiontools– Typicalmetric:F-Measure– Isthisgoodinallscenarios?

• Businesscritical:recall• Besteffort:F-Measure• Minimumeffort:Markedness

SUB MetricsActivation

Fixed!

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 33

APOTENTIAL APPROACH…§ Benchmarkingconditionsadaptabletotheuserneeds

§ Includemultipleusagescenarios:– Metricsdependonthescenario– Adaptableworkloadandfaultload

§ Usequalitymodelsinsteadofindependentmetrics– Qualitymodelsshouldalsoadapttothescenario

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 34

SCENARIOS AND QUALITY MODELS

Howtodefinescenarios?Howtodefinequalitymodels?Howtoadaptworkloadsandfaultloads to

thescenarios?

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 35

CHALLENGES§ Satisfyindustryrequirements

– Representativeness,portability,scalability,non-intrusiveness,lowcost,…

– Prevent“gaming”

§ Satisfyuserrequirements– Representativeness,usefulness,simplicityofuse…– Adaptable– allow“gaming”

§ EndorsementbyTPC,SPEC,…– Howto?

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 36

IS THERE A FUTURE?§ ResilienceBenchmarking

– Assessandcomparethebehaviorofcomponentsandcomputersystemswhensubjectedtochanges

– Whichresiliencemetrics?• Comparable,consistent,understandable,meaningful,…

– Changeloads:• Representative,practical,portable,…

§ TrustworthinessBenchmarking– Whatevidencestocollect?– Whatmetrics?– Dynamicityofperception… socialtrust...

Page 7: benchmarking v2 - University of Texas at Dallas€¦ · 7/23/18 1 Marco Vieira mvieira@dei.uc.pt Department of Informatics Engineering University of Coimbra -Portugal BENCHMARKING

7/23/18

7

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 37

CONCLUSIONS§ Thebenchmarkingconceptiswellestablished!

§ Acceptanceby“big”industrydependsonperceivedutilityformarketing

§ Acceptancebyusersrequires“adaptability”

§ Fromaresearchperspective,performanceanddependabilitybenchmarkingarewellknown

§ Securitybenchmarkingapproachesareweak

§ Newtypesofbenchmarkswillbringadditionalchallenges!

MarcoVieira QRS2018,Lisbon,Portugal,July19th,2018 38

QUESTIONS?

Marco VieiraDepartment of Informatics EngineeringUniversity of [email protected]

http://eden.dei.uc.pt/~mvieira


Recommended