Grand Research
Challenges for Cybersecurity of
Critical Information and Infrastructures
Paulo Esteves-Veríssimo
Univ. of Luxembourg, FSTC / SnT
http://staff.uni.lu/paulo.verissimo
CritiX Lab (Critical and Extreme
Security and Dependability)
Science of Security Speaker Series,
Information Trust Institute, U. Illinois
Urbana/ Champaign, April 2017 .
The world is becoming an immense
(interconnected) infrastructure
ISP
ISP
CLOUD
COMPUTING AND
COMMUNICATIONS
Internet minute
www.intel.com/.../internet-minute-infographic.html
CritiX research
vision
People
Type theory,
Programming
languages, Proof
assistants,
Distributed
computing
Dr. Vincent Rahli (FR)
Dr. Francisco Rocha (PT)
Systems security,
software security,
dependability, and
security
architectures,
autonomous
vehicles safety
and security
Prof. Paulo Esteves-Veríssimo (PT)
Secure and
dependable
distributed
architectures,
middleware and
algorithms
Dr. Jérémie Decouchant (FR)
Distributed systems'
performance and
fault tolerance,
privacy and security,
multicore systems,
peer-to-peer
protocols
Admin.
assistant
Natalie Kirf (DE)
Dr. Marcus Völp (DE)
Microkernel and
microhypervisor
OS-level
protection, CPS,
RTES
Dr. Jiangshan Yu (CN)
Design and
analysis of crypto
protocols, Secure
authentication,
Post compromise
security, Public-
ledger
applications
Dr. David Kozhaya (LB)
Reliable and real-
time distributed
computing, fault-
tolerance in
distributed control
systems (SCADA/
DCS)
- Critical and Extreme Security and
Dependability Research Lab Research enablers for the next generation of protection
• Critical Security and Dependability: – information and infrastructures under advanced
persistent threats
• Extreme Computing: – CSE pushed to the extremes of functional and non-
functional properties
• Architecting and designing for resilience: – accidental and malicious faults; protection in an
incremental way; automatic adaptation
Software Defined Networking (SDN)
the other face of the problem
• ironically, causes of concern lie in SDN's main benefits: – network programmability and control logic centralization
– smaller diversity
– new threats that did not exist before or were harder to exploit
[Kreutz et al.,
HotSDN’13]
Networked Control Systems: no longer closed, proprietary, or dumb
RVR
AVR
Feedback Control
FIREWALL Protection
Device
Generator
[Bessani et al.,
Sec&Priv Mag.’08]
X-by-wire Networked Vehicles:
no longer mechanical and isolated
[Lima et al.,
CPS-SP@CCS’16]
The safety-security gap in vehicle ecosystems
Towards Safe and Secure Autonomous and Cooperative Vehicle Ecosystems. Lima, A; Rocha,
F; Volp, M; Verissimo, P. in Proc’s 2nd ACM Workshop on Cyber-Physical Systems Security and
Privacy (2016, October) @CCS, Vienna-Austria
Autonomous vehicle ecosystem threat plane
Genomics and Biomedical vs. Big Data
• Next Generation Sequencing dramatically decreased sequencing price
• Amount of data handled will scale up.
• Cloud and big data analytics onto the agenda
Big Data Biomedical
[Verissimo et al.,
Sec&Priv Mag.’13]
Privacy- and integrity-preserving data processing The e-biobanking vision
Alysson Bessani et al., BiobankCloud: a Platform for the Secure Storage, Sharing, and Processing of Large Biomedical Data Sets”, in Proc’s of the 1st Int. Workshop on Data Mgt. and Analytics for Medicine and Healthcare (DMAH 2015), Hawaii, US, Sept. 2015.
Privacy-preserving (really) distributed DNA alignment!
DNA workflows in an e-biobanking ecosystem
18
• Classical DNA workflow does not fit the e-biobanking vision, where data are:
• generated and stored in multiple locations
• accessible from different locations
• processed in distributed way
• (any) DNA workflows must guarantee that DNA
data must be protected, with incremental levels according to need, upon: • storage,
• computation,
• publishing, sharing
Blockchain for FinTech and Crypto-currency: challenges
• Scalability: – Currently, the number of records/transactions per second is limited. To make
blockchain useful in financial applications, it is crucial to solve this challenge.
• Security: – Blockchain is vulnerable to attacks such as double spending, selfish mining,
renting, some of which consequence of ambiguous model definitions. We need a more robust and resilient system for critical financial applications.
• Privacy: – Currently, anyone can read the transparent ledger, how to provide user privacy
but still preserve transparency is a key challenge.
• Consensus: – A client can never be guaranteed that a record/transaction is included in the
blockchain. Important to provide fast, deterministic, and verifiable guarantees.
Meeting the Challenges of Critical Dependability and Security
PEARL project IIS&D - Information Infrastructure Security and Dependability Value proposal:
CPS infrastruct. and control
Internet and cloud
infrastruct.
Sec&Dep of
embedded compon.
Highly sensitive
data privacy and
integrity
Architecting and designing for resilience
• comprehensive approach to those threats,
from first principles: “build defence in”
• simultaneously coping with accidental and
malicious faults
• provide protection in an incremental way,
not all threats are extreme, or all systems critical
• automatically adapt to a dynamic range of severity of threats
• seek unattended and perpetual operation
Is resilience really necessary?
A world full of threats?
• targeted attacks and advanced persistent threats, by nation-state actors and other powerful agents
• weakening and subversion of comms and computing services
• threats to privacy: mass surveillance and data collection
• sophisticated automated cyber weapons
• organised crime
Conventional Software Vulnerabilities ever increasing
Number of new vulnerabilities
per year
(Sources: IBM xForce, Symantec, Telexa)
Tailored Subversion and Intrusion
(Source: Adapted from Lipson, H. F., Tracking and Tracing Cyber-Attacks: Technical Challenges and Global Policy Issues, Special Report
CMS/SEI-2002-SR-009, November 2002. (CERT)
High
Low
1980 1985 1990 1995 2000
password guessing self-replicating code
password cracking
exploiting known vulnerabilities
disabling audits
back doors
hijacking sessions
sweepers
sniffers
packet spoofing
GUI automated probes/scans
denial of service
www attacks
Attacks
Attackers
“stealth” / advanced scanning techniques
burglaries
network mgmt. diagnostics
DDOS attacks
20xx…
Bot Nets
Embedded malicious
code
Required
Attacker
expertise
Available
Attack
sophistication
TARGETED
ATTACKS a.k.a.
ADVANCED
PERSISTENT
THREATS
Attack sophistication vs. attacker expertise
Designing and architecting for resilience
1. we want systems to operate through faults and attacks in a seamless manner, in an automatic way
2. we want systems to endure the fact that operating conditions and environments are everyday more uncertain and/or hostile
3. we want systems to be deployed in unattended manner
4. we want systems to attain very high levels of assurance
Preventing and Tolerating Faults and Intrusions
Handling Incremental Threat Severity
Resisting Continued Threats
Validating and Assessing Assumptions and Mechanisms
P Verissimo, M Correia, N Neves, P Sousa, “Intrusion-Resilient Middleware Design and Validation”, in Information Assurance,
Security and Privacy Services. Emerald Group Publishing Limited, May 2009, vol. 4.
P Verissimo, N Neves, M Correia, “Intrusion-Tolerant Architectures: Concepts and Design”, in Architecting Dependable Systems,
ser. LNCS. Springer-Verlag, Jun. 2003, vol. 2677. Ext. vers. in http://hdl.handle.net/10451/14253
Designing dependable and secure systems: a brief history
Host A
Designing dependable and secure systems the zero-defect goal
Host B
Host C
Host D
Host A Host A
Designing dependable and secure systems zero-defect goal considered infeasible:
NOTES:
(i) there will always be vulnerabilities in a fully-fledged system
Host A Host A
Designing dependable and secure systems reality in real-world systems
Host A
Host B
Host C
Host D
NOTES:
(i) there will be quite a few vulnerabilities in a real-world system
Attack-Vulnerability-Intrusion composite fault model
NOTES:
(i) This model, introduced in the MAFTIA project, will help analyze the situation
(ii) state includes data, code, metadata, configuration variables, etc.
(iii) “failure” means failure of any security property, when and if perceived by a user
36
AVI fault model: attack + vulnerability intrusion error failure
Intruder/Designer/Operator
vulnerability(fault)
Intruder
attack(fault)
intrusion (fault)
error failure
(ii) state is erroneous
(iii) failure will ensue sooner or
later
[Verissimo et al.,
Sec&Priv Mag.’06]
(i) Composite fault: v+a
Classical security: Intrusion prevention
Intruder
attack(fault)
intrusion (fault)
error failure
attack prevention
vulnerabilityprevention
intrusion prevention
vulnerabilityremoval
Intruder/Designer/Operator
vulnerability(fault)
Intrusion prevention!
Designing dependable and secure systems reality in real-world systems – optimistic view
Host A
Host B
Host C
Host D
Host A Host A Host A
Host B
Host C
Host D
Host A Host A
Designing dependable and secure systems reality in real-world systems – optimistic view
Host A
Host B
Host C
Host D
Host A
Host B
Host C
Host D
Host A Host A
Designing dependable and secure systems zero attack-vulnerability-match goal
Host A
Host B
Host C
Host D
Host A
Host B
Host C
Host D
NOTES:
(i) Individual vulnerabilities and attacks exist, we try to prevent their successful matches
Classical security: Intrusion prevention
vulnerabilityremoval
Intruder/Designer/Operator
vulnerability(fault)
Intruder
attack(fault)
attack prevention
vulnerabilityprevention
intrusion prevention
Indispensable but not
perfect ...
Host C
Classical security: Intrusion prevention
intrusion (fault)
error failure
Host C
intrusion prevention
Ah, but that’s a residual
probability ! riiiiight ?...
Classical security: Intrusion prevention
intrusion (fault)
error failure
Host C
intrusion prevention
If it MAY happen, it WILL
happen!
Classical security: Intrusion detection
vulnerabilityremoval
Intruder/Designer/Operator
vulnerability(fault)
Intruder
attack(fault)
intrusion (fault)
error failure
attack prevention
vulnerabilityprevention
intrusion prevention
Intrusion
detection!
Classical security: Intrusion detection
vulnerabilityremoval
Intruder/Designer/Operator
vulnerability(fault)
Intruder
attack(fault)
intrusion (fault)
error failure
attack prevention
vulnerabilityprevention
intrusion prevention
Which is useful but imprecise,
incomplete and slow ...
NOTES:
(i) after intrusion, the system is in the path to failure, so incompleteness or slowliness of intrusion detection and/or ad-hoc processing/mitigation, bears a high risk of failure of a security property as perceived by a user
And what’s worse, some
individual failures will still occur
Back to the AVI composite fault model
• what remains, as ‘action points’ before failure?
• why not act after errors caused by intrusions, by intrusion tolerance, as in “fault tolerance”?
51
vulnerabilityremoval
Intruder/Designer/Operator
vulnerability(fault)
Intruder
attack(fault)
intrusion (fault)
error failure
attack prevention
vulnerabilityprevention
intrusion prevention
intrusion detection
We are here!
Automatic Dependability and Security: intrusion tolerance
Intruder
attack(fault)
intrusion (fault)
error failure
attack prevention
vulnerabilityprevention vulnerability
removal
Intruder/Designer/Operator
vulnerability(fault)
intrusion prevention
intrusion tolerance
P Verissimo, N Neves, M Correia, “Intrusion-Tolerant Architectures: Concepts and Design”, in Architecting Dependable Systems,
ser. LNCS. Springer-Verlag, Jun. 2003, vol. 2677. Ext. vers. in http://hdl.handle.net/10451/14253
What is Intrusion Tolerance?
• The tolerance paradigm in security: – Assumes that systems remain to a certain extent vulnerable
– Assumes that attacks on components or sub-systems can happen and some will be successful
– Ensures that the overall system nevertheless remains secure and operational, preventing “failure”, as failure of a security property, when and if perceived by a user
– Paradigms and techniques can simultaneously cover accidental and malicious faults (dependability and security)
• In other words: – Faults--- malicious and other--- occur
– They generate errors, i.e. component-level security compromises
– Error processing mechanisms make sure that system-level security failure is prevented
53
Intrusion Tolerance: error processing
• Error detection and recovery techniques: – effective use of intrusion detection: integrated in a principled and
automatic process
– system detects errors resulting from intrusions
– system either goes back to a previous state known as correct, and resumes/restarts operation, or reconfigures and proceeds forward to state seeking correct provision of service
54
Recover and resume/restart after intrusion Reconfigure after intrusion
Intrusion Tolerance: error processing
• Error masking techniques – redundancy allows providing correct service without glitch
– system masks errors resulting from intrusions
– works whilst enough redundancy for number of faults
55
Detection not needed, whatever happens, error is masked
A methodic approach to modular and distributed resilient computing
• Fault and intrusion tolerance, or automatic security and dependability
• Handle increasing threat severity
• Resist continued threats
• Divide-and-conquer to beat extreme threats
• Hybrid models and architectures
• Ultra-reliable trusted components
• High-confidence vertical verification
• Privacy- and integrity-preserving data processing
Fault and Intrusion Tolerance (FIT) Masking: an abstract solution
Tolerating Faults and Intrusions automatically
Incoming
Traffic
f+1 out of 2f+1
k out of n
Consolidation
f = max. number of faulty replicas (f=1 in this example)
Node
Node
Node
Automatic Dependability and Security: intrusion tolerance
Intruder
attack(fault)
intrusion (fault)
error failure
attack prevention
vulnerabilityprevention vulnerability
removal
Intruder/Designer/Operator
vulnerability(fault)
intrusion prevention
intrusion tolerance
Criticisms to InTol: common-mode failures
or fast exhaustion; f+1 syndrome;...
Fault and Intrusion Tolerance (FIT) The fast exhaustion or common-mode failure problem
Incoming
Traffic
f = max. number of faulty replicas (f=1 in this example)
Node
Node
Node
Node
Consolidator O.S.x
inside!
O.S.x
inside!
O.S.x
inside!
O.S.x
inside!
Host A Host A
Designing dependable and secure systems reality in real-world systems – recap
Host A
Host B
Host C
Host D
Host A Host A
Designing dependable and secure systems reality in real-world systems – adapting to intrusion tolerance
Host A
Host B
Host C
Host D
Host A
Host B
Host C
Host D
Hardening q.b.
Designing dependable and secure systems reality in real-world systems – adapting to intrusion tolerance
Diversifying q.b.
Host A
Host B
Host C
Host D
Host A
Host B
Host C
Host D
Fault and Intrusion Tolerance (FIT) Hardening and Diversification - Mitigating fast exhaustion
Incoming
Traffic
3f+1 diverse replicas (f=1 in this example)
Node
Node
Node
Node
O.S.y
inside!
O.S.w
inside!
O.S.x
inside!
O.S.z
inside!
Host A
Host B
Host C
Host D
Designing dependable and secure systems zero failures goal despite successful intrusions ?
Host A
Host A
Host B
Host C
Host D
Designing dependable and secure systems zero failures goal despite successful intrusions ?
Only until f+1 faults produced ...
Host A
Host A
Fault and Intrusion Tolerance (FIT) The resource exhaustion (or f+1) problem
Incoming
Traffic
f = max. number of faulty replicas (f=1 in this example, was exceeded)
Node
Node
Node
Node
f+1 out of 3f+1
k out of n
Consolidator
Fault and Intrusion Tolerance (FIT) Resisting Continued Threats
Seeking (unattended) perpetual execution
Incoming
Traffic
f = max. number of faulty replicas in an interval Tr
Node
Node
Node
Node
Consolidator
Recover Now!
Recovered!
Recover Now!
Recover Now!
Recovered!
Recovered!
Designing dependable and secure systems zero failures forever goal despite successful intrusions ?
Host A Host A Host A
Host B
Host C
Host D
Host A
Host B
Host C
Host D
Host A
Host B
Host C
Host D
[Sousa et al., DSN 05]
Only if you are faster than the
attacker ...
A methodic approach to modular and distributed resilient computing
• Fault and intrusion tolerance, or automatic security and dependability
• Handle increasing threat severity
• Resist continued threats
• Divide-and-conquer to beat extreme threats
• Hybrid models and architectures
• Ultra-reliable trusted components
• High-confidence vertical verification
• Architecture-aware privacy-preserving data processing
Divide-and-conquer I: Hybrid models and architectures Leveraging power at right place right time
T Host B
Hardened q.b. payload
Ultimately trusted hybrids
NOTES:
(i) Trusted components must be trustworthy by construction
(ii) But having trusted-trustworthy components is not enough:
(iii) We need an computational model that makes payload enjoy their properties
Divide-and-conquer I: Hybrid models and architectures Leveraging power at right place right time
Host A
Host B
Host C
Host D
T
T
T
T
Verified and tamperproof functional API
Optional control network
Payload network, in/out
Divide-and-conquer I: Hybrid models and architectures Leveraging power at right place right time
Host A
Host B
Host C
Host D
T
T
T
T
Paulo Verissimo, “Travelling through
Wormholes: a new look at Distributed Systems
Models”, SIGACT News, vol. 37, no. 1, pages
66-81, 2006., Mar. 2006.
Paulo Verissimo, “Travelling through Wormholes: a new look at Distributed Systems Models”, SIGACT News, vol.37, no. 1, Mar. 2006.
Divide-and-conquer II: Protocol modules decomposition
• decompose payload protocols and components into smaller units to identify split points and critical parts
• further decompose even initially trusted components
• leverage lower-level fault and intrusion tolerance techniques so that desired properties emerge from the now modular conglomerate of sub-components
Ultra-reliable trusted components Dependable Hypervisor and Manycore Architectures
• Extremely dependable computing architecture to withstand advanced and persistent threats
• S.o.t.a.: – Microhypervisor-based security and isolation is much
better than legacy operating-systems, but still we see microhypervisor-level faults and attacks
– Verifying microhypervisor possible but at extreme costs / no protection against hardware faults
• Leverage properties of manycore systems to build dependable and secure microhypervisor-based systems
High-confidence vertical formal verification
of critical components
Re-design and partial verification of BFT-SMaRt (ULisboa) One of the few fully-implemented and efficient BFT-SMR protocols We plan on building trustworthy leader change and reconfiguration
components to plug into BFT-SMaRt As mentioned above we will verify these components in Coq (theorem
prover from INRIA)
Verification of MinBFT’s (ULisboa) core trusted-trustworthy component (USIG):
• C implementation of USIG
• Coq specification/implementation for verification
• Verify that C code satisfies Coq, through VST (Verified
Software Toolchain)
• Verify Coq spec satisfies desired safety properties through
Coq
• generate target code from C with CompCert (formally verified
C compiler)
Science of Security: some
reflections
89
Frameworks for SoS example goals and motivations
• A nation wishes to process as efficiently and effectively as possible, massive forests of data it somehow has access to: – About legitimate use of systems for ilegitimate purposes – About ilegitimate use of systems for ilegitimate purposes
• A nation wishes to improve prevention/ tolerance against ilegitimate use of systems – Direct attacks (inc. APT) onto systems and infrastructures – Intentional weakening or subversion of security and trust
mechanisms in ICT
90
Frameworks for SoS scope of application of results
1. A nation wishes to interpret as efficiently and effectively as possible, massive forests of data it has access to
2. A nation wishes to protect systems/infrastructures against ilegitimate use
A. Intelligence
B. Information gathering
C. Espionage
D. Infrastruct. Security
E. Infrastruct. Protection
F. Infrastruct. Resilience
G. Counter-espionage
91
Motivation (wrap-up) There is no trustworthy data on non-trustworthy systems
ERGO:
We need models and algorithms supporting systems that operate long enough to fulfill their mission, through threats of increasing magnitude, automatically and in an unattended mode
92
As a pre-condition to
Tolerance
• Tolerance Goal: operate correctly as long as at most f faults of any quality occur
• This well-known formal proposition however, says very little about an important objective:
• will f+1 faults not happen “during my watch”?
93
Resilience
• Resilience Goal: tolerate any quality and quantity of faults over time – as long as the power of the threat is bounded
– (i.e. at most f occur within a given interval)
• How to fulfil this formal proposition? – structure (hardening, trusted components)
– diversity, randomisation, obfuscation
– self-healing, ex. proactive/reactive recovery (PRR)
– dynamic reconfiguration, moving target defences 94
95
Paulo Esteves-Veríssimo University of Luxembourg Faculty of Science, Technology and Communication
_
and SnT, the Interdisciplinary Centre for Security, Reliability and Trust _
http://staff.uni.lu/paulo.verissimo
@SnT
Critical and Extreme Security and Dependability
We’re hiring bright PhD students and research associates willing to address these challenges!
Thank you! _