Master thesis in research group DistriNet
Distributed Software & Software SecurityKick-off event - September 25, 2019
Let us start with some examples..
what are you (and we!) aiming for?
Sequence-based Intrusion Detection with
Recurrent Neural Networks
Jin Li
Vera Rimmer, Ilias Tsingenopoulos
Wouter Joosen & Davy Preuveneers
Revised work submitted to NDSS 2020 (San Diego, USA)
IDS
Attack N
BenignNetwork flow
Packet 1
Packet 0
Packet T
…..
Feature vector
Manually
extract
features
Statistical info of flow
Protocol
Flow Duration
Total Fwd Packets
Total Backward Packets
Total Length of Fwd Packets
Total Length of Bwd Packets
Flow Bytes/s
Flow Packets/s
……
……
Attack 1
› Recognize known network attacks
› Identify anomalous network traffic
› Evolving network threat landscape
Network intrusion detection systems
IDS
Packet 1
Packet 0
Packet t
Sequential input
…..
Attack N
Benign
……
Attack 1
Sequence-based intrusion detection
› No manually defined high-level statistical features as input!
› Automatically extract features using RNN from the original packets sequence
› Achieve comparable performance to state-of-the-art
Input layer
Packet 2
Packet 1
Packet t
…..
LSTM
Unit
LSTM layer(s)
Dense layer(s)
Output layerMasking layer
Packet 2
……
Batc
h
LSTM layer 1
LSTM layer 2
LSTM layer n
……
Auto features
Auto features
Recurrent neural networks
Results
› Stratified sampling: 80% training set / 20% testing set
› Compare with the state-of-the-art results evaluated on the CICIDS2017 dataset [1]
Algorithm Precision Recall F1score
KNN 0.96 0.96 0.96
Random Forest 0.98 0.97 0.97
Decision Tree (ID3) 0.98 0.98 0.98
Adaboost 0.77 0.84 0.77
MLP 0.77 0.83 0.76
Naïve-Bayes 0.88 0.04 0.04
QDA 0.97 0.88 0.92
Bi-LSTM 0.984 0.982 0.983
[1] Sharafaldin, I., Lashkari, A. H., & Ghorbani, A. A. (2018, January). Toward Generating a New Intrusion
Detection Dataset and Intrusion Traffic Characterization. In ICISSP (pp. 108-116).
Dynamic and cost-efficient resource
allocation of multi-tenant workloads in
KubernetesAbel Rodríguez Romero
Wouter Joosen & Eddy Truyen
Leveraging k8-resource-optimizer for cost-efficient auto-scaling
Optimal resource allocation is very complex
› Minimize costs
› Comply with SLAs
› Dynamic demands
Elasticity = scalability + automation + optimization
› Optimization of fine-grained parameter spaces (e.g. milliCPUs,
Megabytes) is not feasible
› Non-linear scaling effects in homogeneous horizontal scaling
› Our proposal: address container elasticity using coarse-grained
heterogeneous horizontal scaling
Achieving optimal allocation for discretized spectrum
of numbers of tenants
Auto-scaler prototype for Kubernetes
Expressive Feature-oriented
Multicast for the Internet of Things
Jonathan Oostvogels
Stefanos Peros, Jan Tobias Mühlberg, Koen Yskout
Danny Hughes & Sam Michiels
Feature-oriented routing by example
› mesh network
› smart energy-harvesting cameras
› track wandering animal and report to back-end
› intelligently activate cameras near animal
› how to efficiently route actuation command?
Feature-oriented multicast
› route messages based on feature constraints
› in the IoT
interest in data-centric routing
avoid explicit resource discovery [1]
natural fit with e.g. complex event detection [2]
› repeated unicast? broadcast? multicast?
multicast only replicates messages as needed
but requires storing and communicating feature information
while facing memory and radio constraints
State of the art
› geographic routing (e.g. GPSR, GMR)
does not generalise
› publish-subscribe approaches
directed diffusion, Akkermans et al.
requires overlay structure before feature-oriented path is used
› featurecast
integrates feature-oriented routing in IoT network layer
flexible, no additional per-path overlay
promising performance evaluation
Conclusion
› SMRFET: expressive feature-oriented multicast
› feasible in constrained networks
compact implementation
small radio time overhead relative to regular multicast
outperforms broadcast, TM (radio time), repeated unicast (load distribution)
routing tables require small amounts of memory, graceful degradation
› caveats
packet loss
expressivity-efficiency trade-off (multidimensional addressing)
Fishy Faces: Crafting Adversarial Images to
Poison Face Authentication
Giuseppe Garofalo
Vera Rimmer, Tim Van hamme
Wouter Joosen & Davy Preuveneers
WOOT 2018, August 13-14 (Baltimore, MD, USA)
Face authentication
› wide adoption of face recognition in mobile devices
› face authentication is a highly security-sensitive application
› several attacks have been proposed (e.g replay attacks1, Bkav’s mask2 etc.)
[1] Face Anti-spoofing, Face Presentation Attack Detection
[2] Bkav’s new mask beats Face ID in "twin way": Severity level raised, do not use Face ID in business transactions.
System design
› our target authenticator is composed of two parts:
feature extractor
classification model
Input image
System design
› feature extractor
OpenFace library
based on Google’s FaceNet1 (Convolutional Neural Network)
• face detection
• pre-processing
• feature extraction
feature
extraction
input image
[1] Schroff, F., Kalenichenko, D., and Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE
conference on computer vision and pattern recognition (2015), pp. 815–823.
System design
› One-Class SVM for classification1
Trained only on images of the user
Takes a hyper-parameter which defines the upper-
bound to the percentage of training errors
One-Class
SVM
feature
Extraction
[1] Inspired by: Gadaleta, M., and Rossi, M. Idnet: Smartphone-based gait recognition with convolutional neural networks.
Pattern Recognition 74 (2018), 25 – 37.
input image
System design
› once trained, the model is used to authenticate the user
One-Class
SVM
Feature
Extraction
AuthenticationInput image
Methodology - Attack example
injected image
› after the injection, the classification accuracy drops from 4% to 44% (by 40%!)
false positive
unauthorised user
false negative
authorised user
The Procrastinator
TED Talk by Tim Urbanhttps://www.ted.com/talks/tim_urban_inside_the_mind_of_a_master_procrastinator?language=nl
“The trick, is to start the day in control.
When you’re able to start doing stuff and
show yourself that you can get started,
then you’re able to empower yourself.”
Tim Urban https://www.unstuck.com/advice/how-a-lifelong-procrastinator-cracked-the/
The Roadmap
The objectives of a master thesis
› perform a major project in an independent way
› participate in research
› report both orally and in writing
› create a large dose of self-reliance, creativity and initiative
develop a work plan independently and realize it
time management is very important!
28
6 aspects
whose importance in every master's thesis may vary
› a literature study
› an analysis of a problem/formulate a research question
› the formulation of a solution
› a design
› an implementation
› an evaluation
29
Who does what?
YOU have the
responsibility
your MENTORis your main contact
your SUPERVISORregularly discusses your progress
with your mentor
your ASSESSOR evaluates your completed work
RESEARCHERSdiscuss the progress of your thesis during workshops
direct contact may be limited
stay in touch!
KICK OFF MEETING
September (today )
Your Milestones
1
3
4
SECOND PRESENTATION
(around March)
2
CHOOSE A FINAL TITLE
(April)
FULL TEXT
(early June)
1
FIRST TEXT
(end December)FIRST PRESENTATION
(end November)
2 PROTOTYPE
(after Easter)
3
a major milestone
intermediate stationyour text is one continuous care!
Monthly evaluations
32
› according to plan?
› plan next 4 weeks?
› self-evaluation
› detailed workplan
› self-evaluation
› first piece of text
› progress vs. workplan?
› self-evaluation
› progress vs. workplan?
› self-evaluation
› progress vs. workplan?
› describe demo
› major milestone Easter?
› self-evaluation
› key conclusions?
› refine thesis outline
› self-evaluation
› your goal?
› main problems?
› done to date?
› which exam session?
› send ISP
› plan next 4 weeks?
› self-evaluation
Self-evaluation
› estimate and evaluate the time you spent on your master thesis
in the past 4 weeks
(much too little / too little / sufficient / more than enough / more than expected)
› estimate and evaluate the progress you made,
in relation to your targets
(far behind schedule / behind schedule / well on track / forward on schedule / far ahead of schedule)
› send your timesheet
33
Which time investment is expected
by the university?
THE DATA
› 1 year = 1500/1800 hours = 60 study points
› academic year until June = 36 weeks
› Master of Engineering in Computer Science:
thesis = 24 study points
› Master in Toegepaste Informatica:
thesis = 18 study points
34
THE MATH
› Master of Engineering (24 study points)
between 600 and 720 hours of work for master thesis
corresponds to an average of 17 to 20 hours per week
but there are also exams in January, and vacations ...
so in practice >>> 20 hours per week
› Master Toegepaste Informatica (18 study points)
between 450 and 540 hours of work for master thesis
Corresponds to an average of 12 to 15 hours per week
but there are also exams in January, and vacations ...
so in practice >>> 15 hours per week
How to track your time investment
› the Faculty provides a simple timesheet
granularity is one week
you can refine this if you like
› we expect you to use the spreadsheet, and to send it every month
35
Modalities Master Thesis - Texts› information on website department of Computer Science
via studenten > master … > masterproef > richtlijnen
› your text is very important, it is the basis for your grading by readers
& can lead to great awards !
start in time !
make sure you have someone who will proofread your text
(your mentor looks at the content and is not your spell checker)
› language
English master: everything in English!!
Dutch master: text in Dutch or in English
› correct citation is very important !!
plagiarism, even if not intended, is taken very seriously
automatic screening of all theses 36
Access to articles, journals, etc.
› limo.libis.be
› Springerlink, IEEE Xplore, CiteSeer, …
with KU Leuven license within KU Leuven networks
from home using EZProxy: https://admin.kuleuven.be/icts/english/proxy
37
Sessions organized by the Faculty
› three master’s thesis workshops to support you
Information Literacy: October 8,14,15 & 17
Academic Writing & Intellectual Integrity and Plagiarism: November 5 & 7
› more info at
https://eng.kuleuven.be/studeren/masterproef-en-papers#workshops
› attendance for DistriNet students is mandatory
› you have to register, you will receive an invitation by email
38
Submission deadlines
› Friday 10 Januari, 2020
› Friday 5 June, 2020
› Monday August 17, 2020
› http://eng.kuleuven.be/studenten/masterproef/masterproef/deadlines
39
Master thesis defense
› when: a few days before deliberation
› what: presentation, answering questions and a demo
› grading
look at objectives of a master thesis (slide 28)
and you will understand how you will be graded !
the assessment grid can be found online
criteria used: see next slide
40
Elements that determine grades
41
grades
independently execute an extensive
project
large dose of creativity
and sense of initiative
quality of work
report both orally
and in writing
critical view on delivered
work and conclusions
DistriNet
30+ research projects8 professors
9 research managers
2 business office
10 postdocs
50+ PhD students
80+ people
100 + industry collaborations
Distributed
Software
Secure
Software
Software
Engineering
DistriNet in a Nutshell
30+ years
track record
19846 spin off
companies
OUR MISSION:
to deliver world class research
in three (partially overlapping) domains:
Software Engineering, Distributed Systems and Secure Software
and to have a constructive impact on society
by delivering applications
and by transferring know how and technology
DistriNet Faculty Members
45
Bart De Decker
• Anonymity
• Pseudonymity
• Privacy Protection
Bart Jacobs
• Software Verification
Danny Hughes
• IoT
• Middleware
• Low-level Networking
Eric Steegmans
• Model Driven
Software engineering
• early stages of the
Software Life Cycle
Frank Piessens
• Security Architectures
• End-to-end Security
• ..
Tom Holvoet
• UAVs (Drones)
• AGVs
• Smart Power Grids
• Logistics
Wouter Joosen
• Cloud Computing Platforms
• Software Architecture
• Software Security
Yolande Berbers
• Mobile Cloud Applications
• Context-Aware Computing
DistriNet Research Experts & Managers
46
Bert Lagaisse
• cloud platforms
• authorization & audit
• big data middleware
Lieven Desmet
• web security
• security analytics
• security infrastructure
Dimitri Van Landuyt
• software engineering
• cloud computing
Eddy Truyen
• adaptation
• cloud computing
• service oriented systems
Davy Preuveneers
• mobile computing
• authentication
• identity
• authorization
• analytics
Sam Michiels
• embedded systems
• IoT, CPS
• SDN
• telecom
Jan Tobias Mühlberg
• formal software verification
• embedded systems design
Koen Yskout
• security by design
Pieter Philippaerts
• software engineering
• high quality and readable code
• architectural design
QUESTIONS?
Time for Lunch!