Master thesis in research group DistriNet Distributed ... · look at objectives of a master thesis...

Post on 11-Mar-2020

0 views 0 download

transcript

Master thesis in research group DistriNet

Distributed Software & Software SecurityKick-off event - September 25, 2019

Let us start with some examples..

what are you (and we!) aiming for?

Sequence-based Intrusion Detection with

Recurrent Neural Networks

Jin Li

Vera Rimmer, Ilias Tsingenopoulos

Wouter Joosen & Davy Preuveneers

Revised work submitted to NDSS 2020 (San Diego, USA)

IDS

Attack N

BenignNetwork flow

Packet 1

Packet 0

Packet T

…..

Feature vector

Manually

extract

features

Statistical info of flow

Protocol

Flow Duration

Total Fwd Packets

Total Backward Packets

Total Length of Fwd Packets

Total Length of Bwd Packets

Flow Bytes/s

Flow Packets/s

……

……

Attack 1

› Recognize known network attacks

› Identify anomalous network traffic

› Evolving network threat landscape

Network intrusion detection systems

IDS

Packet 1

Packet 0

Packet t

Sequential input

…..

Attack N

Benign

……

Attack 1

Sequence-based intrusion detection

› No manually defined high-level statistical features as input!

› Automatically extract features using RNN from the original packets sequence

› Achieve comparable performance to state-of-the-art

Input layer

Packet 2

Packet 1

Packet t

…..

LSTM

Unit

LSTM layer(s)

Dense layer(s)

Output layerMasking layer

Packet 2

……

Batc

h

LSTM layer 1

LSTM layer 2

LSTM layer n

……

Auto features

Auto features

Recurrent neural networks

Results

› Stratified sampling: 80% training set / 20% testing set

› Compare with the state-of-the-art results evaluated on the CICIDS2017 dataset [1]

Algorithm Precision Recall F1score

KNN 0.96 0.96 0.96

Random Forest 0.98 0.97 0.97

Decision Tree (ID3) 0.98 0.98 0.98

Adaboost 0.77 0.84 0.77

MLP 0.77 0.83 0.76

Naïve-Bayes 0.88 0.04 0.04

QDA 0.97 0.88 0.92

Bi-LSTM 0.984 0.982 0.983

[1] Sharafaldin, I., Lashkari, A. H., & Ghorbani, A. A. (2018, January). Toward Generating a New Intrusion

Detection Dataset and Intrusion Traffic Characterization. In ICISSP (pp. 108-116).

Dynamic and cost-efficient resource

allocation of multi-tenant workloads in

KubernetesAbel Rodríguez Romero

Wouter Joosen & Eddy Truyen

Leveraging k8-resource-optimizer for cost-efficient auto-scaling

Optimal resource allocation is very complex

› Minimize costs

› Comply with SLAs

› Dynamic demands

Elasticity = scalability + automation + optimization

› Optimization of fine-grained parameter spaces (e.g. milliCPUs,

Megabytes) is not feasible

› Non-linear scaling effects in homogeneous horizontal scaling

› Our proposal: address container elasticity using coarse-grained

heterogeneous horizontal scaling

Achieving optimal allocation for discretized spectrum

of numbers of tenants

Auto-scaler prototype for Kubernetes

Expressive Feature-oriented

Multicast for the Internet of Things

Jonathan Oostvogels

Stefanos Peros, Jan Tobias Mühlberg, Koen Yskout

Danny Hughes & Sam Michiels

Feature-oriented routing by example

› mesh network

› smart energy-harvesting cameras

› track wandering animal and report to back-end

› intelligently activate cameras near animal

› how to efficiently route actuation command?

Feature-oriented multicast

› route messages based on feature constraints

› in the IoT

interest in data-centric routing

avoid explicit resource discovery [1]

natural fit with e.g. complex event detection [2]

› repeated unicast? broadcast? multicast?

multicast only replicates messages as needed

but requires storing and communicating feature information

while facing memory and radio constraints

State of the art

› geographic routing (e.g. GPSR, GMR)

does not generalise

› publish-subscribe approaches

directed diffusion, Akkermans et al.

requires overlay structure before feature-oriented path is used

› featurecast

integrates feature-oriented routing in IoT network layer

flexible, no additional per-path overlay

promising performance evaluation

Conclusion

› SMRFET: expressive feature-oriented multicast

› feasible in constrained networks

compact implementation

small radio time overhead relative to regular multicast

outperforms broadcast, TM (radio time), repeated unicast (load distribution)

routing tables require small amounts of memory, graceful degradation

› caveats

packet loss

expressivity-efficiency trade-off (multidimensional addressing)

Fishy Faces: Crafting Adversarial Images to

Poison Face Authentication

Giuseppe Garofalo

Vera Rimmer, Tim Van hamme

Wouter Joosen & Davy Preuveneers

WOOT 2018, August 13-14 (Baltimore, MD, USA)

Face authentication

› wide adoption of face recognition in mobile devices

› face authentication is a highly security-sensitive application

› several attacks have been proposed (e.g replay attacks1, Bkav’s mask2 etc.)

[1] Face Anti-spoofing, Face Presentation Attack Detection

[2] Bkav’s new mask beats Face ID in "twin way": Severity level raised, do not use Face ID in business transactions.

System design

› our target authenticator is composed of two parts:

feature extractor

classification model

Input image

System design

› feature extractor

OpenFace library

based on Google’s FaceNet1 (Convolutional Neural Network)

• face detection

• pre-processing

• feature extraction

feature

extraction

input image

[1] Schroff, F., Kalenichenko, D., and Philbin, J. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE

conference on computer vision and pattern recognition (2015), pp. 815–823.

System design

› One-Class SVM for classification1

Trained only on images of the user

Takes a hyper-parameter which defines the upper-

bound to the percentage of training errors

One-Class

SVM

feature

Extraction

[1] Inspired by: Gadaleta, M., and Rossi, M. Idnet: Smartphone-based gait recognition with convolutional neural networks.

Pattern Recognition 74 (2018), 25 – 37.

input image

System design

› once trained, the model is used to authenticate the user

One-Class

SVM

Feature

Extraction

AuthenticationInput image

Methodology - Attack example

injected image

› after the injection, the classification accuracy drops from 4% to 44% (by 40%!)

false positive

unauthorised user

false negative

authorised user

The Procrastinator

TED Talk by Tim Urbanhttps://www.ted.com/talks/tim_urban_inside_the_mind_of_a_master_procrastinator?language=nl

“The trick, is to start the day in control.

When you’re able to start doing stuff and

show yourself that you can get started,

then you’re able to empower yourself.”

Tim Urban https://www.unstuck.com/advice/how-a-lifelong-procrastinator-cracked-the/

The Roadmap

The objectives of a master thesis

› perform a major project in an independent way

› participate in research

› report both orally and in writing

› create a large dose of self-reliance, creativity and initiative

develop a work plan independently and realize it

time management is very important!

28

6 aspects

whose importance in every master's thesis may vary

› a literature study

› an analysis of a problem/formulate a research question

› the formulation of a solution

› a design

› an implementation

› an evaluation

29

Who does what?

YOU have the

responsibility

your MENTORis your main contact

your SUPERVISORregularly discusses your progress

with your mentor

your ASSESSOR evaluates your completed work

RESEARCHERSdiscuss the progress of your thesis during workshops

direct contact may be limited

stay in touch!

KICK OFF MEETING

September (today )

Your Milestones

1

3

4

SECOND PRESENTATION

(around March)

2

CHOOSE A FINAL TITLE

(April)

FULL TEXT

(early June)

1

FIRST TEXT

(end December)FIRST PRESENTATION

(end November)

2 PROTOTYPE

(after Easter)

3

a major milestone

intermediate stationyour text is one continuous care!

Monthly evaluations

32

› according to plan?

› plan next 4 weeks?

› self-evaluation

› detailed workplan

› self-evaluation

› first piece of text

› progress vs. workplan?

› self-evaluation

› progress vs. workplan?

› self-evaluation

› progress vs. workplan?

› describe demo

› major milestone Easter?

› self-evaluation

› key conclusions?

› refine thesis outline

› self-evaluation

› your goal?

› main problems?

› done to date?

› which exam session?

› send ISP

› plan next 4 weeks?

› self-evaluation

Self-evaluation

› estimate and evaluate the time you spent on your master thesis

in the past 4 weeks

(much too little / too little / sufficient / more than enough / more than expected)

› estimate and evaluate the progress you made,

in relation to your targets

(far behind schedule / behind schedule / well on track / forward on schedule / far ahead of schedule)

› send your timesheet

33

Which time investment is expected

by the university?

THE DATA

› 1 year = 1500/1800 hours = 60 study points

› academic year until June = 36 weeks

› Master of Engineering in Computer Science:

thesis = 24 study points

› Master in Toegepaste Informatica:

thesis = 18 study points

34

THE MATH

› Master of Engineering (24 study points)

between 600 and 720 hours of work for master thesis

corresponds to an average of 17 to 20 hours per week

but there are also exams in January, and vacations ...

so in practice >>> 20 hours per week

› Master Toegepaste Informatica (18 study points)

between 450 and 540 hours of work for master thesis

Corresponds to an average of 12 to 15 hours per week

but there are also exams in January, and vacations ...

so in practice >>> 15 hours per week

How to track your time investment

› the Faculty provides a simple timesheet

granularity is one week

you can refine this if you like

› we expect you to use the spreadsheet, and to send it every month

35

Modalities Master Thesis - Texts› information on website department of Computer Science

via studenten > master … > masterproef > richtlijnen

› your text is very important, it is the basis for your grading by readers

& can lead to great awards !

start in time !

make sure you have someone who will proofread your text

(your mentor looks at the content and is not your spell checker)

› language

English master: everything in English!!

Dutch master: text in Dutch or in English

› correct citation is very important !!

plagiarism, even if not intended, is taken very seriously

automatic screening of all theses 36

Access to articles, journals, etc.

› limo.libis.be

› Springerlink, IEEE Xplore, CiteSeer, …

with KU Leuven license within KU Leuven networks

from home using EZProxy: https://admin.kuleuven.be/icts/english/proxy

37

Sessions organized by the Faculty

› three master’s thesis workshops to support you

Information Literacy: October 8,14,15 & 17

Academic Writing & Intellectual Integrity and Plagiarism: November 5 & 7

› more info at

https://eng.kuleuven.be/studeren/masterproef-en-papers#workshops

› attendance for DistriNet students is mandatory

› you have to register, you will receive an invitation by email

38

Submission deadlines

› Friday 10 Januari, 2020

› Friday 5 June, 2020

› Monday August 17, 2020

› http://eng.kuleuven.be/studenten/masterproef/masterproef/deadlines

39

Master thesis defense

› when: a few days before deliberation

› what: presentation, answering questions and a demo

› grading

look at objectives of a master thesis (slide 28)

and you will understand how you will be graded !

the assessment grid can be found online

criteria used: see next slide

40

Elements that determine grades

41

grades

independently execute an extensive

project

large dose of creativity

and sense of initiative

quality of work

report both orally

and in writing

critical view on delivered

work and conclusions

DistriNet

30+ research projects8 professors

9 research managers

2 business office

10 postdocs

50+ PhD students

80+ people

100 + industry collaborations

Distributed

Software

Secure

Software

Software

Engineering

DistriNet in a Nutshell

30+ years

track record

19846 spin off

companies

OUR MISSION:

to deliver world class research

in three (partially overlapping) domains:

Software Engineering, Distributed Systems and Secure Software

and to have a constructive impact on society

by delivering applications

and by transferring know how and technology

DistriNet Faculty Members

45

Bart De Decker

• Anonymity

• Pseudonymity

• Privacy Protection

Bart Jacobs

• Software Verification

Danny Hughes

• IoT

• Middleware

• Low-level Networking

Eric Steegmans

• Model Driven

Software engineering

• early stages of the

Software Life Cycle

Frank Piessens

• Security Architectures

• End-to-end Security

• ..

Tom Holvoet

• UAVs (Drones)

• AGVs

• Smart Power Grids

• Logistics

Wouter Joosen

• Cloud Computing Platforms

• Software Architecture

• Software Security

Yolande Berbers

• Mobile Cloud Applications

• Context-Aware Computing

DistriNet Research Experts & Managers

46

Bert Lagaisse

• cloud platforms

• authorization & audit

• big data middleware

Lieven Desmet

• web security

• security analytics

• security infrastructure

Dimitri Van Landuyt

• software engineering

• cloud computing

Eddy Truyen

• adaptation

• cloud computing

• service oriented systems

Davy Preuveneers

• mobile computing

• authentication

• identity

• authorization

• analytics

Sam Michiels

• embedded systems

• IoT, CPS

• SDN

• telecom

Jan Tobias Mühlberg

• formal software verification

• embedded systems design

Koen Yskout

• security by design

Pieter Philippaerts

• software engineering

• high quality and readable code

• architectural design

QUESTIONS?

Time for Lunch!