Date post: | 16-Apr-2017 |
Category: |
Software |
Upload: | andrzej-wasowski |
View: | 378 times |
Download: | 0 times |
What IsSoftwareEngineeringResearchGood For?Andrzej Wasowski
@AndrzejWasowski
PROCESS AND SYSTEM MODELS GROUP
pyrrhocoris apterus (firebug)
c© Andrzej Wasowski, IT University of Copenhagen 1
AB
ETT
ER
QU
ES
TIO
NWhat is interesting SE
research accordingto Andrzej?
c© Andrzej Wasowski, IT University of Copenhagen 2
AB
ETT
ER
QU
ES
TIO
NWhat is interesting SE
research accordingto Andrzej?
a h
a
mmer lurk
ing b
eh
ind
the question
*modelinglanguagessemanticsanalysis
c© Andrzej Wasowski, IT University of Copenhagen 2
AB
ETT
ER
QU
ES
TIO
NWhat is interesting SE
research accordingto Andrzej?
a h
a
mmer lurk
ing b
eh
ind
the question
*modelinglanguagessemanticsanalysis
What relevant SEquestions can be
addressed by defininglanguages & analyzing
models/programs?c© Andrzej Wasowski, IT University of Copenhagen 2
AGENDA
Correctness of Software (bug finding)
Software Engineering is Codified Knowledge (online privacy)Legacy Systems (Software Modernization)
c© Andrzej Wasowski, IT University of Copenhagen 3
AGENDA
Correctness of Software (bug finding)Software Engineering is Codified Knowledge (online privacy)
Legacy Systems (Software Modernization)
c© Andrzej Wasowski, IT University of Copenhagen 3
AGENDA
Correctness of Software (bug finding)Software Engineering is Codified Knowledge (online privacy)Legacy Systems (Software Modernization)
c© Andrzej Wasowski, IT University of Copenhagen 3
What is Linux Kernel ?Incredibly versatile operating system
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
68-98% webserversrun on Linux
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
68-98% webserversrun on Linux
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
68-98% webserversrun on Linux
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
68-98% webserversrun on Linux
$0.5M/Y platinum membership fee
Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
What is Linux Kernel ?Incredibly versatile operating system
GNU/Linux runssupercomputers andinternet servers
AndroidphonestabletssmartTVsetc.
Routers, storage servers,entertainment systems,robots, IoT devices, ...Cloud infrastructure
68-98% webserversrun on Linux
$0.5M/Y platinum membership fee
The most popular OS kernel on the planet!Sources: Gartner and https://en.wikipedia.org/wiki/Usage_share_of_operating_systemshttps://techcrunch.com/2016/11/16/microsoft-joins-the-linux-foundation/
c© Andrzej Wasowski, IT University of Copenhagen 4
Linux Kernel is very large
The source code has 700 million characters, 21 million lines of code(quick measurements on the Raspberry Pi version of Linux)
Boeing 747 has 6 million mechanical parts, half of them simple fastenersAre humans able to understand the entire kernel?
c© Andrzej Wasowski, IT University of Copenhagen 5
Linux Kernel is very large
The source code has 700 million characters, 21 million lines of code(quick measurements on the Raspberry Pi version of Linux)
Boeing 747 has 6 million mechanical parts, half of them simple fasteners
Are humans able to understand the entire kernel?
c© Andrzej Wasowski, IT University of Copenhagen 5
Linux Kernel is very large
The source code has 700 million characters, 21 million lines of code(quick measurements on the Raspberry Pi version of Linux)
Boeing 747 has 6 million mechanical parts, half of them simple fastenersAre humans able to understand the entire kernel?
c© Andrzej Wasowski, IT University of Copenhagen 5
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art� Nobody access to hardware on which others work� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art� Nobody access to hardware on which others work� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art� Nobody access to hardware on which others work� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art� Nobody access to hardware on which others work� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art
� Nobody access to hardware on which others work� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art� Nobody access to hardware on which others work
� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linux Kernel Moves Fast
� 4000 programmers from 440 companies contributed to the kernel(approximate numbers from 2015 only)
� 10,800 lines of code added, 5,300 removed, 1,875 modifiedEvery. Single. Day. (on average)
� Over 8 changes per second
� Is any human able to comprehend this evolution speed?
� Incidentally, this makes it impossible to verify with current state of the art� Nobody access to hardware on which others work� Each of them potentially breaks things for others
c© Andrzej Wasowski, IT University of Copenhagen 6
Linus has power to say no.And not much more...
� Linus Thorvalds� Creator of a free kernel project in 1991� Today a benevolent dictator� Coordinates the kernel with a handful of lieutenants
� Can block developments� Hardly has power to give the project a consistent direction� Project is not managed in the usual sense
c© Andrzej Wasowski, IT University of Copenhagen 7
Linus has power to say no.And not much more...
� Linus Thorvalds� Creator of a free kernel project in 1991� Today a benevolent dictator� Coordinates the kernel with a handful of lieutenants� Can block developments� Hardly has power to give the project a consistent direction� Project is not managed in the usual sense
c© Andrzej Wasowski, IT University of Copenhagen 7
Success from The IutsideSoftware engineering challenge from the inside
Very LargeVery Large
Very ComplexVery Complex
c© Andrzej Wasowski, IT University of Copenhagen 8
Success from The IutsideSoftware engineering challenge from the inside
Very LargeVery Large
Very ComplexVery Complex
Moving VeryMoving VeryFASTFAST
c© Andrzej Wasowski, IT University of Copenhagen 8
Success from The IutsideSoftware engineering challenge from the inside
Very LargeVery Large
Very ComplexVery Complex
Moving VeryMoving VeryFASTFAST
EssentiallyEssentiallyNOT MANAGEDNOT MANAGED
c© Andrzej Wasowski, IT University of Copenhagen 8
Success from The IutsideSoftware engineering challenge from the inside
Very LargeVery Large
Very ComplexVery Complex
Moving VeryMoving VeryFASTFAST
EssentiallyEssentiallyNOT MANAGEDNOT MANAGED
A fascinating object forsoftware engineering studies
Jewels bound to be found ...
Problems bound to appear ...
c© Andrzej Wasowski, IT University of Copenhagen 8
Let’s look closely at a bug
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structscasts, function pointersno specifications, notests
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)
Inter-procedural flowPointers nested incomplex structscasts, function pointersno specifications, notests
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flow
Pointers nested incomplex structscasts, function pointersno specifications, notests
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structs
casts, function pointersno specifications, notests
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structscasts, function pointers
no specifications, notests
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structscasts, function pointersno specifications, notests
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structscasts, function pointersno specifications, notests
while(*)
lockρ
unlockρ
lockρunlockρ
lockρunlockρ
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structscasts, function pointersno specifications, notests
lockρ
unlockρ
lockρunlockρ
lockρunlockρ
while(*)
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Let’s look closely at a bug1 void ath10k_htt_rx_msdu_buff_replenish (struct ath10k_htt *htt) {2 spin_lock_bh(&htt->rx_ring.lock); // 6. DEADLOCK3 // ...4 spin_unlock_bh(&htt->rx_ring.lock);5 }6 void ath10k_htt_rx_in_ord_ind (struct ath10k *ar, struct sk_buff *skb) {7 // ...8 ath10k_htt_rx_msdu_buff_replenish(&ar->htt); // 5. CALL9 }
10 void ath10k_htt_txrx_compl_task (unsigned long ptr) { // 1. ENTRY POINT11 struct ath10k *ar = (struct ath10k *)ptr; // 2. CAST12 // ...13 while ((skb = __skb_dequeue(&rx_ind_q))) {14 spin_lock_bh(&ar->htt->rx_ring.lock); // 3. LOCK15 ath10k_htt_rx_in_ord_ind(ar, skb); // 4. CALL16 spin_unlock_bh(&ar->htt->rx_ring.lock);17 dev_kfree_skb_any(skb);18 }19 // ...
Domain knowledge(tasklets, bottom halves,locks)Inter-procedural flowPointers nested incomplex structscasts, function pointersno specifications, notests
lockρ
unlockρ
lockρunlockρ
while(*)
lockρ
unlockρ
Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17c© Andrzej Wasowski, IT University of Copenhagen 10
Inference of Shapes & Effects
Formalized and implemented for the entire C languageIncluding spec. of selected kernel functions, e.g:
c© Andrzej Wasowski, IT University of Copenhagen 11
Inference of Shapes & Effects
Formalized and implemented for the entire C languageIncluding spec. of selected kernel functions, e.g:
c© Andrzej Wasowski, IT University of Copenhagen 11
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theorems
Nine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)
Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noise
Still a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)
You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)
Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)
[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Does this work?Software engineering research method strikes back
We have proven no theoremsNine thousand files in drivers analyzed (you do get dirty!)Dozen reports for 9K lines, not a lot of noiseStill a lot of work to filter out false positives (you get dirty!)You talk to devs: they want you to fix bugs! (you may get dirty!)Dozen new bugs confirmed and 5 fixed in the Linux kernel projects (somein the main tree already)[recall] On 26 random historical double lock bugs, EBA finds 22; much morethan competing tools (≤ 12), despite negative bias
http://eba.wikit.itu.dk/Iago Abal, Claus Brabrand, Andrzej Wasowski. Effective Bug Finding in C Programs with Shape and Effect Abstractions VMCAI’17
c© Andrzej Wasowski, IT University of Copenhagen 12
Ariane V (1996)
A floating point cast bug,A decade of development,$7B development budget,$0.5B lost rocket & cargo,but ...
c© Andrzej Wasowski, IT University of Copenhagen 14
Ariane V (1996)A floating point cast bug,
A decade of development,$7B development budget,$0.5B lost rocket & cargo,but ...
c© Andrzej Wasowski, IT University of Copenhagen 14
Ariane V (1996)A floating point cast bug,A decade of development,
$7B development budget,$0.5B lost rocket & cargo,but ...
c© Andrzej Wasowski, IT University of Copenhagen 14
Ariane V (1996)A floating point cast bug,A decade of development,$7B development budget,
$0.5B lost rocket & cargo,but ...
c© Andrzej Wasowski, IT University of Copenhagen 14
Ariane V (1996)A floating point cast bug,A decade of development,$7B development budget,$0.5B lost rocket & cargo,
but ...
c© Andrzej Wasowski, IT University of Copenhagen 14
Ariane V (1996)A floating point cast bug,A decade of development,$7B development budget,$0.5B lost rocket & cargo,but ...
c© Andrzej Wasowski, IT University of Copenhagen 14
Ariane V (2013)
� 89 launches since 1996� 3 crashes since 1996� Only the first linked to a software bug
(so is HW really more reliable?)� Last 75 launches with no incidents� Most recent launch: Nov 17, 2016
Have you heard about it?� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
Ariane V (2013)
� 89 launches since 1996
� 3 crashes since 1996� Only the first linked to a software bug
(so is HW really more reliable?)� Last 75 launches with no incidents� Most recent launch: Nov 17, 2016
Have you heard about it?� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
Ariane V (2013)
� 89 launches since 1996� 3 crashes since 1996
� Only the first linked to a software bug(so is HW really more reliable?)
� Last 75 launches with no incidents� Most recent launch: Nov 17, 2016
Have you heard about it?� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
Ariane V (2013)
� 89 launches since 1996� 3 crashes since 1996� Only the first linked to a software bug
(so is HW really more reliable?)
� Last 75 launches with no incidents� Most recent launch: Nov 17, 2016
Have you heard about it?� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
Ariane V (2013)
� 89 launches since 1996� 3 crashes since 1996� Only the first linked to a software bug
(so is HW really more reliable?)� Last 75 launches with no incidents
� Most recent launch: Nov 17, 2016Have you heard about it?
� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
Ariane V (2013)
� 89 launches since 1996� 3 crashes since 1996� Only the first linked to a software bug
(so is HW really more reliable?)� Last 75 launches with no incidents� Most recent launch: Nov 17, 2016
Have you heard about it?
� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
Ariane V (2013)
� 89 launches since 1996� 3 crashes since 1996� Only the first linked to a software bug
(so is HW really more reliable?)� Last 75 launches with no incidents� Most recent launch: Nov 17, 2016
Have you heard about it?� They never show you this slide ...
c© Andrzej Wasowski, IT University of Copenhagen 15
1.27 fatality per 100 million miles
including human failures
c© Andrzej Wasowski, IT University of Copenhagen 16
1.27 fatality per 100 million miles
including human failures
0.76 fatality per 100 million milesc© Andrzej Wasowski, IT University of Copenhagen 16
1.27 fatality per 100 million miles
including human failures
0.76 fatality per 100 million miles
0.03 fatalities per 100 million milesincluding human errors
c© Andrzej Wasowski, IT University of Copenhagen 16
1.27 fatality per 100 million miles
including human failures
0.76 fatality per 100 million miles
0.03 fatalities per 100 million milesincluding human errors
If we areDoing so well,
Why are we stillSO OBSESSED
with correctness ?c© Andrzej Wasowski, IT University of Copenhagen 16
diversity of domains:consumer electronicsautomotiveindustry automationbusiness softdata analytics
c© Andrzej Wasowski, IT University of Copenhagen 17
Let’s look into one domain
Online Privacy and Data Analyticsc© Andrzej Wasowski, IT University of Copenhagen 17
smartphone owner
� I cannot do muchabout this, as a SEresearcher
� My hammer — notgood enough
� Others work witheducation,awareness,regulation, politics,alternative businessmodels
c© Andrzej Wasowski, IT University of Copenhagen 18
smartphone owner
� I cannot do muchabout this, as a SEresearcher
� My hammer — notgood enough
� Others work witheducation,awareness,regulation, politics,alternative businessmodels
software developerc© Andrzej Wasowski, IT University of Copenhagen 18
smartphone owner
� I cannot do muchabout this, as a SEresearcher
� My hammer — notgood enough
� Others work witheducation,awareness,regulation, politics,alternative businessmodels
software developerc© Andrzej Wasowski, IT University of Copenhagen 18
smartphone owner
� I cannot do muchabout this, as a SEresearcher
� My hammer — notgood enough
� Others work witheducation,awareness,regulation, politics,alternative businessmodels
software developerc© Andrzej Wasowski, IT University of Copenhagen 18
smartphone owner
� I cannot do muchabout this, as a SEresearcher
� My hammer — notgood enough
� Others work witheducation,awareness,regulation, politics,alternative businessmodels
software developer
� Architecturalprinciples protectingpersonal data
� Detect libraries used� Detect information
flow to the libraryvendor
� Warn the developerof bad practice (likethe security scannersdo for security)
� Help programmersand companiesconform to GDPR
c© Andrzej Wasowski, IT University of Copenhagen 18
smartphone owner
� I cannot do muchabout this, as a SEresearcher
� My hammer — notgood enough
� Others work witheducation,awareness,regulation, politics,alternative businessmodels
software developer
� Architecturalprinciples protectingpersonal data
� Detect libraries used� Detect information
flow to the libraryvendor
� Warn the developerof bad practice (likethe security scannersdo for security)
� Help programmersand companiesconform to GDPRLots of potentially interesting work
Seeking thesis studentsc© Andrzej Wasowski, IT University of Copenhagen 18
caring parent
� 3000+ schools inPoland uses thesystem (data from 2014)
� The database trackseasily over halfmilion data subjects
� Not only grades� Communication with
parents, conduct,illness, etc.
� This dataset is boundto grow fast
� Cannot help much.We need education,regulation, andgovernance, etc.
c© Andrzej Wasowski, IT University of Copenhagen 19
caring parent
� 3000+ schools inPoland uses thesystem (data from 2014)
� The database trackseasily over halfmilion data subjects
� Not only grades� Communication with
parents, conduct,illness, etc.
� This dataset is boundto grow fast
� Cannot help much.We need education,regulation, andgovernance, etc.
c© Andrzej Wasowski, IT University of Copenhagen 19
caring parent
� 3000+ schools inPoland uses thesystem (data from 2014)
� The database trackseasily over halfmilion data subjects
� Not only grades� Communication with
parents, conduct,illness, etc.
� This dataset is boundto grow fast
� Cannot help much.We need education,regulation, andgovernance, etc.
� High value in thebig (personal) data
� Companies want toextract the value
� But how to store andprocess these datarespectfully?
� How to anonymizethe data?
� Which anonymizationmethod to use? Howto configure it? Howto test whether weuse it correctly?
� Help programmersand companies toconform to GDPR
c© Andrzej Wasowski, IT University of Copenhagen 19
Differential PrivacyAn example from software engineering perspective
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
R1 R2
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ P2
ꞒCynthia Dwork. Differential privacy. ICALP 2006
c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ P2
Ꞓ
P1
P2≤ eε
For any such set of results S and for anyneighbouring sets D1 and D2
Can we expect an average programmerto implement this?If so then how to test this ?Can we implement reusable differentialprivacy components, similar toencryption libraries?
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ P2
Ꞓ
P1
P2≤ eε
For any such set of results S and for anyneighbouring sets D1 and D2
Can we expect an average programmerto implement this?If so then how to test this ?Can we implement reusable differentialprivacy components, similar toencryption libraries?
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ P2
Ꞓ
P1
P2≤ eε
For any such set of results S and for anyneighbouring sets D1 and D2
Can we expect an average programmerto implement this?
If so then how to test this ?Can we implement reusable differentialprivacy components, similar toencryption libraries?
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ P2
Ꞓ
P1
P2≤ eε
For any such set of results S and for anyneighbouring sets D1 and D2
Can we expect an average programmerto implement this?If so then how to test this ?
Can we implement reusable differentialprivacy components, similar toencryption libraries?
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Differential PrivacyAn example from software engineering perspective
D1 D2
K K
S
P1Ꞓ P2
Ꞓ
P1
P2≤ eε
For any such set of results S and for anyneighbouring sets D1 and D2
Can we expect an average programmerto implement this?If so then how to test this ?Can we implement reusable differentialprivacy components, similar toencryption libraries?
Cynthia Dwork. Differential privacy. ICALP 2006c© Andrzej Wasowski, IT University of Copenhagen 20
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation data
How do we select the value of epsilon ?Does this notion of privacy at all capture what data subjects would expect ?This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: studentsRQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?Ultimately a handbook/blueprint for selection and use of anonymization technology
Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovskic© Andrzej Wasowski, IT University of Copenhagen 21
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation data
How do we select the value of epsilon ?Does this notion of privacy at all capture what data subjects would expect ?This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: studentsRQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?Ultimately a handbook/blueprint for selection and use of anonymization technology
maximum anonymity maximum utility
Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovskic© Andrzej Wasowski, IT University of Copenhagen 21
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation dataHow do we select the value of epsilon ?
Does this notion of privacy at all capture what data subjects would expect ?This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: studentsRQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?Ultimately a handbook/blueprint for selection and use of anonymization technology
maximum anonymity maximum utility
ε = 0 ε = +∞Pr[K(D1) ∈ S] ≤ eεPr[K(D2) ∈ S]
?Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovski
c© Andrzej Wasowski, IT University of Copenhagen 21
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation dataHow do we select the value of epsilon ?Does this notion of privacy at all capture what data subjects would expect ?
This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: studentsRQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?Ultimately a handbook/blueprint for selection and use of anonymization technology
maximum anonymity maximum utility
ε = 0 ε = +∞Pr[K(D1) ∈ S] ≤ eεPr[K(D2) ∈ S]
?Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovski
c© Andrzej Wasowski, IT University of Copenhagen 21
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation dataHow do we select the value of epsilon ?Does this notion of privacy at all capture what data subjects would expect ?This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: students
RQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?Ultimately a handbook/blueprint for selection and use of anonymization technology
maximum anonymity maximum utility
ε = 0 ε = +∞Pr[K(D1) ∈ S] ≤ eεPr[K(D2) ∈ S]
?Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovski
c© Andrzej Wasowski, IT University of Copenhagen 21
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation dataHow do we select the value of epsilon ?Does this notion of privacy at all capture what data subjects would expect ?This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: studentsRQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?
Ultimately a handbook/blueprint for selection and use of anonymization technology
maximum anonymity maximum utility
ε = 0 ε = +∞Pr[K(D1) ∈ S] ≤ eεPr[K(D2) ∈ S]
?Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovski
c© Andrzej Wasowski, IT University of Copenhagen 21
Anonymization is difficult, e.g. we are able to re-identify a good number ofstudents in the "anonymized" course evaluation dataHow do we select the value of epsilon ?Does this notion of privacy at all capture what data subjects would expect ?This week running an experiment trying to understand this problem at ITUData: collected by WiFi access points at ITU, data subjects: studentsRQ1: How to relate the noise (ε) to privacy concerns of the data subjects?RQ2: Can data-subjects inform design of data protection in a system?Ultimately a handbook/blueprint for selection and use of anonymization technology
maximum anonymity maximum utility
ε = 0 ε = +∞Pr[K(D1) ∈ S] ≤ eεPr[K(D2) ∈ S]
?Joint work with Mark Berthelsen, Gediminas Kucas, Tina Cecilie Schultz, Irina Shklovski
c© Andrzej Wasowski, IT University of Copenhagen 21
Many more other issues than correctness
Let’s look into: Aging Systemsc© Andrzej Wasowski, IT University of Copenhagen 24
(ancient religions and philosophies) (Nocturne E flat major, op. 55 no. 2)
(Gustav Klimt, Adele Bloch-Bauer)
c© Andrzej Wasowski, IT University of Copenhagen 25
(ancient religions and philosophies) (Nocturne E flat major, op. 55 no. 2)
(Gustav Klimt, Adele Bloch-Bauer)(Søren Kierkegaard)
c© Andrzej Wasowski, IT University of Copenhagen 25
(ancient religions and philosophies) (Nocturne E flat major, op. 55 no. 2)
(Gustav Klimt, Adele Bloch-Bauer)(Søren Kierkegaard)
c© Andrzej Wasowski, IT University of Copenhagen 25
(ancient religions and philosophies) (Nocturne E flat major, op. 55 no. 2)
(Gustav Klimt, Adele Bloch-Bauer)(Søren Kierkegaard)
Is LEGACY a MISNOMERIs LEGACY a MISNOMERfor SOFTWARE?for SOFTWARE?
c© Andrzej Wasowski, IT University of Copenhagen 25
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
In Search of Lost Time by Marcel ProustProust actually died before finishing14 years of Proust’s work9.6 million characters
Many readers suffered long after
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
In Search of Lost Time by Marcel ProustProust actually died before finishing14 years of Proust’s work9.6 million charactersMany readers suffered long after
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
The Linux Kernel by Thousands of Engineers25 years since 1991700 million characters
An important intellectual contributionbenefiting largely the entire societyIs Linux kernel an outlier?
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
The Linux Kernel by Thousands of Engineers25 years since 1991700 million charactersAn important intellectual contributionbenefiting largely the entire society
Is Linux kernel an outlier?
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
The Linux Kernel by Thousands of Engineers25 years since 1991700 million charactersAn important intellectual contributionbenefiting largely the entire societyIs Linux kernel an outlier?
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
1977 Oldsmobile Toronado by General Motors(Likely) the first car with control softwareAn ECU controls the spark timingWe guess 1-3 K lines of code
By 1981 each GM car had 50 KLOCA modern car has about 100 MLOCIncidentally more than a Dreamliner (7MLOC)Still less than our brain (1015 synapses not 106)Software systems are not only key for our lifestyle, butalso likely most complex human creations ever
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
1977 Oldsmobile Toronado by General Motors(Likely) the first car with control softwareAn ECU controls the spark timingWe guess 1-3 K lines of codeBy 1981 each GM car had 50 KLOCA modern car has about 100 MLOC
Incidentally more than a Dreamliner (7MLOC)Still less than our brain (1015 synapses not 106)Software systems are not only key for our lifestyle, butalso likely most complex human creations ever
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
1977 Oldsmobile Toronado by General Motors(Likely) the first car with control softwareAn ECU controls the spark timingWe guess 1-3 K lines of codeBy 1981 each GM car had 50 KLOCA modern car has about 100 MLOCIncidentally more than a Dreamliner (7MLOC)
Still less than our brain (1015 synapses not 106)Software systems are not only key for our lifestyle, butalso likely most complex human creations ever
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
1977 Oldsmobile Toronado by General Motors(Likely) the first car with control softwareAn ECU controls the spark timingWe guess 1-3 K lines of codeBy 1981 each GM car had 50 KLOCA modern car has about 100 MLOCIncidentally more than a Dreamliner (7MLOC)Still less than our brain (1015 synapses not 106)
Software systems are not only key for our lifestyle, butalso likely most complex human creations ever
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
1977 Oldsmobile Toronado by General Motors(Likely) the first car with control softwareAn ECU controls the spark timingWe guess 1-3 K lines of codeBy 1981 each GM car had 50 KLOCA modern car has about 100 MLOCIncidentally more than a Dreamliner (7MLOC)Still less than our brain (1015 synapses not 106)Software systems are not only key for our lifestyle, butalso likely most complex human creations ever
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
100K commits in Facebook. Every. Week. Size of Facebook: 60 MLOCSize of Google 2 000 MLOCExamples start to get boring
Software is important for us, a bit like art.Imagine life without legacy systems (no banking, no credit cards, no railways, noairlines, no tax office)But software is also different. Why is art aging well, and software ages badly?Software does not change a single bitWe change, our needs change, our engineers change
Complexity is the Only Constant (Jürgen Dingel)
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
100K commits in Facebook. Every. Week. Size of Facebook: 60 MLOCSize of Google 2 000 MLOCExamples start to get boring
Software is important for us, a bit like art.Imagine life without legacy systems (no banking, no credit cards, no railways, noairlines, no tax office)
But software is also different. Why is art aging well, and software ages badly?Software does not change a single bitWe change, our needs change, our engineers change
Complexity is the Only Constant (Jürgen Dingel)
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
100K commits in Facebook. Every. Week. Size of Facebook: 60 MLOCSize of Google 2 000 MLOCExamples start to get boring
Software is important for us, a bit like art.Imagine life without legacy systems (no banking, no credit cards, no railways, noairlines, no tax office)But software is also different. Why is art aging well, and software ages badly?
Software does not change a single bitWe change, our needs change, our engineers change
Complexity is the Only Constant (Jürgen Dingel)
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
100K commits in Facebook. Every. Week. Size of Facebook: 60 MLOCSize of Google 2 000 MLOCExamples start to get boring
Software is important for us, a bit like art.Imagine life without legacy systems (no banking, no credit cards, no railways, noairlines, no tax office)But software is also different. Why is art aging well, and software ages badly?Software does not change a single bitWe change, our needs change, our engineers change
Complexity is the Only Constant (Jürgen Dingel)
https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
An output of complex intellectual activitySoftware is legacy like art, perhaps not a misnomer after all
100K commits in Facebook. Every. Week. Size of Facebook: 60 MLOCSize of Google 2 000 MLOCExamples start to get boring
Software is important for us, a bit like art.Imagine life without legacy systems (no banking, no credit cards, no railways, noairlines, no tax office)But software is also different. Why is art aging well, and software ages badly?Software does not change a single bitWe change, our needs change, our engineers change
Complexity is the Only Constant (Jürgen Dingel)https://www.technologyreview.com/s/508231/many-cars-have-a-hundred-million-lines-of-code | Juergen Dingel. Complexity is theOnly Constant: Trends in Computing and Their Relevance to MDE. ICGT’16 | Michael Feathers. Working with Legacy Code
c© Andrzej Wasowski, IT University of Copenhagen 26
Legacy Code iscode we’re afraid to change
[James Shore]
c© Andrzej Wasowski, IT University of Copenhagen 27
Legacy Code iscode without tests
[Michael Feathers]
c© Andrzej Wasowski, IT University of Copenhagen 28
Legacy Code iscode without a caretaker
[yours truly]
c© Andrzej Wasowski, IT University of Copenhagen 29
Some Solutions for Software ModernizationSome methods for taming legacy code
Doing nothingDoing nothing
c© Andrzej Wasowski, IT University of Copenhagen 30
Some Solutions for Software ModernizationSome methods for taming legacy code
Doing nothingDoing nothing ReplatformingReplatforming
c© Andrzej Wasowski, IT University of Copenhagen 30
Some Solutions for Software ModernizationSome methods for taming legacy code
Doing nothingDoing nothing ReplatformingReplatforming
VirtualizationVirtualization
c© Andrzej Wasowski, IT University of Copenhagen 30
Some Solutions for Software ModernizationSome methods for taming legacy code
Doing nothingDoing nothing ReplatformingReplatforming
VirtualizationVirtualization Re-architectingRe-architecting
c© Andrzej Wasowski, IT University of Copenhagen 30
Some Solutions for Software ModernizationSome methods for taming legacy code
Doing nothingDoing nothing ReplatformingReplatforming
VirtualizationVirtualization Re-architectingRe-architecting
c© Andrzej Wasowski, IT University of Copenhagen 30
An Example of a Modernization ProjectAn Example of a Modernization Project
c© Andrzej Wasowski, IT University of Copenhagen 31
An Example of a Modernization ProjectAn Example of a Modernization Project
Slide elements by Alexandru F. Iosif-Lazar
c© Andrzej Wasowski, IT University of Copenhagen 31
Can we trust a complex transformation?Can we trust a complex transformation?
c© Andrzej Wasowski, IT University of Copenhagen 32
Can we trust a complex transformation?Can we trust a complex transformation?
Luckily the program was finite stateVerify that the input and output are functionally equivalent
automatic and fast modernizing
transformation
symbolic execution
equivalence check
using an SMT solver
symbolic execution
modernized code
behavior ofmodernized
code
legacycode
legacycode
Impossible without others turning theory into engineering componentsSemantics → symbolic executorsGrammar theory → transformation languagesDeductive systems → SMT solvers
Slide elements by Alexandru F. Iosif-Lazar
c© Andrzej Wasowski, IT University of Copenhagen 32
Can we trust a complex transformation?Can we trust a complex transformation?
Luckily the program was finite stateVerify that the input and output are functionally equivalent
automatic and fast modernizing
transformation
symbolic execution
equivalence check
using an SMT solver
symbolic execution
modernized code
behavior ofmodernized
code
legacycode
legacycode
Impossible without others turning theory into engineering componentsSemantics → symbolic executorsGrammar theory → transformation languagesDeductive systems → SMT solvers
Slide elements by Alexandru F. Iosif-Lazar
c© Andrzej Wasowski, IT University of Copenhagen 32
Can we trust a complex transformation?Can we trust a complex transformation?
Luckily the program was finite stateVerify that the input and output are functionally equivalent
automatic and fast modernizing
transformation
symbolic execution
equivalence check
using an SMT solver
symbolic execution
modernized code
behavior ofmodernized
code
legacycode
legacycode
Impossible without others turning theory into engineering componentsSemantics → symbolic executorsGrammar theory → transformation languagesDeductive systems → SMT solvers
Slide elements by Alexandru F. Iosif-Lazar
100% correctness not a goalIt does not pay offIdentifying errors key
c© Andrzej Wasowski, IT University of Copenhagen 32
Can we trust a complex transformation?Can we trust a complex transformation?
Luckily the program was finite stateVerify that the input and output are functionally equivalent
automatic and fast modernizing
transformation
symbolic execution
equivalence check
using an SMT solver
symbolic execution
modernized code
behavior ofmodernized
code
legacycode
legacycode
Impossible without others turning theory into engineering componentsSemantics → symbolic executorsGrammar theory → transformation languagesDeductive systems → SMT solvers
Slide elements by Alexandru F. Iosif-Lazar
100% correctness not a goalIt does not pay offIdentifying errors key
c© Andrzej Wasowski, IT University of Copenhagen 32
What is interesting in SE research according to AW?What is interesting in SE research according to AW?
c© Andrzej Wasowski, IT University of Copenhagen 33
What is interesting in SE research according to AW?What is interesting in SE research according to AW?
Warning!You may get dirty
c© Andrzej Wasowski, IT University of Copenhagen 9
c© Andrzej Wasowski, IT University of Copenhagen 33
What is interesting in SE research according to AW?What is interesting in SE research according to AW?
Warning!You may get dirty
c© Andrzej Wasowski, IT University of Copenhagen 9
1.27 fatality per 100 million miles
including human failures
0.76 fatality per 100 million miles
0.03 fatalities per 100 million milesincluding human errors
If we areDoing so well,
Why are we stillSO OBSESSED
with correctness ?c© Andrzej Wasowski, IT University of Copenhagen 15
c© Andrzej Wasowski, IT University of Copenhagen 33
What is interesting in SE research according to AW?What is interesting in SE research according to AW?
Warning!You may get dirty
c© Andrzej Wasowski, IT University of Copenhagen 9
1.27 fatality per 100 million miles
including human failures
0.76 fatality per 100 million miles
0.03 fatalities per 100 million milesincluding human errors
If we areDoing so well,
Why are we stillSO OBSESSED
with correctness ?c© Andrzej Wasowski, IT University of Copenhagen 15
c© Andrzej Wasowski, IT University of Copenhagen 33
What is interesting in SE research according to AW?What is interesting in SE research according to AW?
Warning!You may get dirty
c© Andrzej Wasowski, IT University of Copenhagen 9
1.27 fatality per 100 million miles
including human failures
0.76 fatality per 100 million miles
0.03 fatalities per 100 million milesincluding human errors
If we areDoing so well,
Why are we stillSO OBSESSED
with correctness ?c© Andrzej Wasowski, IT University of Copenhagen 15
(ancient religions and philosophies) (Nocturne E flat major, op. 55 no. 2)
(Gustav Klimt, Adele Bloch-Bauer)(Søren Kierkegaard)
Is LEGACY a MISNOMERIs LEGACY a MISNOMERfor SOFTWARE?for SOFTWARE?
c© Andrzej Wasowski, IT University of Copenhagen 23
c© Andrzej Wasowski, IT University of Copenhagen 33