+ All Categories
Home > Documents > Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons...

Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons...

Date post: 23-Dec-2015
Category:
Upload: nelson-caldwell
View: 214 times
Download: 1 times
Share this document with a friend
Popular Tags:
21
Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author, Derrick Coetzee, waives all copyright and related or neighboring rights to this work.
Transcript
Page 1: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Reliability ofParallel Build Systems

Derrick Coetzee, George NeculaUC Berkeley

Creative Commons Zero Waiver: To the extent possible under law, the author, Derrick Coetzee, waives all copyright and related or neighboring rights to this work.

Page 2: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Why parallelize builds?

• Developer cycle time– Faster builds = Developers get more work done,

higher morale• Continuous integration– Faster builds = tests run more often

• Check-in verification systems– Faster builds = more throughput on check-in

queue

Page 3: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Parallel build systems today

• Job scheduling• Typical example: make -j <n>– Find n build steps that have no unbuilt

dependencies and run them– Whenever one exits, start the next one

• Depends on the dependency graph being correct and complete

• Coarse-grained task parallelism

Page 4: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

What could go wrong?

• Incomplete dependency information– Serial builds → leads to incorrect incremental

builds– Parallel builds → leads to nondeterministic builds,

build breaks, incorrect builds– Developer changes can introduce or remove

dependencies at any time• #include "yy.lex.h"

Page 5: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Example of missing dependencies

• gcc test.c -o test– What files does it read/write/test existence of?

Page 6: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Example of missing dependencies

• gcc test.c -o test– What files does it read/write/test existence of?

• Actual: 5 processes, 119 files/directories/usr/bin/gcc /etc/ld.so.hwcap /tmp

/usr/lib/gcc/…/cc1 /lib/libc.so.6 /tmp/ccdCCHK0.s

/usr/bin/as /proc/meminfo /tmp/ccKs1ykU.c

/usr/bin/ld test.c.gch /tmp/cc0YtTuE.o

/usr/bin/nm /usr/lib/crt1.o /tmp/ccGGL3Eo.ld

/usr/bin/strip /usr/…/lib/specs /tmp/ccG4c608.le

… … …

Page 7: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Parallel builds are error-prone

• Missing dependencies cause errors• Nondeterministic builds make errors difficult

to reproduce• Unnecessary dependencies limit scalability• An alternative:– Developer specifies serial build (easier!)– Serial build is automatically parallelized– Nondeterminism is eliminated

Page 8: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Build transactions

• Each build step’s file operations are monitored using system call interception

• A transaction manager inserts locks before accessing each file (may suspend processes)

• Ensure that parallel build behaves in same way as the serial build– Use concurrency control techniques from databases– Schedule is conflict-equivalent to the user’s serial

schedule

Page 9: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Build transactions example

• (1) Compile test.c to test.o, then (2) link:tid Lock/unlock Lock type Path Result

1 LOCK READ /etc/ld.so.cache OK

2 LOCK READ /etc/ld.so.cache OK

… … … … …

1 LOCK CREATE test.o OK

… … … … …

2 LOCK TEST test.o BLOCKED

… … … … …

1 UNLOCK CREATE test.o OK

2 LOCK TEST test.o OK

Page 10: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Build transactions example

• What if transaction 2 takes the lock first?tid Lock/unlock Lock type Path Result

1 LOCK READ /etc/ld.so.cache OK

2 LOCK READ /etc/ld.so.cache OK

… … … … …

2 LOCK TEST test.o OK

… … … … …

1 LOCK CREATE test.o ROLLBACK 2

… … … … …

2 LOCK TEST test.o BLOCKED

… … … … …

1 UNLOCK CREATE test.o OK

2 LOCK TEST test.o OK

Page 11: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Avoiding cascading rollback

• To ensure conflict-equivalence to the serial schedule, transactions must commit in order– Strict two-phase locking is too strict

• Instead, take advantage of the fact that the dependency graph – and lock set – changes very little from build to build

• Predicted locks– Derived from set of possible conflicts during previous run– Never block– Give no privilege to access data– Block conflicting lock attempts by transactions with larger

timestamps

Page 12: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Build transactions example

• Compile step followed by a link step:tid Lock/unlock Lock type Path Result

1 PREDICTED LOCK

CREATE test.o OK

1 LOCK READ /etc/ld.so.cache OK

2 LOCK READ /etc/ld.so.cache OK

… … … … …

2 LOCK TEST test.o BLOCKED

… … … … …

1 LOCK CREATE test.o OK

… … … … …

1 UNLOCK CREATE test.o OK

2 LOCK TEST test.o OK

Page 13: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Preliminary results - Linux kernel build

0 5 10 15 20 25 30 35 401

1.5

2

2.5

3

3.5

Speedup (apmake)Speedup (make -j)

Number of concurrent processes

Page 14: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Preliminary results - Linux kernel build• Statistics:– Number of transactions/build steps: 2,949– Parallel build time: 3m9s– Total lock requests: 1,859,172– Lock requests blocked due to conflict: 1,697

0.000.18

0.370.55

0.740.92

1.111.29

1.481.66

1.842.03

2.212.40

020406080

100120140160180

Time waiting on lock (sec)

Freq

uenc

y

Page 15: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Future work:Unimplemented stuff

• Haven’t yet implemented rollback– Needed for “unexpected dependencies”

• Fast cross-platform system call interception– ptrace, binary translation, custom filesystem?

• Multiversion timestamping– Useful for builds that read/write the same file

multiple times• Append-only files– Log files, standard out

Page 16: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Future work:Diagnosing make build bugs

• If two build steps experience a conflict, but neither depends on the other directly or indirectly…– This proves the make build is nondeterministic– Isolates most important missing dependencies

• Filter dependency graph by “files in my source repository”– Finds other interesting dependencies (e.g. headers)

• Easy bug-finding tool for existing projects

Page 17: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Future work:Process hierarchies

• Long-running process spawning many short-lived processes (e.g. make)

• Rolling back make would be very bad• Solution is virtualization:– Lie to make (your children have completed)– Predict outputs of children based on previous

build – block make if it tries to access these– Rolling back make (if necessary) isn’t so bad now

Page 18: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Future work:Intra-build step parallelism

• Efficient parallel parsing for compilation– Ref Par Lab Browser’s work (Seth Fowler)

• Efficient parallel optimization– Unexplored?

• Efficient parallel linking– Ref Google’s gold linker

Page 19: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Questions?

Page 20: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Future work:Validated incremental builds

• Observation: most build steps produce same output files as in previous build

• Go ahead and use the old versions – if they’re wrong, we’ll find out when that file is rebuilt

• Eliminates blocking for a faster parallel build, at the cost of more rollbacks

Page 21: Reliability of Parallel Build Systems Derrick Coetzee, George Necula UC Berkeley Creative Commons Zero Waiver: To the extent possible under law, the author,

Future work:Distributed parallel builds

• How to automatically partition builds between machines based on dependency graph?

• How to efficiently handle unexpected dependencies


Recommended