+ All Categories
Home > Documents > Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett...

Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett...

Date post: 24-Jul-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
140
Everything’s a File Descriptor Josh Triplett [email protected] Linux Plumbers Conference 2015
Transcript
Page 1: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Everything’s a File Descriptor

Josh [email protected]

Linux Plumbers Conference 2015

Page 2: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

“Everything’s a file”

Page 3: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 4: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 5: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 6: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 7: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 8: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 9: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 10: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I /home/josh/doc/presentations/lpc-2015/fd/fd.pdf

I /etc/hostname

I /dev/null

I /dev/zero

I /dev/ttyS0

I /dev/dri/card0

I /dev/cpu/0/cpuid

I /tmp/.X11-unix/X0

I /proc/1/environ

I /proc/cmdline

I /sys/class/block/sda/queue/rotational

I /sys/firmware/acpi/tables/DSDT

Page 11: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Everything has a filename?

Page 12: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Everything has a filename?

Page 13: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

////////////////////////////////Everything has a filename?

Page 14: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I Pipes

I Sockets

I epoll

I memfd

I KVM virtual machines and CPUs

I . . .

Page 15: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Everything’s a file descriptor

Page 16: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I What is a file descriptor, really?

I What can you do with a file descriptor?

I What interesting file descriptors exist?

I How do you build a new type of file descriptors?

I What interesting file descriptors don’t exist?

Page 17: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I What is a file descriptor, really?

I What can you do with a file descriptor?

I What interesting file descriptors exist?

I How do you build a new type of file descriptors?

I What interesting file descriptors don’t exist yet?

Page 18: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

What is a file descriptor, really?

Page 19: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I struct fd, struct fdtable

I struct file

Page 20: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd versus struct file

testfile contains “0123456789”

x = open("testfile", O_RDONLY);

xdup = dup(x);

y = open("testfile", O_RDONLY);

read(x, &c, 1);

putchar(c);

read(xdup, &c, 1);

putchar(c);

read(y, &c, 1);

putchar(c);

Page 21: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd versus struct file

testfile contains “0123456789”

x = open("testfile", O_RDONLY);

xdup = dup(x);

y = open("testfile", O_RDONLY);

read(x, &c, 1);

putchar(c);

read(xdup, &c, 1);

putchar(c);

read(y, &c, 1);

putchar(c);

Page 22: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd versus struct file

testfile contains “0123456789”

x = open("testfile", O_RDONLY);

xdup = dup(x);

y = open("testfile", O_RDONLY);

read(x, &c, 1);

putchar(c); /* Prints ’0’ */

read(xdup, &c, 1);

putchar(c);

read(y, &c, 1);

putchar(c);

Page 23: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd versus struct file

testfile contains “0123456789”

x = open("testfile", O_RDONLY);

xdup = dup(x);

y = open("testfile", O_RDONLY);

read(x, &c, 1);

putchar(c); /* Prints ’0’ */

read(xdup, &c, 1);

putchar(c); /* Prints ’1’ */

read(y, &c, 1);

putchar(c);

Page 24: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd versus struct file

testfile contains “0123456789”

x = open("testfile", O_RDONLY);

xdup = dup(x);

y = open("testfile", O_RDONLY);

read(x, &c, 1);

putchar(c); /* Prints ’0’ */

read(xdup, &c, 1);

putchar(c); /* Prints ’1’ */

read(y, &c, 1);

putchar(c); /* Prints ’0’ */

Page 25: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos:f count:

f pos: 0f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 26: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 0f count: 1

f pos: 0f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 27: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 0f count: 2

f pos: 0f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 28: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 0f count: 2

f pos: 0f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 29: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 1f count: 2

f pos: 0f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 30: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 2f count: 2

f pos: 0f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 31: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 2f count: 2

f pos: 1f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 32: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 2f count: 2

f pos: 1f count: 1

testfile

0

1

2

3

...

userspace int

kernel object driver-specific

Page 33: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 2f count: 2

f pos: 1f count: 1

testfile

0

1

2

3

...

userspace int kernel object

driver-specific

Page 34: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

struct fd

x

xdup

y

struct file

f pos: 2f count: 2

f pos: 1f count: 1

testfile

0

1

2

3

...

userspace int kernel object driver-specific

Page 35: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

File descriptor:Userspace reference to

kernel object

Page 36: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

What can you do with afile descriptor?

Page 37: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 38: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 39: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 40: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 41: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 42: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 43: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 44: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 45: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 46: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 47: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 48: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 49: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 50: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I read, write

I seek

I preadv, pwritev

I stat

I Blocking or non-blocking

I poll, select, epoll

I dup, dup2

I Send over a UNIX socket via SCM_RIGHTS

I Inherited over exec

I mmap

I sendfile, splice, tee

I openat

I . . .

I ioctl

Page 51: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Use file descriptors!

Page 52: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

What interesting filedescriptors exist?

Page 53: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

eventfd

I 64-bit counter used as an event queue

I write: Add value to counterI read: Block until non-zero; read value and reset to 0

I “Semaphore mode”: Read 1 and decrement by 1

I poll: Ready for reading if non-zero

I Several drivers use eventfd to signal events between kerneland userspace

Page 54: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

eventfd

I 64-bit counter used as an event queue

I write: Add value to counter

I read: Block until non-zero; read value and reset to 0I “Semaphore mode”: Read 1 and decrement by 1

I poll: Ready for reading if non-zero

I Several drivers use eventfd to signal events between kerneland userspace

Page 55: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

eventfd

I 64-bit counter used as an event queue

I write: Add value to counterI read: Block until non-zero; read value and reset to 0

I “Semaphore mode”: Read 1 and decrement by 1

I poll: Ready for reading if non-zero

I Several drivers use eventfd to signal events between kerneland userspace

Page 56: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

eventfd

I 64-bit counter used as an event queue

I write: Add value to counterI read: Block until non-zero; read value and reset to 0

I “Semaphore mode”: Read 1 and decrement by 1

I poll: Ready for reading if non-zero

I Several drivers use eventfd to signal events between kerneland userspace

Page 57: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

eventfd

I 64-bit counter used as an event queue

I write: Add value to counterI read: Block until non-zero; read value and reset to 0

I “Semaphore mode”: Read 1 and decrement by 1

I poll: Ready for reading if non-zero

I Several drivers use eventfd to signal events between kerneland userspace

Page 58: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

timerfd

I Allows handling timers as file descriptors

I Throw them in the poll loop with everything else

I Create with specified timeout

I read: Block until timeout; return number of times expired

I poll: Reading for reading if timeout passed

Page 59: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

Page 60: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 61: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 62: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 63: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 64: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 65: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 66: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 67: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 68: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 69: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signals

I Receive asynchronous events in a process

I Suspend execution, save registers, move execution to handler

I Restore registers and resume execution when handler done

I Assume a userspace stack to push and pop state

I sigaltstack sets an alternate stack to switch to

I Set up stack to return into call to sigreturn for cleanup

I Can receive signals while in a kernel syscall

I Some syscalls restart afterward

I Syscalls with timeouts adjust them (restart_syscall)

I Other syscalls return EINTR

I Can mask signals to avoid interruption

I Special syscalls that also set signal mask (ppoll, pselect,KVM_SET_SIGNAL_MASK ioctl)

I “async-signal-safe” library functions

Page 70: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Signed-off-by: <(;,;)@r’lyeh>

Page 71: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

signalfd

I File descriptor to receive a given set of signals

I Block “normal” signal delivery; receive via signalfd instead

I read: Block until signal, return struct signalfd_siginfo

I poll: Readable when signal received

Page 72: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

signalfd

I File descriptor to receive a given set of signals

I Block “normal” signal delivery; receive via signalfd instead

I read: Block until signal, return struct signalfd_siginfo

I poll: Readable when signal received

Page 73: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

How do you build a new typeof file descriptor?

Page 74: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Semantics

I read and writeI NothingI Raw dataI Specific data structure

I poll/select/epollI Must match read/write blocking behavior if anyI Can have pollable fd even if read/write do nothing

I seek and file position

I mmap

I What happens with multiple processes, or dup?

I For everything else: ioctl

Page 75: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Semantics

I read and writeI NothingI Raw dataI Specific data structure

I poll/select/epollI Must match read/write blocking behavior if anyI Can have pollable fd even if read/write do nothing

I seek and file position

I mmap

I What happens with multiple processes, or dup?

I For everything else: ioctl

Page 76: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Semantics

I read and writeI NothingI Raw dataI Specific data structure

I poll/select/epollI Must match read/write blocking behavior if anyI Can have pollable fd even if read/write do nothing

I seek and file position

I mmap

I What happens with multiple processes, or dup?

I For everything else: ioctl

Page 77: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Semantics

I read and writeI NothingI Raw dataI Specific data structure

I poll/select/epollI Must match read/write blocking behavior if anyI Can have pollable fd even if read/write do nothing

I seek and file position

I mmap

I What happens with multiple processes, or dup?

I For everything else: ioctl

Page 78: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Semantics

I read and writeI NothingI Raw dataI Specific data structure

I poll/select/epollI Must match read/write blocking behavior if anyI Can have pollable fd even if read/write do nothing

I seek and file position

I mmap

I What happens with multiple processes, or dup?

I For everything else: ioctl

Page 79: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Semantics

I read and writeI NothingI Raw dataI Specific data structure

I poll/select/epollI Must match read/write blocking behavior if anyI Can have pollable fd even if read/write do nothing

I seek and file position

I mmap

I What happens with multiple processes, or dup?

I For everything else: ioctl

Page 80: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Implementation

I anon_inode_getfdI Doesn’t need a backing inode or filesystemI Provide an ops structure and private data pointerI Private data points to your kernel object

I simple_read_from_buffer, simple_write_to_buffer

I no_llseek, fixed_size_llseekI Check file->f_flags & O_NONBLOCK

I Blocking: wait_queue_headI Non-blocking: return -EAGAIN

Page 81: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Implementation

I anon_inode_getfdI Doesn’t need a backing inode or filesystemI Provide an ops structure and private data pointerI Private data points to your kernel object

I simple_read_from_buffer, simple_write_to_buffer

I no_llseek, fixed_size_llseekI Check file->f_flags & O_NONBLOCK

I Blocking: wait_queue_headI Non-blocking: return -EAGAIN

Page 82: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Implementation

I anon_inode_getfdI Doesn’t need a backing inode or filesystemI Provide an ops structure and private data pointerI Private data points to your kernel object

I simple_read_from_buffer, simple_write_to_buffer

I no_llseek, fixed_size_llseek

I Check file->f_flags & O_NONBLOCKI Blocking: wait_queue_headI Non-blocking: return -EAGAIN

Page 83: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Implementation

I anon_inode_getfdI Doesn’t need a backing inode or filesystemI Provide an ops structure and private data pointerI Private data points to your kernel object

I simple_read_from_buffer, simple_write_to_buffer

I no_llseek, fixed_size_llseekI Check file->f_flags & O_NONBLOCK

I Blocking: wait_queue_headI Non-blocking: return -EAGAIN

Page 84: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

What interesting filedescriptors don’t exist yet?

Page 85: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Child processes

Page 86: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 87: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 88: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 89: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 90: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 91: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 92: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 93: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 94: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

I fork/clone

I Parent process gets the child PID

I Parent uses dedicated syscalls (waitpid) to wait for child exit

I When child exits, parent gets SIGCHLD signal

I Parent makes waitpid call to get exit status

Problems:

I Waiting not integrated with poll loops

Signals

I Process-global; libraries can’t manage only their own processes

Page 95: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Alternatives

I Set SIGCHLD handler, write to pipe or eventfdI Still process-global; gets all child exit notificationsI Requires coordinating global signal handling between libraries

Signals

I signalfd for SIGCHLDI Still process-global; gets all child exit notificationsI Requires coordinating global signal handling between librariesI Must block SIGCHLD; breaks code expecting SIGCHLD

Page 96: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Alternatives

I Set SIGCHLD handler, write to pipe or eventfdI Still process-global; gets all child exit notificationsI Requires coordinating global signal handling between libraries

Signals

I signalfd for SIGCHLDI Still process-global; gets all child exit notificationsI Requires coordinating global signal handling between librariesI Must block SIGCHLD; breaks code expecting SIGCHLD

Page 97: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd

Page 98: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd

I New flag for clone

I Return a file descriptor for the child process

I read: block until child exits, return exit information

I poll: becomes readable when child exits

I Maintains a reference to the child’s task_struct

I Relatively simple, except. . .

Page 99: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd

I New flag for clone

I Return a file descriptor for the child process

I read: block until child exits, return exit information

I poll: becomes readable when child exits

I Maintains a reference to the child’s task_struct

I Relatively simple, except. . .

Page 100: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd

I New flag for clone

I Return a file descriptor for the child process

I read: block until child exits, return exit information

I poll: becomes readable when child exits

I Maintains a reference to the child’s task_struct

I Relatively simple, except. . .

Page 101: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd

I New flag for clone

I Return a file descriptor for the child process

I read: block until child exits, return exit information

I poll: becomes readable when child exits

I Maintains a reference to the child’s task_struct

I Relatively simple, except. . .

Page 102: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd

I New flag for clone

I Return a file descriptor for the child process

I read: block until child exits, return exit information

I poll: becomes readable when child exits

I Maintains a reference to the child’s task_struct

I Relatively simple, except. . .

Page 103: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 104: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architecture

I Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 105: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 106: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architectures

I Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 107: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 108: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 109: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 110: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparenting

I Work in progress

Page 111: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Complications

Need a new clone system call for the fd out parameter

clone syscall parameters vary by architectureI Avoided in the new syscall

clone is out of parameters (6) on some architecturesI Pass parameters via a struct and size

Low-level copy_thread function grabbed tls parameterdirectly from syscall register arguments; couldn’t move it

I Pass parameter normally via C, fix assembly syscall entryI Fixed with copy_thread_tls (merged in 4.2)

ptrace and reparentingI Work in progress

Page 112: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

History and status

I Thiago Macieira originally proposed forkfd to simplify Qt

I Josh and Thiago started on clonefd earlier this year

I Some infrastructure merged into 4.2

I Syscall aimed for future kernel after resolving ptrace issues

Page 113: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

File descriptor:Userspace reference to

kernel object

Page 114: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

What else can we do with areference to task_struct?

Page 115: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Process IDs

Page 116: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Process IDs

I Small integers used to reference processes

I Used pervasively in process syscalls

I Enumerated as directories in /proc

I Unique within root container

I Container PID namespaces map a subset of these

I PIDs do not hold a reference; can be reused

I Race condition if used from non-parent process

Page 117: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Process IDs

I Small integers used to reference processes

I Used pervasively in process syscalls

I Enumerated as directories in /proc

I Unique within root container

I Container PID namespaces map a subset of these

I PIDs do not hold a reference; can be reused

I Race condition if used from non-parent process

Page 118: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Process IDs

I Small integers used to reference processes

I Used pervasively in process syscalls

I Enumerated as directories in /proc

I Unique within root container

I Container PID namespaces map a subset of these

I PIDs do not hold a reference; can be reused

I Race condition if used from non-parent process

Page 119: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd as process identifier

I Unique across the entire system

I Holds a reference to the process

I Race-free

I Can pass via exec, UNIX sockets

I Allows non-parent processes to obtain exit information

Page 120: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd as process identifier

I Unique across the entire system

I Holds a reference to the process

I Race-free

I Can pass via exec, UNIX sockets

I Allows non-parent processes to obtain exit information

Page 121: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd as process identifier

I Unique across the entire system

I Holds a reference to the process

I Race-free

I Can pass via exec, UNIX sockets

I Allows non-parent processes to obtain exit information

Page 122: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd as process identifier

I Unique across the entire system

I Holds a reference to the process

I Race-free

I Can pass via exec, UNIX sockets

I Allows non-parent processes to obtain exit information

Page 123: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

clonefd as process identifier

I Unique across the entire system

I Holds a reference to the process

I Race-free

I Can pass via exec, UNIX sockets

I Allows non-parent processes to obtain exit information

Page 124: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Next steps

I Merge clonefd

I For each PID syscall, add an fd variant

I Add ioctls to obtain process information

I Add process enumeration (next, child, root)

Page 125: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Other future file descriptors

Warning: wild speculation andconjecture ahead

Page 126: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Other future file descriptors

Warning: wild speculation andconjecture ahead

Page 127: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

User and group IDs

I Suppose users and groups were unique kernel objects?

I Unique across container user namespaces

I “Get unused user/group”

I Set up arbitrary mappings when mounting a filesystem

I Allow a process to hold multiple credentials (like setgroups)

Page 128: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

User and group IDs

I Suppose users and groups were unique kernel objects?

I Unique across container user namespaces

I “Get unused user/group”

I Set up arbitrary mappings when mounting a filesystem

I Allow a process to hold multiple credentials (like setgroups)

Page 129: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

User and group IDs

I Suppose users and groups were unique kernel objects?

I Unique across container user namespaces

I “Get unused user/group”

I Set up arbitrary mappings when mounting a filesystem

I Allow a process to hold multiple credentials (like setgroups)

Page 130: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

User and group IDs

I Suppose users and groups were unique kernel objects?

I Unique across container user namespaces

I “Get unused user/group”

I Set up arbitrary mappings when mounting a filesystem

I Allow a process to hold multiple credentials (like setgroups)

Page 131: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

User and group IDs

I Suppose users and groups were unique kernel objects?

I Unique across container user namespaces

I “Get unused user/group”

I Set up arbitrary mappings when mounting a filesystem

I Allow a process to hold multiple credentials (like setgroups)

Page 132: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Filesystem mounts

I Suppose mount returned a directory file descriptor

I openat relative to the filesystem

I Separate call to bind into the filesystem namespace

I Bind existing dirfd for bind mounts

Page 133: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Filesystem mounts

I Suppose mount returned a directory file descriptor

I openat relative to the filesystem

I Separate call to bind into the filesystem namespace

I Bind existing dirfd for bind mounts

Page 134: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Filesystem mounts

I Suppose mount returned a directory file descriptor

I openat relative to the filesystem

I Separate call to bind into the filesystem namespace

I Bind existing dirfd for bind mounts

Page 135: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Filesystem mounts

I Suppose mount returned a directory file descriptor

I openat relative to the filesystem

I Separate call to bind into the filesystem namespace

I Bind existing dirfd for bind mounts

Page 136: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Summary

I File descriptor: Userspace reference to kernel object

I Reference-counted, race-free, unambiguous ID

I Well-defined semantics

I Extensive operations

I poll and blocking

I Use file descriptors in new APIs

I Don’t invent new identifier namespaces

Page 137: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Summary

I File descriptor: Userspace reference to kernel object

I Reference-counted, race-free, unambiguous ID

I Well-defined semantics

I Extensive operations

I poll and blocking

I Use file descriptors in new APIs

I Don’t invent new identifier namespaces

Page 138: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Summary

I File descriptor: Userspace reference to kernel object

I Reference-counted, race-free, unambiguous ID

I Well-defined semantics

I Extensive operations

I poll and blocking

I Use file descriptors in new APIs

I Don’t invent new identifier namespaces

Page 139: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Summary

I File descriptor: Userspace reference to kernel object

I Reference-counted, race-free, unambiguous ID

I Well-defined semantics

I Extensive operations

I poll and blocking

I Use file descriptors in new APIs

I Don’t invent new identifier namespaces

Page 140: Everything's a File Descriptor · Everything’s a File Descriptor Josh Triplett josh@joshtriplett.org Linux Plumbers Conference 2015

Summary

I File descriptor: Userspace reference to kernel object

I Reference-counted, race-free, unambiguous ID

I Well-defined semantics

I Extensive operations

I poll and blocking

I Use file descriptors in new APIs

I Don’t invent new identifier namespaces


Recommended