+ All Categories
Home > Documents > Advanced Char Driver Operations Linux Kernel Programming CIS 4930/COP 5641.

Advanced Char Driver Operations Linux Kernel Programming CIS 4930/COP 5641.

Date post: 13-Dec-2015
Category:
Upload: todd-perkins
View: 235 times
Download: 1 times
Share this document with a friend
Popular Tags:
44
Advanced Char Driver Operations Linux Kernel Programming CIS 4930/COP 5641
Transcript

Advanced Char Driver Operations

Linux Kernel Programming

CIS 4930/COP 5641

Topics

Managing ioctl command numbers Putting a thread to sleep Seeking on a device Access control

ioctl

input/output control system call For operations beyond simple data transfers

Eject the media Report error information Change hardware settings Self destruct

Alternatives Embedded commands in the data stream Driver-specific file systems

ioctl

User-level interface (application view)int ioctl(int fd, int request, ...); ...

Does not indicate variable number of arguments Would be problematic for the system call interface

In this context, it is meant to pass a single optional argument Traditionally a char *argp Just a way to bypass the type checking

For more information, look at man page

ioctl

Driver-level interfaceint (*unlocked_ioctl) (struct file *filp, unsigned int cmd, unsigned long arg); cmd is passed from the user unchanged arg can be an integer or a pointer Compiler does not type check

ioctl() has changed from the LDD3 era Modified to remove the big kernel lock (BKL) http://lwn.net/Articles/119652/

Choosing the ioctl Commands Desire a numbering scheme to avoid

mistakes E.g., issuing a command to the wrong device

(changing the baud rate of an audio device) Unique ioctl command numbers across system Check ioctl.h files in the source and directory Documentation/ioctl/

Choosing the ioctl Commands A command number uses four bitfields

Defined in include/uapi/asm-generic/ioctl.h

(for most architectures) < direction, type, number, size>

direction: direction of data transfer _IOC_NONE _IOC_READ _IOC_WRITE _IOC_READ | WRITE

Choosing the ioctl Commands

< direction, type, number, size> type (ioctl device type)

8-bit (_IOC_TYPEBITS) magic number Associated with the device

number 8-bit (_IOC_NRBITS) sequential number Unique within device

Choosing the ioctl Commands

< direction, type, number, size> size: size of user data involved

_IOC_SIZEBITS Usually 14 bits but could be overridden by architecture

#define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int)

/* provoke compile error for invalid uses of size argument */extern unsigned int __invalid_size_argument_for_IOC;#define _IOC_TYPECHECK(t) \ ((sizeof(t) == sizeof(t[1]) && \ sizeof(t) < (1 << _IOC_SIZEBITS)) ? \ sizeof(t) : __invalid_size_argument_for_IOC)

/* See http://lwn.net/Articles/48354/ */

Choosing the ioctl Commands Useful macros to create ioctl command

numbers _IO(type, nr) _IOR(type, nr, datatype) _IOW(type, nr, datatype) _IOWR(type, nr, datatype)

_IO*_BAD used for backward compatibility Uses number (of bytes) rather than datatype http://lkml.iu.edu//hypermail/linux/kernel/0310.1/0019.html

arg is unsigned long (integer)

arg is a pointer

Choosing the ioctl Commands Useful macros to decode ioctl command

numbers _IOC_DIR(nr) _IOC_TYPE(nr) _IOC_NR(nr) _IOC_SIZE(nr)

Choosing the ioctl Commands The scull example

/* Use 'k' as magic number (type) field */

#define SCULL_IOC_MAGIC 'k‘

/* Please use a different 8-bit number in your code */

#define SCULL_IOCRESET _IO(SCULL_IOC_MAGIC, 0)

Choosing the ioctl Commands The scull example/*

* S means "Set" through a ptr,

* T means "Tell" directly with the argument value

* G means "Get": reply by setting through a pointer

* Q means "Query": response is on the return value

* X means "eXchange": switch G and S atomically

* H means "sHift": switch T and Q atomically

*/

#define SCULL_IOCSQUANTUM _IOW(SCULL_IOC_MAGIC, 1, int)

#define SCULL_IOCSQSET _IOW(SCULL_IOC_MAGIC, 2, int)

#define SCULL_IOCTQUANTUM _IO(SCULL_IOC_MAGIC, 3)

#define SCULL_IOCTQSET _IO(SCULL_IOC_MAGIC, 4)

#define SCULL_IOCGQUANTUM _IOR(SCULL_IOC_MAGIC, 5, int)

Set new value and return the old value

Choosing the ioctl Commands The scull example

#define SCULL_IOCGQSET _IOR(SCULL_IOC_MAGIC, 6, int)

#define SCULL_IOCQQUANTUM _IO(SCULL_IOC_MAGIC, 7)

#define SCULL_IOCQQSET _IO(SCULL_IOC_MAGIC, 8)

#define SCULL_IOCXQUANTUM _IOWR(SCULL_IOC_MAGIC, 9, int)

#define SCULL_IOCXQSET _IOWR(SCULL_IOC_MAGIC,10, int)

#define SCULL_IOCHQUANTUM _IO(SCULL_IOC_MAGIC, 11)

#define SCULL_IOCHQSET _IO(SCULL_IOC_MAGIC, 12)

...

#define SCULL_IOC_MAXNR 14

The Return Value

When the command number is not supported –ENOTTY (according to the POSIX standard) Some drivers may (in conflict with the POSIX

standard) return –EINVAL

The Predefined Commands

Handled by the kernel first Will not be passed down to device drivers

Three groups For any file (regular, device, FIFO, socket)

Magic number: “T.” For regular files only Specific to the file system type

E.g., see ext2_ioctl()

Using the ioctl Argument

If it is an integer, just use it directly If it is a pointer

Need to check for valid user addressint access_ok(int type, const void *addr,

unsigned long size); type: either VERIFY_READ or VERIFY_WRITE Returns 1 for success, 0 for failure

Driver then results –EFAULT to the caller Defined in <linux/uaccess.h> Mostly called by memory-access routines

Using the ioctl Argument

The scull exampleint scull_ioctl(struct file *filp,

unsigned int cmd, unsigned long arg) {

int err = 0, tmp;

int retval = 0;

/* check the magic number and whether the command is defined */

if (_IOC_TYPE(cmd) != SCULL_IOC_MAGIC) {

return -ENOTTY;

}

if (_IOC_NR(cmd) > SCULL_IOC_MAXNR) {

return -ENOTTY;

}

Using the ioctl Argument

The scull example…

/* the concept of "read" and "write" is reversed here */

if (_IOC_DIR(cmd) & _IOC_READ) {

err = !access_ok(VERIFY_WRITE, (void __user *) arg,

_IOC_SIZE(cmd));

} else if (_IOC_DIR(cmd) & _IOC_WRITE) {

err = !access_ok(VERIFY_READ, (void __user *) arg,

_IOC_SIZE(cmd));

}

if (err) return -EFAULT;

Capabilities and Restricted Operations Limit certain ioctl operations to privileged users See <linux/capability.h> for the full set of

capabilities To check a certain capability call

int capable(int capability); In the scull example

if (!capable(CAP_SYS_ADMIN)) {

return –EPERM;

} http://lwn.net/Articles/486306/

A catch-all capability for many system

administration operations

The Implementation of the ioctl Commands A giant switch statement…

switch(cmd) {

case SCULL_IOCRESET:

scull_quantum = SCULL_QUANTUM;

scull_qset = SCULL_QSET;

break;

case SCULL_IOCSQUANTUM: /* Set: arg points to the value */

if (!capable(CAP_SYS_ADMIN)) {

return -EPERM;

}

retval = __get_user(scull_quantum, (int __user *)arg);

break;

The Implementation of the ioctl Commands Six ways to pass and receive arguments from

the user space Need to know command number

int quantum;

ioctl(fd,SCULL_IOCSQUANTUM, &quantum); /* Set by pointer */

ioctl(fd,SCULL_IOCTQUANTUM, quantum); /* Set by value */

ioctl(fd,SCULL_IOCGQUANTUM, &quantum); /* Get by pointer */

quantum = ioctl(fd,SCULL_IOCQQUANTUM); /* Get by return value */

ioctl(fd,SCULL_IOCXQUANTUM, &quantum); /* Exchange by pointer */

quantum = ioctl(fd,SCULL_IOCHQUANTUM, quantum); /* Exchange by value */

Pros/Cons of ioctl

Cons Unregulated means to add new system call

API Not reviewed Different for each device

32/64-bit compatibility No way to enumerate

Pros read and write with one call

Ref http://lwn.net/Articles/191653/

Device Control Without ioctl Writing control sequences into the data

stream itself Example: console escape sequences Advantages:

No need to implement ioctl methods Disadvantages:

Need to make sure that escape sequences do not appear in the normal data stream (e.g., cat a binary file)

Need to parse the data stream

Device Control Without ioctl sysfs

Can be used to enumerate all exported components Use standard unix shell commands

Netlink Getting/setting socket options

debugfs Probably not a good choice since its purpose is for

debugging relay interface

https://www.kernel.org/doc/Documentation/filesystems/relay.txt

SLEEPING

Sleeping

Suspend thread waiting for some condition Example usage: Blocking I/O Data is not immediately available for reads When the device is not ready to accept data

Output buffer is full

Introduction to Sleeping

A process is removed from the scheduler’s run queue

Certain rules Generally never sleep when running in an atomic

context Multiple steps must be performed without concurrent

accesses Not while holding a spinlock, seqlock, or RCU lock Not while disabling interrupts

Introduction to Sleeping

After waking up Make no assumptions about the state of the system The resource one is waiting for might be gone again Must check the wait condition again

Introduction to Sleeping

Wait queue: contains a list of processes waiting for a specific event #include <linux/wait.h> To initialize statically, callDECLARE_WAIT_QUEUE_HEAD(my_queue);

To initialize dynamically, callwait_queue_head_t my_queue;

init_waitqueue_head(&my_queue);

Simple Sleeping

Call variants of wait_event macros wait_event(queue, condition)

queue = wait queue head Passed by value

Waits until the boolean condition becomes true Puts into an uninterruptible sleep

Usually is not what you want

wait_event_interruptible(queue, condition) Can be interrupted by signals Returns nonzero if sleep was interrupted

Your driver should return -ERESTARTSYS

Simple Sleeping

wait_event_timeout(queue, condition, timeout) Wait for a limited time (in jiffies) Returns 0 regardless of condition evaluations

wait_event_interruptible_timeout(queue,

condition,

timeout)

Simple Sleeping

To wake up, call variants of wake_up functionsvoid wake_up(wait_queue_head_t *queue);

Wakes up all processes waiting on the queue

void wake_up_interruptible(wait_queue_head_t *queue); Wakes up processes that perform an interruptible sleep

Simple Sleeping

Example module: sleepystatic DECLARE_WAIT_QUEUE_HEAD(wq);

static int flag = 0;

ssize_t sleepy_read(struct file *filp, char __user *buf,

size_t count, loff_t *pos) {

printk(KERN_DEBUG "process %i (%s) going to sleep\n",

current->pid, current->comm);

wait_event_interruptible(wq, flag != 0);

flag = 0;

printk(KERN_DEBUG "awoken %i (%s)\n", current->pid,

current->comm);

return 0; /* EOF */

}

Multiple threads can wake up at this point

Simple Sleeping

Example module: sleepyssize_t sleepy_write(struct file *filp, const char __user *buf,

size_t count, loff_t *pos) {

printk(KERN_DEBUG "process %i (%s) awakening the readers...\n",

current->pid, current->comm);

flag = 1;

wake_up_interruptible(&wq);

return count; /* succeed, to avoid retrial */

}

Blocking and Nonblocking Operations By default, operations block

If no data is available for reads If no space is available for writes

Non-blocking I/O is indicated by the O_NONBLOCK flag in filp->f_flags Defined in <linux/fcntl.h> Only open, read, and write calls are affected Returns –EAGAIN immediately instead of block Applications need to distinguish non-blocking

returns vs. EOFs

A Blocking I/O Example

scullpipe A read process

Blocks when no data is available Wakes a blocking write when buffer space becomes

available A write process

Blocks when no buffer space is available Wakes a blocking read process when data arrives

A Blocking I/O Example

scullpipe data structure

struct scull_pipe {

wait_queue_head_t inq, outq; /* read and write queues */

char *buffer, *end; /* begin of buf, end of buf */

int buffersize; /* used in pointer arithmetic */

char *rp, *wp; /* where to read, where to write */

int nreaders, nwriters; /* number of openings for r/w */

struct fasync_struct *async_queue; /* asynchronous readers */

struct mutex mutex; /* mutual exclusion */

struct cdev cdev; /* Char device structure */

};

A Blocking I/O Example

static ssize_t scull_p_read(struct file *filp, char __user *buf,

size_t count, loff_t *f_pos) {

struct scull_pipe *dev = filp->private_data;

if (mutex_lock_interruptible(&dev->mutex)) return -ERESTARTSYS;

while (dev->rp == dev->wp) { /* nothing to read */

mutex_unlock(&dev->mutex); /* release the lock */

if (filp->f_flags & O_NONBLOCK)

return -EAGAIN;

if (wait_event_interruptible(dev->inq, (dev->rp != dev->wp)))

return -ERESTARTSYS;

if (mutex_lock_interruptible(&dev->mutex))

return -ERESTARTSYS;

}

A Blocking I/O Example

if (dev->wp > dev->rp)

count = min(count, (size_t)(dev->wp - dev->rp));

else /* the write pointer has wrapped */

count = min(count, (size_t)(dev->end - dev->rp));

if (copy_to_user(buf, dev->rp, count)) {

mutex_lock(&dev->mutex);

return -EFAULT;

}

dev->rp += count;

if (dev->rp == dev->end) dev->rp = dev->buffer; /* wrapped */

mutex_unlock(&dev->mutex);

/* finally, awake any writers and return */

wake_up_interruptible(&dev->outq);

return count;

}

LLSEEK()

The llseek Implementation

Implements lseek and llseek system calls Modifies filp->f_pos

loff_t scull_llseek(struct file *filp, loff_t off, int whence) {

struct scull_dev *dev = filp->private_data;

loff_t newpos;

switch(whence) {

case 0: /* SEEK_SET */

newpos = off;

break;

case 1: /* SEEK_CUR, relative to the current position */

newpos = filp->f_pos + off;

break;

The llseek Implementation

case 2: /* SEEK_END, relative to the end of the file */

newpos = dev->size + off;

break;

default: /* can't happen */

return -EINVAL;

}

if (newpos < 0) return -EINVAL;

filp->f_pos = newpos;

return newpos;

}

The llseek Implementation

May not make sense for serial ports and keyboard inputs Need to inform the kernel via calling nonseekable_open in the open method

int nonseekable_open(struct inode *inode, struct file *filp);

Replace llseek method with no_llseek (defined in <linux/fs.h> in your file_operations structure


Recommended