linux/Documentation/preempt-locking.txt
<<
>>
Prefs
   1===========================================================================
   2Proper Locking Under a Preemptible Kernel: Keeping Kernel Code Preempt-Safe
   3===========================================================================
   4
   5:Author: Robert Love <rml@tech9.net>
   6
   7
   8Introduction
   9============
  10
  11
  12A preemptible kernel creates new locking issues.  The issues are the same as
  13those under SMP: concurrency and reentrancy.  Thankfully, the Linux preemptible
  14kernel model leverages existing SMP locking mechanisms.  Thus, the kernel
  15requires explicit additional locking for very few additional situations.
  16
  17This document is for all kernel hackers.  Developing code in the kernel
  18requires protecting these situations.
  19 
  20
  21RULE #1: Per-CPU data structures need explicit protection
  22^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  23
  24
  25Two similar problems arise. An example code snippet::
  26
  27        struct this_needs_locking tux[NR_CPUS];
  28        tux[smp_processor_id()] = some_value;
  29        /* task is preempted here... */
  30        something = tux[smp_processor_id()];
  31
  32First, since the data is per-CPU, it may not have explicit SMP locking, but
  33require it otherwise.  Second, when a preempted task is finally rescheduled,
  34the previous value of smp_processor_id may not equal the current.  You must
  35protect these situations by disabling preemption around them.
  36
  37You can also use put_cpu() and get_cpu(), which will disable preemption.
  38
  39
  40RULE #2: CPU state must be protected.
  41^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  42
  43
  44Under preemption, the state of the CPU must be protected.  This is arch-
  45dependent, but includes CPU structures and state not preserved over a context
  46switch.  For example, on x86, entering and exiting FPU mode is now a critical
  47section that must occur while preemption is disabled.  Think what would happen
  48if the kernel is executing a floating-point instruction and is then preempted.
  49Remember, the kernel does not save FPU state except for user tasks.  Therefore,
  50upon preemption, the FPU registers will be sold to the lowest bidder.  Thus,
  51preemption must be disabled around such regions.
  52
  53Note, some FPU functions are already explicitly preempt safe.  For example,
  54kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.
  55
  56
  57RULE #3: Lock acquire and release must be performed by same task
  58^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  59
  60
  61A lock acquired in one task must be released by the same task.  This
  62means you can't do oddball things like acquire a lock and go off to
  63play while another task releases it.  If you want to do something
  64like this, acquire and release the task in the same code path and
  65have the caller wait on an event by the other task.
  66
  67
  68Solution
  69========
  70
  71
  72Data protection under preemption is achieved by disabling preemption for the
  73duration of the critical region.
  74
  75::
  76
  77  preempt_enable()              decrement the preempt counter
  78  preempt_disable()             increment the preempt counter
  79  preempt_enable_no_resched()   decrement, but do not immediately preempt
  80  preempt_check_resched()       if needed, reschedule
  81  preempt_count()               return the preempt counter
  82
  83The functions are nestable.  In other words, you can call preempt_disable
  84n-times in a code path, and preemption will not be reenabled until the n-th
  85call to preempt_enable.  The preempt statements define to nothing if
  86preemption is not enabled.
  87
  88Note that you do not need to explicitly prevent preemption if you are holding
  89any locks or interrupts are disabled, since preemption is implicitly disabled
  90in those cases.
  91
  92But keep in mind that 'irqs disabled' is a fundamentally unsafe way of
  93disabling preemption - any cond_resched() or cond_resched_lock() might trigger
  94a reschedule if the preempt count is 0. A simple printk() might trigger a
  95reschedule. So use this implicit preemption-disabling property only if you
  96know that the affected codepath does not do any of this. Best policy is to use
  97this only for small, atomic code that you wrote and which calls no complex
  98functions.
  99
 100Example::
 101
 102        cpucache_t *cc; /* this is per-CPU */
 103        preempt_disable();
 104        cc = cc_data(searchp);
 105        if (cc && cc->avail) {
 106                __free_block(searchp, cc_entry(cc), cc->avail);
 107                cc->avail = 0;
 108        }
 109        preempt_enable();
 110        return 0;
 111
 112Notice how the preemption statements must encompass every reference of the
 113critical variables.  Another example::
 114
 115        int buf[NR_CPUS];
 116        set_cpu_val(buf);
 117        if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n");
 118        spin_lock(&buf_lock);
 119        /* ... */
 120
 121This code is not preempt-safe, but see how easily we can fix it by simply
 122moving the spin_lock up two lines.
 123
 124
 125Preventing preemption using interrupt disabling
 126===============================================
 127
 128
 129It is possible to prevent a preemption event using local_irq_disable and
 130local_irq_save.  Note, when doing so, you must be very careful to not cause
 131an event that would set need_resched and result in a preemption check.  When
 132in doubt, rely on locking or explicit preemption disabling.
 133
 134Note in 2.5 interrupt disabling is now only per-CPU (e.g. local).
 135
 136An additional concern is proper usage of local_irq_disable and local_irq_save.
 137These may be used to protect from preemption, however, on exit, if preemption
 138may be enabled, a test to see if preemption is required should be done.  If
 139these are called from the spin_lock and read/write lock macros, the right thing
 140is done.  They may also be called within a spin-lock protected region, however,
 141if they are ever called outside of this context, a test for preemption should
 142be made. Do note that calls from interrupt context or bottom half/ tasklets
 143are also protected by preemption locks and so may use the versions which do
 144not check preemption.
 145