linux/Documentation/RCU/rcuref.rst
<<
>>
Prefs
   1.. SPDX-License-Identifier: GPL-2.0
   2
   3====================================================================
   4Reference-count design for elements of lists/arrays protected by RCU
   5====================================================================
   6
   7
   8Please note that the percpu-ref feature is likely your first
   9stop if you need to combine reference counts and RCU.  Please see
  10include/linux/percpu-refcount.h for more information.  However, in
  11those unusual cases where percpu-ref would consume too much memory,
  12please read on.
  13
  14------------------------------------------------------------------------
  15
  16Reference counting on elements of lists which are protected by traditional
  17reader/writer spinlocks or semaphores are straightforward:
  18
  19CODE LISTING A::
  20
  21    1.                                      2.
  22    add()                                   search_and_reference()
  23    {                                       {
  24        alloc_object                            read_lock(&list_lock);
  25        ...                                     search_for_element
  26        atomic_set(&el->rc, 1);                 atomic_inc(&el->rc);
  27        write_lock(&list_lock);                  ...
  28        add_element                             read_unlock(&list_lock);
  29        ...                                     ...
  30        write_unlock(&list_lock);          }
  31    }
  32
  33    3.                                      4.
  34    release_referenced()                    delete()
  35    {                                       {
  36        ...                                     write_lock(&list_lock);
  37        if(atomic_dec_and_test(&el->rc))        ...
  38            kfree(el);
  39        ...                                     remove_element
  40    }                                           write_unlock(&list_lock);
  41                                                ...
  42                                                if (atomic_dec_and_test(&el->rc))
  43                                                    kfree(el);
  44                                                ...
  45                                            }
  46
  47If this list/array is made lock free using RCU as in changing the
  48write_lock() in add() and delete() to spin_lock() and changing read_lock()
  49in search_and_reference() to rcu_read_lock(), the atomic_inc() in
  50search_and_reference() could potentially hold reference to an element which
  51has already been deleted from the list/array.  Use atomic_inc_not_zero()
  52in this scenario as follows:
  53
  54CODE LISTING B::
  55
  56    1.                                      2.
  57    add()                                   search_and_reference()
  58    {                                       {
  59        alloc_object                            rcu_read_lock();
  60        ...                                     search_for_element
  61        atomic_set(&el->rc, 1);                 if (!atomic_inc_not_zero(&el->rc)) {
  62        spin_lock(&list_lock);                      rcu_read_unlock();
  63                                                    return FAIL;
  64        add_element                             }
  65        ...                                     ...
  66        spin_unlock(&list_lock);                rcu_read_unlock();
  67    }                                       }
  68    3.                                      4.
  69    release_referenced()                    delete()
  70    {                                       {
  71        ...                                     spin_lock(&list_lock);
  72        if (atomic_dec_and_test(&el->rc))       ...
  73            call_rcu(&el->head, el_free);       remove_element
  74        ...                                     spin_unlock(&list_lock);
  75    }                                           ...
  76                                                if (atomic_dec_and_test(&el->rc))
  77                                                    call_rcu(&el->head, el_free);
  78                                                ...
  79                                            }
  80
  81Sometimes, a reference to the element needs to be obtained in the
  82update (write) stream.  In such cases, atomic_inc_not_zero() might be
  83overkill, since we hold the update-side spinlock.  One might instead
  84use atomic_inc() in such cases.
  85
  86It is not always convenient to deal with "FAIL" in the
  87search_and_reference() code path.  In such cases, the
  88atomic_dec_and_test() may be moved from delete() to el_free()
  89as follows:
  90
  91CODE LISTING C::
  92
  93    1.                                      2.
  94    add()                                   search_and_reference()
  95    {                                       {
  96        alloc_object                            rcu_read_lock();
  97        ...                                     search_for_element
  98        atomic_set(&el->rc, 1);                 atomic_inc(&el->rc);
  99        spin_lock(&list_lock);                  ...
 100
 101        add_element                             rcu_read_unlock();
 102        ...                                 }
 103        spin_unlock(&list_lock);            4.
 104    }                                       delete()
 105    3.                                      {
 106    release_referenced()                        spin_lock(&list_lock);
 107    {                                           ...
 108        ...                                     remove_element
 109        if (atomic_dec_and_test(&el->rc))       spin_unlock(&list_lock);
 110            kfree(el);                          ...
 111        ...                                     call_rcu(&el->head, el_free);
 112    }                                           ...
 113    5.                                      }
 114    void el_free(struct rcu_head *rhp)
 115    {
 116        release_referenced();
 117    }
 118
 119The key point is that the initial reference added by add() is not removed
 120until after a grace period has elapsed following removal.  This means that
 121search_and_reference() cannot find this element, which means that the value
 122of el->rc cannot increase.  Thus, once it reaches zero, there are no
 123readers that can or ever will be able to reference the element.  The
 124element can therefore safely be freed.  This in turn guarantees that if
 125any reader finds the element, that reader may safely acquire a reference
 126without checking the value of the reference counter.
 127
 128A clear advantage of the RCU-based pattern in listing C over the one
 129in listing B is that any call to search_and_reference() that locates
 130a given object will succeed in obtaining a reference to that object,
 131even given a concurrent invocation of delete() for that same object.
 132Similarly, a clear advantage of both listings B and C over listing A is
 133that a call to delete() is not delayed even if there are an arbitrarily
 134large number of calls to search_and_reference() searching for the same
 135object that delete() was invoked on.  Instead, all that is delayed is
 136the eventual invocation of kfree(), which is usually not a problem on
 137modern computer systems, even the small ones.
 138
 139In cases where delete() can sleep, synchronize_rcu() can be called from
 140delete(), so that el_free() can be subsumed into delete as follows::
 141
 142    4.
 143    delete()
 144    {
 145        spin_lock(&list_lock);
 146        ...
 147        remove_element
 148        spin_unlock(&list_lock);
 149        ...
 150        synchronize_rcu();
 151        if (atomic_dec_and_test(&el->rc))
 152            kfree(el);
 153        ...
 154    }
 155
 156As additional examples in the kernel, the pattern in listing C is used by
 157reference counting of struct pid, while the pattern in listing B is used by
 158struct posix_acl.
 159