From 863981e96738983919de841ec669e157e6bdaeb0 Mon Sep 17 00:00:00 2001 From: André Fabian Silva Delgado Date: Sun, 11 Sep 2016 04:34:46 -0300 Subject: Linux-libre 4.7.1-gnu --- .../RCU/Design/Requirements/Requirements.html | 941 ++++++++++++--------- 1 file changed, 534 insertions(+), 407 deletions(-) (limited to 'Documentation/RCU/Design/Requirements/Requirements.html') diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html index a725f9900..e7e24b3e8 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.html +++ b/Documentation/RCU/Design/Requirements/Requirements.html @@ -1,5 +1,3 @@ - - @@ -65,8 +63,8 @@ All that aside, here are the categories of currently known RCU requirements:

This is followed by a summary, -which is in turn followed by the inevitable -answers to the quick quizzes. +however, the answers to each quick quiz immediately follows the quiz. +Select the big white space with your mouse to see the answer.

Fundamental Requirements

@@ -153,13 +151,27 @@ Therefore, the outcome: cannot happen. -

Quick Quiz 1: -Wait a minute! -You said that updaters can make useful forward progress concurrently -with readers, but pre-existing readers will block -synchronize_rcu()!!! -Just who are you trying to fool??? -
Answer + + + + + + + +
 
Quick Quiz:
+ Wait a minute! + You said that updaters can make useful forward progress concurrently + with readers, but pre-existing readers will block + synchronize_rcu()!!! + Just who are you trying to fool??? +
Answer:
+ First, if updaters do not wish to be blocked by readers, they can use + call_rcu() or kfree_rcu(), which will + be discussed later. + Second, even when using synchronize_rcu(), the other + update-side code does run concurrently with readers, whether + pre-existing or not. +
 

This scenario resembles one of the first uses of RCU in @@ -210,9 +222,20 @@ to guarantee that do_something() never runs concurrently with recovery(), but with little or no synchronization overhead in do_something_dlm(). -

Quick Quiz 2: -Why is the synchronize_rcu() on line 28 needed? -
Answer + + + + + + + +
 
Quick Quiz:
+ Why is the synchronize_rcu() on line 28 needed? +
Answer:
+ Without that extra grace period, memory reordering could result in + do_something_dlm() executing do_something() + concurrently with the last bits of recovery(). +
 

In order to avoid fatal problems such as deadlocks, @@ -332,12 +355,27 @@ It also prevents any number of “interesting” compiler optimizations, for example, the use of gp as a scratch location immediately preceding the assignment. -

Quick Quiz 3: -But rcu_assign_pointer() does nothing to prevent the -two assignments to p->a and p->b -from being reordered. -Can't that also cause problems? -
Answer + + + + + + + +
 
Quick Quiz:
+ But rcu_assign_pointer() does nothing to prevent the + two assignments to p->a and p->b + from being reordered. + Can't that also cause problems? +
Answer:
+ No, it cannot. + The readers cannot see either of these two fields until + the assignment to gp, by which time both fields are + fully initialized. + So reordering the assignments + to p->a and p->b cannot possibly + cause any problems. +
 

It is tempting to assume that the reader need not do anything special @@ -494,11 +532,42 @@ The rcu_access_pointer() on line 6 is similar to code protected by the corresponding update-side lock. -

Quick Quiz 4: -Without the rcu_dereference() or the -rcu_access_pointer(), what destructive optimizations -might the compiler make use of? -
Answer + + + + + + + +
 
Quick Quiz:
+ Without the rcu_dereference() or the + rcu_access_pointer(), what destructive optimizations + might the compiler make use of? +
Answer:
+ Let's start with what happens to do_something_gp() + if it fails to use rcu_dereference(). + It could reuse a value formerly fetched from this same pointer. + It could also fetch the pointer from gp in a byte-at-a-time + manner, resulting in load tearing, in turn resulting a bytewise + mash-up of two distince pointer values. + It might even use value-speculation optimizations, where it makes + a wrong guess, but by the time it gets around to checking the + value, an update has changed the pointer to match the wrong guess. + Too bad about any dereferences that returned pre-initialization garbage + in the meantime! + + +

+ For remove_gp_synchronous(), as long as all modifications + to gp are carried out while holding gp_lock, + the above optimizations are harmless. + However, + with CONFIG_SPARSE_RCU_POINTER=y, + sparse will complain if you + define gp with __rcu and then + access it without using + either rcu_access_pointer() or rcu_dereference(). +

 

In short, RCU's publish-subscribe guarantee is provided by the combination @@ -571,17 +640,156 @@ systems with more than one CPU: synchronize_rcu() migrates in the meantime. -

Quick Quiz 5: -Given that multiple CPUs can start RCU read-side critical sections -at any time without any ordering whatsoever, how can RCU possibly tell whether -or not a given RCU read-side critical section starts before a -given instance of synchronize_rcu()? -
Answer - -

Quick Quiz 6: -The first and second guarantees require unbelievably strict ordering! -Are all these memory barriers really required? -
Answer + + + + + + + +
 
Quick Quiz:
+ Given that multiple CPUs can start RCU read-side critical sections + at any time without any ordering whatsoever, how can RCU possibly + tell whether or not a given RCU read-side critical section starts + before a given instance of synchronize_rcu()? +
Answer:
+ If RCU cannot tell whether or not a given + RCU read-side critical section starts before a + given instance of synchronize_rcu(), + then it must assume that the RCU read-side critical section + started first. + In other words, a given instance of synchronize_rcu() + can avoid waiting on a given RCU read-side critical section only + if it can prove that synchronize_rcu() started first. +
 
+ + + + + + + + +
 
Quick Quiz:
+ The first and second guarantees require unbelievably strict ordering! + Are all these memory barriers really required? +
Answer:
+ Yes, they really are required. + To see why the first guarantee is required, consider the following + sequence of events: + + +
    +
  1. + CPU 1: rcu_read_lock() + +
  2. + CPU 1: q = rcu_dereference(gp); + /* Very likely to return p. */ + +
  3. + CPU 0: list_del_rcu(p); + +
  4. + CPU 0: synchronize_rcu() starts. + +
  5. + CPU 1: do_something_with(q->a); + /* No smp_mb(), so might happen after kfree(). */ + +
  6. + CPU 1: rcu_read_unlock() + +
  7. + CPU 0: synchronize_rcu() returns. + +
  8. + CPU 0: kfree(p); + +
+ +

+ Therefore, there absolutely must be a full memory barrier between the + end of the RCU read-side critical section and the end of the + grace period. + + +

+ The sequence of events demonstrating the necessity of the second rule + is roughly similar: + + +

    +
  1. CPU 0: list_del_rcu(p); + +
  2. CPU 0: synchronize_rcu() starts. + +
  3. CPU 1: rcu_read_lock() + +
  4. CPU 1: q = rcu_dereference(gp); + /* Might return p if no memory barrier. */ + +
  5. CPU 0: synchronize_rcu() returns. + +
  6. CPU 0: kfree(p); + +
  7. + CPU 1: do_something_with(q->a); /* Boom!!! */ + +
  8. CPU 1: rcu_read_unlock() + +
+ +

+ And similarly, without a memory barrier between the beginning of the + grace period and the beginning of the RCU read-side critical section, + CPU 1 might end up accessing the freelist. + + +

+ The “as if” rule of course applies, so that any + implementation that acts as if the appropriate memory barriers + were in place is a correct implementation. + That said, it is much easier to fool yourself into believing + that you have adhered to the as-if rule than it is to actually + adhere to it! +

 
+ + + + + + + + +
 
Quick Quiz:
+ You claim that rcu_read_lock() and rcu_read_unlock() + generate absolutely no code in some kernel builds. + This means that the compiler might arbitrarily rearrange consecutive + RCU read-side critical sections. + Given such rearrangement, if a given RCU read-side critical section + is done, how can you be sure that all prior RCU read-side critical + sections are done? + Won't the compiler rearrangements make that impossible to determine? +
Answer:
+ In cases where rcu_read_lock() and rcu_read_unlock() + generate absolutely no code, RCU infers quiescent states only at + special locations, for example, within the scheduler. + Because calls to schedule() had better prevent calling-code + accesses to shared variables from being rearranged across the call to + schedule(), if RCU detects the end of a given RCU read-side + critical section, it will necessarily detect the end of all prior + RCU read-side critical sections, no matter how aggressively the + compiler scrambles the code. + + +

+ Again, this all assumes that the compiler cannot scramble code across + calls to the scheduler, out of interrupt handlers, into the idle loop, + into user-mode code, and so on. + But if your kernel build allows that sort of scrambling, you have broken + far more than just RCU! +

 

Note that these memory-barrier requirements do not replace the fundamental @@ -626,9 +834,19 @@ inconvenience can be avoided through use of the call_rcu() and kfree_rcu() API members described later in this document. -

Quick Quiz 7: -But how does the upgrade-to-write operation exclude other readers? -
Answer + + + + + + + +
 
Quick Quiz:
+ But how does the upgrade-to-write operation exclude other readers? +
Answer:
+ It doesn't, just like normal RCU updates, which also do not exclude + RCU readers. +
 

This guarantee allows lookup code to be shared between read-side @@ -714,9 +932,20 @@ to do significant reordering. This is by design: Any significant ordering constraints would slow down these fast-path APIs. -

Quick Quiz 8: -Can't the compiler also reorder this code? -
Answer + + + + + + + +
 
Quick Quiz:
+ Can't the compiler also reorder this code? +
Answer:
+ No, the volatile casts in READ_ONCE() and + WRITE_ONCE() prevent the compiler from reordering in + this particular case. +
 

Readers Do Not Exclude Updaters

@@ -769,10 +998,28 @@ new readers can start immediately after synchronize_rcu() starts, and synchronize_rcu() is under no obligation to wait for these new readers. -

Quick Quiz 9: -Suppose that synchronize_rcu() did wait until all readers had completed. -Would the updater be able to rely on this? -
Answer + + + + + + + +
 
Quick Quiz:
+ Suppose that synchronize_rcu() did wait until all + readers had completed instead of waiting only on + pre-existing readers. + For how long would the updater be able to rely on there + being no readers? +
Answer:
+ For no time at all. + Even if synchronize_rcu() were to wait until + all readers had completed, a new reader might start immediately after + synchronize_rcu() completed. + Therefore, the code following + synchronize_rcu() can never rely on there being + no readers. +
 

Grace Periods Don't Partition Read-Side Critical Sections

@@ -969,11 +1216,24 @@ grace period. As a result, an RCU read-side critical section cannot partition a pair of RCU grace periods. -

Quick Quiz 10: -How long a sequence of grace periods, each separated by an RCU read-side -critical section, would be required to partition the RCU read-side -critical sections at the beginning and end of the chain? -
Answer + + + + + + + +
 
Quick Quiz:
+ How long a sequence of grace periods, each separated by an RCU + read-side critical section, would be required to partition the RCU + read-side critical sections at the beginning and end of the chain? +
Answer:
+ In theory, an infinite number. + In practice, an unknown number that is sensitive to both implementation + details and timing considerations. + Therefore, even in practice, RCU users must abide by the + theoretical rather than the practical answer. +
 

Disabling Preemption Does Not Block Grace Periods

@@ -1109,12 +1369,27 @@ These classes is covered in the following sections.

Specialization

-RCU is and always has been intended primarily for read-mostly situations, as -illustrated by the following figure. -This means that RCU's read-side primitives are optimized, often at the +RCU is and always has been intended primarily for read-mostly situations, +which means that RCU's read-side primitives are optimized, often at the expense of its update-side primitives. +Experience thus far is captured by the following list of situations: -

RCUApplicability.svg

+
    +
  1. Read-mostly data, where stale and inconsistent data is not + a problem: RCU works great! +
  2. Read-mostly data, where data must be consistent: + RCU works well. +
  3. Read-write data, where data must be consistent: + RCU might work OK. + Or not. +
  4. Write-mostly data, where data must be consistent: + RCU is very unlikely to be the right tool for the job, + with the following exceptions, where RCU can provide: +
      +
    1. Existence guarantees for update-friendly mechanisms. +
    2. Wait-free read-side primitives for real-time use. +
    +

This focus on read-mostly situations means that RCU must interoperate @@ -1127,9 +1402,43 @@ synchronization primitives be legal within RCU read-side critical sections, including spinlocks, sequence locks, atomic operations, reference counters, and memory barriers. -

Quick Quiz 11: -What about sleeping locks? -
Answer + + + + + + + +
 
Quick Quiz:
+ What about sleeping locks? +
Answer:
+ These are forbidden within Linux-kernel RCU read-side critical + sections because it is not legal to place a quiescent state + (in this case, voluntary context switch) within an RCU read-side + critical section. + However, sleeping locks may be used within userspace RCU read-side + critical sections, and also within Linux-kernel sleepable RCU + (SRCU) + read-side critical sections. + In addition, the -rt patchset turns spinlocks into a + sleeping locks so that the corresponding critical sections + can be preempted, which also means that these sleeplockified + spinlocks (but not other sleeping locks!) may be acquire within + -rt-Linux-kernel RCU read-side critical sections. + + +

+ Note that it is legal for a normal RCU read-side + critical section to conditionally acquire a sleeping locks + (as in mutex_trylock()), but only as long as it does + not loop indefinitely attempting to conditionally acquire that + sleeping locks. + The key point is that things like mutex_trylock() + either return with the mutex held, or return an error indication if + the mutex was not immediately available. + Either way, mutex_trylock() returns immediately without + sleeping. +

 

It often comes as a surprise that many algorithms do not require a @@ -1160,10 +1469,7 @@ some period of time, so the exact wait period is a judgment call. One of our pair of veternarians might wait 30 seconds before pronouncing the cat dead, while the other might insist on waiting a full minute. The two veternarians would then disagree on the state of the cat during -the final 30 seconds of the minute following the last heartbeat, as -fancifully illustrated below: - -

2013-08-is-it-dead.png

+the final 30 seconds of the minute following the last heartbeat.

Interestingly enough, this same situation applies to hardware. @@ -1343,7 +1649,8 @@ situations where neither synchronize_rcu() nor synchronize_rcu_expedited() would be legal, including within preempt-disable code, local_bh_disable() code, interrupt-disable code, and interrupt handlers. -However, even call_rcu() is illegal within NMI handlers. +However, even call_rcu() is illegal within NMI handlers +and from idle and offline CPUs. The callback function (remove_gp_cb() in this case) will be executed within softirq (software interrupt) environment within the Linux kernel, @@ -1354,12 +1661,27 @@ write an RCU callback function that takes too long. Long-running operations should be relegated to separate threads or (in the Linux kernel) workqueues. -

Quick Quiz 12: -Why does line 19 use rcu_access_pointer()? -After all, call_rcu() on line 25 stores into the -structure, which would interact badly with concurrent insertions. -Doesn't this mean that rcu_dereference() is required? -
Answer + + + + + + + +
 
Quick Quiz:
+ Why does line 19 use rcu_access_pointer()? + After all, call_rcu() on line 25 stores into the + structure, which would interact badly with concurrent insertions. + Doesn't this mean that rcu_dereference() is required? +
Answer:
+ Presumably the ->gp_lock acquired on line 18 excludes + any changes, including any insertions that rcu_dereference() + would protect against. + Therefore, any insertions will be delayed until after + ->gp_lock + is released on line 25, which in turn means that + rcu_access_pointer() suffices. +
 

However, all that remove_gp_cb() is doing is @@ -1406,14 +1728,31 @@ This was due to the fact that RCU was not heavily used within DYNIX/ptx, so the very few places that needed something like synchronize_rcu() simply open-coded it. -

Quick Quiz 13: -Earlier it was claimed that call_rcu() and -kfree_rcu() allowed updaters to avoid being blocked -by readers. -But how can that be correct, given that the invocation of the callback -and the freeing of the memory (respectively) must still wait for -a grace period to elapse? -
Answer + + + + + + + +
 
Quick Quiz:
+ Earlier it was claimed that call_rcu() and + kfree_rcu() allowed updaters to avoid being blocked + by readers. + But how can that be correct, given that the invocation of the callback + and the freeing of the memory (respectively) must still wait for + a grace period to elapse? +
Answer:
+ We could define things this way, but keep in mind that this sort of + definition would say that updates in garbage-collected languages + cannot complete until the next time the garbage collector runs, + which does not seem at all reasonable. + The key point is that in most cases, an updater using either + call_rcu() or kfree_rcu() can proceed to the + next update as soon as it has invoked call_rcu() or + kfree_rcu(), without having to wait for a subsequent + grace period. +
 

But what if the updater must wait for the completion of code to be @@ -1838,11 +2177,26 @@ kthreads to be spawned. Therefore, invoking synchronize_rcu() during scheduler initialization can result in deadlock. -

Quick Quiz 14: -So what happens with synchronize_rcu() during -scheduler initialization for CONFIG_PREEMPT=n -kernels? -
Answer + + + + + + + +
 
Quick Quiz:
+ So what happens with synchronize_rcu() during + scheduler initialization for CONFIG_PREEMPT=n + kernels? +
Answer:
+ In CONFIG_PREEMPT=n kernel, synchronize_rcu() + maps directly to synchronize_sched(). + Therefore, synchronize_rcu() works normally throughout + boot in CONFIG_PREEMPT=n kernels. + However, your code must also work in CONFIG_PREEMPT=y kernels, + so it is still necessary to avoid invoking synchronize_rcu() + during scheduler initialization. +
 

I learned of these boot-time requirements as a result of a series of @@ -2170,6 +2524,14 @@ up to and including systems with 4096 CPUs. This real-time requirement motivated the grace-period kthread, which also simplified handling of a number of race conditions. +

+RCU must avoid degrading real-time response for CPU-bound threads, whether +executing in usermode (which is one use case for +CONFIG_NO_HZ_FULL=y) or in the kernel. +That said, CPU-bound loops in the kernel must execute +cond_resched_rcu_qs() at least once per few tens of milliseconds +in order to avoid receiving an IPI from RCU. +

Finally, RCU's status as a synchronization primitive means that any RCU failure can result in arbitrary memory corruption that can be @@ -2223,6 +2585,8 @@ described in a separate section.

  • Sched Flavor
  • Sleepable RCU
  • Tasks RCU +
  • + Waiting for Multiple Grace Periods

    Bottom-Half Flavor

    @@ -2472,6 +2836,94 @@ The tasks-RCU API is quite compact, consisting only of synchronize_rcu_tasks(), and rcu_barrier_tasks(). +

    +Waiting for Multiple Grace Periods

    + +

    +Perhaps you have an RCU protected data structure that is accessed from +RCU read-side critical sections, from softirq handlers, and from +hardware interrupt handlers. +That is three flavors of RCU, the normal flavor, the bottom-half flavor, +and the sched flavor. +How to wait for a compound grace period? + +

    +The best approach is usually to “just say no!” and +insert rcu_read_lock() and rcu_read_unlock() +around each RCU read-side critical section, regardless of what +environment it happens to be in. +But suppose that some of the RCU read-side critical sections are +on extremely hot code paths, and that use of CONFIG_PREEMPT=n +is not a viable option, so that rcu_read_lock() and +rcu_read_unlock() are not free. +What then? + +

    +You could wait on all three grace periods in succession, as follows: + +

    +
    + 1 synchronize_rcu();
    + 2 synchronize_rcu_bh();
    + 3 synchronize_sched();
    +
    +
    + +

    +This works, but triples the update-side latency penalty. +In cases where this is not acceptable, synchronize_rcu_mult() +may be used to wait on all three flavors of grace period concurrently: + +

    +
    + 1 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched);
    +
    +
    + +

    +But what if it is necessary to also wait on SRCU? +This can be done as follows: + +

    +
    + 1 static void call_my_srcu(struct rcu_head *head,
    + 2        void (*func)(struct rcu_head *head))
    + 3 {
    + 4   call_srcu(&my_srcu, head, func);
    + 5 }
    + 6
    + 7 synchronize_rcu_mult(call_rcu, call_rcu_bh, call_rcu_sched, call_my_srcu);
    +
    +
    + +

    +If you needed to wait on multiple different flavors of SRCU +(but why???), you would need to create a wrapper function resembling +call_my_srcu() for each SRCU flavor. + + + + + + + + +
     
    Quick Quiz:
    + But what if I need to wait for multiple RCU flavors, but I also need + the grace periods to be expedited? +
    Answer:
    + If you are using expedited grace periods, there should be less penalty + for waiting on them in succession. + But if that is nevertheless a problem, you can use workqueues + or multiple kthreads to wait on the various expedited grace + periods concurrently. +
     
    + +

    +Again, it is usually better to adjust the RCU read-side critical sections +to use a single flavor of RCU, but when this is not feasible, you can use +synchronize_rcu_mult(). +

    Possible Future Changes

    @@ -2569,329 +3021,4 @@ and is provided under the terms of the Creative Commons Attribution-Share Alike 3.0 United States license. -

    -Answers to Quick Quizzes

    - - -

    Quick Quiz 1: -Wait a minute! -You said that updaters can make useful forward progress concurrently -with readers, but pre-existing readers will block -synchronize_rcu()!!! -Just who are you trying to fool??? - - -

    Answer: -First, if updaters do not wish to be blocked by readers, they can use -call_rcu() or kfree_rcu(), which will -be discussed later. -Second, even when using synchronize_rcu(), the other -update-side code does run concurrently with readers, whether pre-existing -or not. - - -

    Back to Quick Quiz 1. - - -

    Quick Quiz 2: -Why is the synchronize_rcu() on line 28 needed? - - -

    Answer: -Without that extra grace period, memory reordering could result in -do_something_dlm() executing do_something() -concurrently with the last bits of recovery(). - - -

    Back to Quick Quiz 2. - - -

    Quick Quiz 3: -But rcu_assign_pointer() does nothing to prevent the -two assignments to p->a and p->b -from being reordered. -Can't that also cause problems? - - -

    Answer: -No, it cannot. -The readers cannot see either of these two fields until -the assignment to gp, by which time both fields are -fully initialized. -So reordering the assignments -to p->a and p->b cannot possibly -cause any problems. - - -

    Back to Quick Quiz 3. - - -

    Quick Quiz 4: -Without the rcu_dereference() or the -rcu_access_pointer(), what destructive optimizations -might the compiler make use of? - - -

    Answer: -Let's start with what happens to do_something_gp() -if it fails to use rcu_dereference(). -It could reuse a value formerly fetched from this same pointer. -It could also fetch the pointer from gp in a byte-at-a-time -manner, resulting in load tearing, in turn resulting a bytewise -mash-up of two distince pointer values. -It might even use value-speculation optimizations, where it makes a wrong -guess, but by the time it gets around to checking the value, an update -has changed the pointer to match the wrong guess. -Too bad about any dereferences that returned pre-initialization garbage -in the meantime! - -

    -For remove_gp_synchronous(), as long as all modifications -to gp are carried out while holding gp_lock, -the above optimizations are harmless. -However, -with CONFIG_SPARSE_RCU_POINTER=y, -sparse will complain if you -define gp with __rcu and then -access it without using -either rcu_access_pointer() or rcu_dereference(). - - -

    Back to Quick Quiz 4. - - -

    Quick Quiz 5: -Given that multiple CPUs can start RCU read-side critical sections -at any time without any ordering whatsoever, how can RCU possibly tell whether -or not a given RCU read-side critical section starts before a -given instance of synchronize_rcu()? - - -

    Answer: -If RCU cannot tell whether or not a given -RCU read-side critical section starts before a -given instance of synchronize_rcu(), -then it must assume that the RCU read-side critical section -started first. -In other words, a given instance of synchronize_rcu() -can avoid waiting on a given RCU read-side critical section only -if it can prove that synchronize_rcu() started first. - - -

    Back to Quick Quiz 5. - - -

    Quick Quiz 6: -The first and second guarantees require unbelievably strict ordering! -Are all these memory barriers really required? - - -

    Answer: -Yes, they really are required. -To see why the first guarantee is required, consider the following -sequence of events: - -

      -
    1. CPU 1: rcu_read_lock() -
    2. CPU 1: q = rcu_dereference(gp); - /* Very likely to return p. */ -
    3. CPU 0: list_del_rcu(p); -
    4. CPU 0: synchronize_rcu() starts. -
    5. CPU 1: do_something_with(q->a); - /* No smp_mb(), so might happen after kfree(). */ -
    6. CPU 1: rcu_read_unlock() -
    7. CPU 0: synchronize_rcu() returns. -
    8. CPU 0: kfree(p); -
    - -

    -Therefore, there absolutely must be a full memory barrier between the -end of the RCU read-side critical section and the end of the -grace period. - -

    -The sequence of events demonstrating the necessity of the second rule -is roughly similar: - -

      -
    1. CPU 0: list_del_rcu(p); -
    2. CPU 0: synchronize_rcu() starts. -
    3. CPU 1: rcu_read_lock() -
    4. CPU 1: q = rcu_dereference(gp); - /* Might return p if no memory barrier. */ -
    5. CPU 0: synchronize_rcu() returns. -
    6. CPU 0: kfree(p); -
    7. CPU 1: do_something_with(q->a); /* Boom!!! */ -
    8. CPU 1: rcu_read_unlock() -
    - -

    -And similarly, without a memory barrier between the beginning of the -grace period and the beginning of the RCU read-side critical section, -CPU 1 might end up accessing the freelist. - -

    -The “as if” rule of course applies, so that any implementation -that acts as if the appropriate memory barriers were in place is a -correct implementation. -That said, it is much easier to fool yourself into believing that you have -adhered to the as-if rule than it is to actually adhere to it! - - -

    Back to Quick Quiz 6. - - -

    Quick Quiz 7: -But how does the upgrade-to-write operation exclude other readers? - - -

    Answer: -It doesn't, just like normal RCU updates, which also do not exclude -RCU readers. - - -

    Back to Quick Quiz 7. - - -

    Quick Quiz 8: -Can't the compiler also reorder this code? - - -

    Answer: -No, the volatile casts in READ_ONCE() and -WRITE_ONCE() prevent the compiler from reordering in -this particular case. - - -

    Back to Quick Quiz 8. - - -

    Quick Quiz 9: -Suppose that synchronize_rcu() did wait until all readers had completed. -Would the updater be able to rely on this? - - -

    Answer: -No. -Even if synchronize_rcu() were to wait until -all readers had completed, a new reader might start immediately after -synchronize_rcu() completed. -Therefore, the code following -synchronize_rcu() cannot rely on there being no readers -in any case. - - -

    Back to Quick Quiz 9. - - -

    Quick Quiz 10: -How long a sequence of grace periods, each separated by an RCU read-side -critical section, would be required to partition the RCU read-side -critical sections at the beginning and end of the chain? - - -

    Answer: -In theory, an infinite number. -In practice, an unknown number that is sensitive to both implementation -details and timing considerations. -Therefore, even in practice, RCU users must abide by the theoretical rather -than the practical answer. - - -

    Back to Quick Quiz 10. - - -

    Quick Quiz 11: -What about sleeping locks? - - -

    Answer: -These are forbidden within Linux-kernel RCU read-side critical sections -because it is not legal to place a quiescent state (in this case, -voluntary context switch) within an RCU read-side critical section. -However, sleeping locks may be used within userspace RCU read-side critical -sections, and also within Linux-kernel sleepable RCU -(SRCU) -read-side critical sections. -In addition, the -rt patchset turns spinlocks into a sleeping locks so -that the corresponding critical sections can be preempted, which -also means that these sleeplockified spinlocks (but not other sleeping locks!) -may be acquire within -rt-Linux-kernel RCU read-side critical sections. - -

    -Note that it is legal for a normal RCU read-side critical section -to conditionally acquire a sleeping locks (as in mutex_trylock()), -but only as long as it does not loop indefinitely attempting to -conditionally acquire that sleeping locks. -The key point is that things like mutex_trylock() -either return with the mutex held, or return an error indication if -the mutex was not immediately available. -Either way, mutex_trylock() returns immediately without sleeping. - - -

    Back to Quick Quiz 11. - - -

    Quick Quiz 12: -Why does line 19 use rcu_access_pointer()? -After all, call_rcu() on line 25 stores into the -structure, which would interact badly with concurrent insertions. -Doesn't this mean that rcu_dereference() is required? - - -

    Answer: -Presumably the ->gp_lock acquired on line 18 excludes -any changes, including any insertions that rcu_dereference() -would protect against. -Therefore, any insertions will be delayed until after ->gp_lock -is released on line 25, which in turn means that -rcu_access_pointer() suffices. - - -

    Back to Quick Quiz 12. - - -

    Quick Quiz 13: -Earlier it was claimed that call_rcu() and -kfree_rcu() allowed updaters to avoid being blocked -by readers. -But how can that be correct, given that the invocation of the callback -and the freeing of the memory (respectively) must still wait for -a grace period to elapse? - - -

    Answer: -We could define things this way, but keep in mind that this sort of -definition would say that updates in garbage-collected languages -cannot complete until the next time the garbage collector runs, -which does not seem at all reasonable. -The key point is that in most cases, an updater using either -call_rcu() or kfree_rcu() can proceed to the -next update as soon as it has invoked call_rcu() or -kfree_rcu(), without having to wait for a subsequent -grace period. - - -

    Back to Quick Quiz 13. - - -

    Quick Quiz 14: -So what happens with synchronize_rcu() during -scheduler initialization for CONFIG_PREEMPT=n -kernels? - - -

    Answer: -In CONFIG_PREEMPT=n kernel, synchronize_rcu() -maps directly to synchronize_sched(). -Therefore, synchronize_rcu() works normally throughout -boot in CONFIG_PREEMPT=n kernels. -However, your code must also work in CONFIG_PREEMPT=y kernels, -so it is still necessary to avoid invoking synchronize_rcu() -during scheduler initialization. - - -

    Back to Quick Quiz 14. - - -- cgit v1.2.3-54-g00ecf