Avoiding Locks: Read And Write Ordering

Sometimes it is possible to avoid locking. Consider the following case from the 2.2 firewall code, which inserted an element into a single linked list in user context:

        new->next = i->next;
        i->next = new;
    

Here the author (Alan Cox, who knows what he's doing) assumes that the pointer assignments are atomic. This is important, because networking packets would traverse this list on bottom halves without a lock. Depending on their exact timing, they would either see the new element in the list with a valid next pointer, or it would not be in the list yet. A lock is still required against other CPUs inserting or deleting from the list, of course.

Of course, the writes must be in this order, otherwise the new element appears in the list with an invalid next pointer, and any other CPU iterating at the wrong time will jump through it into garbage. Because modern CPUs reorder, Alan's code actually read as follows:

        new->next = i->next;
        wmb();
        i->next = new;
    

The wmb() is a write memory barrier (include/asm/system.h): neither the compiler nor the CPU will allow any writes to memory after the wmb() to be visible to other hardware before any of the writes before the wmb().

As i386 does not do write reordering, this bug would never show up on that platform. On other SMP platforms, however, it will.

There is also rmb() for read ordering: to ensure any previous variable reads occur before following reads. The simple mb() macro combines both rmb() and wmb().

Some atomic operations are defined to act as a memory barrier (ie. as per the mb() macro, but if in doubt, be explicit. Also, spinlock operations act as partial barriers: operations after gaining a spinlock will never be moved to precede the spin_lock() call, and operations before releasing a spinlock will never be moved after the spin_unlock() call.