Welcome, to Rusty's Remarkably Unreliable Guide to Kernel Locking issues. This document describes the locking systems in the Linux Kernel as we approach 2.4.
It looks like SMP is here to stay; so everyone hacking on the kernel these days needs to know the fundamentals of concurrency and locking for SMP.
(Skip this if you know what a Race Condition is).
In a normal program, you can increment a counter like so:
very_important_count++; |
This is what they would expect to happen:
Table 1-1. Expected Results
Instance 1 | Instance 2 |
---|---|
read very_important_count (5) | |
add 1 (6) | |
write very_important_count (6) | |
read very_important_count (6) | |
add 1 (7) | |
write very_important_count (7) |
This is what might happen:
Table 1-2. Possible Results
Instance 1 | Instance 2 |
---|---|
read very_important_count (5) | |
read very_important_count (5) | |
add 1 (6) | |
add 1 (6) | |
write very_important_count (6) | |
write very_important_count (6) |
This overlap, where what actually happens depends on the relative timing of multiple tasks, is called a race condition. The piece of code containing the concurrency issue is called a critical region. And especially since Linux starting running on SMP machines, they became one of the major issues in kernel design and implementation.
The solution is to recognize when these simultaneous accesses occur, and use locks to make sure that only one instance can enter the critical region at any time. There are many friendly primitives in the Linux kernel to help you do this. And then there are the unfriendly primitives, but I'll pretend they don't exist.