Preface
Some weeks back, in the discord LetsGoRusty server, a topic of discussion was the behavior of match expression vis-a-vis a lock. The thread of discussion was quite interesting for me. Fellow members of the channel exchanged messages, much to my benefit. After all, I am learning Rust! 😁
In short, the problem was this:
match receiver.lock().unwrap().recv().unwrap() {
// ...
}
and
// ...
let message = receiver.lock().unwrap().recv().unwrap();
match message {
// ....
}
behave differently, Why?
As someone who is learning Rust enthusiastically, and fighting the Borrow Checker valiantly along the way 😃 this question intrigued me. I wanted to find out - to the extent I could - the exact reason behind this behaviour. Whatever I have been able to understand, I have captured in a series of articles (listed below). This is the last and final article of that series.
Build up
Perhaps, it will be easier to understand the background of this article, if the ones just preceding it (in order as below) are referred to:
Explores Method-call expressions and binding. (link)
Explores RAII, OBMR and how these are used in establishing the behaviour of a Mutex (link)
Explores the interplay between the scope of a temporary and match expressions - 3 (link)
What are the main takeaways so far (summary of the earlier articles)
- An expression like
a.b().c()
is called a Method-call expression. It is kind of obvious thatc()
can be called only when a receiver object exists which implements the methodc()
. A method/function cannot be called in isolation. The question is, which is the receiver object in that expression, on whichc()
is available. - Any expression evaluates to a value. So, does a Method-call expression, as well. How we deal with that value is important.
- In order to facilitate execution of a Method-call expression, #rustlang compiler brings forth temporary objects as receivers of the methods. The scope - i.e. how long is it available - of these temporaries are governed by a few rules.
- The purpose of having a Mutex is to set up a fence around a piece of data. When a thread of execution needs an access to this piece of data, the Mutex must set up a lock around the data, ensuring an exclusive access by this particular thread. While the lock is in place, no other thread can access the data. Therefore, it is important to remove the exclusive lock, just when the current thread of execution is done with it.
- Rust employs a technique named OBMR (more commonly known as RAII) to implement the lock by creating temporary object of type
MutexGuard
and using the scoping rules to ensure its timely destruction (thereby, releasing the lock). - When used in a
match
expression, the rules of scoping the temporaries determine how the lock is held and released. This is an important observation.
A brief description of the application where it started
The Discord discussion thread, is about a multi-threaded webserver toy application, from the the Rust Book. This application demonstrates - amongst other things - inter-thread communication using Channels. A bunch of workers is looking for jobs to arrive at a channel: when a job arrives, a worker picks it up and executes. This article is not about threads and channels; so we don't go that way. It is sufficient for us to know that the Channel
(the receiving end of the channel, to be accurate but that is not important) is held inside a Mutex. Any thread that needs to peep in the channel to see if a job is waiting to be picked up, must acquire the lock!
What is the real issue with the behaviour here?
Well, imagine a queue of jobs. One thread acquires the lock on the queue and picks up a job which is at the mouth of the queue. Once it picks up, it is going to execute the job. During the execution, it must release the lock, so that the next job at the mouth of the queue, may be picked up by another thread. Nothing really very complicated, so far.
But, what if the first thread never releases the lock? In that case, even if jobs accumulate in the queue, other threads - even if free - will remain idle. The first thread, when it is done with the current job, will get a chance to peek again and extract the next job. Only this thread will be working, and in effect, the application will behave as if it is single-threaded.
Explaining the behavior
The code snippet that began the discussion on 'Let's Go Rusty' Discord server, was:
// ...
let message = receiver.lock().unwrap().recv().unwrap();
match message { // <-- match on the bound variable
Message::NewJob(job) => {
println!("Worker {} got a new job, executing ..", id);
job();
},
Message::Terminate => {
println!("Worker {} was told to terminate!", id);
break;
}
}
// ...
The point was that the behaviour was different (as outlined earlier) when the first line in the snippet is chaged to this:
// ...
match receiver.lock().unwrap().recv().unwrap() { // <-- match on the expression
Message::NewJob(job) => {
println!("Worker {} got a new job, executing ..", id);
job();
},
Message::Terminate => {
println!("Worker {} was told to terminate!", id);
break;
}
}
// ...
In the second case, the lock is held till the end of the match
expression as we have seen in the earlier article. Therefore, while the job()
is being done, receiver's channel - the job-queue's head - is locked (the MutexGuard
is alive). No other thread can access the receiver's channel during this time. In effect, the application behaves as if it is single threaded.
Expressions while let
and match
The example shared on Discord is almost the same as the sample multi-threaded application that is implemented in the Rust book. The book, also mentions the same problem as described above, but as a slightly different case.
This works as a multi-threaded server alright:
// --snip--
impl Worker {
fn new(id: usize, receiver: Arc<Mutex<mpsc::Receiver<Job>>>) -> Worker {
let thread = thread::spawn(move || loop {
let job = receiver.lock().unwrap().recv().unwrap(); // <== the lock is released here (tempprary dropped)
println!("Worker {id} got a job; executing.");
job(); // <-- do the job and then loop back again
});
Worker { id, thread }
}
}
But, this behaves as a single-threaded server:
// --snip--
impl Worker {
fn new(id: usize, receiver: Arc<Mutex<mpsc::Receiver<Job>>>) -> Worker {
let thread = thread::spawn(move || {
while let Ok(job) = receiver.lock().unwrap().recv() { // <-- lock is not released ...
println!("Worker {id} got a job; executing.");
job();
} // <-- till here!
});
Worker { id, thread }
}
}
There is no match
yet the behaviour is the same as in match receiver.lock().unwrap().recv()
. Why?
This is so because Rust compiler translates a while let() = .....
into a match
expression!
The while let Ok(Job) = receiver.lock().unwrap().recv()
is made into something equivalent to this:
loop {
match receiver.lock().unwrap().recv() { // <-- lock is not released
Ok(job) => job(),
Err(_) => { break; }
} // <-- till here!
}
Like earlier, the lock is still held inside the match
arm and is not released till the end of the expression, and then it loops!
Conclusion
An idiomatic use of Mutex
requires us to understand the important role that a temporary MutexGuard
plays of fencing and freeing the data that is under Mutex
's lock. The `MutexGuard
's construction and destruction ensure that the lock is held and released at the appropriate time, so that access to data is not prevented when it is necessary. The rules that govern the scope of a temporary in a Rust expression, is crucial here because the rules ensure how long the MutexGuard
lives!
It will be fantastic to hear your comments, on this series. If this benefits someone, I will be happy. I will be happier if someone spots and points out the gaps in my understanding. I am learning and any help is welcome! 😃.
Top comments (2)
The other day I had some trouble with temporaries not going out of scope, and it had nothing to do with match statements.
The trouble was in this line:
tokio::time::sleep(Duration::from_millis(thread_rng().gen_range(0..=1000))).await;
ThreadRng
) returned bythread_rng()
is not Send..await()
needs Send on everything in its Future.Storing the integer from
gen_range()
in a seperate variable andsleep()
on that variable compiled ok. Turns out that temporaries are kept alive until the expression where it is created is ended: at the semicolon or at the end of the encompassing expression, like a match statement. I do not know the exact details though.This here seems like the same problem to me. But I am no expert :-)
Thank you very much for taking time to drop in and share your observations.
Sorry, for responding to your comment much later. Day-to-day job to earn a living come in the way of tasks I should have been more regular with. :-)
This is an excellent observation. I have gone through your interaction with others in 'rust-lang.org'. Your conclusion at the end of that conversation, is valid, IMO.
In my blog, I didn't explore the aspect of temporaries being used in expressions that are shared amongst the threads. Because yielding and awaiting require the Future to remember the previously computed state, the return value of
get_range()
being an Integer, helps. It is COPY.For that
Rc<>
discussion in there, my point is that we create anRc
before we share it with the threads. In other words, I know what I am sharing and I am preparing for that, before I begin to share.Please keep sharing your viewpoints, whenever you can.