-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mutex #41
Conversation
Draft for now, so I can first use it somewhere. |
used it in the example as example. ready from my side. |
It's a 3p library, but it may make sense to consider using https://docs.rs/lock_api/0.4.2/lock_api/ |
I would like to keep the dependencies for kernel at a bare minimum. So would like to implement by my self, instead of using this crate. |
d0d1dba
to
e4b7ba2
Compare
renamed to only Mutex, as i think the example is enough to test, and so can keep the change smaller |
Something broke now. Should sleep now, will Fix tomorrow. |
Signed-off-by: Finn Behrens <[email protected]>
} | ||
|
||
impl<T: ?Sized> Mutex<T> { | ||
/// acquire a lock on the mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicks: uppercase on first letter, end with period .
.
/// acquire a lock on the mutex | ||
/// # unsafe | ||
/// This is unsafe, as it returns a second lock if one is already help by the current process | ||
// with CONFIG_DEBUG_LOCK_ALLOW mutex_lock is a macro, which calls mutex_lock_nested(&mutex, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicks: missing /
, typo help
-> held
, use Markdown syntax, i.e. `CONFIG_DEBUG_LOCK_ALLOW`
and `mutex_lock_nested(&mutex, 0)`
, end with .
. Also "task" instead of "process".
impl<T: ?Sized> Mutex<T> { | ||
/// acquire a lock on the mutex | ||
/// # unsafe | ||
/// This is unsafe, as it returns a second lock if one is already help by the current process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any kernel APIs we can use to make this safe? It'd be a shame if all mutex usage requires unsafe
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we could test if the mutex is already locked and panic in that case (similar to what Rust's std guarantees: a deadlock or a panic but never returning), but that would pretty much be like redoing the kernel mutex debugging infrastructure, so not sure how it will be received. Even if we go that route, we should have both APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ "panic in that case" -> "deadlock in that case" -- considering we are the kernel, it is way better to deadlock a task (with perhaps a warning or even an oops prior to the deadlock itself) than kill the entire system.
Hmm... I wonder, could we do this statically for a subset of the use cases? Say, two types, Mutex
and MutexLocked
which move themselves into each other on every operation (& forget themselves to avoid calling unnecessary drops which would require a test). Then we could have code like:
let m = m.lock();
...
let m = m.unlock();
Things like trylock
could return an enum of either locked/unlocked, there could be a m.take(|...| { ... })
, etc.
Of course, this doesn't work if one needs to keep the mutex stored in different states, but in some use cases this could be enough.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan. With mutexguard you only need a reference of the mutex. Your idea needs a owned type which can not live in a struct or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must be missing something here. Given that mutexes are not recursive, how is it that lock() can return a second MutexGuard when the mutex is already locked (by whomever)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it's unsafe, as if you call Ut 2 times from the same function, it calles 2 times mutex_unlock, which results in a kernel panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this and it just deadlocked the task, perhaps we have different debugging configurations. In any case, whether it deadlocks or panics, why does it make this unsafe?
Mutex from the std library is not unsafe and behaves similarly:
The exact behavior on locking a mutex in the thread which already holds the lock is left unspecified. However, this function will not return on the second call (it might panic or deadlock, for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does it make this unsafe?
It depends whether the panic is a controlled one or one triggered due to a double free, GPF or similar.
Mutex from the std library is not unsafe and behaves similarly:
Yeah, this is what I mentioned in my first comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove them unsafe again, and Wirte some documentation. I can only do it after my exams. So either feel free to take my work, or wait until I have more time again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry about it. Good luck on your exams! :)
use crate::bindings; | ||
|
||
pub struct Mutex<T: ?Sized> { | ||
lock: bindings::mutex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct mutex
contains a struct list_head
, which is a self-referential data structure. So we need to pin Mutex before we can safely initialise and use it.
#[cfg(CONFIG_DEBUG_LOCK_ALLOC)] | ||
pub unsafe fn lock<'a>(&'a self) -> MutexGuard<'a, T> { | ||
unsafe { | ||
bindings::mutex_lock_nested( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to call mutex_init
before you can safely use it. Where is it being called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some mutexes (I think it was in the rtlink code) I don't saw it called anywhere. So I thought it works without calling mutex_unit if you set the head_list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some mutexes (I think it was in the rtlink code) I don't saw it called anywhere.
In such cases they were probably defined with DEFINE_MUTEX
, which allows them to skip mutex_init
.
So I thought it works without calling mutex_unit if you set the head_list
Where are you setting the head of the list?
In any case, check out Generic Mutex Subsystem, in particular:
A mutex must only be initialized via the API (see below).
Hi Finn, I just saw on the rust-for-linux list archive that you are doing this to implement pr_*; I'm not responding there because I just joined the list, so I don't have the thread to reply to. Anyway, I don't think mutex is what you need for this because it will put the task to sleep when it is locked, which means that the pr_* functions you implement won't be usable for contexts where sleeping is not an option, for example, interrupt contents. You'll likely need a spinlock -- I was working on an implementation before, I'll upload to a branch here so that you can take a look. But if you only need to set a value once, other sync primitives may be cheaper/better. Could you give more details on what you're trying to do? Thanks! |
Yes, I only want to set the value once. So I guess other sync primitives are better. |
Yes, closing this sounds like a good plan. |
Commit f5ce815 ("scsi: target: tcmu: Support DATA_BLOCK_SIZE = N * PAGE_SIZE") introduced xas_next() calls to iterate xarray elements. These calls triggered the WARNING "suspicious RCU usage" at tcmu device set up [1]. In the call stack of xas_next(), xas_load() was called. According to its comment, this function requires "the xa_lock or the RCU lock". To avoid the warning: - Guard the small loop calling xas_next() in tcmu_get_empty_block with RCU lock. - In the large loop in tcmu_copy_data using RCU lock would possibly disable preemtion for a long time (copy multi MBs). Therefore replace XA_STATE, xas_set and xas_next with a single xa_load. [1] [ 1899.867091] ============================= [ 1899.871199] WARNING: suspicious RCU usage [ 1899.875310] 5.13.0-rc1+ Rust-for-Linux#41 Not tainted [ 1899.879222] ----------------------------- [ 1899.883299] include/linux/xarray.h:1182 suspicious rcu_dereference_check() usage! [ 1899.890940] other info that might help us debug this: [ 1899.899082] rcu_scheduler_active = 2, debug_locks = 1 [ 1899.905719] 3 locks held by kworker/0:1/1368: [ 1899.910161] #0: ffffa1f8c8b98738 ((wq_completion)target_submission){+.+.}-{0:0}, at: process_one_work+0x1ee/0x580 [ 1899.920732] #1: ffffbd7040cd7e78 ((work_completion)(&q->sq.work)){+.+.}-{0:0}, at: process_one_work+0x1ee/0x580 [ 1899.931146] #2: ffffa1f8d1c99768 (&udev->cmdr_lock){+.+.}-{3:3}, at: tcmu_queue_cmd+0xea/0x160 [target_core_user] [ 1899.941678] stack backtrace: [ 1899.946093] CPU: 0 PID: 1368 Comm: kworker/0:1 Not tainted 5.13.0-rc1+ Rust-for-Linux#41 [ 1899.953070] Hardware name: System manufacturer System Product Name/PRIME Z270-A, BIOS 1302 03/15/2018 [ 1899.962459] Workqueue: target_submission target_queued_submit_work [target_core_mod] [ 1899.970337] Call Trace: [ 1899.972839] dump_stack+0x6d/0x89 [ 1899.976222] xas_descend+0x10e/0x120 [ 1899.979875] xas_load+0x39/0x50 [ 1899.983077] tcmu_get_empty_blocks+0x115/0x1c0 [target_core_user] [ 1899.989318] queue_cmd_ring+0x1da/0x630 [target_core_user] [ 1899.994897] ? rcu_read_lock_sched_held+0x3f/0x70 [ 1899.999695] ? trace_kmalloc+0xa6/0xd0 [ 1900.003501] ? __kmalloc+0x205/0x380 [ 1900.007167] tcmu_queue_cmd+0x12f/0x160 [target_core_user] [ 1900.012746] __target_execute_cmd+0x23/0xa0 [target_core_mod] [ 1900.018589] transport_generic_new_cmd+0x1f3/0x370 [target_core_mod] [ 1900.025046] transport_handle_cdb_direct+0x34/0x50 [target_core_mod] [ 1900.031517] target_queued_submit_work+0x43/0xe0 [target_core_mod] [ 1900.037837] process_one_work+0x268/0x580 [ 1900.041952] ? process_one_work+0x580/0x580 [ 1900.046195] worker_thread+0x55/0x3b0 [ 1900.049921] ? process_one_work+0x580/0x580 [ 1900.054192] kthread+0x143/0x160 [ 1900.057499] ? kthread_create_worker_on_cpu+0x40/0x40 [ 1900.062661] ret_from_fork+0x1f/0x30 Link: https://lore.kernel.org/r/[email protected] Fixes: f5ce815 ("scsi: target: tcmu: Support DATA_BLOCK_SIZE = N * PAGE_SIZE") Reported-by: Shin'ichiro Kawasaki <[email protected]> Tested-by: Shin'ichiro Kawasaki <[email protected]> Signed-off-by: Bodo Stroesser <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
Create a Mutex and pr_* macros