Message ID | 20240214084419.6194-1-brgl@bgdev.pl |
---|---|
Headers | show |
Series | gpio: fix SRCU bugs | expand |
On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > an actual race condition. The other three just make lockdep happy. Same comment here, can we simply redo this work so we won't have tons of fixes on top of the nice RCU rework?
On Wed, Feb 14, 2024 at 1:53 PM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > > On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > > an actual race condition. The other three just make lockdep happy. > > Same comment here, can we simply redo this work so we won't have tons of fixes > on top of the nice RCU rework? > The rework is clearly a new development - not meant for backporting. I don't see any benefit from rebasing. These are normal fixes for a big rework. Bart
On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > an actual race condition. The other three just make lockdep happy. This doesn't fix the issue I reported yesterday when applied on top of today's next: https://lava.sirena.org.uk/scheduler/job/585469 [ 1.995518] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000078 ... [ 2.176162] Call trace: [ 2.178610] check_init_srcu_struct+0x1c/0xa0 [ 2.182974] synchronize_srcu+0x1c/0x100 [ 2.186904] gpiod_request_commit+0xec/0x1e0 [ 2.191183] gpiochip_request_own_desc+0x58/0x124 [ 2.195894] gpiod_hog+0x114/0x1b4 [ 2.199305] of_gpiochip_add+0x208/0x370 [ 2.203232] gpiochip_add_data_with_key+0x71c/0xf10 [ 2.208117] devm_gpiochip_add_data_with_key+0x30/0x7c [ 2.213261] mxc_gpio_probe+0x208/0x4b0
On Wed, Feb 14, 2024 at 2:17 PM Mark Brown <broonie@kernel.org> wrote: > > On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > > an actual race condition. The other three just make lockdep happy. > > This doesn't fix the issue I reported yesterday when applied on top of > today's next: > > https://lava.sirena.org.uk/scheduler/job/585469 > > [ 1.995518] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000078 > > ... > > [ 2.176162] Call trace: > [ 2.178610] check_init_srcu_struct+0x1c/0xa0 > [ 2.182974] synchronize_srcu+0x1c/0x100 > [ 2.186904] gpiod_request_commit+0xec/0x1e0 > [ 2.191183] gpiochip_request_own_desc+0x58/0x124 > [ 2.195894] gpiod_hog+0x114/0x1b4 > [ 2.199305] of_gpiochip_add+0x208/0x370 > [ 2.203232] gpiochip_add_data_with_key+0x71c/0xf10 > [ 2.208117] devm_gpiochip_add_data_with_key+0x30/0x7c > [ 2.213261] mxc_gpio_probe+0x208/0x4b0 No, the fix for this issue is in my tree as commit ba5c5effe02c ("gpio: initialize descriptor SRCU structure before adding OF-based chips"). These are mostly fixes for lockdep warnings. Bart
On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > an actual race condition. The other three just make lockdep happy. For 1/4-3/4: Acked-by: Paul E. McKenney <paulmck@kernel.org> For 4/4, you are playing with fire, but I will assume that you know what you are doing. ;-) Thanx, Paul > v1 -> v2: > - use srcu_dereference() instead of rcu_dereference_protected() as > advised by Paul > - add a patch using rcu_dereference_check(..., 1) in deprecated > interfaces that return the address of the RCU-protected chip structure > to external users (who shouldn't use it anyway but well...) > - pick up review tags for patches 1/4 and 2/4 > > Bartosz Golaszewski (4): > gpio: take the SRCU read lock in gpiod_hog() > gpio: cdev: use correct pointer accessors with SRCU > gpio: use srcu_dereference() with SRCU-protected pointers > gpio: don't let lockdep complain about inherently dangerous RCU usage > > drivers/gpio/gpiolib-cdev.c | 25 ++++++++++++------------- > drivers/gpio/gpiolib-sysfs.c | 5 +++-- > drivers/gpio/gpiolib.c | 32 ++++++++++++++++++-------------- > drivers/gpio/gpiolib.h | 3 ++- > 4 files changed, 35 insertions(+), 30 deletions(-) > > -- > 2.40.1 >
On Wed, Feb 14, 2024 at 7:44 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > > an actual race condition. The other three just make lockdep happy. > > For 1/4-3/4: > > Acked-by: Paul E. McKenney <paulmck@kernel.org> > > For 4/4, you are playing with fire, but I will assume that you know what > you are doing. ;-) > Up until this rework, this gdev->chip pointer could go from under any user at any point. Now we have this gpio_device wrapper that provides an entry point to using the chip safely while protected by the SRCU read lock. Anyone who is still accessing gpio_chip directly (and not being the GPIO provider themselves) is asking for trouble. There's however no point in spamming lockdep splats in this case. I may end up adding a warning to these routines. Unfortunately, it's hard to fix 15 years of technical debt. :( Thanks for the Acks. Bartosz [snip]
On Wed, Feb 14, 2024 at 08:08:33PM +0100, Bartosz Golaszewski wrote: > On Wed, Feb 14, 2024 at 7:44 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Wed, Feb 14, 2024 at 09:44:15AM +0100, Bartosz Golaszewski wrote: > > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > > > > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > > > an actual race condition. The other three just make lockdep happy. > > > > For 1/4-3/4: > > > > Acked-by: Paul E. McKenney <paulmck@kernel.org> > > > > For 4/4, you are playing with fire, but I will assume that you know what > > you are doing. ;-) > > Up until this rework, this gdev->chip pointer could go from under any > user at any point. Now we have this gpio_device wrapper that provides > an entry point to using the chip safely while protected by the SRCU > read lock. Anyone who is still accessing gpio_chip directly (and not > being the GPIO provider themselves) is asking for trouble. There's > however no point in spamming lockdep splats in this case. I may end up > adding a warning to these routines. > > Unfortunately, it's hard to fix 15 years of technical debt. :( Indeed, life is sometimes hard! Thanx, Paul > Thanks for the Acks. > Bartosz > > [snip]
On Wed, Feb 14, 2024 at 9:44 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> > > Here are four fixes to some bugs in recent SRCU changes. The first one fixes > an actual race condition. The other three just make lockdep happy. > > v1 -> v2: > - use srcu_dereference() instead of rcu_dereference_protected() as > advised by Paul > - add a patch using rcu_dereference_check(..., 1) in deprecated > interfaces that return the address of the RCU-protected chip structure > to external users (who shouldn't use it anyway but well...) > - pick up review tags for patches 1/4 and 2/4 > > Bartosz Golaszewski (4): > gpio: take the SRCU read lock in gpiod_hog() > gpio: cdev: use correct pointer accessors with SRCU > gpio: use srcu_dereference() with SRCU-protected pointers > gpio: don't let lockdep complain about inherently dangerous RCU usage > > drivers/gpio/gpiolib-cdev.c | 25 ++++++++++++------------- > drivers/gpio/gpiolib-sysfs.c | 5 +++-- > drivers/gpio/gpiolib.c | 32 ++++++++++++++++++-------------- > drivers/gpio/gpiolib.h | 3 ++- > 4 files changed, 35 insertions(+), 30 deletions(-) > > -- > 2.40.1 > Series queued. Bart
On Wed, Feb 14, 2024 at 8:08 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> Unfortunately, it's hard to fix 15 years of technical debt. :(
If it's any consolation I didn't create this one technical debt on purpose.
The actual mistake creating it goes something like: /dev/gpiochipN files are
nice and we can clean up after a file handle is closed or terminated,
I wonder why people insist on using sysfs for so many things.
All the file semantics of handles going away by being used etc, that's
why sysfs is good. I was too inexperienced to understand that, or I would
have paid more attention...
Yet I think we ended up in a reasonable place.
Yours,
Linus Walleij
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Here are four fixes to some bugs in recent SRCU changes. The first one fixes an actual race condition. The other three just make lockdep happy. v1 -> v2: - use srcu_dereference() instead of rcu_dereference_protected() as advised by Paul - add a patch using rcu_dereference_check(..., 1) in deprecated interfaces that return the address of the RCU-protected chip structure to external users (who shouldn't use it anyway but well...) - pick up review tags for patches 1/4 and 2/4 Bartosz Golaszewski (4): gpio: take the SRCU read lock in gpiod_hog() gpio: cdev: use correct pointer accessors with SRCU gpio: use srcu_dereference() with SRCU-protected pointers gpio: don't let lockdep complain about inherently dangerous RCU usage drivers/gpio/gpiolib-cdev.c | 25 ++++++++++++------------- drivers/gpio/gpiolib-sysfs.c | 5 +++-- drivers/gpio/gpiolib.c | 32 ++++++++++++++++++-------------- drivers/gpio/gpiolib.h | 3 ++- 4 files changed, 35 insertions(+), 30 deletions(-)