Message ID | 20230504151100.v4.5.I4e47cbfa1bb2ebbcdb5ca16817aa2887f15dc82c@changeid (mailing list archive) |
---|---|
State | Handled Elsewhere, archived |
Headers | show |
Series | watchdog/hardlockup: Add the buddy hardlockup detector | expand |
On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > In preparation for the buddy hardlockup detector, rename > touch_nmi_watchdog() to touch_hardlockup_watchdog() to make it clear > that it will touch whatever hardlockup detector is configured. We'll > add a #define for the old name (touch_nmi_watchdog) so that we don't > have to touch every piece of code referring to the old name. Is this really helpful? Now it's got two names Could just leave it. If you insist then it'd be better just to rename everything in one go at the end of a merge window IMO. Conflicts would be trivial. > Ideally this change would also rename the arch_touch_nmi_watchdog(), > but that is harder since arch_touch_nmi_watchdog() is exported with > EXPORT_SYMBOL() and thus is ABI. Add a comment next to the call to > hopefully alleviate some of the confusion here. We don't keep ABI fixed upstream. Thanks, Nick > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > > Changes in v4: > - ("Rename touch_nmi_watchdog() to ...") new for v4. > > include/linux/nmi.h | 27 ++++++++++++++++++++++----- > 1 file changed, 22 insertions(+), 5 deletions(-) > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h > index 454fe99c4874..35d09d70f394 100644 > --- a/include/linux/nmi.h > +++ b/include/linux/nmi.h > @@ -125,18 +125,35 @@ void watchdog_nmi_disable(unsigned int cpu); > void lockup_detector_reconfigure(void); > > /** > - * touch_nmi_watchdog - restart NMI watchdog timeout. > + * touch_hardlockup_watchdog - manually pet the hardlockup watchdog. > * > - * If the architecture supports the NMI watchdog, touch_nmi_watchdog() > - * may be used to reset the timeout - for code which intentionally > - * disables interrupts for a long time. This call is stateless. > + * If we support detecting hardlockups, touch_hardlockup_watchdog() may be > + * used to pet the watchdog (reset the timeout) - for code which > + * intentionally disables interrupts for a long time. This call is stateless. > */ > -static inline void touch_nmi_watchdog(void) > +static inline void touch_hardlockup_watchdog(void) > { > + /* > + * Pass on to the hardlockup detector selected via CONFIG_. Note that > + * the hardlockup detector may not be arch-specific nor using NMIs, > + * but arch_touch_nmi_watchdog() is exported with EXPORT_SYMBOL() and > + * is thus ABI. > + */ > arch_touch_nmi_watchdog(); > + > + /* > + * Touching the hardlock detector implcitily pets the > + * softlockup detector too > + */ > touch_softlockup_watchdog(); > } > > +/* > + * It's encouraged for code to refer to the new name, but allow the old > + * name as well. > + */ > +#define touch_nmi_watchdog touch_hardlockup_watchdog > + > /* > * Create trigger_all_cpu_backtrace() out of the arch-provided > * base function. Return whether such support was available, > -- > 2.40.1.521.gf1e218fcd8-goog
Hi, On Thu, May 4, 2023 at 7:51 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > > In preparation for the buddy hardlockup detector, rename > > touch_nmi_watchdog() to touch_hardlockup_watchdog() to make it clear > > that it will touch whatever hardlockup detector is configured. We'll > > add a #define for the old name (touch_nmi_watchdog) so that we don't > > have to touch every piece of code referring to the old name. > > Is this really helpful? Now it's got two names Could just leave it. > If you insist then it'd be better just to rename everything in one > go at the end of a merge window IMO. Conflicts would be trivial. I'm not picky here. I changed the name since Petr requested names to be changed for any code I was touching [1] and so I threw this out as a proposal. I agree that having two names can be confusing, but in this case it didn't feel too terrible to me. I'd love to hear Petr's opinion on this name change. I'm happy with: a) This patch as it is. b) Dropping this patch (or perhaps just changing it to add comments). c) Changing this patch to rename all 70 uses of the old name. Assuming this will go through Andrew Morton's tree, I'd be interested in whether he's OK w/ this. d) Dropping this patch from this series but putting it on the backburner to try to do later (so that the rename can happen at a time when it's least disruptive). > > Ideally this change would also rename the arch_touch_nmi_watchdog(), > > but that is harder since arch_touch_nmi_watchdog() is exported with > > EXPORT_SYMBOL() and thus is ABI. Add a comment next to the call to > > hopefully alleviate some of the confusion here. > > We don't keep ABI fixed upstream. I'm happy to be corrected, but my understanding was that kernel devs made an effort not to mess with things exported via "EXPORT_SYMBOL", but things exported via "EXPORT_SYMBOL_GPL" were fair game. I guess maybe my patch calling it "ABI" is a stronger statement than that, though. Doing a little more research, nobody wants to say that things exported with "EXPORT_SYMBOL" are ABI, they just want to say that we make an effort to have them be more stable. So certainly I should adjust my patch series not to call it ABI, but I'm still on the fence about whether I should rename this or not. I'd love to hear other opinions. This rename actually would be a lot easier than the touch_nmi_watchdog() one since the code referencing the name "arch_touch_nmi_watchdog" isn't spread so broadly through the kernel. [1] https://lore.kernel.org/r/ZFErmshcrcikrSU1@alley -Doug
On Sat May 6, 2023 at 2:37 AM AEST, Doug Anderson wrote: > Hi, > > On Thu, May 4, 2023 at 7:51 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > > > In preparation for the buddy hardlockup detector, rename > > > touch_nmi_watchdog() to touch_hardlockup_watchdog() to make it clear > > > that it will touch whatever hardlockup detector is configured. We'll > > > add a #define for the old name (touch_nmi_watchdog) so that we don't > > > have to touch every piece of code referring to the old name. > > > > Is this really helpful? Now it's got two names Could just leave it. > > If you insist then it'd be better just to rename everything in one > > go at the end of a merge window IMO. Conflicts would be trivial. > > I'm not picky here. I changed the name since Petr requested names to > be changed for any code I was touching [1] and so I threw this out as > a proposal. I agree that having two names can be confusing, but in > this case it didn't feel too terrible to me. > > I'd love to hear Petr's opinion on this name change. I'm happy with: > > a) This patch as it is. > > b) Dropping this patch (or perhaps just changing it to add comments). > > c) Changing this patch to rename all 70 uses of the old name. Assuming > this will go through Andrew Morton's tree, I'd be interested in > whether he's OK w/ this. > > d) Dropping this patch from this series but putting it on the > backburner to try to do later (so that the rename can happen at a time > when it's least disruptive). > > > > > Ideally this change would also rename the arch_touch_nmi_watchdog(), > > > but that is harder since arch_touch_nmi_watchdog() is exported with > > > EXPORT_SYMBOL() and thus is ABI. Add a comment next to the call to > > > hopefully alleviate some of the confusion here. > > > > We don't keep ABI fixed upstream. > > I'm happy to be corrected, but my understanding was that kernel devs > made an effort not to mess with things exported via "EXPORT_SYMBOL", > but things exported via "EXPORT_SYMBOL_GPL" were fair game. I don't think that's the case. If anything people might be a bit more inclined to accommodate GPL exports for out of tree modules that use them. > I guess maybe my patch calling it "ABI" is a stronger statement than > that, though. Doing a little more research, nobody wants to say that > things exported with "EXPORT_SYMBOL" are ABI, they just want to say > that we make an effort to have them be more stable. We wouldn't break any symbol for no reason, but in this case there is a good reason. If the name change is important for clarity then we change it. And this is about the easiest change for an out of tree module to deal with, so it should be no big deal for them. Thanks, Nick
Hi, On Sun, May 7, 2023 at 6:35 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > On Sat May 6, 2023 at 2:37 AM AEST, Doug Anderson wrote: > > Hi, > > > > On Thu, May 4, 2023 at 7:51 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > > > On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > > > > In preparation for the buddy hardlockup detector, rename > > > > touch_nmi_watchdog() to touch_hardlockup_watchdog() to make it clear > > > > that it will touch whatever hardlockup detector is configured. We'll > > > > add a #define for the old name (touch_nmi_watchdog) so that we don't > > > > have to touch every piece of code referring to the old name. > > > > > > Is this really helpful? Now it's got two names Could just leave it. > > > If you insist then it'd be better just to rename everything in one > > > go at the end of a merge window IMO. Conflicts would be trivial. > > > > I'm not picky here. I changed the name since Petr requested names to > > be changed for any code I was touching [1] and so I threw this out as > > a proposal. I agree that having two names can be confusing, but in > > this case it didn't feel too terrible to me. > > > > I'd love to hear Petr's opinion on this name change. I'm happy with: > > > > a) This patch as it is. > > > > b) Dropping this patch (or perhaps just changing it to add comments). > > > > c) Changing this patch to rename all 70 uses of the old name. Assuming > > this will go through Andrew Morton's tree, I'd be interested in > > whether he's OK w/ this. > > > > d) Dropping this patch from this series but putting it on the > > backburner to try to do later (so that the rename can happen at a time > > when it's least disruptive). > > > > > > > > Ideally this change would also rename the arch_touch_nmi_watchdog(), > > > > but that is harder since arch_touch_nmi_watchdog() is exported with > > > > EXPORT_SYMBOL() and thus is ABI. Add a comment next to the call to > > > > hopefully alleviate some of the confusion here. > > > > > > We don't keep ABI fixed upstream. > > > > I'm happy to be corrected, but my understanding was that kernel devs > > made an effort not to mess with things exported via "EXPORT_SYMBOL", > > but things exported via "EXPORT_SYMBOL_GPL" were fair game. > > I don't think that's the case. If anything people might be a bit more > inclined to accommodate GPL exports for out of tree modules that use > them. > > > I guess maybe my patch calling it "ABI" is a stronger statement than > > that, though. Doing a little more research, nobody wants to say that > > things exported with "EXPORT_SYMBOL" are ABI, they just want to say > > that we make an effort to have them be more stable. > > We wouldn't break any symbol for no reason, but in this case there is a > good reason. If the name change is important for clarity then we change > it. And this is about the easiest change for an out of tree module to > deal with, so it should be no big deal for them. OK, fair enough. My current plan is to wait a few more days to see if anyone else chimes in with opinions. If I don't hear anything, in my next version I will rename _neither_ touch_nmi_watchdog() nor arch_touch_nmi_watchdog(). I'll still add comments indicating that these functions touch the "hardlockup" watchdog but I won't attempt the rename just to keep the series simpler. -Doug
On Fri 2023-05-05 09:37:35, Doug Anderson wrote: > Hi, > > On Thu, May 4, 2023 at 7:51 PM Nicholas Piggin <npiggin@gmail.com> wrote: > > > > On Fri May 5, 2023 at 8:13 AM AEST, Douglas Anderson wrote: > > > In preparation for the buddy hardlockup detector, rename > > > touch_nmi_watchdog() to touch_hardlockup_watchdog() to make it clear > > > that it will touch whatever hardlockup detector is configured. We'll > > > add a #define for the old name (touch_nmi_watchdog) so that we don't > > > have to touch every piece of code referring to the old name. > > > > Is this really helpful? Now it's got two names Could just leave it. > > If you insist then it'd be better just to rename everything in one > > go at the end of a merge window IMO. Conflicts would be trivial. > > I'm not picky here. I changed the name since Petr requested names to > be changed for any code I was touching [1] and so I threw this out as > a proposal. I agree that having two names can be confusing, but in > this case it didn't feel too terrible to me. IMHO, it is worth renaming to make the code easier to follow. Especially after adding the buddy hardlockup detector that is not using NMI context. And I agree that that we should rename all callers as well. Otherwise, it might be seen just as an extra churn. > I'd love to hear Petr's opinion on this name change. I'm happy with: > > a) This patch as it is. > > b) Dropping this patch (or perhaps just changing it to add comments). > > c) Changing this patch to rename all 70 uses of the old name. Assuming > this will go through Andrew Morton's tree, I'd be interested in > whether he's OK w/ this. > > d) Dropping this patch from this series but putting it on the > backburner to try to do later (so that the rename can happen at a time > when it's least disruptive). d) sounds reasonable given that there is about 70 callers. > > > > Ideally this change would also rename the arch_touch_nmi_watchdog(), > > > but that is harder since arch_touch_nmi_watchdog() is exported with > > > EXPORT_SYMBOL() and thus is ABI. Add a comment next to the call to > > > hopefully alleviate some of the confusion here. > > > > We don't keep ABI fixed upstream. > > I'm happy to be corrected, but my understanding was that kernel devs > made an effort not to mess with things exported via "EXPORT_SYMBOL", > but things exported via "EXPORT_SYMBOL_GPL" were fair game. My understanding is that kernel guarantees ABI compatibility only for the userspace (do-not-break-userspace rule). But the kernel ABI is not guaranteed [*] It actually has even a positive side effect because it motivates module developers to upstream the code. Of course, there should be a good reason for the change. And I think that we have a good reason. [*] This is valid for upstream. Another story is with linux distributions. They usually maintain the kernel KABI stability to some degree when backporting upstream changes. Best Regards, Petr
diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 454fe99c4874..35d09d70f394 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -125,18 +125,35 @@ void watchdog_nmi_disable(unsigned int cpu); void lockup_detector_reconfigure(void); /** - * touch_nmi_watchdog - restart NMI watchdog timeout. + * touch_hardlockup_watchdog - manually pet the hardlockup watchdog. * - * If the architecture supports the NMI watchdog, touch_nmi_watchdog() - * may be used to reset the timeout - for code which intentionally - * disables interrupts for a long time. This call is stateless. + * If we support detecting hardlockups, touch_hardlockup_watchdog() may be + * used to pet the watchdog (reset the timeout) - for code which + * intentionally disables interrupts for a long time. This call is stateless. */ -static inline void touch_nmi_watchdog(void) +static inline void touch_hardlockup_watchdog(void) { + /* + * Pass on to the hardlockup detector selected via CONFIG_. Note that + * the hardlockup detector may not be arch-specific nor using NMIs, + * but arch_touch_nmi_watchdog() is exported with EXPORT_SYMBOL() and + * is thus ABI. + */ arch_touch_nmi_watchdog(); + + /* + * Touching the hardlock detector implcitily pets the + * softlockup detector too + */ touch_softlockup_watchdog(); } +/* + * It's encouraged for code to refer to the new name, but allow the old + * name as well. + */ +#define touch_nmi_watchdog touch_hardlockup_watchdog + /* * Create trigger_all_cpu_backtrace() out of the arch-provided * base function. Return whether such support was available,
In preparation for the buddy hardlockup detector, rename touch_nmi_watchdog() to touch_hardlockup_watchdog() to make it clear that it will touch whatever hardlockup detector is configured. We'll add a #define for the old name (touch_nmi_watchdog) so that we don't have to touch every piece of code referring to the old name. Ideally this change would also rename the arch_touch_nmi_watchdog(), but that is harder since arch_touch_nmi_watchdog() is exported with EXPORT_SYMBOL() and thus is ABI. Add a comment next to the call to hopefully alleviate some of the confusion here. Signed-off-by: Douglas Anderson <dianders@chromium.org> --- Changes in v4: - ("Rename touch_nmi_watchdog() to ...") new for v4. include/linux/nmi.h | 27 ++++++++++++++++++++++----- 1 file changed, 22 insertions(+), 5 deletions(-)