Message ID | 20200515212846.1347-2-mcgrof@kernel.org |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net: taint when the device driver firmware crashes | expand |
On Fri, May 15, 2020 at 09:28:32PM +0000, Luis Chamberlain wrote: > Device driver firmware can crash, and sometimes, this can leave your > system in a state which makes the device or subsystem completely > useless. Detecting this by inspecting /proc/sys/kernel/tainted instead > of scraping some magical words from the kernel log, which is driver > specific, is much easier. So instead provide a helper which lets drivers > annotate this. > > Once this happens, scrapers can easily look for modules taint flags > for a firmware crash. This will taint both the kernel and respective > calling module. > > The new helper module_firmware_crashed() uses LOCKDEP_STILL_OK as this > fact should in no way shape or form affect lockdep. This taint is device > driver specific. > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > --- > Documentation/admin-guide/tainted-kernels.rst | 6 ++++++ > include/linux/kernel.h | 3 ++- > include/linux/module.h | 13 +++++++++++++ > include/trace/events/module.h | 3 ++- > kernel/module.c | 5 +++-- > kernel/panic.c | 1 + > tools/debugging/kernel-chktaint | 7 +++++++ > 7 files changed, 34 insertions(+), 4 deletions(-) > Reviewed-by: Rafael Aquini <aquini@redhat.com>
+++ Luis Chamberlain [15/05/20 21:28 +0000]: >Device driver firmware can crash, and sometimes, this can leave your >system in a state which makes the device or subsystem completely >useless. Detecting this by inspecting /proc/sys/kernel/tainted instead >of scraping some magical words from the kernel log, which is driver >specific, is much easier. So instead provide a helper which lets drivers >annotate this. > >Once this happens, scrapers can easily look for modules taint flags >for a firmware crash. This will taint both the kernel and respective >calling module. > >The new helper module_firmware_crashed() uses LOCKDEP_STILL_OK as this >fact should in no way shape or form affect lockdep. This taint is device >driver specific. > >Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> >--- > Documentation/admin-guide/tainted-kernels.rst | 6 ++++++ > include/linux/kernel.h | 3 ++- > include/linux/module.h | 13 +++++++++++++ > include/trace/events/module.h | 3 ++- > kernel/module.c | 5 +++-- > kernel/panic.c | 1 + > tools/debugging/kernel-chktaint | 7 +++++++ > 7 files changed, 34 insertions(+), 4 deletions(-) > >diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst >index 71e9184a9079..92530f1d60ae 100644 >--- a/Documentation/admin-guide/tainted-kernels.rst >+++ b/Documentation/admin-guide/tainted-kernels.rst >@@ -100,6 +100,7 @@ Bit Log Number Reason that got the kernel tainted > 15 _/K 32768 kernel has been live patched > 16 _/X 65536 auxiliary taint, defined for and used by distros > 17 _/T 131072 kernel was built with the struct randomization plugin >+ 18 _/Q 262144 driver firmware crash annotation > === === ====== ======================================================== > > Note: The character ``_`` is representing a blank in this table to make reading >@@ -162,3 +163,8 @@ More detailed explanation for tainting > produce extremely unusual kernel structure layouts (even performance > pathological ones), which is important to know when debugging. Set at > build time. >+ >+ 18) ``Q`` used by device drivers to annotate that the device driver's firmware >+ has crashed and the device's operation has been severely affected. The >+ device may be left in a crippled state, requiring full driver removal / >+ addition, system reboot, or it is unclear how long recovery will take. >diff --git a/include/linux/kernel.h b/include/linux/kernel.h >index 04a5885cec1b..19e1541c82c7 100644 >--- a/include/linux/kernel.h >+++ b/include/linux/kernel.h >@@ -601,7 +601,8 @@ extern enum system_states { > #define TAINT_LIVEPATCH 15 > #define TAINT_AUX 16 > #define TAINT_RANDSTRUCT 17 >-#define TAINT_FLAGS_COUNT 18 >+#define TAINT_FIRMWARE_CRASH 18 >+#define TAINT_FLAGS_COUNT 19 > > struct taint_flag { > char c_true; /* character printed when tainted */ >diff --git a/include/linux/module.h b/include/linux/module.h >index 2c2e988bcf10..221200078180 100644 >--- a/include/linux/module.h >+++ b/include/linux/module.h >@@ -697,6 +697,14 @@ static inline bool is_livepatch_module(struct module *mod) > bool is_module_sig_enforced(void); > void set_module_sig_enforced(void); > >+void add_taint_module(struct module *mod, unsigned flag, >+ enum lockdep_ok lockdep_ok); >+ >+static inline void module_firmware_crashed(void) >+{ >+ add_taint_module(THIS_MODULE, TAINT_FIRMWARE_CRASH, LOCKDEP_STILL_OK); >+} Just a nit: I think module_firmware_crashed() is a confusing name - it doesn't really tell me what it's doing, and it's not really related to the rest of the module_* symbols, which mostly have to do with module loader/module specifics. Especially since a driver can be built-in, too. How about taint_firmware_crashed() or something similar? Also, I think we might crash in add_taint_module() if a driver is built into the kernel, because THIS_MODULE will be null and there is no null pointer check in add_taint_module(). We could unify the CONFIG_MODULES and !CONFIG_MODULES stubs and either add an `if (mod)` check in add_taint_module() or add an #ifdef MODULE check in the stub itself to call add_taint() or add_taint_module() as appropriate. Hope that makes sense. Thanks! Jessica
On Tue, May 19, 2020 at 06:42:31PM +0200, Jessica Yu wrote: > +++ Luis Chamberlain [15/05/20 21:28 +0000]: > > Device driver firmware can crash, and sometimes, this can leave your > > system in a state which makes the device or subsystem completely > > useless. Detecting this by inspecting /proc/sys/kernel/tainted instead > > of scraping some magical words from the kernel log, which is driver > > specific, is much easier. So instead provide a helper which lets drivers > > annotate this. > > > > Once this happens, scrapers can easily look for modules taint flags > > for a firmware crash. This will taint both the kernel and respective > > calling module. > > > > The new helper module_firmware_crashed() uses LOCKDEP_STILL_OK as this > > fact should in no way shape or form affect lockdep. This taint is device > > driver specific. > > > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> > > --- > > Documentation/admin-guide/tainted-kernels.rst | 6 ++++++ > > include/linux/kernel.h | 3 ++- > > include/linux/module.h | 13 +++++++++++++ > > include/trace/events/module.h | 3 ++- > > kernel/module.c | 5 +++-- > > kernel/panic.c | 1 + > > tools/debugging/kernel-chktaint | 7 +++++++ > > 7 files changed, 34 insertions(+), 4 deletions(-) > > > > diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst > > index 71e9184a9079..92530f1d60ae 100644 > > --- a/Documentation/admin-guide/tainted-kernels.rst > > +++ b/Documentation/admin-guide/tainted-kernels.rst > > @@ -100,6 +100,7 @@ Bit Log Number Reason that got the kernel tainted > > 15 _/K 32768 kernel has been live patched > > 16 _/X 65536 auxiliary taint, defined for and used by distros > > 17 _/T 131072 kernel was built with the struct randomization plugin > > + 18 _/Q 262144 driver firmware crash annotation > > === === ====== ======================================================== > > > > Note: The character ``_`` is representing a blank in this table to make reading > > @@ -162,3 +163,8 @@ More detailed explanation for tainting > > produce extremely unusual kernel structure layouts (even performance > > pathological ones), which is important to know when debugging. Set at > > build time. > > + > > + 18) ``Q`` used by device drivers to annotate that the device driver's firmware > > + has crashed and the device's operation has been severely affected. The > > + device may be left in a crippled state, requiring full driver removal / > > + addition, system reboot, or it is unclear how long recovery will take. > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > > index 04a5885cec1b..19e1541c82c7 100644 > > --- a/include/linux/kernel.h > > +++ b/include/linux/kernel.h > > @@ -601,7 +601,8 @@ extern enum system_states { > > #define TAINT_LIVEPATCH 15 > > #define TAINT_AUX 16 > > #define TAINT_RANDSTRUCT 17 > > -#define TAINT_FLAGS_COUNT 18 > > +#define TAINT_FIRMWARE_CRASH 18 > > +#define TAINT_FLAGS_COUNT 19 > > > > struct taint_flag { > > char c_true; /* character printed when tainted */ > > diff --git a/include/linux/module.h b/include/linux/module.h > > index 2c2e988bcf10..221200078180 100644 > > --- a/include/linux/module.h > > +++ b/include/linux/module.h > > @@ -697,6 +697,14 @@ static inline bool is_livepatch_module(struct module *mod) > > bool is_module_sig_enforced(void); > > void set_module_sig_enforced(void); > > > > +void add_taint_module(struct module *mod, unsigned flag, > > + enum lockdep_ok lockdep_ok); > > + > > +static inline void module_firmware_crashed(void) > > +{ > > + add_taint_module(THIS_MODULE, TAINT_FIRMWARE_CRASH, LOCKDEP_STILL_OK); > > +} > > Just a nit: I think module_firmware_crashed() is a confusing name - it > doesn't really tell me what it's doing, and it's not really related to > the rest of the module_* symbols, which mostly have to do with module > loader/module specifics. Especially since a driver can be built-in, too. > How about taint_firmware_crashed() or something similar? Sure. > Also, I think we might crash in add_taint_module() if a driver is > built into the kernel, because THIS_MODULE will be null and there is > no null pointer check in add_taint_module(). We could unify the > CONFIG_MODULES and !CONFIG_MODULES stubs and either add an `if (mod)` > check in add_taint_module() or add an #ifdef MODULE check in the stub > itself to call add_taint() or add_taint_module() as appropriate. Hope > that makes sense. I had to do something a bit different but I think you'll agree with it. Will include it in my next iteration. Luis
diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst index 71e9184a9079..92530f1d60ae 100644 --- a/Documentation/admin-guide/tainted-kernels.rst +++ b/Documentation/admin-guide/tainted-kernels.rst @@ -100,6 +100,7 @@ Bit Log Number Reason that got the kernel tainted 15 _/K 32768 kernel has been live patched 16 _/X 65536 auxiliary taint, defined for and used by distros 17 _/T 131072 kernel was built with the struct randomization plugin + 18 _/Q 262144 driver firmware crash annotation === === ====== ======================================================== Note: The character ``_`` is representing a blank in this table to make reading @@ -162,3 +163,8 @@ More detailed explanation for tainting produce extremely unusual kernel structure layouts (even performance pathological ones), which is important to know when debugging. Set at build time. + + 18) ``Q`` used by device drivers to annotate that the device driver's firmware + has crashed and the device's operation has been severely affected. The + device may be left in a crippled state, requiring full driver removal / + addition, system reboot, or it is unclear how long recovery will take. diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 04a5885cec1b..19e1541c82c7 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -601,7 +601,8 @@ extern enum system_states { #define TAINT_LIVEPATCH 15 #define TAINT_AUX 16 #define TAINT_RANDSTRUCT 17 -#define TAINT_FLAGS_COUNT 18 +#define TAINT_FIRMWARE_CRASH 18 +#define TAINT_FLAGS_COUNT 19 struct taint_flag { char c_true; /* character printed when tainted */ diff --git a/include/linux/module.h b/include/linux/module.h index 2c2e988bcf10..221200078180 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -697,6 +697,14 @@ static inline bool is_livepatch_module(struct module *mod) bool is_module_sig_enforced(void); void set_module_sig_enforced(void); +void add_taint_module(struct module *mod, unsigned flag, + enum lockdep_ok lockdep_ok); + +static inline void module_firmware_crashed(void) +{ + add_taint_module(THIS_MODULE, TAINT_FIRMWARE_CRASH, LOCKDEP_STILL_OK); +} + #else /* !CONFIG_MODULES... */ static inline struct module *__module_address(unsigned long addr) @@ -844,6 +852,11 @@ void *dereference_module_function_descriptor(struct module *mod, void *ptr) return ptr; } +static inline void module_firmware_crashed(void) +{ + add_taint(TAINT_FIRMWARE_CRASH, LOCKDEP_STILL_OK); +} + #endif /* CONFIG_MODULES */ #ifdef CONFIG_SYSFS diff --git a/include/trace/events/module.h b/include/trace/events/module.h index 097485c73c01..b749ea25affd 100644 --- a/include/trace/events/module.h +++ b/include/trace/events/module.h @@ -26,7 +26,8 @@ struct module; { (1UL << TAINT_OOT_MODULE), "O" }, \ { (1UL << TAINT_FORCED_MODULE), "F" }, \ { (1UL << TAINT_CRAP), "C" }, \ - { (1UL << TAINT_UNSIGNED_MODULE), "E" }) + { (1UL << TAINT_UNSIGNED_MODULE), "E" }, \ + { (1UL << TAINT_FIRMWARE_CRASH), "Q" }) TRACE_EVENT(module_load, diff --git a/kernel/module.c b/kernel/module.c index 80faaf2116dd..f98e8c25c6b4 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -325,12 +325,13 @@ static inline int strong_try_module_get(struct module *mod) return -ENOENT; } -static inline void add_taint_module(struct module *mod, unsigned flag, - enum lockdep_ok lockdep_ok) +void add_taint_module(struct module *mod, unsigned flag, + enum lockdep_ok lockdep_ok) { add_taint(flag, lockdep_ok); set_bit(flag, &mod->taints); } +EXPORT_SYMBOL_GPL(add_taint_module); /* * A thread that wants to hold a reference to a module only while it diff --git a/kernel/panic.c b/kernel/panic.c index ec6d7d788ce7..504fb926947e 100644 --- a/kernel/panic.c +++ b/kernel/panic.c @@ -384,6 +384,7 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = { [ TAINT_LIVEPATCH ] = { 'K', ' ', true }, [ TAINT_AUX ] = { 'X', ' ', true }, [ TAINT_RANDSTRUCT ] = { 'T', ' ', true }, + [ TAINT_FIRMWARE_CRASH ] = { 'Q', ' ', true }, }; /** diff --git a/tools/debugging/kernel-chktaint b/tools/debugging/kernel-chktaint index 2240cb56e6e5..c397c6aabea7 100755 --- a/tools/debugging/kernel-chktaint +++ b/tools/debugging/kernel-chktaint @@ -194,6 +194,13 @@ else addout "T" echo " * kernel was built with the struct randomization plugin (#17)" fi +T=`expr $T / 2` +if [ `expr $T % 2` -eq 0 ]; then + addout " " +else + addout "Q" + echo " * a device driver's firmware has crashed (#18)" +fi echo "For a more detailed explanation of the various taint flags see" echo " Documentation/admin-guide/tainted-kernels.rst in the the Linux kernel sources"
Device driver firmware can crash, and sometimes, this can leave your system in a state which makes the device or subsystem completely useless. Detecting this by inspecting /proc/sys/kernel/tainted instead of scraping some magical words from the kernel log, which is driver specific, is much easier. So instead provide a helper which lets drivers annotate this. Once this happens, scrapers can easily look for modules taint flags for a firmware crash. This will taint both the kernel and respective calling module. The new helper module_firmware_crashed() uses LOCKDEP_STILL_OK as this fact should in no way shape or form affect lockdep. This taint is device driver specific. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> --- Documentation/admin-guide/tainted-kernels.rst | 6 ++++++ include/linux/kernel.h | 3 ++- include/linux/module.h | 13 +++++++++++++ include/trace/events/module.h | 3 ++- kernel/module.c | 5 +++-- kernel/panic.c | 1 + tools/debugging/kernel-chktaint | 7 +++++++ 7 files changed, 34 insertions(+), 4 deletions(-)