Message ID | 20200225164540.4520-1-luca@lucaceresoli.net |
---|---|
State | Superseded, archived |
Headers | show |
Series | [v2] of: overlay: log the error cause on resolver failure | expand |
Context | Check | Description |
---|---|---|
robh/checkpatch | success |
On 2/25/20 10:45 AM, Luca Ceresoli wrote: > For some of its error paths, of_resolve_phandles() only logs a very generic > error which does not help much in finding the origin of the problem: > > OF: resolver: overlay phandle fixup failed: -22 > > Add error messages for all the error paths that don't have one. Now a > specific message is always emitted, thus also remove the generic catch-all > message emitted before returning. > > For example, in case a DT overlay has a fixup node that is not present in > the base DT __symbols__, this error is now logged: > > OF: resolver: node gpio9 not found in base DT, fixup failed > > Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> > Cc: Geert Uytterhoeven <geert@linux-m68k.org> > --- > > I don't know in detail the meaning of the adjust_local_phandle_references() > and update_usages_of_a_phandle_reference() error paths, thus I have put > pretty generic messages. Any suggestion on better wording would be welcome. If you have not read the code to understand what the meaning of the errors are, you should not be suggesting changes to the error messages. Only one of the issues detected as errors can possibly be something other than an error either in the resolver.c code or the dtc compiler -- a missing symbol in the live devicetree. This may be because of failing to compile the base devicetree without symbols, depending on a symbol from another overlay where the other overlay has not been applied, or depending on a symbol from another overlay where the other overlay is applied but the overlay was not compiled with symbols. (Not meant to be an exhaustive list, but it might be.) Thus the missing symbol problem might be fixable without a fix to kernel code. The error message philosophy for overlay related errors is to minimize error messages that help diagnose the precise cause of a kernel code bug, with the intent of keeping the code more compact and readable. When a bug occurs, debugging messages can be added for the debug session. Following this philosophy, only the message in the second patch chunk is ok. I will include an example of more precise error messages in the other locations, just for education on what is going wrong at those points. > > Changed in v2: > > - add a message for each error path that does not have one yet > --- > drivers/of/resolver.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c > index 83c766233181..a80d673621bc 100644 > --- a/drivers/of/resolver.c > +++ b/drivers/of/resolver.c > @@ -291,8 +291,10 @@ int of_resolve_phandles(struct device_node *overlay) > break; > > err = adjust_local_phandle_references(local_fixups, overlay, phandle_delta); > - if (err) > + if (err) { > + pr_err("cannot adjust local phandle references\n"); > goto out; > + } Delete this message. But if there was a message, it could be: "invalid overlay, adjust local phandle references failed\n" > > overlay_fixups = NULL; > > @@ -321,11 +323,15 @@ int of_resolve_phandles(struct device_node *overlay) > > err = of_property_read_string(tree_symbols, > prop->name, &refpath); > - if (err) > + if (err) { > + pr_err("node %s not found in base DT, fixup failed\n", > + prop->name); "symbol '%s' not found in live devicetree symbols table\n", prop->name > goto out; > + } > > refnode = of_find_node_by_path(refpath); > if (!refnode) { > + pr_err("cannot find node for %s\n", refpath); > err = -ENOENT; > goto out; > } > @@ -334,13 +340,14 @@ int of_resolve_phandles(struct device_node *overlay) > of_node_put(refnode); > > err = update_usages_of_a_phandle_reference(overlay, prop, phandle); > - if (err) > + if (err) { > + pr_err("cannot update usages of a phandle reference (%s)\n", > + prop->name); > break; > + } Delete this message. But if there was a message, it could be: "invalid fixup for symbol '%s'\n", prop->name > } > > out: > - if (err) > - pr_err("overlay phandle fixup failed: %d\n", err); Do not remove this message. The other messages do not explain that phandle fixup failed - they provide a more detailed description of a specific reason _why_ the phandle fixup failed. > of_node_put(tree_symbols); > > return err; >
Hi Frank, On 26/02/20 04:53, Frank Rowand wrote: > On 2/25/20 10:45 AM, Luca Ceresoli wrote: >> For some of its error paths, of_resolve_phandles() only logs a very generic >> error which does not help much in finding the origin of the problem: >> >> OF: resolver: overlay phandle fixup failed: -22 >> >> Add error messages for all the error paths that don't have one. Now a >> specific message is always emitted, thus also remove the generic catch-all >> message emitted before returning. >> >> For example, in case a DT overlay has a fixup node that is not present in >> the base DT __symbols__, this error is now logged: >> >> OF: resolver: node gpio9 not found in base DT, fixup failed >> >> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> >> Cc: Geert Uytterhoeven <geert@linux-m68k.org> >> --- >> >> I don't know in detail the meaning of the adjust_local_phandle_references() >> and update_usages_of_a_phandle_reference() error paths, thus I have put >> pretty generic messages. Any suggestion on better wording would be welcome. > > If you have not read the code to understand what the meaning of > the errors are, you should not be suggesting changes to the error > messages. > > Only one of the issues detected as errors can possibly be something > other than an error either in the resolver.c code or the dtc > compiler -- a missing symbol in the live devicetree. This may > be because of failing to compile the base devicetree without > symbols, depending on a symbol from another overlay where the > other overlay has not been applied, or depending on a symbol > from another overlay where the other overlay is applied but > the overlay was not compiled with symbols. (Not meant to be > an exhaustive list, but it might be.) Thus the missing > symbol problem might be fixable without a fix to kernel > code. The error message philosophy for overlay related > errors is to minimize error messages that help diagnose > the precise cause of a kernel code bug, with the intent > of keeping the code more compact and readable. When a > bug occurs, debugging messages can be added for the > debug session. Got it, sorry about that. > Following this philosophy, only the message in the second > patch chunk is ok. Then I think you can apply the v1 patch which only contains the message about the problem I experienced, and which was caused by an incorrect DTO: https://patchwork.ozlabs.org/patch/1243987/ Just ignore the note saying the patch is not for mainline, it's wrong.
On 2/27/20 2:11 AM, Luca Ceresoli wrote: > Hi Frank, > > On 26/02/20 04:53, Frank Rowand wrote: >> On 2/25/20 10:45 AM, Luca Ceresoli wrote: >>> For some of its error paths, of_resolve_phandles() only logs a very generic >>> error which does not help much in finding the origin of the problem: >>> >>> OF: resolver: overlay phandle fixup failed: -22 >>> >>> Add error messages for all the error paths that don't have one. Now a >>> specific message is always emitted, thus also remove the generic catch-all >>> message emitted before returning. >>> >>> For example, in case a DT overlay has a fixup node that is not present in >>> the base DT __symbols__, this error is now logged: >>> >>> OF: resolver: node gpio9 not found in base DT, fixup failed >>> >>> Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> >>> Cc: Geert Uytterhoeven <geert@linux-m68k.org> >>> --- >>> >>> I don't know in detail the meaning of the adjust_local_phandle_references() >>> and update_usages_of_a_phandle_reference() error paths, thus I have put >>> pretty generic messages. Any suggestion on better wording would be welcome. >> >> If you have not read the code to understand what the meaning of >> the errors are, you should not be suggesting changes to the error >> messages. >> >> Only one of the issues detected as errors can possibly be something >> other than an error either in the resolver.c code or the dtc >> compiler -- a missing symbol in the live devicetree. This may >> be because of failing to compile the base devicetree without >> symbols, depending on a symbol from another overlay where the >> other overlay has not been applied, or depending on a symbol >> from another overlay where the other overlay is applied but >> the overlay was not compiled with symbols. (Not meant to be >> an exhaustive list, but it might be.) Thus the missing >> symbol problem might be fixable without a fix to kernel >> code. The error message philosophy for overlay related >> errors is to minimize error messages that help diagnose >> the precise cause of a kernel code bug, with the intent >> of keeping the code more compact and readable. When a >> bug occurs, debugging messages can be added for the >> debug session. > > Got it, sorry about that. > >> Following this philosophy, only the message in the second >> patch chunk is ok. > > Then I think you can apply the v1 patch which only contains the message > about the problem I experienced, and which was caused by an incorrect DTO: > > https://patchwork.ozlabs.org/patch/1243987/ > > Just ignore the note saying the patch is not for mainline, it's wrong. > Mostly yes, v1 contains the one place a message should be added. Let me bike shed a little bit though. I suggested a different wording for the message in v2, but I do not think my attempt at wording was precise enough. I would instead suggest: "node label '%s' not found in live devicetree symbols table\n" Some subtle differences. - It is a node label, not a node name. - If multiple overlays are applied, then the intention may have been to supply the node label via a previously applied overlay instead of from the base devicetree. So specifying the live devicetree is more accurate. Please submit v3 for mainline. Thanks, Frank
diff --git a/drivers/of/resolver.c b/drivers/of/resolver.c index 83c766233181..a80d673621bc 100644 --- a/drivers/of/resolver.c +++ b/drivers/of/resolver.c @@ -291,8 +291,10 @@ int of_resolve_phandles(struct device_node *overlay) break; err = adjust_local_phandle_references(local_fixups, overlay, phandle_delta); - if (err) + if (err) { + pr_err("cannot adjust local phandle references\n"); goto out; + } overlay_fixups = NULL; @@ -321,11 +323,15 @@ int of_resolve_phandles(struct device_node *overlay) err = of_property_read_string(tree_symbols, prop->name, &refpath); - if (err) + if (err) { + pr_err("node %s not found in base DT, fixup failed\n", + prop->name); goto out; + } refnode = of_find_node_by_path(refpath); if (!refnode) { + pr_err("cannot find node for %s\n", refpath); err = -ENOENT; goto out; } @@ -334,13 +340,14 @@ int of_resolve_phandles(struct device_node *overlay) of_node_put(refnode); err = update_usages_of_a_phandle_reference(overlay, prop, phandle); - if (err) + if (err) { + pr_err("cannot update usages of a phandle reference (%s)\n", + prop->name); break; + } } out: - if (err) - pr_err("overlay phandle fixup failed: %d\n", err); of_node_put(tree_symbols); return err;
For some of its error paths, of_resolve_phandles() only logs a very generic error which does not help much in finding the origin of the problem: OF: resolver: overlay phandle fixup failed: -22 Add error messages for all the error paths that don't have one. Now a specific message is always emitted, thus also remove the generic catch-all message emitted before returning. For example, in case a DT overlay has a fixup node that is not present in the base DT __symbols__, this error is now logged: OF: resolver: node gpio9 not found in base DT, fixup failed Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> --- I don't know in detail the meaning of the adjust_local_phandle_references() and update_usages_of_a_phandle_reference() error paths, thus I have put pretty generic messages. Any suggestion on better wording would be welcome. Changed in v2: - add a message for each error path that does not have one yet --- drivers/of/resolver.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)