Message ID | 20200716045542.33554-1-aik@ozlabs.ru |
---|---|
State | Accepted |
Headers | show |
Series | fdt: Avoid recursion when traversing tree | expand |
Hi Alexey, It seems it did again ! I haven't received this in my inbox, despite you've put me on Cc... :-\ On Thu, 16 Jul 2020 14:55:42 +1000 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > A loop over peers does not need recursion which becomes a problem with > hundreds devices. > You're likely right but, per curiosity, do you have some numbers to share ? > Cc: Greg Kurz <groug@kaod.org> > Fixes: efa56b851fab ("fdt: Delete nodes of devices removed between boot and CAS") > Suggested-by: Jordan Niethe <jniethe5@gmail.com> > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > board-qemu/slof/fdt.fs | 27 +++++++++++++++------------ > 1 file changed, 15 insertions(+), 12 deletions(-) > > diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs > index 7da4ff066702..03b1d4034d0a 100644 > --- a/board-qemu/slof/fdt.fs > +++ b/board-qemu/slof/fdt.fs > @@ -636,20 +636,23 @@ r> drop > THEN > ; > > -: (fdt-cas-search-obsolete-nodes) ( start node -- ) > - dup IF > - dup child 2 pick swap recurse > - dup peer 2 pick swap recurse > - > - dup fdt-cas-node-obsolete? IF > - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN > - dup delete-node > - THEN > +: (fdt-cas-search-obsolete-nodes) ( node -- ) > + dup child > + BEGIN > + dup > + WHILE > + dup recurse > + peer > + REPEAT > + drop > + dup fdt-cas-node-obsolete? IF > + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN > + dup delete-node > THEN > - 2drop > + drop > ; > > -: fdt-cas-delete-obsolete-nodes ( start -- ) > +: fdt-cas-delete-obsolete-nodes ( -- ) > s" /" find-device get-node (fdt-cas-search-obsolete-nodes) > fdt-cas-delete-obsolete-aliases > ; > @@ -657,7 +660,7 @@ r> drop > : fdt-fix-cas-node ( start -- ) > fdt-generation# 1+ to fdt-generation# > 0 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Add phandles > - dup fdt-cas-delete-obsolete-nodes \ Delete removed devices > + fdt-cas-delete-obsolete-nodes \ Delete removed devices Maybe keep the comment aligned with the other ones ? Reviewed-by: Greg Kurz <groug@kaod.org> > 1 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Patch+add other properties > 2 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Delete phandles from pass 0 > drop
Hi! On Thu, Jul 16, 2020 at 05:54:30PM +0200, Greg Kurz wrote: > On Thu, 16 Jul 2020 14:55:42 +1000 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > A loop over peers does not need recursion which becomes a problem with > > hundreds devices. > > You're likely right but, per curiosity, do you have some numbers to > share ? It's two items on the data stack per recursion (and one on the return stack). This would probably not take too much stack (just a lot, not *too* much though), except that to do "peer" we also recurse, so if there are a lot of entries in any of your parent nodes, you lose. And most trees have some nodes with many children. You can code any of the tree walks without any recursion, using just O(1) extra memory (just keep track of the previous node visited, for example), but in this case, it is probably fine to just do a loop to handle the nodes on one level. > > -: (fdt-cas-search-obsolete-nodes) ( start node -- ) > > - dup IF > > - dup child 2 pick swap recurse > > - dup peer 2 pick swap recurse > > - > > - dup fdt-cas-node-obsolete? IF > > - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN > > - dup delete-node > > - THEN > > +: (fdt-cas-search-obsolete-nodes) ( node -- ) > > + dup child > > + BEGIN > > + dup > > + WHILE > > + dup recurse > > + peer > > + REPEAT > > + drop > > + dup fdt-cas-node-obsolete? IF > > + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN > > + dup delete-node > > THEN > > - 2drop > > + drop > > ; ... which is what this does :-) The patch looks fine to me. Segher
On 17/07/2020 01:54, Greg Kurz wrote: > Hi Alexey, > > It seems it did again ! I haven't received this in my inbox, despite you've > put me on Cc... :-\ > > On Thu, 16 Jul 2020 14:55:42 +1000 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> A loop over peers does not need recursion which becomes a problem with >> hundreds devices. >> > > You're likely right but, per curiosity, do you have some numbers to > share ? This was discovered with just "-smp 2048,cores=512,threads=4" and the results were either "cas not implemented" or some other weird XIVE messages (I did not see any but Anton did). > >> Cc: Greg Kurz <groug@kaod.org> >> Fixes: efa56b851fab ("fdt: Delete nodes of devices removed between boot and CAS") >> Suggested-by: Jordan Niethe <jniethe5@gmail.com> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> board-qemu/slof/fdt.fs | 27 +++++++++++++++------------ >> 1 file changed, 15 insertions(+), 12 deletions(-) >> >> diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs >> index 7da4ff066702..03b1d4034d0a 100644 >> --- a/board-qemu/slof/fdt.fs >> +++ b/board-qemu/slof/fdt.fs >> @@ -636,20 +636,23 @@ r> drop >> THEN >> ; >> >> -: (fdt-cas-search-obsolete-nodes) ( start node -- ) >> - dup IF >> - dup child 2 pick swap recurse >> - dup peer 2 pick swap recurse >> - >> - dup fdt-cas-node-obsolete? IF >> - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN >> - dup delete-node >> - THEN >> +: (fdt-cas-search-obsolete-nodes) ( node -- ) >> + dup child >> + BEGIN >> + dup >> + WHILE >> + dup recurse >> + peer >> + REPEAT >> + drop >> + dup fdt-cas-node-obsolete? IF >> + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN >> + dup delete-node >> THEN >> - 2drop >> + drop >> ; >> >> -: fdt-cas-delete-obsolete-nodes ( start -- ) >> +: fdt-cas-delete-obsolete-nodes ( -- ) >> s" /" find-device get-node (fdt-cas-search-obsolete-nodes) >> fdt-cas-delete-obsolete-aliases >> ; >> @@ -657,7 +660,7 @@ r> drop >> : fdt-fix-cas-node ( start -- ) >> fdt-generation# 1+ to fdt-generation# >> 0 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Add phandles >> - dup fdt-cas-delete-obsolete-nodes \ Delete removed devices >> + fdt-cas-delete-obsolete-nodes \ Delete removed devices > > Maybe keep the comment aligned with the other ones ? May be :) > > Reviewed-by: Greg Kurz <groug@kaod.org> Thanks! > >> 1 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Patch+add other properties >> 2 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Delete phandles from pass 0 >> drop >
On 17/07/2020 07:21, Segher Boessenkool wrote: > Hi! > > On Thu, Jul 16, 2020 at 05:54:30PM +0200, Greg Kurz wrote: >> On Thu, 16 Jul 2020 14:55:42 +1000 >> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>> A loop over peers does not need recursion which becomes a problem with >>> hundreds devices. >> >> You're likely right but, per curiosity, do you have some numbers to >> share ? > > It's two items on the data stack per recursion (and one on the return > stack). This would probably not take too much stack (just a lot, not > *too* much though), except that to do "peer" we also recurse, so if > there are a lot of entries in any of your parent nodes, you lose. And > most trees have some nodes with many children. > > You can code any of the tree walks without any recursion, using just > O(1) extra memory (just keep track of the previous node visited, for > example) for O(1) we need to keep a parent in the current node (which we do). I wonder now if SLOF does O(1) walks anywhere. >, but in this case, it is probably fine to just do a loop to > handle the nodes on one level. > >>> -: (fdt-cas-search-obsolete-nodes) ( start node -- ) >>> - dup IF >>> - dup child 2 pick swap recurse >>> - dup peer 2 pick swap recurse >>> - >>> - dup fdt-cas-node-obsolete? IF >>> - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN >>> - dup delete-node >>> - THEN >>> +: (fdt-cas-search-obsolete-nodes) ( node -- ) >>> + dup child >>> + BEGIN >>> + dup >>> + WHILE >>> + dup recurse >>> + peer >>> + REPEAT >>> + drop >>> + dup fdt-cas-node-obsolete? IF >>> + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN >>> + dup delete-node >>> THEN >>> - 2drop >>> + drop >>> ; > > ... which is what this does :-)
On Fri, Jul 17, 2020 at 10:45:58AM +1000, Alexey Kardashevskiy wrote: > > You can code any of the tree walks without any recursion, using just > > O(1) extra memory (just keep track of the previous node visited, for > > example) > > for O(1) we need to keep a parent in the current node (which we do). I > wonder now if SLOF does O(1) walks anywhere. Before the FDT stuff the only (full) tree walks were done via the client interface, and then all the traversion logic is done there. In OF you typically open a node by (partial) path name, unit addresses, etc.; you are never interested in a whole (sub-)tree, as far as I can see. Maybe there is some case, but it then escapes me. Segher
On Fri, 17 Jul 2020 10:34:40 +1000 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > On 17/07/2020 01:54, Greg Kurz wrote: > > Hi Alexey, > > > > It seems it did again ! I haven't received this in my inbox, despite you've > > put me on Cc... :-\ > > > > On Thu, 16 Jul 2020 14:55:42 +1000 > > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > >> A loop over peers does not need recursion which becomes a problem with > >> hundreds devices. > >> > > > > You're likely right but, per curiosity, do you have some numbers to > > share ? > > > This was discovered with just "-smp 2048,cores=512,threads=4" and the > results were either "cas not implemented" or some other weird XIVE > messages (I did not see any but Anton did). > Oh, I was thinking to something like "it took 30 minutes to boot", but you seem to be talking about dysfunction of SLOF, which would be caused by stack exhaustion if I understand Segher's anwser correctly, right ? > > > > > >> Cc: Greg Kurz <groug@kaod.org> > >> Fixes: efa56b851fab ("fdt: Delete nodes of devices removed between boot and CAS") > >> Suggested-by: Jordan Niethe <jniethe5@gmail.com> > >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > >> --- > >> board-qemu/slof/fdt.fs | 27 +++++++++++++++------------ > >> 1 file changed, 15 insertions(+), 12 deletions(-) > >> > >> diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs > >> index 7da4ff066702..03b1d4034d0a 100644 > >> --- a/board-qemu/slof/fdt.fs > >> +++ b/board-qemu/slof/fdt.fs > >> @@ -636,20 +636,23 @@ r> drop > >> THEN > >> ; > >> > >> -: (fdt-cas-search-obsolete-nodes) ( start node -- ) > >> - dup IF > >> - dup child 2 pick swap recurse > >> - dup peer 2 pick swap recurse > >> - > >> - dup fdt-cas-node-obsolete? IF > >> - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN > >> - dup delete-node > >> - THEN > >> +: (fdt-cas-search-obsolete-nodes) ( node -- ) > >> + dup child > >> + BEGIN > >> + dup > >> + WHILE > >> + dup recurse > >> + peer > >> + REPEAT > >> + drop > >> + dup fdt-cas-node-obsolete? IF > >> + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN > >> + dup delete-node > >> THEN > >> - 2drop > >> + drop > >> ; > >> > >> -: fdt-cas-delete-obsolete-nodes ( start -- ) > >> +: fdt-cas-delete-obsolete-nodes ( -- ) > >> s" /" find-device get-node (fdt-cas-search-obsolete-nodes) > >> fdt-cas-delete-obsolete-aliases > >> ; > >> @@ -657,7 +660,7 @@ r> drop > >> : fdt-fix-cas-node ( start -- ) > >> fdt-generation# 1+ to fdt-generation# > >> 0 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Add phandles > >> - dup fdt-cas-delete-obsolete-nodes \ Delete removed devices > >> + fdt-cas-delete-obsolete-nodes \ Delete removed devices > > > > Maybe keep the comment aligned with the other ones ? > > May be :) > > > > > Reviewed-by: Greg Kurz <groug@kaod.org> > > > Thanks! > > > > >> 1 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Patch+add other properties > >> 2 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Delete phandles from pass 0 > >> drop > > >
On 17/07/2020 17:42, Greg Kurz wrote: > On Fri, 17 Jul 2020 10:34:40 +1000 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> >> >> On 17/07/2020 01:54, Greg Kurz wrote: >>> Hi Alexey, >>> >>> It seems it did again ! I haven't received this in my inbox, despite you've >>> put me on Cc... :-\ >>> >>> On Thu, 16 Jul 2020 14:55:42 +1000 >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>> >>>> A loop over peers does not need recursion which becomes a problem with >>>> hundreds devices. >>>> >>> >>> You're likely right but, per curiosity, do you have some numbers to >>> share ? >> >> >> This was discovered with just "-smp 2048,cores=512,threads=4" and the >> results were either "cas not implemented" or some other weird XIVE >> messages (I did not see any but Anton did). >> > > Oh, I was thinking to something like "it took 30 minutes to boot", but Well, this too but it is just some waiting :) > you seem to be talking about dysfunction of SLOF, which would be caused > by stack exhaustion if I understand Segher's anwser correctly, right ? Right. I am a bit surprised SLOF did not barf with "stack overflow" though. > >> >> >>> >>>> Cc: Greg Kurz <groug@kaod.org> >>>> Fixes: efa56b851fab ("fdt: Delete nodes of devices removed between boot and CAS") >>>> Suggested-by: Jordan Niethe <jniethe5@gmail.com> >>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>> --- >>>> board-qemu/slof/fdt.fs | 27 +++++++++++++++------------ >>>> 1 file changed, 15 insertions(+), 12 deletions(-) >>>> >>>> diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs >>>> index 7da4ff066702..03b1d4034d0a 100644 >>>> --- a/board-qemu/slof/fdt.fs >>>> +++ b/board-qemu/slof/fdt.fs >>>> @@ -636,20 +636,23 @@ r> drop >>>> THEN >>>> ; >>>> >>>> -: (fdt-cas-search-obsolete-nodes) ( start node -- ) >>>> - dup IF >>>> - dup child 2 pick swap recurse >>>> - dup peer 2 pick swap recurse >>>> - >>>> - dup fdt-cas-node-obsolete? IF >>>> - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN >>>> - dup delete-node >>>> - THEN >>>> +: (fdt-cas-search-obsolete-nodes) ( node -- ) >>>> + dup child >>>> + BEGIN >>>> + dup >>>> + WHILE >>>> + dup recurse >>>> + peer >>>> + REPEAT >>>> + drop >>>> + dup fdt-cas-node-obsolete? IF >>>> + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN >>>> + dup delete-node >>>> THEN >>>> - 2drop >>>> + drop >>>> ; >>>> >>>> -: fdt-cas-delete-obsolete-nodes ( start -- ) >>>> +: fdt-cas-delete-obsolete-nodes ( -- ) >>>> s" /" find-device get-node (fdt-cas-search-obsolete-nodes) >>>> fdt-cas-delete-obsolete-aliases >>>> ; >>>> @@ -657,7 +660,7 @@ r> drop >>>> : fdt-fix-cas-node ( start -- ) >>>> fdt-generation# 1+ to fdt-generation# >>>> 0 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Add phandles >>>> - dup fdt-cas-delete-obsolete-nodes \ Delete removed devices >>>> + fdt-cas-delete-obsolete-nodes \ Delete removed devices >>> >>> Maybe keep the comment aligned with the other ones ? >> >> May be :) >> >>> >>> Reviewed-by: Greg Kurz <groug@kaod.org> >> >> >> Thanks! >> >>> >>>> 1 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Patch+add other properties >>>> 2 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Delete phandles from pass 0 >>>> drop >>> >> >
On Fri, Jul 17, 2020 at 07:39:53PM +1000, Alexey Kardashevskiy wrote: > Well, this too but it is just some waiting :) > > > you seem to be talking about dysfunction of SLOF, which would be caused > > by stack exhaustion if I understand Segher's anwser correctly, right ? > > > Right. I am a bit surprised SLOF did not barf with "stack overflow" though. I do not know the current state of this well, so apply salt as needed... It is much too much overhead to check the stack pointer always (executing a single virtual machine insn is just a handful of real machine instructions!) A traditional approach is to only check the stack in the outer interpreter loop (where you have to find every name that is used in the dictionary anyway, or parse numbers, or anything else that costs thousands of cycles *anyway*!) Many paflof configurations unmap the pages right above and below the stacks, to trap such things. That does let slight underflows through, because the top element is not usually actually stored on the stack. SLOF runs in real mode, so there are no guard pages, you can write to all RAM always. By default the interactive interpreter prints the stack depth at every prompt. This is very helpful while writing and testing code. And with the "test everything you write as soon as you write it" approach, most problems are avoided. Exceptions are capacity problems (like here), or anything else that cannot be tested for easily (hardware problems for example). So, perhaps this tree traversal code should call some "test the stack pointers and scream" word at strategic places? At the start of handling any subtree for example. (If some heavier routine is called for every node, you can put the test *there* instead, without making everything slow). Segher
On 17/07/2020 20:28, Segher Boessenkool wrote: > On Fri, Jul 17, 2020 at 07:39:53PM +1000, Alexey Kardashevskiy wrote: >> Well, this too but it is just some waiting :) >> >>> you seem to be talking about dysfunction of SLOF, which would be caused >>> by stack exhaustion if I understand Segher's anwser correctly, right ? >> >> >> Right. I am a bit surprised SLOF did not barf with "stack overflow" though. > > I do not know the current state of this well, so apply salt as needed... > > It is much too much overhead to check the stack pointer always > (executing a single virtual machine insn is just a handful of real > machine instructions!) This all true but yet there is "ERROR: stack overflow in engine()" and I saw these in the past, and there others. But I never got to understand it completely their limits, Thomas would probably know. > A traditional approach is to only check the > stack in the outer interpreter loop (where you have to find every name > that is used in the dictionary anyway, or parse numbers, or anything > else that costs thousands of cycles *anyway*!) > > Many paflof configurations unmap the pages right above and below the > stacks, to trap such things. That does let slight underflows through, > because the top element is not usually actually stored on the stack. > > SLOF runs in real mode, so there are no guard pages, you can write to > all RAM always. Yes :( > By default the interactive interpreter prints the stack depth at every > prompt. This is very helpful while writing and testing code. And with > the "test everything you write as soon as you write it" approach, most > problems are avoided. Exceptions are capacity problems (like here), or > anything else that cannot be tested for easily (hardware problems for > example). > > So, perhaps this tree traversal code should call some "test the stack > pointers and scream" word at strategic places? At the start of handling > any subtree for example. > > (If some heavier routine is called for every node, you can put the test > *there* instead, without making everything slow). Making everything slow is perfectly fine for debugging SLOF. Thankfully SLOF is not that complicated (real mode, 1 CPU, 1 device open at the time) and normally a VM config is enough to reproduce the problem.
On 17/07/2020 13.02, Alexey Kardashevskiy wrote: > > > On 17/07/2020 20:28, Segher Boessenkool wrote: >> On Fri, Jul 17, 2020 at 07:39:53PM +1000, Alexey Kardashevskiy wrote: >>> Well, this too but it is just some waiting :) >>> >>>> you seem to be talking about dysfunction of SLOF, which would be caused >>>> by stack exhaustion if I understand Segher's anwser correctly, right ? >>> >>> >>> Right. I am a bit surprised SLOF did not barf with "stack overflow" though. >> >> I do not know the current state of this well, so apply salt as needed... >> >> It is much too much overhead to check the stack pointer always >> (executing a single virtual machine insn is just a handful of real >> machine instructions!) > > > This all true but yet there is "ERROR: stack overflow in engine()" and I > saw these in the past, and there others. But I never got to understand > it completely their limits, Thomas would probably know. IIRC that is issued by a check at the beginning of engine(), i.e. you'll only see that if you e.g. call from Forth into C code and that calls back into Forth via engine(). Thomas
On Fri, Jul 17, 2020 at 09:02:34PM +1000, Alexey Kardashevskiy wrote: > > (If some heavier routine is called for every node, you can put the test > > *there* instead, without making everything slow). > > Making everything slow is perfectly fine for debugging SLOF. Not if you have to debug on real hardware, or debug the hardware driver itself, or even the actual hardware itself. > Thankfully > SLOF is not that complicated (real mode, 1 CPU, 1 device open at the > time) and normally a VM config is enough to reproduce the problem. SLOF always has many devices open at the same time. Only one ihandle is in my-self at any point of time, of course. This very nicely simplifies how you have to deal with things. Segher
On 18/07/2020 03:49, Segher Boessenkool wrote: > On Fri, Jul 17, 2020 at 09:02:34PM +1000, Alexey Kardashevskiy wrote: >>> (If some heavier routine is called for every node, you can put the test >>> *there* instead, without making everything slow). >> >> Making everything slow is perfectly fine for debugging SLOF. > > Not if you have to debug on real hardware, or debug the hardware driver > itself, or even the actual hardware itself. SLOF on real hardware in 2020? :) >> Thankfully >> SLOF is not that complicated (real mode, 1 CPU, 1 device open at the >> time) and normally a VM config is enough to reproduce the problem. > > SLOF always has many devices open at the same time. Only one ihandle > is in my-self at any point of time, of course. This very nicely > simplifies how you have to deal with things. I meant it does not open more than a single boot device (one or more instances, with packages), and yeah, it keeps serial/vga and TPM open but 1) it is not really many 2) checking stack boundary at every step would make a useful debug tool. Well, I got my answer anyway, thanks for sharing :)
Hi! On Mon, Jul 20, 2020 at 10:50:35AM +1000, Alexey Kardashevskiy wrote: > On 18/07/2020 03:49, Segher Boessenkool wrote: > > On Fri, Jul 17, 2020 at 09:02:34PM +1000, Alexey Kardashevskiy wrote: > >>> (If some heavier routine is called for every node, you can put the test > >>> *there* instead, without making everything slow). > >> > >> Making everything slow is perfectly fine for debugging SLOF. > > > > Not if you have to debug on real hardware, or debug the hardware driver > > itself, or even the actual hardware itself. > > SLOF on real hardware in 2020? :) *Shrug*. It should still work! > >> Thankfully > >> SLOF is not that complicated (real mode, 1 CPU, 1 device open at the > >> time) and normally a VM config is enough to reproduce the problem. > > > > SLOF always has many devices open at the same time. Only one ihandle > > is in my-self at any point of time, of course. This very nicely > > simplifies how you have to deal with things. > > I meant it does not open more than a single boot device (one or more > instances, with packages), and yeah, it keeps serial/vga and TPM open > but 1) it is not really many In the normal case, whenever a device is opened first all its bus parent devices are opened, recursively. You also can have multiple files (on block devices) open at once, etc. This is the same on all Open Firmware implementations. > 2) checking stack boundary at every step > would make a useful debug tool. Yes, and very expensive. I don't dispute that it can be useful in certain cases :-) (It is easy to add in such cases, as well.) Segher
diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs index 7da4ff066702..03b1d4034d0a 100644 --- a/board-qemu/slof/fdt.fs +++ b/board-qemu/slof/fdt.fs @@ -636,20 +636,23 @@ r> drop THEN ; -: (fdt-cas-search-obsolete-nodes) ( start node -- ) - dup IF - dup child 2 pick swap recurse - dup peer 2 pick swap recurse - - dup fdt-cas-node-obsolete? IF - fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN - dup delete-node - THEN +: (fdt-cas-search-obsolete-nodes) ( node -- ) + dup child + BEGIN + dup + WHILE + dup recurse + peer + REPEAT + drop + dup fdt-cas-node-obsolete? IF + fdt-debug IF dup ." Deleting obsolete node: " dup .node ." = " . cr THEN + dup delete-node THEN - 2drop + drop ; -: fdt-cas-delete-obsolete-nodes ( start -- ) +: fdt-cas-delete-obsolete-nodes ( -- ) s" /" find-device get-node (fdt-cas-search-obsolete-nodes) fdt-cas-delete-obsolete-aliases ; @@ -657,7 +660,7 @@ r> drop : fdt-fix-cas-node ( start -- ) fdt-generation# 1+ to fdt-generation# 0 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Add phandles - dup fdt-cas-delete-obsolete-nodes \ Delete removed devices + fdt-cas-delete-obsolete-nodes \ Delete removed devices 1 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Patch+add other properties 2 to fdt-cas-pass dup (fdt-fix-cas-node) drop \ Delete phandles from pass 0 drop
A loop over peers does not need recursion which becomes a problem with hundreds devices. Cc: Greg Kurz <groug@kaod.org> Fixes: efa56b851fab ("fdt: Delete nodes of devices removed between boot and CAS") Suggested-by: Jordan Niethe <jniethe5@gmail.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- board-qemu/slof/fdt.fs | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-)