Message ID | 20171003051523.17650-6-aik@ozlabs.ru |
---|---|
State | Superseded |
Headers | show |
Series | fdt: Pass the resulting device tree to QEMU | expand |
On Tue, Oct 03, 2017 at 04:15:23PM +1100, Alexey Kardashevskiy wrote: > This creates flatten device tree and passes it to QEMU via a custom > hypercall right before jumping to RTAS. Uh.. "right before jumping to RTAS" isn't very clear to me. It's called at quiesce time, right, which we anticipate is the last thing SLOF will do before being wiped away by the OS. That's not really connected to RTAS AFAICT. > On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob > is 360KB (356KB structs and 20KB of strings), building such a tree takes > ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices > takes 38ms and creates 16KB blob. Hrm. 360/16 * 38ms = 855ms < 1s. Which suggests we have something that's slower than O(n) here, which is concerning. 256 cpus and 256 devices is largish, but it's not a ludicrous size. > This preloads strings with 40 property names from CPU and PCI device nodes > and the strings lookup only searches within these. Without string reusing > at all, the strings blob is 200KB and rendering time is 1.7sec; with > unlimited reusing, the strings blob is 4KB and rendering time is > 2.8sec. Blech, that's a prety ugly set of tradeoffs. When I suggested this approach I hadn't thought it would take so long to flatten the tree in Forth. > > While we are here, fix the version handling in fdt-init. It only matters > a little for the fdt-debug==1 case though. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > Changes: > v3: > * fixed stack handling after hcall returned > * fixed format versions in both rendering and parsing paths > * rebased on top of removed unused hvcalls > * renamed used variables to have fdtfl- prefixes as there are already > some for parsing the initial dt > > v2: > * fixed comments from review > * added strings cache > * changed last_compat_vers from 0x17 to 0x16 as suggested by dwg > > --- > > I tested the blob by storing it from QEMU to a file and decompiling it; > this produces error which I do not really > understand as the name of the root is an empty string (literaly: > 00 00 00 01 00 00 00 00) and yet this error: > > aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb > ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) > Warning: Input tree has errors, output forced The reason for this error is it's not to do with the "node name" as given after the FDT_BEGIN_TAG, but with the "name" property. That's not required in FDT, but if supplied should match that node name exactly. IIRC, however, for human readability runtime OF treats 'name' of the root node as being '/'. Arguably that is a bug in dtc - '/' as the name property in the root is probably acceptable. On the other hand, you should probably just omit the 'name' properties when you flatten the tree. > --- > lib/libhvcall/libhvcall.h | 3 +- > board-qemu/slof/fdt.fs | 284 +++++++++++++++++++++++++++++++++++++++++++++- > board-qemu/slof/rtas.fs | 7 ++ > lib/libhvcall/hvcall.code | 5 + > lib/libhvcall/hvcall.in | 1 + > 5 files changed, 297 insertions(+), 3 deletions(-) > > diff --git a/lib/libhvcall/libhvcall.h b/lib/libhvcall/libhvcall.h > index 6356a62..3fa4398 100644 > --- a/lib/libhvcall/libhvcall.h > +++ b/lib/libhvcall/libhvcall.h > @@ -24,7 +24,8 @@ > #define KVMPPC_H_LOGICAL_MEMOP (KVMPPC_HCALL_BASE + 0x1) > /* Client Architecture support */ > #define KVMPPC_H_CAS (KVMPPC_HCALL_BASE + 0x2) > -#define KVMPPC_HCALL_MAX KVMPPC_H_CAS > +#define KVMPPC_H_UPDATE_DT (KVMPPC_HCALL_BASE + 0x3) > +#define KVMPPC_HCALL_MAX KVMPPC_H_UPDATE_DT > > #ifndef __ASSEMBLY__ > > diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs > index 851645e..548cc25 100644 > --- a/board-qemu/slof/fdt.fs > +++ b/board-qemu/slof/fdt.fs > @@ -27,7 +27,7 @@ struct > 4 field >fdth_boot_cpu > 4 field >fdth_string_size > 4 field >fdth_struct_size > -drop > +constant /fdth > > h# d00dfeed constant OF_DT_HEADER > h# 1 constant OF_DT_BEGIN_NODE > @@ -69,7 +69,7 @@ fdt-start fdt-init > dup >fdth_version l@ 3 >= IF > ." strings size : 0x" dup >fdth_string_size l@ . cr > THEN > - dup >fdth_version l@ 17 >= IF > + dup >fdth_version l@ 11 >= IF > ." struct size : 0x" dup >fdth_struct_size l@ . cr > THEN > THEN > @@ -439,4 +439,284 @@ r> drop > fdt-cas-fix? > ; > > +VARIABLE fdtfl-struct > +VARIABLE fdtfl-struct-here > +VARIABLE fdtfl-strings > +VARIABLE fdtfl-strings-cache > +VARIABLE fdtfl-strings-here > +VARIABLE fdtfl-strings-reused \ debug only > +VARIABLE fdt-ms \ debug only > + > +: fdt-skip-string ( cur -- cur ) > + BEGIN > + dup c@ > + WHILE > + 1+ > + REPEAT > + 4 + -4 and > +; > + > +: zstring= ( str len zstr -- flag ) > + 2dup + c@ 0<> IF > + 3drop false > + EXIT > + THEN > + swap comp 0= > +; > + > +: fdt-find-string ( name namelen -- nameoff true | false ) > + fdtfl-strings @ > + BEGIN > + dup fdtfl-strings-cache @ < > + WHILE > + 3dup zstring= IF > + fdtfl-strings @ - > + -rot > + 2drop > + true > + EXIT > + THEN > + fdt-skip-string > + REPEAT > + 3drop > + false > +; > + > +: fdt-str-allot ( len -- ) fdtfl-strings-here @ + to fdtfl-strings-here ; > +: fdt-str-c, ( char -- ) fdtfl-strings-here @ 1 fdt-str-allot c! ; > +: fdt-str-align ( -- ) > + fdtfl-strings-here @ > + dup dup 4 #aligned swap - ( here bytes-to-erase ) > + dup -rot > + erase > + fdt-str-allot > +; > +: fdt-str-bytes, ( data len -- ) fdtfl-strings-here @ over fdt-str-allot swap move ; > +: fdt-str-ztr, ( str len -- ) fdt-str-bytes, 0 fdt-str-c, ; > + > +: fdt-add-string ( name namelen -- nameoff ) > + fdtfl-strings-here @ -rot > + fdt-str-ztr, > + fdt-str-align > + fdtfl-strings @ - > +; > + > +: fdt-get-string ( name namelen -- nameoff ) > + 2dup fdt-find-string IF > + -rot 2drop > + fdt-debug IF > + 1 fdtfl-strings-reused +! > + THEN > + EXIT > + THEN > + fdt-add-string > +; > + > +: fdt-allot ( len -- ) fdtfl-struct-here @ + to fdtfl-struct-here ; > +: fdt-c, ( char -- ) fdtfl-struct-here @ 1 fdt-allot c! ; > +: fdt-align ( -- ) > + fdtfl-struct-here @ > + dup dup 4 #aligned swap - ( here bytes-to-erase ) > + dup -rot > + erase > + fdt-allot > +; > +: fdt-bytes, ( data len -- ) fdtfl-struct-here @ over fdt-allot swap move ; > +: fdt-ztr, ( str len -- ) fdt-bytes, 0 fdt-c, ; > +: fdt-l, ( token -- ) fdtfl-struct-here @ l! /l fdt-allot ; > + > +: fdt-begin-node ( name namelen -- ) > + OF_DT_BEGIN_NODE fdt-l, > + 2dup 1 = swap c@ [char] / = and \ is it "/"? > + IF > + 2drop s" " \ dtc is still unhappy though > + THEN > + fdt-ztr, > + fdt-align > +; > + > +: fdt-end-node ( -- ) OF_DT_END_NODE fdt-l, ; > + > +: fdt-prop ( prop len name namelen -- ) > + OF_DT_PROP fdt-l, > + > + \ get string offset > + fdt-get-string ( prop len nameoff ) > + > + \ store len and nameoff > + over fdt-l, > + fdt-l, ( prop len ) > + > + \ now store the bytes > + fdt-bytes, > + fdt-align > +; > + > +: fdt-end ( -- ) OF_DT_END fdt-l, ; > + > +: fdt-properties ( phandle -- ) > + dup encode-int s" phandle" fdt-prop > + >r > + s" " > + BEGIN > + r@ next-property > + WHILE > + 2dup > + 2dup r@ get-property > + not IF > + 2swap fdt-prop > + THEN > + REPEAT > + r> > + drop > +; > + > +: fdt-flatten-node ( node -- ) > + fdt-debug 1 > IF dup node>path type cr THEN > + dup node>qname fdt-begin-node > + dup fdt-properties > + child > + BEGIN > + dup > + WHILE > + dup recurse > + peer > + REPEAT > + drop > + fdt-end-node > +; > + > +: fdtfl-strings-preload ( -- ) > + s" reg" fdt-add-string drop > + s" status" fdt-add-string drop > + s" 64-bit" fdt-add-string drop > + s" phandle" fdt-add-string drop > + s" ibm,vmx" fdt-add-string drop > + s" ibm,dfp" fdt-add-string drop > + s" slb-size" fdt-add-string drop > + s" ibm,purr" fdt-add-string drop > + s" vendor-id" fdt-add-string drop > + s" device-id" fdt-add-string drop > + s" min-grant" fdt-add-string drop > + s" class-code" fdt-add-string drop > + s" compatible" fdt-add-string drop > + s" interrupts" fdt-add-string drop > + s" cpu-version" fdt-add-string drop > + s" #size-cells" fdt-add-string drop > + s" ibm,req#msi" fdt-add-string drop > + s" revision-id" fdt-add-string drop > + s" device_type" fdt-add-string drop > + s" max-latency" fdt-add-string drop > + s" ibm,chip-id" fdt-add-string drop > + s" ibm,pft-size" fdt-add-string drop > + s" ibm,slb-size" fdt-add-string drop > + s" devsel-speed" fdt-add-string drop > + s" ibm,loc-code" fdt-add-string drop > + s" subsystem-id" fdt-add-string drop > + s" d-cache-size" fdt-add-string drop > + s" i-cache-size" fdt-add-string drop > + s" #address-cells" fdt-add-string drop > + s" clock-frequency" fdt-add-string drop > + s" cache-line-size" fdt-add-string drop > + s" ibm,pa-features" fdt-add-string drop > + s" ibm,my-drc-index" fdt-add-string drop > + s" d-cache-line-size" fdt-add-string drop > + s" i-cache-line-size" fdt-add-string drop > + s" assigned-addresses" fdt-add-string drop > + s" d-cache-block-size" fdt-add-string drop > + s" i-cache-block-size" fdt-add-string drop > + s" timebase-frequency" fdt-add-string drop > + s" subsystem-vendor-id" fdt-add-string drop > + s" ibm,segment-page-sizes" fdt-add-string drop > + s" ibm,ppc-interrupt-server#s" fdt-add-string drop > + s" ibm,processor-segment-sizes" fdt-add-string drop > + s" ibm,ppc-interrupt-gserver#s" fdt-add-string drop > +; > + > +: fdt-append-blob ( bytes cur blob -- cur ) > + 3dup -rot swap move > + drop + > +; > + > +: fdt-flatten-tree ( root -- tree ) > + 200000 alloc-mem dup fdtfl-struct-here ! fdtfl-struct ! > + 200000 alloc-mem dup fdtfl-strings-here ! fdtfl-strings ! > + > + fdt-debug IF > + 0 fdtfl-strings-reused ! > + milliseconds fdt-ms ! > + THEN > + > + \ Preload strings cache > + fdtfl-strings-preload > + fdtfl-strings-here @ fdtfl-strings-cache ! > + \ Render the blobs > + fdt-flatten-node > + fdt-end > + > + \ Calculate strings and struct sizes > + fdtfl-struct-here @ fdtfl-struct @ - > + fdtfl-strings-here @ fdtfl-strings @ - ( struct-len strings-len ) > + > + 2dup + /fdth + > + 10 + \ Reserve 16 bytes and an empty reserved block > + > + fdt-debug IF > + 3dup > + ." FDT flat size=" .d cr > + ." Strings size=" .d cr > + ." Struct size=" .d cr > + ." Reused strings=" fdtfl-strings-reused @ .d cr > + milliseconds fdt-ms @ - > + ." Took " .d ." ms" cr > + THEN > + > + \ Allocate flatten DT blob > + dup alloc-mem ( struct-len strings-len total-len fdt ) > + >r ( struct-len strings-len total-len r: fdt ) > + > + \ Write header > + OF_DT_HEADER r@ >fdth_magic l! > + dup r@ >fdth_tsize l! > + /fdth 10 + 2 pick + r@ >fdth_struct_off l! > + /fdth 10 + r@ >fdth_string_off l! > + /fdth r@ >fdth_rsvmap_off l! > + 11 r@ >fdth_version l! > + 10 r@ >fdth_compat_vers l! > + 0 r@ >fdth_boot_cpu l! > + over r@ >fdth_string_size l! > + 2 pick r@ >fdth_struct_size l! > + ( struct-len strings-len total-len r: fdt ) > + > + drop ( struct-len strings-len r: fdt ) > + r@ /fdth + ( struct-len strings-len cur r: fdt ) > + > + \ Write the reserved entry > + 0 over ! > + cell+ > + 0 over ! > + cell+ ( struct-len strings-len cur r: fdt ) > + > + \ Write strings and struct blobs > + fdtfl-strings @ fdt-append-blob > + fdtfl-struct @ fdt-append-blob > + drop > + > + \ Free temporary blobs > + fdtfl-struct @ 200000 free-mem > + fdtfl-strings @ 200000 free-mem > + > + \ Return fdt > + r> > +; > + > +: fdt-flatten-tree-free ( tree ) > + dup >fdth_tsize l@ free-mem > +; > + > +: fdt ( -- ) > + " /" find-node > + fdt-flatten-tree > +; > + > s" /" find-node fdt-fix-phandles > diff --git a/board-qemu/slof/rtas.fs b/board-qemu/slof/rtas.fs > index b17157e..219bcda 100644 > --- a/board-qemu/slof/rtas.fs > +++ b/board-qemu/slof/rtas.fs > @@ -98,6 +98,13 @@ find-qemu-rtas > ; > > : rtas-quiesce ( -- ) > + " /" find-node > + fdt-flatten-tree > + dup hv-update-dt ?dup IF > + \ Ignore hcall not implemented error, print error otherwise > + dup -2 <> IF ." HV-UPDATE-DT error: " . cr ELSE drop THEN > + THEN > + fdt-flatten-tree-free > " quiesce" rtas-get-token rtas-cb rtas>token l! > 0 rtas-cb rtas>nargs l! > 0 rtas-cb rtas>nret l! > diff --git a/lib/libhvcall/hvcall.code b/lib/libhvcall/hvcall.code > index 744469f..5918c90 100644 > --- a/lib/libhvcall/hvcall.code > +++ b/lib/libhvcall/hvcall.code > @@ -123,3 +123,8 @@ PRIM(check_X2d_and_X2d_patch_X2d_sc1) > > patch_broken_sc1((void*)start, (void*)end, (void*)patch_ins); > MIRP > + > +PRIM(hv_X2d_update_X2d_dt) > + unsigned long dt = TOS.u; > + TOS.u = hv_generic(KVMPPC_H_UPDATE_DT, dt); > +MIRP > diff --git a/lib/libhvcall/hvcall.in b/lib/libhvcall/hvcall.in > index e99d6d1..9193162 100644 > --- a/lib/libhvcall/hvcall.in > +++ b/lib/libhvcall/hvcall.in > @@ -30,4 +30,5 @@ cod(RX!) > > cod(hv-logical-memop) > cod(hv-cas) > +cod(hv-update-dt) > cod(get-print-version)
On Tue, 3 Oct 2017 16:15:23 +1100 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > This creates flatten device tree and passes it to QEMU via a custom > hypercall right before jumping to RTAS. > > On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob > is 360KB (356KB structs and 20KB of strings), building such a tree takes > ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices > takes 38ms and creates 16KB blob. > > This preloads strings with 40 property names from CPU and PCI device nodes > and the strings lookup only searches within these. Without string reusing > at all, the strings blob is 200KB and rendering time is 1.7sec; with > unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. > > While we are here, fix the version handling in fdt-init. It only matters > a little for the fdt-debug==1 case though. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > Changes: > v3: > * fixed stack handling after hcall returned > * fixed format versions in both rendering and parsing paths > * rebased on top of removed unused hvcalls > * renamed used variables to have fdtfl- prefixes as there are already > some for parsing the initial dt > Yeah and this was confusing CAS because ibm,client-architecture-support uses fdt-struct, but wasn't accessing the right one... it took me some time to understand what was happening :-\ Maybe we should have word in fdt.fs to avoid using fdt-struct in some other file ? > v2: > * fixed comments from review > * added strings cache > * changed last_compat_vers from 0x17 to 0x16 as suggested by dwg > > --- > > I tested the blob by storing it from QEMU to a file and decompiling it; > this produces error which I do not really > understand as the name of the root is an empty string (literaly: > 00 00 00 01 00 00 00 00) and yet this error: > > aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb > ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) > Warning: Input tree has errors, output forced > --- > lib/libhvcall/libhvcall.h | 3 +- > board-qemu/slof/fdt.fs | 284 +++++++++++++++++++++++++++++++++++++++++++++- > board-qemu/slof/rtas.fs | 7 ++ > lib/libhvcall/hvcall.code | 5 + > lib/libhvcall/hvcall.in | 1 + > 5 files changed, 297 insertions(+), 3 deletions(-) > > diff --git a/lib/libhvcall/libhvcall.h b/lib/libhvcall/libhvcall.h > index 6356a62..3fa4398 100644 > --- a/lib/libhvcall/libhvcall.h > +++ b/lib/libhvcall/libhvcall.h > @@ -24,7 +24,8 @@ > #define KVMPPC_H_LOGICAL_MEMOP (KVMPPC_HCALL_BASE + 0x1) > /* Client Architecture support */ > #define KVMPPC_H_CAS (KVMPPC_HCALL_BASE + 0x2) > -#define KVMPPC_HCALL_MAX KVMPPC_H_CAS > +#define KVMPPC_H_UPDATE_DT (KVMPPC_HCALL_BASE + 0x3) > +#define KVMPPC_HCALL_MAX KVMPPC_H_UPDATE_DT > > #ifndef __ASSEMBLY__ > > diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs > index 851645e..548cc25 100644 > --- a/board-qemu/slof/fdt.fs > +++ b/board-qemu/slof/fdt.fs > @@ -27,7 +27,7 @@ struct > 4 field >fdth_boot_cpu > 4 field >fdth_string_size > 4 field >fdth_struct_size > -drop > +constant /fdth > > h# d00dfeed constant OF_DT_HEADER > h# 1 constant OF_DT_BEGIN_NODE > @@ -69,7 +69,7 @@ fdt-start fdt-init > dup >fdth_version l@ 3 >= IF > ." strings size : 0x" dup >fdth_string_size l@ . cr > THEN > - dup >fdth_version l@ 17 >= IF > + dup >fdth_version l@ 11 >= IF > ." struct size : 0x" dup >fdth_struct_size l@ . cr > THEN > THEN > @@ -439,4 +439,284 @@ r> drop > fdt-cas-fix? > ; > > +VARIABLE fdtfl-struct > +VARIABLE fdtfl-struct-here > +VARIABLE fdtfl-strings > +VARIABLE fdtfl-strings-cache > +VARIABLE fdtfl-strings-here > +VARIABLE fdtfl-strings-reused \ debug only > +VARIABLE fdt-ms \ debug only > + > +: fdt-skip-string ( cur -- cur ) > + BEGIN > + dup c@ > + WHILE > + 1+ > + REPEAT > + 4 + -4 and > +; > + > +: zstring= ( str len zstr -- flag ) > + 2dup + c@ 0<> IF > + 3drop false > + EXIT > + THEN > + swap comp 0= > +; > + > +: fdt-find-string ( name namelen -- nameoff true | false ) > + fdtfl-strings @ > + BEGIN > + dup fdtfl-strings-cache @ < > + WHILE > + 3dup zstring= IF > + fdtfl-strings @ - > + -rot > + 2drop > + true > + EXIT > + THEN > + fdt-skip-string > + REPEAT > + 3drop > + false > +; > + > +: fdt-str-allot ( len -- ) fdtfl-strings-here @ + to fdtfl-strings-here ; > +: fdt-str-c, ( char -- ) fdtfl-strings-here @ 1 fdt-str-allot c! ; > +: fdt-str-align ( -- ) > + fdtfl-strings-here @ > + dup dup 4 #aligned swap - ( here bytes-to-erase ) > + dup -rot > + erase > + fdt-str-allot > +; > +: fdt-str-bytes, ( data len -- ) fdtfl-strings-here @ over fdt-str-allot swap move ; > +: fdt-str-ztr, ( str len -- ) fdt-str-bytes, 0 fdt-str-c, ; > + > +: fdt-add-string ( name namelen -- nameoff ) > + fdtfl-strings-here @ -rot > + fdt-str-ztr, > + fdt-str-align > + fdtfl-strings @ - > +; > + > +: fdt-get-string ( name namelen -- nameoff ) > + 2dup fdt-find-string IF > + -rot 2drop > + fdt-debug IF > + 1 fdtfl-strings-reused +! > + THEN > + EXIT > + THEN > + fdt-add-string > +; > + > +: fdt-allot ( len -- ) fdtfl-struct-here @ + to fdtfl-struct-here ; > +: fdt-c, ( char -- ) fdtfl-struct-here @ 1 fdt-allot c! ; > +: fdt-align ( -- ) > + fdtfl-struct-here @ > + dup dup 4 #aligned swap - ( here bytes-to-erase ) > + dup -rot > + erase > + fdt-allot > +; > +: fdt-bytes, ( data len -- ) fdtfl-struct-here @ over fdt-allot swap move ; > +: fdt-ztr, ( str len -- ) fdt-bytes, 0 fdt-c, ; > +: fdt-l, ( token -- ) fdtfl-struct-here @ l! /l fdt-allot ; > + > +: fdt-begin-node ( name namelen -- ) > + OF_DT_BEGIN_NODE fdt-l, > + 2dup 1 = swap c@ [char] / = and \ is it "/"? > + IF > + 2drop s" " \ dtc is still unhappy though > + THEN > + fdt-ztr, > + fdt-align > +; > + > +: fdt-end-node ( -- ) OF_DT_END_NODE fdt-l, ; > + > +: fdt-prop ( prop len name namelen -- ) > + OF_DT_PROP fdt-l, > + > + \ get string offset > + fdt-get-string ( prop len nameoff ) > + > + \ store len and nameoff > + over fdt-l, > + fdt-l, ( prop len ) > + > + \ now store the bytes > + fdt-bytes, > + fdt-align > +; > + > +: fdt-end ( -- ) OF_DT_END fdt-l, ; > + > +: fdt-properties ( phandle -- ) > + dup encode-int s" phandle" fdt-prop > + >r > + s" " > + BEGIN > + r@ next-property > + WHILE > + 2dup > + 2dup r@ get-property > + not IF > + 2swap fdt-prop > + THEN > + REPEAT > + r> > + drop > +; > + > +: fdt-flatten-node ( node -- ) > + fdt-debug 1 > IF dup node>path type cr THEN > + dup node>qname fdt-begin-node > + dup fdt-properties > + child > + BEGIN > + dup > + WHILE > + dup recurse > + peer > + REPEAT > + drop > + fdt-end-node > +; > + > +: fdtfl-strings-preload ( -- ) > + s" reg" fdt-add-string drop > + s" status" fdt-add-string drop > + s" 64-bit" fdt-add-string drop > + s" phandle" fdt-add-string drop > + s" ibm,vmx" fdt-add-string drop > + s" ibm,dfp" fdt-add-string drop > + s" slb-size" fdt-add-string drop > + s" ibm,purr" fdt-add-string drop > + s" vendor-id" fdt-add-string drop > + s" device-id" fdt-add-string drop > + s" min-grant" fdt-add-string drop > + s" class-code" fdt-add-string drop > + s" compatible" fdt-add-string drop > + s" interrupts" fdt-add-string drop > + s" cpu-version" fdt-add-string drop > + s" #size-cells" fdt-add-string drop > + s" ibm,req#msi" fdt-add-string drop > + s" revision-id" fdt-add-string drop > + s" device_type" fdt-add-string drop > + s" max-latency" fdt-add-string drop > + s" ibm,chip-id" fdt-add-string drop > + s" ibm,pft-size" fdt-add-string drop > + s" ibm,slb-size" fdt-add-string drop > + s" devsel-speed" fdt-add-string drop > + s" ibm,loc-code" fdt-add-string drop > + s" subsystem-id" fdt-add-string drop > + s" d-cache-size" fdt-add-string drop > + s" i-cache-size" fdt-add-string drop > + s" #address-cells" fdt-add-string drop > + s" clock-frequency" fdt-add-string drop > + s" cache-line-size" fdt-add-string drop > + s" ibm,pa-features" fdt-add-string drop > + s" ibm,my-drc-index" fdt-add-string drop > + s" d-cache-line-size" fdt-add-string drop > + s" i-cache-line-size" fdt-add-string drop > + s" assigned-addresses" fdt-add-string drop > + s" d-cache-block-size" fdt-add-string drop > + s" i-cache-block-size" fdt-add-string drop > + s" timebase-frequency" fdt-add-string drop > + s" subsystem-vendor-id" fdt-add-string drop > + s" ibm,segment-page-sizes" fdt-add-string drop > + s" ibm,ppc-interrupt-server#s" fdt-add-string drop > + s" ibm,processor-segment-sizes" fdt-add-string drop > + s" ibm,ppc-interrupt-gserver#s" fdt-add-string drop > +; > + > +: fdt-append-blob ( bytes cur blob -- cur ) > + 3dup -rot swap move > + drop + > +; > + > +: fdt-flatten-tree ( root -- tree ) > + 200000 alloc-mem dup fdtfl-struct-here ! fdtfl-struct ! > + 200000 alloc-mem dup fdtfl-strings-here ! fdtfl-strings ! > + > + fdt-debug IF > + 0 fdtfl-strings-reused ! > + milliseconds fdt-ms ! > + THEN > + > + \ Preload strings cache > + fdtfl-strings-preload > + fdtfl-strings-here @ fdtfl-strings-cache ! > + \ Render the blobs > + fdt-flatten-node > + fdt-end > + > + \ Calculate strings and struct sizes > + fdtfl-struct-here @ fdtfl-struct @ - > + fdtfl-strings-here @ fdtfl-strings @ - ( struct-len strings-len ) > + > + 2dup + /fdth + > + 10 + \ Reserve 16 bytes and an empty reserved block > + > + fdt-debug IF > + 3dup > + ." FDT flat size=" .d cr > + ." Strings size=" .d cr > + ." Struct size=" .d cr > + ." Reused strings=" fdtfl-strings-reused @ .d cr > + milliseconds fdt-ms @ - > + ." Took " .d ." ms" cr > + THEN > + > + \ Allocate flatten DT blob > + dup alloc-mem ( struct-len strings-len total-len fdt ) > + >r ( struct-len strings-len total-len r: fdt ) > + > + \ Write header > + OF_DT_HEADER r@ >fdth_magic l! > + dup r@ >fdth_tsize l! > + /fdth 10 + 2 pick + r@ >fdth_struct_off l! > + /fdth 10 + r@ >fdth_string_off l! > + /fdth r@ >fdth_rsvmap_off l! > + 11 r@ >fdth_version l! > + 10 r@ >fdth_compat_vers l! > + 0 r@ >fdth_boot_cpu l! > + over r@ >fdth_string_size l! > + 2 pick r@ >fdth_struct_size l! > + ( struct-len strings-len total-len r: fdt ) > + > + drop ( struct-len strings-len r: fdt ) > + r@ /fdth + ( struct-len strings-len cur r: fdt ) > + > + \ Write the reserved entry > + 0 over ! > + cell+ > + 0 over ! > + cell+ ( struct-len strings-len cur r: fdt ) > + > + \ Write strings and struct blobs > + fdtfl-strings @ fdt-append-blob > + fdtfl-struct @ fdt-append-blob > + drop > + > + \ Free temporary blobs > + fdtfl-struct @ 200000 free-mem > + fdtfl-strings @ 200000 free-mem > + > + \ Return fdt > + r> > +; > + > +: fdt-flatten-tree-free ( tree ) > + dup >fdth_tsize l@ free-mem > +; > + > +: fdt ( -- ) > + " /" find-node > + fdt-flatten-tree > +; > + > s" /" find-node fdt-fix-phandles > diff --git a/board-qemu/slof/rtas.fs b/board-qemu/slof/rtas.fs > index b17157e..219bcda 100644 > --- a/board-qemu/slof/rtas.fs > +++ b/board-qemu/slof/rtas.fs > @@ -98,6 +98,13 @@ find-qemu-rtas > ; > > : rtas-quiesce ( -- ) > + " /" find-node > + fdt-flatten-tree > + dup hv-update-dt ?dup IF > + \ Ignore hcall not implemented error, print error otherwise > + dup -2 <> IF ." HV-UPDATE-DT error: " . cr ELSE drop THEN > + THEN > + fdt-flatten-tree-free > " quiesce" rtas-get-token rtas-cb rtas>token l! > 0 rtas-cb rtas>nargs l! > 0 rtas-cb rtas>nret l! > diff --git a/lib/libhvcall/hvcall.code b/lib/libhvcall/hvcall.code > index 744469f..5918c90 100644 > --- a/lib/libhvcall/hvcall.code > +++ b/lib/libhvcall/hvcall.code > @@ -123,3 +123,8 @@ PRIM(check_X2d_and_X2d_patch_X2d_sc1) > > patch_broken_sc1((void*)start, (void*)end, (void*)patch_ins); > MIRP > + > +PRIM(hv_X2d_update_X2d_dt) > + unsigned long dt = TOS.u; > + TOS.u = hv_generic(KVMPPC_H_UPDATE_DT, dt); > +MIRP > diff --git a/lib/libhvcall/hvcall.in b/lib/libhvcall/hvcall.in > index e99d6d1..9193162 100644 > --- a/lib/libhvcall/hvcall.in > +++ b/lib/libhvcall/hvcall.in > @@ -30,4 +30,5 @@ cod(RX!) > > cod(hv-logical-memop) > cod(hv-cas) > +cod(hv-update-dt) > cod(get-print-version)
On 03/10/17 19:48, Greg Kurz wrote: > On Tue, 3 Oct 2017 16:15:23 +1100 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> This creates flatten device tree and passes it to QEMU via a custom >> hypercall right before jumping to RTAS. >> >> On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob >> is 360KB (356KB structs and 20KB of strings), building such a tree takes >> ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices >> takes 38ms and creates 16KB blob. >> >> This preloads strings with 40 property names from CPU and PCI device nodes >> and the strings lookup only searches within these. Without string reusing >> at all, the strings blob is 200KB and rendering time is 1.7sec; with >> unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. >> >> While we are here, fix the version handling in fdt-init. It only matters >> a little for the fdt-debug==1 case though. >> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> >> Changes: >> v3: >> * fixed stack handling after hcall returned >> * fixed format versions in both rendering and parsing paths >> * rebased on top of removed unused hvcalls >> * renamed used variables to have fdtfl- prefixes as there are already >> some for parsing the initial dt >> > > Yeah and this was confusing CAS because ibm,client-architecture-support > uses fdt-struct, but wasn't accessing the right one... it took me some > time to understand what was happening :-\ > > Maybe we should have word in fdt.fs to avoid using fdt-struct in some > other file ? This would not help - these variables are global.
On 03/10/17 17:28, David Gibson wrote: > On Tue, Oct 03, 2017 at 04:15:23PM +1100, Alexey Kardashevskiy wrote: >> This creates flatten device tree and passes it to QEMU via a custom >> hypercall right before jumping to RTAS. > > Uh.. "right before jumping to RTAS" isn't very clear to me. It's > called at quiesce time, right, which we anticipate is the last thing > SLOF will do before being wiped away by the OS. That's not really > connected to RTAS AFAICT. > I did not think much on this paragraph, sorry :) >> On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob >> is 360KB (356KB structs and 20KB of strings), building such a tree takes >> ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices >> takes 38ms and creates 16KB blob. > > Hrm. 360/16 * 38ms = 855ms < 1s. My simple case does not do PCI, with a PCI bridge and a E1000 it is 45ms. But I see your point. It is j > Which suggests we have something > that's slower than O(n) here, which is concerning. 256 cpus and 256 > devices is largish, but it's not a ludicrous size. With 255 PCI devices, there are quite many bridges with property names which I do not preload, that could explain. >> This preloads strings with 40 property names from CPU and PCI device nodes >> and the strings lookup only searches within these. Without string reusing >> at all, the strings blob is 200KB and rendering time is 1.7sec; with >> unlimited reusing, the strings blob is 4KB and rendering time is >> 2.8sec. > > Blech, that's a prety ugly set of tradeoffs. When I suggested this > approach I hadn't thought it would take so long to flatten the tree in > Forth. Fetching the tree to the guest kernel takes close to 10 seconds though, using the client interface. Just to compare. I can look at what Segher suggested when mentioned a word list, not sure what this is and how it can help but worth to try I guess. >> >> While we are here, fix the version handling in fdt-init. It only matters >> a little for the fdt-debug==1 case though. >> >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> >> Changes: >> v3: >> * fixed stack handling after hcall returned >> * fixed format versions in both rendering and parsing paths >> * rebased on top of removed unused hvcalls >> * renamed used variables to have fdtfl- prefixes as there are already >> some for parsing the initial dt >> >> v2: >> * fixed comments from review >> * added strings cache >> * changed last_compat_vers from 0x17 to 0x16 as suggested by dwg >> >> --- >> >> I tested the blob by storing it from QEMU to a file and decompiling it; >> this produces error which I do not really >> understand as the name of the root is an empty string (literaly: >> 00 00 00 01 00 00 00 00) and yet this error: >> >> aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb >> ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) >> Warning: Input tree has errors, output forced > > The reason for this error is it's not to do with the "node name" as > given after the FDT_BEGIN_TAG, but with the "name" property. That's > not required in FDT, but if supplied should match that node name > exactly. IIRC, however, for human readability runtime OF treats > 'name' of the root node as being '/'. > > Arguably that is a bug in dtc - '/' as the name property in the root > is probably acceptable. On the other hand, you should probably just > omit the 'name' properties when you flatten the tree. Ok, will do this.
On Tue, 3 Oct 2017 19:53:23 +1100 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > On 03/10/17 19:48, Greg Kurz wrote: > > On Tue, 3 Oct 2017 16:15:23 +1100 > > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > >> This creates flatten device tree and passes it to QEMU via a custom > >> hypercall right before jumping to RTAS. > >> > >> On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob > >> is 360KB (356KB structs and 20KB of strings), building such a tree takes > >> ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices > >> takes 38ms and creates 16KB blob. > >> > >> This preloads strings with 40 property names from CPU and PCI device nodes > >> and the strings lookup only searches within these. Without string reusing > >> at all, the strings blob is 200KB and rendering time is 1.7sec; with > >> unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. > >> > >> While we are here, fix the version handling in fdt-init. It only matters > >> a little for the fdt-debug==1 case though. > >> > >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > >> --- > >> > >> Changes: > >> v3: > >> * fixed stack handling after hcall returned > >> * fixed format versions in both rendering and parsing paths > >> * rebased on top of removed unused hvcalls > >> * renamed used variables to have fdtfl- prefixes as there are already > >> some for parsing the initial dt > >> > > > > Yeah and this was confusing CAS because ibm,client-architecture-support > > uses fdt-struct, but wasn't accessing the right one... it took me some > > time to understand what was happening :-\ > > > > Maybe we should have word in fdt.fs to avoid using fdt-struct in some > > other file ? > > This would not help - these variables are global. > They do have global scope indeed, but it doesn't mean they necessarily have to be used as such... >
On 03/10/17 16:15, Alexey Kardashevskiy wrote: > This creates flatten device tree and passes it to QEMU via a custom > hypercall right before jumping to RTAS. > > On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob > is 360KB (356KB structs and 20KB of strings), building such a tree takes > ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices > takes 38ms and creates 16KB blob. > > This preloads strings with 40 property names from CPU and PCI device nodes > and the strings lookup only searches within these. Without string reusing > at all, the strings blob is 200KB and rendering time is 1.7sec; with > unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. > > While we are here, fix the version handling in fdt-init. It only matters > a little for the fdt-debug==1 case though. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > Changes: > v3: > * fixed stack handling after hcall returned > * fixed format versions in both rendering and parsing paths > * rebased on top of removed unused hvcalls > * renamed used variables to have fdtfl- prefixes as there are already > some for parsing the initial dt > > v2: > * fixed comments from review > * added strings cache > * changed last_compat_vers from 0x17 to 0x16 as suggested by dwg > > --- > > I tested the blob by storing it from QEMU to a file and decompiling it; > this produces error which I do not really > understand as the name of the root is an empty string (literaly: > 00 00 00 01 00 00 00 00) and yet this error: > > aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb > ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) > Warning: Input tree has errors, output forced > --- > lib/libhvcall/libhvcall.h | 3 +- > board-qemu/slof/fdt.fs | 284 +++++++++++++++++++++++++++++++++++++++++++++- > board-qemu/slof/rtas.fs | 7 ++ > lib/libhvcall/hvcall.code | 5 + > lib/libhvcall/hvcall.in | 1 + > 5 files changed, 297 insertions(+), 3 deletions(-) > [...] > diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs > index 851645e..548cc25 100644 > --- a/board-qemu/slof/fdt.fs > +++ b/board-qemu/slof/fdt.fs > @@ -27,7 +27,7 @@ struct [...] > +: fdt-properties ( phandle -- ) > + dup encode-int s" phandle" fdt-prop > + >r > + s" " > + BEGIN > + r@ next-property > + WHILE > + 2dup > + 2dup r@ get-property > + not IF > + 2swap fdt-prop > + THEN > + REPEAT > + r> > + drop > +; Doing as below still works but brings the time for the huge guest (256 CPUs, 256 E1000) down to 366ms as it does not search for properties every time but just iterates through them, I picked this from ".properties". So we are back in business :) But I'd really love someone to explain me how that works as I am unable to parse this kind of internals myself... Segher, please. : fdt-properties ( phandle -- ) dup encode-int s" phandle" fdt-prop node>properties @ cell+ @ BEGIN dup WHILE dup link> dup >name name>string 2>r execute 2r> fdt-prop @ REPEAT drop ;
On Tue, Oct 03, 2017 at 11:13:51PM +1100, Alexey Kardashevskiy wrote: > On 03/10/17 16:15, Alexey Kardashevskiy wrote: > > This creates flatten device tree and passes it to QEMU via a custom > > hypercall right before jumping to RTAS. > > > > On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob > > is 360KB (356KB structs and 20KB of strings), building such a tree takes > > ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices > > takes 38ms and creates 16KB blob. > > > > This preloads strings with 40 property names from CPU and PCI device nodes > > and the strings lookup only searches within these. Without string reusing > > at all, the strings blob is 200KB and rendering time is 1.7sec; with > > unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. > > > > While we are here, fix the version handling in fdt-init. It only matters > > a little for the fdt-debug==1 case though. > > > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > > --- > > > > Changes: > > v3: > > * fixed stack handling after hcall returned > > * fixed format versions in both rendering and parsing paths > > * rebased on top of removed unused hvcalls > > * renamed used variables to have fdtfl- prefixes as there are already > > some for parsing the initial dt > > > > v2: > > * fixed comments from review > > * added strings cache > > * changed last_compat_vers from 0x17 to 0x16 as suggested by dwg > > > > --- > > > > I tested the blob by storing it from QEMU to a file and decompiling it; > > this produces error which I do not really > > understand as the name of the root is an empty string (literaly: > > 00 00 00 01 00 00 00 00) and yet this error: > > > > aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb > > ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) > > Warning: Input tree has errors, output forced > > --- > > lib/libhvcall/libhvcall.h | 3 +- > > board-qemu/slof/fdt.fs | 284 +++++++++++++++++++++++++++++++++++++++++++++- > > board-qemu/slof/rtas.fs | 7 ++ > > lib/libhvcall/hvcall.code | 5 + > > lib/libhvcall/hvcall.in | 1 + > > 5 files changed, 297 insertions(+), 3 deletions(-) > > > > [...] > > > diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs > > index 851645e..548cc25 100644 > > --- a/board-qemu/slof/fdt.fs > > +++ b/board-qemu/slof/fdt.fs > > @@ -27,7 +27,7 @@ struct > > [...] > > > +: fdt-properties ( phandle -- ) > > + dup encode-int s" phandle" fdt-prop > > + >r > > + s" " > > + BEGIN > > + r@ next-property > > + WHILE > > + 2dup > > + 2dup r@ get-property > > + not IF > > + 2swap fdt-prop > > + THEN > > + REPEAT > > + r> > > + drop > > +; > > > Doing as below still works but brings the time for the huge guest (256 > CPUs, 256 E1000) down to 366ms as it does not search for properties every Ah, much better. That's a cost I can live with. > time but just iterates through them, I picked this from ".properties". So > we are back in business :) But I'd really love someone to explain me how > that works as I am unable to parse this kind of internals myself... Segher, > please. > > > : fdt-properties ( phandle -- ) > dup encode-int s" phandle" fdt-prop > > node>properties @ cell+ @ > BEGIN > dup > WHILE > dup > > link> > dup >name name>string 2>r > execute > 2r> > fdt-prop > > @ > REPEAT > > drop > ; > >
On Tue, Oct 03, 2017 at 05:28:43PM +1100, David Gibson wrote: > On Tue, Oct 03, 2017 at 04:15:23PM +1100, Alexey Kardashevskiy wrote: > > I tested the blob by storing it from QEMU to a file and decompiling it; > > this produces error which I do not really > > understand as the name of the root is an empty string (literaly: > > 00 00 00 01 00 00 00 00) and yet this error: > > > > aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb > > ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) > > Warning: Input tree has errors, output forced > > The reason for this error is it's not to do with the "node name" as > given after the FDT_BEGIN_TAG, but with the "name" property. That's > not required in FDT, but if supplied should match that node name > exactly. IIRC, however, for human readability runtime OF treats > 'name' of the root node as being '/'. SLOF usually sets the "name" property in the root node to "/". The standard actually requires "Name of system’s manufacturer and model number, e.g., “ABC,mat750”", which is also not an empty string (nor an absent property). > Arguably that is a bug in dtc - '/' as the name property in the root > is probably acceptable. On the other hand, you should probably just > omit the 'name' properties when you flatten the tree. Yup, it is one of the differences between FDT and OF device trees. Segher
On Tue, Oct 03, 2017 at 11:13:51PM +1100, Alexey Kardashevskiy wrote: > > +: fdt-properties ( phandle -- ) > > + dup encode-int s" phandle" fdt-prop > > + >r > > + s" " > > + BEGIN > > + r@ next-property > > + WHILE > > + 2dup > > + 2dup r@ get-property > > + not IF > > + 2swap fdt-prop > > + THEN > > + REPEAT > > + r> > > + drop > > +; > > > Doing as below still works but brings the time for the huge guest (256 > CPUs, 256 E1000) down to 366ms as it does not search for properties every > time but just iterates through them, I picked this from ".properties". So > we are back in business :) But I'd really love someone to explain me how > that works as I am unable to parse this kind of internals myself... Segher, > please. I'll try. : link>name cell+ ; : name>string char+ count ; A wordlist is a linked list, offset 0 of each record is a pointer to the next. At offset 1 cell is the per-name data: a flags byte, a byte that is the number of chars in the name, and the actual chars. Well. The wordlists are themselves a linked list; offset 0 holds the address of the next wordlist, and offset 1 cell holds the address of the first name record. Like: STRUCT cell FIELD wid>next cell FIELD wid>names \ the head of the list of name records END-STRUCT STRUCT cell FIELD name>next char FIELD name>flags char FIELD name>count 0 FIELD name>chars END-STRUCT The default (find) implementation (from engine.in): : ((find)) ( str len head -- 0 | link ) BEGIN dup WHILE >r 2dup r@ link>name name>string string=ci IF \ if found the name: 2drop r> EXIT THEN \ return a pointer to this name record r> @ \ follow the linked list REPEAT 3drop false ; \ nope, not found > : fdt-properties ( phandle -- ) > dup encode-int s" phandle" fdt-prop > > node>properties @ cell+ @ "properties" in a node struct is a wordlist, which is just a pointer to the first name record in that wordlist. > BEGIN > dup > WHILE > dup > > link> : link> link>name name> ; where name> moves to after the name (cell-aligned): : name> char+ dup c@ 1+ chars+ aligned ; (skip the flags byte, get the char count byte, skip that many bytes (and the count byte itself), align). > dup >name name>string 2>r > execute > 2r> > fdt-prop >name tries to find the name, given an xt. You do not need it here. > > @ > REPEAT > > drop > ; So maybe more like : fdt-property ( link -- ) dup link> execute rot link>name name>string fdt-prop ; : fdt-properties ( phandle -- ) dup encode-int s" phandle" fdt-prop node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; (and maybe better names? Something that makes clear it is copying properties to the flat tree). Segher
On 05/10/17 03:40, Segher Boessenkool wrote: > On Tue, Oct 03, 2017 at 11:13:51PM +1100, Alexey Kardashevskiy wrote: >>> +: fdt-properties ( phandle -- ) >>> + dup encode-int s" phandle" fdt-prop >>> + >r >>> + s" " >>> + BEGIN >>> + r@ next-property >>> + WHILE >>> + 2dup >>> + 2dup r@ get-property >>> + not IF >>> + 2swap fdt-prop >>> + THEN >>> + REPEAT >>> + r> >>> + drop >>> +; >> >> >> Doing as below still works but brings the time for the huge guest (256 >> CPUs, 256 E1000) down to 366ms as it does not search for properties every >> time but just iterates through them, I picked this from ".properties". So >> we are back in business :) But I'd really love someone to explain me how >> that works as I am unable to parse this kind of internals myself... Segher, >> please. > > I'll try. > > : link>name cell+ ; > : name>string char+ count ; > > A wordlist is a linked list, offset 0 of each record is a pointer to > the next. At offset 1 cell is the per-name data: a flags byte, a byte > that is the number of chars in the name, and the actual chars. > > Well. The wordlists are themselves a linked list; offset 0 holds the > address of the next wordlist, and offset 1 cell holds the address of > the first name record. Like: > > STRUCT > cell FIELD wid>next > cell FIELD wid>names \ the head of the list of name records > END-STRUCT > > STRUCT > cell FIELD name>next > char FIELD name>flags > char FIELD name>count > 0 FIELD name>chars > END-STRUCT These do not exist in the existing SLOF code, you just typed them here, right? I want to put these into slof/fs/node.fs right after the node struct definition with some comments about what points to what. > > > The default (find) implementation (from engine.in): > > : ((find)) ( str len head -- 0 | link ) > BEGIN dup WHILE > >r 2dup r@ link>name name>string string=ci IF \ if found the name: > 2drop r> EXIT THEN \ return a pointer to this name record > r> @ \ follow the linked list > REPEAT > 3drop false ; \ nope, not found > > >> : fdt-properties ( phandle -- ) >> dup encode-int s" phandle" fdt-prop >> >> node>properties @ cell+ @ > > "properties" in a node struct is a wordlist, which is just a pointer > to the first name record in that wordlist. ( phandle ) node>properties @ ( properties-wordlist ) cell+ @ ( properties-names-next ) And properties-names-next points to the first 'name>next' thingy. Ok. > >> BEGIN >> dup >> WHILE >> dup >> >> link> > > : link> link>name name> ; Ahhh. I was searching for 'LINK_X3E' but it is: col(LINK> LINK>NAME NAME>) Ok. Another confusion was that 'NAME' here does not point to a string but to 'name>flags'. Ok. btw what does 'lfa' stand for in stack comments? So, after 'link>', the stack is: ( properties-names-next end-of-name ) And after that end-of-name there is/are what? One 'xt' there returns data+len, where does that token come from? I assume next one is 'xt' which prints from 'ls', right? All I am getting it all wrong and there is just one xt? > where name> moves to after the name (cell-aligned): > > : name> char+ dup c@ 1+ chars+ aligned ; > (skip the flags byte, get the char count byte, skip that many bytes > (and the count byte itself), align). > >> dup >name name>string 2>r >> execute >> 2r> >> fdt-prop > >> name tries to find the name, given an xt. You do not need it here. Well, with this new knowledge - sounds like the name is right there between properties-names-next and what 'link>' returns so I do not need extra work to extract it. '>name' simply goes backwards to the beginning of the name, right (if I am reading the beginning of slof/fs/base.fs correctly)? btw '.property' still does search, what is that - room for a little improvement or I am missing something again? > >> >> @ >> REPEAT >> >> drop >> ; > > So maybe more like > > : fdt-property ( link -- ) > dup link> execute rot link>name name>string fdt-prop ; Without stack comments, I'll forget this in 2 months maximum. > : fdt-properties ( phandle -- ) > dup encode-int s" phandle" fdt-prop > node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; With this formatting and missing stack comments I will stop understanding it in a week :( > (and maybe better names? Something that makes clear it is copying > properties to the flat tree). It does not work like this, if you do not like the name - do not just say it, suggest a better one ;) and thanks for the lesson!
On 05/10/17 14:26, Alexey Kardashevskiy wrote: > On 05/10/17 03:40, Segher Boessenkool wrote: >> On Tue, Oct 03, 2017 at 11:13:51PM +1100, Alexey Kardashevskiy wrote: >>>> +: fdt-properties ( phandle -- ) >>>> + dup encode-int s" phandle" fdt-prop >>>> + >r >>>> + s" " >>>> + BEGIN >>>> + r@ next-property >>>> + WHILE >>>> + 2dup >>>> + 2dup r@ get-property >>>> + not IF >>>> + 2swap fdt-prop >>>> + THEN >>>> + REPEAT >>>> + r> >>>> + drop >>>> +; >>> >>> >>> Doing as below still works but brings the time for the huge guest (256 >>> CPUs, 256 E1000) down to 366ms as it does not search for properties every >>> time but just iterates through them, I picked this from ".properties". So >>> we are back in business :) But I'd really love someone to explain me how >>> that works as I am unable to parse this kind of internals myself... Segher, >>> please. >> >> I'll try. >> >> : link>name cell+ ; >> : name>string char+ count ; >> >> A wordlist is a linked list, offset 0 of each record is a pointer to >> the next. At offset 1 cell is the per-name data: a flags byte, a byte >> that is the number of chars in the name, and the actual chars. >> >> Well. The wordlists are themselves a linked list; offset 0 holds the >> address of the next wordlist, and offset 1 cell holds the address of >> the first name record. Like: >> >> STRUCT >> cell FIELD wid>next >> cell FIELD wid>names \ the head of the list of name records >> END-STRUCT >> >> STRUCT >> cell FIELD name>next >> char FIELD name>flags >> char FIELD name>count >> 0 FIELD name>chars >> END-STRUCT > > > These do not exist in the existing SLOF code, you just typed them here, right? > > I want to put these into slof/fs/node.fs right after the node struct > definition with some comments about what points to what. > > >> >> >> The default (find) implementation (from engine.in): >> >> : ((find)) ( str len head -- 0 | link ) >> BEGIN dup WHILE >> >r 2dup r@ link>name name>string string=ci IF \ if found the name: >> 2drop r> EXIT THEN \ return a pointer to this name record >> r> @ \ follow the linked list >> REPEAT >> 3drop false ; \ nope, not found >> >> >>> : fdt-properties ( phandle -- ) >>> dup encode-int s" phandle" fdt-prop >>> >>> node>properties @ cell+ @ >> >> "properties" in a node struct is a wordlist, which is just a pointer >> to the first name record in that wordlist. > > ( phandle ) > node>properties @ ( properties-wordlist ) > cell+ @ ( properties-names-next ) > > And properties-names-next points to the first 'name>next' thingy. Ok. > >> >>> BEGIN >>> dup >>> WHILE >>> dup >>> >>> link> >> >> : link> link>name name> ; > > Ahhh. I was searching for 'LINK_X3E' but it is: > col(LINK> LINK>NAME NAME>) > > Ok. Another confusion was that 'NAME' here does not point to a string but > to 'name>flags'. Ok. > > btw what does 'lfa' stand for in stack comments? > > > So, after 'link>', the stack is: > ( properties-names-next end-of-name ) > > > And after that end-of-name there is/are what? One 'xt' there returns > data+len, where does that token come from? I assume next one is 'xt' which > prints from 'ls', right? All I am getting it all wrong and there is just > one xt? > > >> where name> moves to after the name (cell-aligned): >> >> : name> char+ dup c@ 1+ chars+ aligned ; >> (skip the flags byte, get the char count byte, skip that many bytes >> (and the count byte itself), align). >> >>> dup >name name>string 2>r >>> execute >>> 2r> >>> fdt-prop >> >>> name tries to find the name, given an xt. You do not need it here. > > > Well, with this new knowledge - sounds like the name is right there between > properties-names-next and what 'link>' returns so I do not need extra work > to extract it. btw 336ms now (was 351ms), nice. See "[PATCH slof v4 2/3] fdt: Pass the resulting device tree to QEMU" commit log for all numbers.
On Thu, Oct 05, 2017 at 06:39:03PM +1100, Alexey Kardashevskiy wrote: > btw 336ms now (was 351ms), nice. See "[PATCH slof v4 2/3] fdt: Pass the > resulting device tree to QEMU" commit log for all numbers. Nice indeed :-) If you want further speedups, a juicy target is improving how the wordlists are searched, since that will help everything. Segher
On 05/10/17 23:21, Segher Boessenkool wrote: > On Thu, Oct 05, 2017 at 06:39:03PM +1100, Alexey Kardashevskiy wrote: >> btw 336ms now (was 351ms), nice. See "[PATCH slof v4 2/3] fdt: Pass the >> resulting device tree to QEMU" commit log for all numbers. > > Nice indeed :-) > > If you want further speedups, a juicy target is improving how the > wordlists are searched, since that will help everything. I was pretty happy with 1.7sec really :) With fdt, I would stop now and probably come back later to look at wordlists when/if you answer my questions 3 mails above in this thread. And is there much to improve just search? I'd think the linked list needs to be replaced with something sorted so it is a different data structure.
Hi Alexey, On Fri, Oct 06, 2017 at 12:40:49PM +1100, Alexey Kardashevskiy wrote: > On 05/10/17 23:21, Segher Boessenkool wrote: > > On Thu, Oct 05, 2017 at 06:39:03PM +1100, Alexey Kardashevskiy wrote: > >> btw 336ms now (was 351ms), nice. See "[PATCH slof v4 2/3] fdt: Pass the > >> resulting device tree to QEMU" commit log for all numbers. > > > > Nice indeed :-) > > > > If you want further speedups, a juicy target is improving how the > > wordlists are searched, since that will help everything. > > I was pretty happy with 1.7sec really :) With fdt, I would stop now and > probably come back later to look at wordlists when/if you answer my > questions 3 mails above in this thread. Good plan :-) (I hope I can find that mail, if not, just ping me?) > And is there much to improve just search? I'd think the linked list needs > to be replaced with something sorted so it is a different data structure. You still need something like a linked list because there can be two words in the same wordlist with the same name, and you can still find both in some ways, but the normal ways always should find the newer. The current scheme to accelerate lookup is very simplistic: it takes a hash of the word name, and has a cache indexed by that. The cache has just one way (no chaining, no nothing), so two conflicting names will keep fighting for the slot. If something is not found in the cache, the slow way (walked the linked lists) is used. The whole cache is blown away whenever the search order is changed, etc., too. As you see it could all be a lot faster (at the cost of complexity). It always was fast enough ;-) Some ways things could be improved: * Keep caches per word list, not just one globally. * Actually reorder the word lists when a word is found, i.e. pull a found word closer to the word list head; but you also need to maintain the original order some way, for words like MARKER or FORGET. Segher
This is the mail I wanted you to comment on. Thanks! On 05/10/17 14:26, Alexey Kardashevskiy wrote: > On 05/10/17 03:40, Segher Boessenkool wrote: >> On Tue, Oct 03, 2017 at 11:13:51PM +1100, Alexey Kardashevskiy wrote: >>>> +: fdt-properties ( phandle -- ) >>>> + dup encode-int s" phandle" fdt-prop >>>> + >r >>>> + s" " >>>> + BEGIN >>>> + r@ next-property >>>> + WHILE >>>> + 2dup >>>> + 2dup r@ get-property >>>> + not IF >>>> + 2swap fdt-prop >>>> + THEN >>>> + REPEAT >>>> + r> >>>> + drop >>>> +; >>> >>> >>> Doing as below still works but brings the time for the huge guest (256 >>> CPUs, 256 E1000) down to 366ms as it does not search for properties every >>> time but just iterates through them, I picked this from ".properties". So >>> we are back in business :) But I'd really love someone to explain me how >>> that works as I am unable to parse this kind of internals myself... Segher, >>> please. >> >> I'll try. >> >> : link>name cell+ ; >> : name>string char+ count ; >> >> A wordlist is a linked list, offset 0 of each record is a pointer to >> the next. At offset 1 cell is the per-name data: a flags byte, a byte >> that is the number of chars in the name, and the actual chars. >> >> Well. The wordlists are themselves a linked list; offset 0 holds the >> address of the next wordlist, and offset 1 cell holds the address of >> the first name record. Like: >> >> STRUCT >> cell FIELD wid>next >> cell FIELD wid>names \ the head of the list of name records >> END-STRUCT >> >> STRUCT >> cell FIELD name>next >> char FIELD name>flags >> char FIELD name>count >> 0 FIELD name>chars >> END-STRUCT > > > These do not exist in the existing SLOF code, you just typed them here, right? > > I want to put these into slof/fs/node.fs right after the node struct > definition with some comments about what points to what. > > >> >> >> The default (find) implementation (from engine.in): >> >> : ((find)) ( str len head -- 0 | link ) >> BEGIN dup WHILE >> >r 2dup r@ link>name name>string string=ci IF \ if found the name: >> 2drop r> EXIT THEN \ return a pointer to this name record >> r> @ \ follow the linked list >> REPEAT >> 3drop false ; \ nope, not found >> >> >>> : fdt-properties ( phandle -- ) >>> dup encode-int s" phandle" fdt-prop >>> >>> node>properties @ cell+ @ >> >> "properties" in a node struct is a wordlist, which is just a pointer >> to the first name record in that wordlist. > > ( phandle ) > node>properties @ ( properties-wordlist ) > cell+ @ ( properties-names-next ) > > And properties-names-next points to the first 'name>next' thingy. Ok. > >> >>> BEGIN >>> dup >>> WHILE >>> dup >>> >>> link> >> >> : link> link>name name> ; > > Ahhh. I was searching for 'LINK_X3E' but it is: > col(LINK> LINK>NAME NAME>) > > Ok. Another confusion was that 'NAME' here does not point to a string but > to 'name>flags'. Ok. > > btw what does 'lfa' stand for in stack comments? > > > So, after 'link>', the stack is: > ( properties-names-next end-of-name ) > > > And after that end-of-name there is/are what? One 'xt' there returns > data+len, where does that token come from? I assume next one is 'xt' which > prints from 'ls', right? All I am getting it all wrong and there is just > one xt? > > >> where name> moves to after the name (cell-aligned): >> >> : name> char+ dup c@ 1+ chars+ aligned ; >> (skip the flags byte, get the char count byte, skip that many bytes >> (and the count byte itself), align). >> >>> dup >name name>string 2>r >>> execute >>> 2r> >>> fdt-prop >> >>> name tries to find the name, given an xt. You do not need it here. > > > Well, with this new knowledge - sounds like the name is right there between > properties-names-next and what 'link>' returns so I do not need extra work > to extract it. > > '>name' simply goes backwards to the beginning of the name, right (if I am > reading the beginning of slof/fs/base.fs correctly)? > > btw '.property' still does search, what is that - room for a little > improvement or I am missing something again? > > >> >>> >>> @ >>> REPEAT >>> >>> drop >>> ; >> >> So maybe more like >> >> : fdt-property ( link -- ) >> dup link> execute rot link>name name>string fdt-prop ; > > Without stack comments, I'll forget this in 2 months maximum. > >> : fdt-properties ( phandle -- ) >> dup encode-int s" phandle" fdt-prop >> node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; > > > With this formatting and missing stack comments I will stop understanding > it in a week :( > > >> (and maybe better names? Something that makes clear it is copying >> properties to the flat tree). > > It does not work like this, if you do not like the name - do not just say > it, suggest a better one ;) > > > > and thanks for the lesson! > >
On Mon, Oct 09, 2017 at 01:06:08PM +1100, Alexey Kardashevskiy wrote: > This is the mail I wanted you to comment on. Thanks! Ah ok, thanks. > >> Well. The wordlists are themselves a linked list; offset 0 holds the > >> address of the next wordlist, and offset 1 cell holds the address of > >> the first name record. Like: > >> > >> STRUCT > >> cell FIELD wid>next > >> cell FIELD wid>names \ the head of the list of name records > >> END-STRUCT > >> > >> STRUCT > >> cell FIELD name>next > >> char FIELD name>flags > >> char FIELD name>count > >> 0 FIELD name>chars > >> END-STRUCT > > > > > > These do not exist in the existing SLOF code, you just typed them here, right? Yes, this is just a description of what the core code (engine.in) does. You do not actually want these words, certainly not in node.fs . For example, "whatever>next" is just a no-op: writing it as no code at all is faster, this is quite hot code. > >> : link> link>name name> ; > > > > Ahhh. I was searching for 'LINK_X3E' but it is: > > col(LINK> LINK>NAME NAME>) > > > > Ok. Another confusion was that 'NAME' here does not point to a string but > > to 'name>flags'. Ok. Yeah, engine.in is mostly like Forth syntax (ignoring all the immediate fields, and the words headers ("col" etc.) > > btw what does 'lfa' stand for in stack comments? lfa is "link field address". nfa is "name field address". xt is "execution token". SLOF uses a very traditional dictionary structure. Each word is laid out like this: lfa: cell, a pointer to the previous word in this list. nfa: at least one cell; a flags field, and the name of the word as a counted string, and padding to the next word boundary. xt: the "body" of the word, starting with the code field, and then for e.g. colon words a list of (mostly) xts, etc. > > So, after 'link>', the stack is: > > ( properties-names-next end-of-name ) : link> ( lfa -- xt ) > > '>name' simply goes backwards to the beginning of the name, right (if I am > > reading the beginning of slof/fs/base.fs correctly)? Yes, but it guesses: it searches back to a byte that could be the correct length byte. It is also slow. You do not want to use it (unless you have to, to get a name when only knowing the xt: it's really only for debugging things). > > btw '.property' still does search, what is that - room for a little > > improvement or I am missing something again? Probably, yes. Seems it could just use link>name instead of link> >name . Well, it is link> dup >name name>string so it could do link>name dup name> swap name>string or something like that. > >> : fdt-property ( link -- ) > >> dup link> execute rot link>name name>string fdt-prop ; > > > > Without stack comments, I'll forget this in 2 months maximum. (There is a stack comment :-P ) I don't write "what is on the stack" after every word, certainly not in short phrases, where it should be obvious. I suspect your problem reading this is that you do not know the stack signatures of link> etc. by heart. > >> : fdt-properties ( phandle -- ) > >> dup encode-int s" phandle" fdt-prop > >> node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; > > > > > > With this formatting and missing stack comments I will stop understanding > > it in a week :( A bit better factored then? : (fdt-properties) ( lfa -- ) BEGIN dup WHILE dup fdt-property @ REPEAT drop ; : fdt-properties ( phandle -- ) dup encode-int s" phandle" fdt-prop node>properties @ cell+ @ (fdt-properties) ; > >> (and maybe better names? Something that makes clear it is copying > >> properties to the flat tree). > > > > It does not work like this, if you do not like the name - do not just say > > it, suggest a better one ;) It certainly does work like that! Coming up with good names is the hardest part of programming (and a very important part). Maybe just names like "make-fdt-properties"? Cheers, HTH, Segher
On 10/10/17 04:28, Segher Boessenkool wrote: > On Mon, Oct 09, 2017 at 01:06:08PM +1100, Alexey Kardashevskiy wrote: >> This is the mail I wanted you to comment on. Thanks! > > Ah ok, thanks. > >>>> Well. The wordlists are themselves a linked list; offset 0 holds the >>>> address of the next wordlist, and offset 1 cell holds the address of >>>> the first name record. Like: >>>> >>>> STRUCT >>>> cell FIELD wid>next >>>> cell FIELD wid>names \ the head of the list of name records >>>> END-STRUCT >>>> >>>> STRUCT >>>> cell FIELD name>next >>>> char FIELD name>flags >>>> char FIELD name>count >>>> 0 FIELD name>chars >>>> END-STRUCT >>> >>> >>> These do not exist in the existing SLOF code, you just typed them here, right? > > Yes, this is just a description of what the core code (engine.in) does. > You do not actually want these words, certainly not in node.fs . Even declaring these is bad? Elsewhere you suggested: : fdt-properties ( phandle -- ) dup encode-int s" phandle" fdt-prop node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; How bad is replacing an absolutely magic 'cell+' with a kinda documented 'wid>names'? > > For example, "whatever>next" is just a no-op: writing it as no code at > all is faster, this is quite hot code. Sure, wid>next and name>next are useless in the code but they still serve a documentation purpose, and other struct fields are no no-op. > >>>> : link> link>name name> ; >>> >>> Ahhh. I was searching for 'LINK_X3E' but it is: >>> col(LINK> LINK>NAME NAME>) >>> >>> Ok. Another confusion was that 'NAME' here does not point to a string but >>> to 'name>flags'. Ok. > > Yeah, engine.in is mostly like Forth syntax (ignoring all the immediate > fields, and the words headers ("col" etc.) > >>> btw what does 'lfa' stand for in stack comments? > > lfa is "link field address". nfa is "name field address". xt is > "execution token". SLOF uses a very traditional dictionary structure. > Each word is laid out like this: > > lfa: cell, a pointer to the previous word in this list. > nfa: at least one cell; a flags field, and the name of the word as > a counted string, and padding to the next word boundary. So "nfa" is name>next + name>flags + name>count + name>chars + padding. Ok. > xt: the "body" of the word, starting with the code field, and then > for e.g. colon words a list of (mostly) xts, etc. So it is a wordlist where the very first word has the node's name and its xt returns the raw date (pointer + len), and other words do some other stuff (.print-xxxx?), is that correct? Where is that very first word compiled to the xt which returns the raw data? > >>> So, after 'link>', the stack is: >>> ( properties-names-next end-of-name ) > > : link> ( lfa -- xt ) > >>> '>name' simply goes backwards to the beginning of the name, right (if I am >>> reading the beginning of slof/fs/base.fs correctly)? > > Yes, but it guesses: it searches back to a byte that could be the > correct length byte. It is also slow. You do not want to use it > (unless you have to, to get a name when only knowing the xt: it's > really only for debugging things). > >>> btw '.property' still does search, what is that - room for a little >>> improvement or I am missing something again? > > Probably, yes. Seems it could just use link>name instead of > link> >name . > Well, it is link> dup >name name>string so it could do > link>name dup name> swap name>string or something like that. Aha! :) > >>>> : fdt-property ( link -- ) >>>> dup link> execute rot link>name name>string fdt-prop ; >>> >>> Without stack comments, I'll forget this in 2 months maximum. > > (There is a stack comment :-P ) > > I don't write "what is on the stack" after every word, certainly not > in short phrases, where it should be obvious. > > I suspect your problem reading this is that you do not know the stack > signatures of link> etc. by heart. Exactly :( >>>> : fdt-properties ( phandle -- ) >>>> dup encode-int s" phandle" fdt-prop >>>> node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; >>> >>> >>> With this formatting and missing stack comments I will stop understanding >>> it in a week :( > > A bit better factored then? > > : (fdt-properties) ( lfa -- ) > BEGIN dup WHILE dup fdt-property @ REPEAT drop ; > : fdt-properties ( phandle -- ) > dup encode-int s" phandle" fdt-prop > node>properties @ cell+ @ (fdt-properties) ; May be... I'll do this but I am still struggling with a single line loops. Why exactly is this style bad? : fdt-add-properties ( phandle -- ) dup encode-int s" phandle" fdt-prop node>properties @ wid>names @ BEGIN dup WHILE dup fdt-add-property @ REPEAT drop ; > >>>> (and maybe better names? Something that makes clear it is copying >>>> properties to the flat tree). >>> >>> It does not work like this, if you do not like the name - do not just say >>> it, suggest a better one ;) > > It certainly does work like that! Coming up with good names is the > hardest part of programming (and a very important part). You have a better idea what is a good forth name ;) > Maybe just names like "make-fdt-properties"? I have now fdt-add-properties and fdt-add-property, too bad? > > Cheers, HTH, > > > Segher >
On 04/10/17 12:02, David Gibson wrote: > On Tue, Oct 03, 2017 at 11:13:51PM +1100, Alexey Kardashevskiy wrote: >> On 03/10/17 16:15, Alexey Kardashevskiy wrote: >>> This creates flatten device tree and passes it to QEMU via a custom >>> hypercall right before jumping to RTAS. >>> >>> On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob >>> is 360KB (356KB structs and 20KB of strings), building such a tree takes >>> ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices >>> takes 38ms and creates 16KB blob. >>> >>> This preloads strings with 40 property names from CPU and PCI device nodes >>> and the strings lookup only searches within these. Without string reusing >>> at all, the strings blob is 200KB and rendering time is 1.7sec; with >>> unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. >>> >>> While we are here, fix the version handling in fdt-init. It only matters >>> a little for the fdt-debug==1 case though. >>> >>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>> --- [...] Adding Ben to cc:. >> >> Doing as below still works but brings the time for the huge guest (256 >> CPUs, 256 E1000) down to 366ms as it does not search for properties every > > Ah, much better. That's a cost I can live with. I did some more debugging. There is a noticeable delay when booting such a big guest, I cannot really measure prom_init() as the "jiffies" symbol is not available there but a stopwatch shows 8.5sec in the guest's flatten_device_tree() which fetches the entire device tree via the client interface, node-by-node, property-by-property. With two small patches (below, for the guest kernel and for SLOF) the delay is not noticeable any more. The questions are: 1. are we interested in such optimization? 2. do I have to add lunux,phandle as well, or just "phandle" is enough (it is now for my tests) 3. what should allocate the fdt - the guest or slof? In my draft - it is the guest as it simply reserves 1MB for this and that's it 4. reserved map - SLOF puts a zero and my draft adds the real map to the end - is that hacky or ok? diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c index 613f79f03877..6102eb682c71 100644 --- a/arch/powerpc/kernel/prom_init.c +++ b/arch/powerpc/kernel/prom_init.c @@ -2466,6 +2466,19 @@ static void __init flatten_device_tree(void) prom_panic("Can't allocate initial device-tree chunk\n"); mem_end = mem_start + room; + if (!call_prom_ret("fdt-fetch", 2, 1, NULL, mem_start, room)) { + hdr = (void *) mem_start; + dt_header_start = (unsigned long)hdr; + hdr->boot_cpuid_phys = cpu_to_be32(prom.cpu); + hdr->off_mem_rsvmap = hdr->totalsize; + rsvmap = (void*)hdr + be32_to_cpu(hdr->totalsize); + hdr->totalsize = be32_to_cpu(hdr->totalsize) + sizeof(mem_reserve_map); + hdr->totalsize = cpu_to_be32(hdr->totalsize); + memcpy(rsvmap, mem_reserve_map, sizeof(mem_reserve_map)); + mem_reserve_cnt = MEM_RESERVE_MAP_SIZE; + return; + } + /* Get root of tree */ root = call_prom("peer", 1, 1, (phandle)0); if (root == (phandle)0) diff --git a/slof/fs/client.fs b/slof/fs/client.fs index 7d537a6..67976fb 100644 --- a/slof/fs/client.fs +++ b/slof/fs/client.fs @@ -308,4 +308,18 @@ ALSO client-voc DEFINITIONS : set-callback ( newfunc -- oldfunc ) client-callback @ swap client-callback ! ; +\ Curtom method to get FDT blob +: fdt-fetch ( buf len -- 0 ) + " /" find-node fdt-flatten-tree ( buf len dtb ) + dup >r + >fdth_tsize l@ ( buf len size r: dtb ) + min ( buf minsize r: dtb ) + r@ -rot + move + + r> fdt-flatten-tree-free + + 0 +; +
On Fri, Oct 13, 2017 at 07:26:02PM +1100, Alexey Kardashevskiy wrote: > diff --git a/slof/fs/client.fs b/slof/fs/client.fs > index 7d537a6..67976fb 100644 > --- a/slof/fs/client.fs > +++ b/slof/fs/client.fs > @@ -308,4 +308,18 @@ ALSO client-voc DEFINITIONS > : set-callback ( newfunc -- oldfunc ) > client-callback @ swap client-callback ! ; > > +\ Curtom method to get FDT blob s/r/s/ > +: fdt-fetch ( buf len -- 0 ) > + " /" find-node fdt-flatten-tree ( buf len dtb ) " /" find-node is device-tree @ > + dup >r > + >fdth_tsize l@ ( buf len size r: dtb ) > + min ( buf minsize r: dtb ) > + r@ -rot > + move > + > + r> fdt-flatten-tree-free > + > + 0 > +; That "min" isn't correct I think -- if the DTB doesn't fit, you want to complain loudly and/or return an error to the caller. Segher
On 13/10/17 23:45, Segher Boessenkool wrote: > On Fri, Oct 13, 2017 at 07:26:02PM +1100, Alexey Kardashevskiy wrote: >> diff --git a/slof/fs/client.fs b/slof/fs/client.fs >> index 7d537a6..67976fb 100644 >> --- a/slof/fs/client.fs >> +++ b/slof/fs/client.fs >> @@ -308,4 +308,18 @@ ALSO client-voc DEFINITIONS >> : set-callback ( newfunc -- oldfunc ) >> client-callback @ swap client-callback ! ; >> >> +\ Curtom method to get FDT blob > > s/r/s/ > >> +: fdt-fetch ( buf len -- 0 ) >> + " /" find-node fdt-flatten-tree ( buf len dtb ) > > " /" find-node is device-tree @ Hm. Lovely :) >> + dup >r >> + >fdth_tsize l@ ( buf len size r: dtb ) >> + min ( buf minsize r: dtb ) >> + r@ -rot >> + move >> + >> + r> fdt-flatten-tree-free >> + >> + 0 >> +; > > That "min" isn't correct I think -- if the DTB doesn't fit, you want to > complain loudly and/or return an error to the caller. Well, this is not a real patch, this is an RFC to hear about the idea in general. I'll add a complain here if we decide it is worth doing.
Hi Alexey, On Sat, Oct 14, 2017 at 11:24:05AM +1100, Alexey Kardashevskiy wrote: > >> + dup >r > >> + >fdth_tsize l@ ( buf len size r: dtb ) > >> + min ( buf minsize r: dtb ) > >> + r@ -rot > >> + move > >> + > >> + r> fdt-flatten-tree-free > >> + > >> + 0 > >> +; > > > > That "min" isn't correct I think -- if the DTB doesn't fit, you want to > > complain loudly and/or return an error to the caller. > > Well, this is not a real patch, this is an RFC to hear about the idea in > general. I'll add a complain here if we decide it is worth doing. Returning an error is more important (and probably enough, the kernel is in control at this point). Segher
On Tue, Oct 10, 2017 at 01:23:15PM +1100, Alexey Kardashevskiy wrote: > On 10/10/17 04:28, Segher Boessenkool wrote: > >>>> STRUCT > >>>> cell FIELD wid>next > >>>> cell FIELD wid>names \ the head of the list of name records > >>>> END-STRUCT > >>>> > >>>> STRUCT > >>>> cell FIELD name>next > >>>> char FIELD name>flags > >>>> char FIELD name>count > >>>> 0 FIELD name>chars > >>>> END-STRUCT > >>> > >>> > >>> These do not exist in the existing SLOF code, you just typed them here, right? > > > > Yes, this is just a description of what the core code (engine.in) does. > > You do not actually want these words, certainly not in node.fs . > > Even declaring these is bad? The name lookup is *the* slowest part of the engine, you do not want to make it slower for no good reason. The word fields are not a fixed-length struct, so you cannot describe that with a STRUCT (and there already are NAME>LINK LINK> NAME> words, and no code outside of the engine has any business accessing flags, count, chars directly). It's fine to document this of course. If other code wants to do unusual things to wordlists we really should abstract that to some other words, not "manually" manipulate memory (which you will still do with the "struct" definitions). Yup I'm to blame for the first occurrence of this here ;-) > Elsewhere you suggested: > > : fdt-properties ( phandle -- ) > dup encode-int s" phandle" fdt-prop > node>properties @ cell+ @ BEGIN dup WHILE dup fdt-property @ REPEAT drop ; > > How bad is replacing an absolutely magic 'cell+' with a kinda documented > 'wid>names'? It's not really good either ;-) : for-all-words ( wid xt -- ) \ xt has sig ( lfa -- ) dup >r cell+ @ BEGIN dup WHILE dup r@ execute @ REPEAT r> drop ; : fdt-copy-properties ( phandle -- ) dup encode-int s" phandle" fdt-prop node>properties @ ['] fdt-copy-property for-all-words ; Maybe it's nicer to use the nfa... dunno. > > lfa is "link field address". nfa is "name field address". xt is > > "execution token". SLOF uses a very traditional dictionary structure. > > Each word is laid out like this: > > > > lfa: cell, a pointer to the previous word in this list. > > nfa: at least one cell; a flags field, and the name of the word as > > a counted string, and padding to the next word boundary. > > So "nfa" is name>next + name>flags + name>count + name>chars + padding. Ok. \ link>name ( lfa -- nfa ) \ name> ( nfa -- xt ) \ link> ( lfa -- xt ) The name field is the flags, the count, the chars, and padding to cell size. The nfa is the address of the name field, so the address of the flags. > > xt: the "body" of the word, starting with the code field, and then > > for e.g. colon words a list of (mostly) xts, etc. > > > So it is a wordlist Nope, a wordlist is something else. I shouldn't have said list: it's just a bunch of xt's one after another in memory. Like an array. > where the very first word has the node's name and its > xt returns the raw date (pointer + len), and other words do some other > stuff (.print-xxxx?), is that correct? Where is that very first word > compiled to the xt which returns the raw data? A colon def like: : over 2dup drop ; \ a silly way to implement over of course :-) is laid out in memory like (assuming cell size 8): link \ to the previous word in this word list 00 04 6f 76 65 72 00 00 \ flags=0, "over" as counted string, padding ' DOCOL \ the address of docol; "nest" ' 2dup ' drop ' EXIT \ or SEMIS or whatever is the end of a colon \ word these days; "unnest" > > A bit better factored then? > > > > : (fdt-properties) ( lfa -- ) > > BEGIN dup WHILE dup fdt-property @ REPEAT drop ; > > : fdt-properties ( phandle -- ) > > dup encode-int s" phandle" fdt-prop > > node>properties @ cell+ @ (fdt-properties) ; > > May be... I'll do this but I am still struggling with a single line loops. > Why exactly is this style bad? > > : fdt-add-properties ( phandle -- ) > dup encode-int s" phandle" fdt-prop > > node>properties @ wid>names @ > BEGIN > dup > WHILE > dup fdt-add-property > @ > REPEAT > drop > ; Because that won't even fit more than two definitions on a screen. Rigid identation practices like that also make it impossible to use spacing to indicate where phrases end. And, importantly, the only reason you would want indents is if you are nesting too deep (i.e. at all). Most definitions should fit on a single line! > >>>> (and maybe better names? Something that makes clear it is copying > >>>> properties to the flat tree). > >>> > >>> It does not work like this, if you do not like the name - do not just say > >>> it, suggest a better one ;) > > > > It certainly does work like that! Coming up with good names is the > > hardest part of programming (and a very important part). > > You have a better idea what is a good forth name ;) If a name makes clear what the word does, it is a good name, especially if it is a nice short name as well. If it is hard to make a nice short name that describes exactly what the word does, chances are the word does too much. > I have now fdt-add-properties and fdt-add-property, too bad? I think "copy" is better than "add"? Segher
diff --git a/lib/libhvcall/libhvcall.h b/lib/libhvcall/libhvcall.h index 6356a62..3fa4398 100644 --- a/lib/libhvcall/libhvcall.h +++ b/lib/libhvcall/libhvcall.h @@ -24,7 +24,8 @@ #define KVMPPC_H_LOGICAL_MEMOP (KVMPPC_HCALL_BASE + 0x1) /* Client Architecture support */ #define KVMPPC_H_CAS (KVMPPC_HCALL_BASE + 0x2) -#define KVMPPC_HCALL_MAX KVMPPC_H_CAS +#define KVMPPC_H_UPDATE_DT (KVMPPC_HCALL_BASE + 0x3) +#define KVMPPC_HCALL_MAX KVMPPC_H_UPDATE_DT #ifndef __ASSEMBLY__ diff --git a/board-qemu/slof/fdt.fs b/board-qemu/slof/fdt.fs index 851645e..548cc25 100644 --- a/board-qemu/slof/fdt.fs +++ b/board-qemu/slof/fdt.fs @@ -27,7 +27,7 @@ struct 4 field >fdth_boot_cpu 4 field >fdth_string_size 4 field >fdth_struct_size -drop +constant /fdth h# d00dfeed constant OF_DT_HEADER h# 1 constant OF_DT_BEGIN_NODE @@ -69,7 +69,7 @@ fdt-start fdt-init dup >fdth_version l@ 3 >= IF ." strings size : 0x" dup >fdth_string_size l@ . cr THEN - dup >fdth_version l@ 17 >= IF + dup >fdth_version l@ 11 >= IF ." struct size : 0x" dup >fdth_struct_size l@ . cr THEN THEN @@ -439,4 +439,284 @@ r> drop fdt-cas-fix? ; +VARIABLE fdtfl-struct +VARIABLE fdtfl-struct-here +VARIABLE fdtfl-strings +VARIABLE fdtfl-strings-cache +VARIABLE fdtfl-strings-here +VARIABLE fdtfl-strings-reused \ debug only +VARIABLE fdt-ms \ debug only + +: fdt-skip-string ( cur -- cur ) + BEGIN + dup c@ + WHILE + 1+ + REPEAT + 4 + -4 and +; + +: zstring= ( str len zstr -- flag ) + 2dup + c@ 0<> IF + 3drop false + EXIT + THEN + swap comp 0= +; + +: fdt-find-string ( name namelen -- nameoff true | false ) + fdtfl-strings @ + BEGIN + dup fdtfl-strings-cache @ < + WHILE + 3dup zstring= IF + fdtfl-strings @ - + -rot + 2drop + true + EXIT + THEN + fdt-skip-string + REPEAT + 3drop + false +; + +: fdt-str-allot ( len -- ) fdtfl-strings-here @ + to fdtfl-strings-here ; +: fdt-str-c, ( char -- ) fdtfl-strings-here @ 1 fdt-str-allot c! ; +: fdt-str-align ( -- ) + fdtfl-strings-here @ + dup dup 4 #aligned swap - ( here bytes-to-erase ) + dup -rot + erase + fdt-str-allot +; +: fdt-str-bytes, ( data len -- ) fdtfl-strings-here @ over fdt-str-allot swap move ; +: fdt-str-ztr, ( str len -- ) fdt-str-bytes, 0 fdt-str-c, ; + +: fdt-add-string ( name namelen -- nameoff ) + fdtfl-strings-here @ -rot + fdt-str-ztr, + fdt-str-align + fdtfl-strings @ - +; + +: fdt-get-string ( name namelen -- nameoff ) + 2dup fdt-find-string IF + -rot 2drop + fdt-debug IF + 1 fdtfl-strings-reused +! + THEN + EXIT + THEN + fdt-add-string +; + +: fdt-allot ( len -- ) fdtfl-struct-here @ + to fdtfl-struct-here ; +: fdt-c, ( char -- ) fdtfl-struct-here @ 1 fdt-allot c! ; +: fdt-align ( -- ) + fdtfl-struct-here @ + dup dup 4 #aligned swap - ( here bytes-to-erase ) + dup -rot + erase + fdt-allot +; +: fdt-bytes, ( data len -- ) fdtfl-struct-here @ over fdt-allot swap move ; +: fdt-ztr, ( str len -- ) fdt-bytes, 0 fdt-c, ; +: fdt-l, ( token -- ) fdtfl-struct-here @ l! /l fdt-allot ; + +: fdt-begin-node ( name namelen -- ) + OF_DT_BEGIN_NODE fdt-l, + 2dup 1 = swap c@ [char] / = and \ is it "/"? + IF + 2drop s" " \ dtc is still unhappy though + THEN + fdt-ztr, + fdt-align +; + +: fdt-end-node ( -- ) OF_DT_END_NODE fdt-l, ; + +: fdt-prop ( prop len name namelen -- ) + OF_DT_PROP fdt-l, + + \ get string offset + fdt-get-string ( prop len nameoff ) + + \ store len and nameoff + over fdt-l, + fdt-l, ( prop len ) + + \ now store the bytes + fdt-bytes, + fdt-align +; + +: fdt-end ( -- ) OF_DT_END fdt-l, ; + +: fdt-properties ( phandle -- ) + dup encode-int s" phandle" fdt-prop + >r + s" " + BEGIN + r@ next-property + WHILE + 2dup + 2dup r@ get-property + not IF + 2swap fdt-prop + THEN + REPEAT + r> + drop +; + +: fdt-flatten-node ( node -- ) + fdt-debug 1 > IF dup node>path type cr THEN + dup node>qname fdt-begin-node + dup fdt-properties + child + BEGIN + dup + WHILE + dup recurse + peer + REPEAT + drop + fdt-end-node +; + +: fdtfl-strings-preload ( -- ) + s" reg" fdt-add-string drop + s" status" fdt-add-string drop + s" 64-bit" fdt-add-string drop + s" phandle" fdt-add-string drop + s" ibm,vmx" fdt-add-string drop + s" ibm,dfp" fdt-add-string drop + s" slb-size" fdt-add-string drop + s" ibm,purr" fdt-add-string drop + s" vendor-id" fdt-add-string drop + s" device-id" fdt-add-string drop + s" min-grant" fdt-add-string drop + s" class-code" fdt-add-string drop + s" compatible" fdt-add-string drop + s" interrupts" fdt-add-string drop + s" cpu-version" fdt-add-string drop + s" #size-cells" fdt-add-string drop + s" ibm,req#msi" fdt-add-string drop + s" revision-id" fdt-add-string drop + s" device_type" fdt-add-string drop + s" max-latency" fdt-add-string drop + s" ibm,chip-id" fdt-add-string drop + s" ibm,pft-size" fdt-add-string drop + s" ibm,slb-size" fdt-add-string drop + s" devsel-speed" fdt-add-string drop + s" ibm,loc-code" fdt-add-string drop + s" subsystem-id" fdt-add-string drop + s" d-cache-size" fdt-add-string drop + s" i-cache-size" fdt-add-string drop + s" #address-cells" fdt-add-string drop + s" clock-frequency" fdt-add-string drop + s" cache-line-size" fdt-add-string drop + s" ibm,pa-features" fdt-add-string drop + s" ibm,my-drc-index" fdt-add-string drop + s" d-cache-line-size" fdt-add-string drop + s" i-cache-line-size" fdt-add-string drop + s" assigned-addresses" fdt-add-string drop + s" d-cache-block-size" fdt-add-string drop + s" i-cache-block-size" fdt-add-string drop + s" timebase-frequency" fdt-add-string drop + s" subsystem-vendor-id" fdt-add-string drop + s" ibm,segment-page-sizes" fdt-add-string drop + s" ibm,ppc-interrupt-server#s" fdt-add-string drop + s" ibm,processor-segment-sizes" fdt-add-string drop + s" ibm,ppc-interrupt-gserver#s" fdt-add-string drop +; + +: fdt-append-blob ( bytes cur blob -- cur ) + 3dup -rot swap move + drop + +; + +: fdt-flatten-tree ( root -- tree ) + 200000 alloc-mem dup fdtfl-struct-here ! fdtfl-struct ! + 200000 alloc-mem dup fdtfl-strings-here ! fdtfl-strings ! + + fdt-debug IF + 0 fdtfl-strings-reused ! + milliseconds fdt-ms ! + THEN + + \ Preload strings cache + fdtfl-strings-preload + fdtfl-strings-here @ fdtfl-strings-cache ! + \ Render the blobs + fdt-flatten-node + fdt-end + + \ Calculate strings and struct sizes + fdtfl-struct-here @ fdtfl-struct @ - + fdtfl-strings-here @ fdtfl-strings @ - ( struct-len strings-len ) + + 2dup + /fdth + + 10 + \ Reserve 16 bytes and an empty reserved block + + fdt-debug IF + 3dup + ." FDT flat size=" .d cr + ." Strings size=" .d cr + ." Struct size=" .d cr + ." Reused strings=" fdtfl-strings-reused @ .d cr + milliseconds fdt-ms @ - + ." Took " .d ." ms" cr + THEN + + \ Allocate flatten DT blob + dup alloc-mem ( struct-len strings-len total-len fdt ) + >r ( struct-len strings-len total-len r: fdt ) + + \ Write header + OF_DT_HEADER r@ >fdth_magic l! + dup r@ >fdth_tsize l! + /fdth 10 + 2 pick + r@ >fdth_struct_off l! + /fdth 10 + r@ >fdth_string_off l! + /fdth r@ >fdth_rsvmap_off l! + 11 r@ >fdth_version l! + 10 r@ >fdth_compat_vers l! + 0 r@ >fdth_boot_cpu l! + over r@ >fdth_string_size l! + 2 pick r@ >fdth_struct_size l! + ( struct-len strings-len total-len r: fdt ) + + drop ( struct-len strings-len r: fdt ) + r@ /fdth + ( struct-len strings-len cur r: fdt ) + + \ Write the reserved entry + 0 over ! + cell+ + 0 over ! + cell+ ( struct-len strings-len cur r: fdt ) + + \ Write strings and struct blobs + fdtfl-strings @ fdt-append-blob + fdtfl-struct @ fdt-append-blob + drop + + \ Free temporary blobs + fdtfl-struct @ 200000 free-mem + fdtfl-strings @ 200000 free-mem + + \ Return fdt + r> +; + +: fdt-flatten-tree-free ( tree ) + dup >fdth_tsize l@ free-mem +; + +: fdt ( -- ) + " /" find-node + fdt-flatten-tree +; + s" /" find-node fdt-fix-phandles diff --git a/board-qemu/slof/rtas.fs b/board-qemu/slof/rtas.fs index b17157e..219bcda 100644 --- a/board-qemu/slof/rtas.fs +++ b/board-qemu/slof/rtas.fs @@ -98,6 +98,13 @@ find-qemu-rtas ; : rtas-quiesce ( -- ) + " /" find-node + fdt-flatten-tree + dup hv-update-dt ?dup IF + \ Ignore hcall not implemented error, print error otherwise + dup -2 <> IF ." HV-UPDATE-DT error: " . cr ELSE drop THEN + THEN + fdt-flatten-tree-free " quiesce" rtas-get-token rtas-cb rtas>token l! 0 rtas-cb rtas>nargs l! 0 rtas-cb rtas>nret l! diff --git a/lib/libhvcall/hvcall.code b/lib/libhvcall/hvcall.code index 744469f..5918c90 100644 --- a/lib/libhvcall/hvcall.code +++ b/lib/libhvcall/hvcall.code @@ -123,3 +123,8 @@ PRIM(check_X2d_and_X2d_patch_X2d_sc1) patch_broken_sc1((void*)start, (void*)end, (void*)patch_ins); MIRP + +PRIM(hv_X2d_update_X2d_dt) + unsigned long dt = TOS.u; + TOS.u = hv_generic(KVMPPC_H_UPDATE_DT, dt); +MIRP diff --git a/lib/libhvcall/hvcall.in b/lib/libhvcall/hvcall.in index e99d6d1..9193162 100644 --- a/lib/libhvcall/hvcall.in +++ b/lib/libhvcall/hvcall.in @@ -30,4 +30,5 @@ cod(RX!) cod(hv-logical-memop) cod(hv-cas) +cod(hv-update-dt) cod(get-print-version)
This creates flatten device tree and passes it to QEMU via a custom hypercall right before jumping to RTAS. On a machine with 256 CPUs and 256 virtual Intel E1000 devices the blob is 360KB (356KB structs and 20KB of strings), building such a tree takes ~2s on a POWER8 box. A simple tree with 1 CPU and a couple of devices takes 38ms and creates 16KB blob. This preloads strings with 40 property names from CPU and PCI device nodes and the strings lookup only searches within these. Without string reusing at all, the strings blob is 200KB and rendering time is 1.7sec; with unlimited reusing, the strings blob is 4KB and rendering time is 2.8sec. While we are here, fix the version handling in fdt-init. It only matters a little for the fdt-debug==1 case though. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- Changes: v3: * fixed stack handling after hcall returned * fixed format versions in both rendering and parsing paths * rebased on top of removed unused hvcalls * renamed used variables to have fdtfl- prefixes as there are already some for parsing the initial dt v2: * fixed comments from review * added strings cache * changed last_compat_vers from 0x17 to 0x16 as suggested by dwg --- I tested the blob by storing it from QEMU to a file and decompiling it; this produces error which I do not really understand as the name of the root is an empty string (literaly: 00 00 00 01 00 00 00 00) and yet this error: aik@fstn1-p1:~$ dtc -f -I dtb -O dts -o dbg.dts dbg.dtb ERROR (name_properties): "name" property in / is incorrect ("/" instead of base node name) Warning: Input tree has errors, output forced --- lib/libhvcall/libhvcall.h | 3 +- board-qemu/slof/fdt.fs | 284 +++++++++++++++++++++++++++++++++++++++++++++- board-qemu/slof/rtas.fs | 7 ++ lib/libhvcall/hvcall.code | 5 + lib/libhvcall/hvcall.in | 1 + 5 files changed, 297 insertions(+), 3 deletions(-)