Message ID | 1225958144.25986.9.camel@localhost (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Thu, Nov 06, 2008 at 06:55:44PM +1100, Michael Ellerman wrote: > This commit adds an output format, which produces python > code. When run, the python produces a data structure that > can then be inspected in order to do various things. > > Signed-off-by: Michael Ellerman <michael@ellerman.id.au> > --- > > I'm not sure if this is generally useful (or sane) but it was for me so > I thought I'd post it. Hrm, well the idea of langauge source output seems reasonable. But the actual data structure emitted, and the method of construction in Python both seem a bit odd to me. > I have a dts that I want to use to configure a simulator, and this > seemed like the nicest way to get there. dtc spits out the pythonised > device tree, and then I have a 10 line python script that does the > configuring. [snip] > diff --git a/python.c b/python.c > new file mode 100644 > index 0000000..8ad0433 > --- /dev/null > +++ b/python.c AFAICT this is based roughly on the output side of treesource.c. It would be kind of nice if the two could be combined, with the same basic structure looping over the device tree, and different emitters for either python or dts source. This would be similar to what we do in flattree.c to emit either binary or asm versions of the flat tree. > @@ -0,0 +1,129 @@ > +/* > + * (C) Copyright David Gibson <dwg@au1.ibm.com>, IBM Corporation. 2005. > + * (C) Copyright Michael Ellerman, IBM Corporation. 2008. > + * > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms of the GNU General Public License as > + * published by the Free Software Foundation; either version 2 of the > + * License, or (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License > + * along with this program; if not, write to the Free Software > + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 > + * USA > + */ > + > +#include "dtc.h" > +#include "srcpos.h" > + > + > +static void write_propval_cells(FILE *f, struct property *prop) > +{ > + cell_t *cp = (cell_t *)prop->val.val; > + int i; > + > + fprintf(f, " p = Property('%s', [", prop->name); > + > + for (i = 0; i < prop->val.len / sizeof(cell_t); i++) > + fprintf(f, "0x%x, ", fdt32_to_cpu(cp[i])); > + > + fprintf(f, "])\n"); > +} > + > +static int isstring(char c) > +{ > + return (isprint(c) > + || (c == '\0') > + || strchr("\a\b\t\n\v\f\r", c)); > +} > + > +static void write_property(FILE *f, struct property *prop) > +{ > + const char *p = prop->val.val; > + int i, strtype, len = prop->val.len; > + > + if (len == 0) { > + fprintf(f, " p = Property('%s', None)\n", prop->name); > + goto out; > + } > + > + strtype = 1; > + for (i = 0; i < len; i++) { > + if (!isstring(p[i])) { > + strtype = 0; > + break; > + } > + } > + > + if (strtype) > + fprintf(f, " p = Property('%s', '%s')\n", prop->name, > + prop->val.val); This isn't correct. The property value could contain \0s or other control characters which won't be preserved properly if emitted directly into the python source. You'd need to escape the string, as write_propval_string() does in treesource.c. Uh.. there's also an interesting ambiguity here. In OF and flat trees, strings are NUL-terminated and the final '\0' is included as part of the property length. Python strings are not NUL-terminated, they're bytestrings that know their own length. I think making sure all the conversions correctly preserve the presence/lack of a terminal NUL, requires a bit more care here.. > + else if (len == 4) > + fprintf(f, " p = Property('%s', 0x%x)\n", prop->name, > + fdt32_to_cpu(*(const cell_t *)p)); There's a propval_cell() function in livetree.c you can use to simplify this. > + else > + write_propval_cells(f, prop); Uh.. this branch could be called in the case where prop is not a string, but also doesn't have length a multiple of 4, which write_propval_cells() won't correctly deal with. These branches also result in the value having different Python types depending on the context. That's not necessarily a bad thing, but since which Python type is chosen depends on a heuristic only, it certainly needs some care. You certainly need to be certain that you can always deduce the exact, byte-for-byte correct version of the property value from whatever you put into the Python data structure. > + > +out: > + fprintf(f, " n.properties.append(p)\n"); So, emitting Python procedural code to build up the data structure, rather than a great big Python literal that the Python parser will just turn into the right thing seems a bit of a roundabout way of doing this. > +} > + > +static void write_tree_source_node(FILE *f, struct node *tree, int level) > +{ > + char name[MAX_NODENAME_LEN + 1] = "root"; Why not just have the root node's name be the empty string, as we do in the flat tree? > + struct property *prop; > + struct node *child; > + > + if (tree->name && (*tree->name)) > + strncpy(name, tree->name, MAX_NODENAME_LEN); > + > + fprintf(f, " n = Node('%s', parents[-1])\n", name); > + > + if (level > 0) > + fprintf(f, " parents[-1].children.append(n)\n"); > + else > + fprintf(f, " root = n\n"); > + > + for_each_property(tree, prop) > + write_property(f, prop); > + > + fprintf(f, " parents.append(n)\n"); > + > + for_each_child(tree, child) { > + write_tree_source_node(f, child, level + 1); > + } > + > + fprintf(f, " parents.pop()\n"); > +} > + > + > +static char *header = "#!/usr/bin/python\n\ > +\n\ > +class Node(object):\n\ > + def __init__(self, name, parent, unitaddr=None):\n\ The unitaddr parameter is never used afaict. > + self.__dict__.update(locals())\n\ > + self.children = []\n\ > + self.properties = []\n\ > +\n\ > +class Property(object):\n\ > + def __init__(self, name, value):\n\ > + self.__dict__.update(locals())\n\ > +"; > + > +void dt_to_python(FILE *f, struct boot_info *bi, int version) > +{ > + fprintf(f, "%s\n", header); > + fprintf(f, "def generate_tree():\n"); > + fprintf(f, " parents = [None]\n"); > + > + write_tree_source_node(f, bi->dt, 0); > + > + fprintf(f, " root.version = %d\n", version); Since you're not emitting a flat tree, the version is not relevant here, and should not be a parameter (again, like dt_to_source()). > + fprintf(f, " return root\n"); > +}
On 2008-11-07 at 02:31:40, David Gibson wrote: > On Thu, Nov 06, 2008 at 06:55:44PM +1100, Michael Ellerman wrote: >> This commit adds an output format, which produces python >> code. When run, the python produces a data structure that >> can then be inspected in order to do various things. ... >> I'm not sure if this is generally useful (or sane) but it was for me >> so >> I thought I'd post it. > > Hrm, well the idea of langauge source output seems reasonable. But > the actual data structure emitted, and the method of construction in > Python both seem a bit odd to me. > >> I have a dts that I want to use to configure a simulator, and this >> seemed like the nicest way to get there. dtc spits out the pythonised >> device tree, and then I have a 10 line python script that does the >> configuring. [snip] > These branches also result in the value having different Python types > depending on the context. That's not necessarily a bad thing, but > since which Python type is chosen depends on a heuristic only, it > certainly needs some care. You certainly need to be certain that you > can always deduce the exact, byte-for-byte correct version of the > property value from whatever you put into the Python data structure. >> + >> +out: >> + fprintf(f, " n.properties.append(p)\n"); > > So, emitting Python procedural code to build up the data structure, > rather than a great big Python literal that the Python parser will > just turn into the right thing seems a bit of a roundabout way of > doing this. I would think so too. I haven't looked at the output, only at Davids comments. If the data structure is ambiguous, then I do think more thought is needed. Have you considered just parsing the flat tree binary? Either creating a python binding to libfdt or even just parsing the dtb directly? I have written perl code to parse a dtb and query it for nodes and properties, it wasn't too bad. I need to look at a bug report by another user and comment it, then I should seek the okays post it. It is currently read-only and iterative callback based (like the kernels early-scan-flat-tree stuff), but I have planned creating a tree for querying, editing, and re-flattening. Perl strings are counted length binary blobs, so property contents are interpreted with pack and unpack. The library has been used to search a dtb to build a list of cpu instances and memory blocks, and it has been used to query the properties of a known node in the tree. milton
On Nov 10, 2008, at 10:11 AM, Milton Miller wrote: > On 2008-11-07 at 02:31:40, David Gibson wrote: >> On Thu, Nov 06, 2008 at 06:55:44PM +1100, Michael Ellerman wrote: >>> This commit adds an output format, which produces python >>> code. When run, the python produces a data structure that >>> can then be inspected in order to do various things. > ... >>> I'm not sure if this is generally useful (or sane) but it was for >>> me so >>> I thought I'd post it. >> >> Hrm, well the idea of langauge source output seems reasonable. But >> the actual data structure emitted, and the method of construction in >> Python both seem a bit odd to me. >> >>> I have a dts that I want to use to configure a simulator, and this >>> seemed like the nicest way to get there. dtc spits out the >>> pythonised >>> device tree, and then I have a 10 line python script that does the >>> configuring. > > [snip] >> These branches also result in the value having different Python types >> depending on the context. That's not necessarily a bad thing, but >> since which Python type is chosen depends on a heuristic only, it >> certainly needs some care. You certainly need to be certain that you >> can always deduce the exact, byte-for-byte correct version of the >> property value from whatever you put into the Python data structure. >>> + >>> +out: >>> + fprintf(f, " n.properties.append(p)\n"); >> >> So, emitting Python procedural code to build up the data structure, >> rather than a great big Python literal that the Python parser will >> just turn into the right thing seems a bit of a roundabout way of >> doing this. > > I would think so too. I haven't looked at the output, only at > Davids comments. If the data structure is ambiguous, then I do > think more thought is needed. There is value in the DTC (optionally) emitting a python library and then having the DTC result use it. It would allow for python to easily, at runtime, be able to modify the contents and not have to inline-edit, emit, compile a DTS. BTW: it would also be nice if the python library to dump the dts (or even dtb) > > > Have you considered just parsing the flat tree binary? Either > creating a python binding to libfdt or even just parsing the dtb > directly? > > I have written perl code to parse a dtb and query it for nodes and > properties, it wasn't too bad. I need to look at a bug report by > another user and comment it, then I should seek the okays post it. > It is currently read-only and iterative callback based (like the > kernels early-scan-flat-tree stuff), but I have planned creating a > tree for querying, editing, and re-flattening. Perl strings are > counted length binary blobs, so property contents are interpreted > with pack and unpack. The library has been used to search a dtb to > build a list of cpu instances and memory blocks, and it has been > used to query the properties of a known node in the tree. > > milton > > _______________________________________________ > devicetree-discuss mailing list > devicetree-discuss@ozlabs.org > https://ozlabs.org/mailman/listinfo/devicetree-discuss >
On Nov 10, 2008, at 11:00 AM, Jimi Xenidis wrote: > On Nov 10, 2008, at 10:11 AM, Milton Miller wrote: >> On 2008-11-07 at 02:31:40, David Gibson wrote: >>> On Thu, Nov 06, 2008 at 06:55:44PM +1100, Michael Ellerman wrote: >>>> This commit adds an output format, which produces python >>>> code. When run, the python produces a data structure that >>>> can then be inspected in order to do various things. >> ... >>>> I'm not sure if this is generally useful (or sane) but it was for >>>> me so >>>> I thought I'd post it. >>> >>> Hrm, well the idea of langauge source output seems reasonable. But >>> the actual data structure emitted, and the method of construction in >>> Python both seem a bit odd to me. >>> >>>> I have a dts that I want to use to configure a simulator, and this >>>> seemed like the nicest way to get there. dtc spits out the >>>> pythonised >>>> device tree, and then I have a 10 line python script that does the >>>> configuring. >> >> [snip] >>> These branches also result in the value having different Python types >>> depending on the context. That's not necessarily a bad thing, but >>> since which Python type is chosen depends on a heuristic only, it >>> certainly needs some care. You certainly need to be certain that you >>> can always deduce the exact, byte-for-byte correct version of the >>> property value from whatever you put into the Python data structure. ... >>> So, emitting Python procedural code to build up the data structure, >>> rather than a great big Python literal that the Python parser will >>> just turn into the right thing seems a bit of a roundabout way of >>> doing this. >> >> I would think so too. I haven't looked at the output, only at >> Davids comments. If the data structure is ambiguous, then I do think >> more thought is needed. > > There is value in the DTC (optionally) emitting a python library and > then having the DTC result use it. I'm not sure what you are trying to say here, Jimi. Are you asking that dtc emit dtlib.py? And then have it parse the python later? > It would allow for python to easily, at runtime, be able to modify the > contents and not have to inline-edit, emit, compile a DTS. Are you saying you want to modify a device tree in some python-specific syntax, and just dump it and have dtc understand that format so we don't have to translate to a dts? Admittedly this is not the impression I got when I interrogated you over chat. But its still how I'm parsing this email. > BTW: it would also be nice if the python library to dump the dts (or > even dtb) Ok so you want to see the standard output too. >> Have you considered just parsing the flat tree binary? Either >> creating a python binding to libfdt or even just parsing the dtb >> directly? I know that just parsing the dtb in python (and even changing and emitting a changed dtb) will be easier than teaching dtc to read something that looks like python code. Because I have written the perl and have dabbled in others python code (but I don't plan on writing the python version). Based on my experience with parsing dtb in perl, I think handling property conversion in python, where one can explicitly request type conversion by how one intends to use the property value, is preferable to emitting the data structure in another language and relying on heuristics to guess the right type based on its value. I'm saying lets add decode_string / decode_int (direct translations to pack and unpack, or just call them explicitly) to interpret the properties rather than expect a translated python string but get a byte array because it had some special character, or worse expect an integer or byte array but get a string because its value happened to look like a string. Doing these heuristics when creating a dts is ok because the result will still compile correctly back to a dtb -- it just makes it harder for the human to read, not the machine to parse, but expecting another language environment to use the result without having encode/decode available is likely to lead to data dependent bugs. So then the question becomes what is the value to emit a python tree structure more natively for python to read versus decoding dtb and building the tree in python? milton
diff --git a/Makefile.dtc b/Makefile.dtc index bece49b..92164de 100644 --- a/Makefile.dtc +++ b/Makefile.dtc @@ -12,6 +12,7 @@ DTC_SRCS = \ livetree.c \ srcpos.c \ treesource.c \ + python.c \ util.c DTC_GEN_SRCS = dtc-lexer.lex.c dtc-parser.tab.c diff --git a/dtc.c b/dtc.c index 84bee2d..496aebf 100644 --- a/dtc.c +++ b/dtc.c @@ -92,6 +92,7 @@ static void __attribute__ ((noreturn)) usage(void) fprintf(stderr, "\t\t\tdts - device tree source text\n"); fprintf(stderr, "\t\t\tdtb - device tree blob\n"); fprintf(stderr, "\t\t\tasm - assembler source\n"); + fprintf(stderr, "\t\t\tpy - python source\n"); fprintf(stderr, "\t-V <output version>\n"); fprintf(stderr, "\t\tBlob version to produce, defaults to %d (relevant for dtb\n\t\tand asm output only)\n", DEFAULT_FDT_VERSION); fprintf(stderr, "\t-R <number>\n"); @@ -219,6 +220,8 @@ int main(int argc, char *argv[]) dt_to_blob(outf, bi, outversion); } else if (streq(outform, "asm")) { dt_to_asm(outf, bi, outversion); + } else if (streq(outform, "py")) { + dt_to_python(outf, bi, outversion); } else if (streq(outform, "null")) { /* do nothing */ } else { diff --git a/dtc.h b/dtc.h index 5cb9f58..45252fe 100644 --- a/dtc.h +++ b/dtc.h @@ -237,6 +237,7 @@ void process_checks(int force, struct boot_info *bi); void dt_to_blob(FILE *f, struct boot_info *bi, int version); void dt_to_asm(FILE *f, struct boot_info *bi, int version); +void dt_to_python(FILE *f, struct boot_info *bi, int version); struct boot_info *dt_from_blob(const char *fname); diff --git a/python.c b/python.c new file mode 100644 index 0000000..8ad0433 --- /dev/null +++ b/python.c @@ -0,0 +1,129 @@ +/* + * (C) Copyright David Gibson <dwg@au1.ibm.com>, IBM Corporation. 2005. + * (C) Copyright Michael Ellerman, IBM Corporation. 2008. + * + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation; either version 2 of the + * License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 + * USA + */ + +#include "dtc.h" +#include "srcpos.h" + + +static void write_propval_cells(FILE *f, struct property *prop) +{ + cell_t *cp = (cell_t *)prop->val.val; + int i; + + fprintf(f, " p = Property('%s', [", prop->name); + + for (i = 0; i < prop->val.len / sizeof(cell_t); i++) + fprintf(f, "0x%x, ", fdt32_to_cpu(cp[i])); + + fprintf(f, "])\n"); +} + +static int isstring(char c) +{ + return (isprint(c) + || (c == '\0') + || strchr("\a\b\t\n\v\f\r", c)); +} + +static void write_property(FILE *f, struct property *prop) +{ + const char *p = prop->val.val; + int i, strtype, len = prop->val.len; + + if (len == 0) { + fprintf(f, " p = Property('%s', None)\n", prop->name); + goto out; + } + + strtype = 1; + for (i = 0; i < len; i++) { + if (!isstring(p[i])) { + strtype = 0; + break; + } + } + + if (strtype) + fprintf(f, " p = Property('%s', '%s')\n", prop->name, + prop->val.val); + else if (len == 4) + fprintf(f, " p = Property('%s', 0x%x)\n", prop->name, + fdt32_to_cpu(*(const cell_t *)p)); + else + write_propval_cells(f, prop); + +out: + fprintf(f, " n.properties.append(p)\n"); +} + +static void write_tree_source_node(FILE *f, struct node *tree, int level) +{ + char name[MAX_NODENAME_LEN + 1] = "root"; + struct property *prop; + struct node *child; + + if (tree->name && (*tree->name)) + strncpy(name, tree->name, MAX_NODENAME_LEN); + + fprintf(f, " n = Node('%s', parents[-1])\n", name); + + if (level > 0) + fprintf(f, " parents[-1].children.append(n)\n"); + else + fprintf(f, " root = n\n"); + + for_each_property(tree, prop) + write_property(f, prop); + + fprintf(f, " parents.append(n)\n"); + + for_each_child(tree, child) { + write_tree_source_node(f, child, level + 1); + } + + fprintf(f, " parents.pop()\n"); +} + + +static char *header = "#!/usr/bin/python\n\ +\n\ +class Node(object):\n\ + def __init__(self, name, parent, unitaddr=None):\n\ + self.__dict__.update(locals())\n\ + self.children = []\n\ + self.properties = []\n\ +\n\ +class Property(object):\n\ + def __init__(self, name, value):\n\ + self.__dict__.update(locals())\n\ +"; + +void dt_to_python(FILE *f, struct boot_info *bi, int version) +{ + fprintf(f, "%s\n", header); + fprintf(f, "def generate_tree():\n"); + fprintf(f, " parents = [None]\n"); + + write_tree_source_node(f, bi->dt, 0); + + fprintf(f, " root.version = %d\n", version); + fprintf(f, " return root\n"); +}
This commit adds an output format, which produces python code. When run, the python produces a data structure that can then be inspected in order to do various things. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> --- I'm not sure if this is generally useful (or sane) but it was for me so I thought I'd post it. I have a dts that I want to use to configure a simulator, and this seemed like the nicest way to get there. dtc spits out the pythonised device tree, and then I have a 10 line python script that does the configuring. cheers Makefile.dtc | 1 + dtc.c | 3 + dtc.h | 1 + python.c | 129 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 134 insertions(+), 0 deletions(-) create mode 100644 python.c