Message ID | 55B1B121.7040404@redhat.com |
---|---|
State | New |
Headers | show |
Eric Blake <eblake@redhat.com> writes: > On 07/01/2015 02:22 PM, Markus Armbruster wrote: >> Caution, rough edges. > > No joke. It doesn't even compile without this fixup to a rebase snafu > (see [0] below): Uh, how did that happen? I compiled it a million times... (except for the last time, obviously). > diff --git i/scripts/qapi-types.py w/scripts/qapi-types.py > index 79e8d24..12f3767 100644 > --- i/scripts/qapi-types.py > +++ w/scripts/qapi-types.py > @@ -184,6 +184,7 @@ class QAPISchemaGenTypeVisitor(QAPISchemaVisitor): > self.fwdecl = None > self.fwdefn = None > self.btin = None > + def visit_begin(self, schema): > self.decl = '' > self.defn = '' > self.fwdecl = '' > > I already know you'll be editing this further, but here's some things to > look for. > >> >> qapi/introspect.json defines the introspection schema. It should do >> for uses other than QMP. >> FIXME it's almost entirely devoid of comments. > > It generates quite a bit of code to support itself :) > > qapi-types.c | 383 ++++++++++++++++++++++++++++ > qapi-types.h | 293 +++++++++++++++++++++ > qapi-visit.c | 797 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > qapi-visit.h | 26 + > 4 files changed, 1499 insertions(+) Can't be helped. We need the types to define query-schema properly. And then the generators are eager to generate everything that could be possibly useful for these types. Due to the hacky way I implement qmp_query_schema(), most of it shouldn't be needed in qemu proper. But it *is* needed for the new test-qmp-input-visitor test case. >> The introspection schema does not reflect all the rules and >> restrictions that apply to QAPI schemata. A valid QAPI schema has an >> introspection value conforming to the introspection schema, but the >> converse is not true. >> >> Introspection lowers away a number of schema details: >> >> * The built-in types are declared with their JSON type. >> >> TODO Should we map all the integer types to just int? > > Just int for now seems fine. It's easier to add refinement later when > we have a proven need, than it is to be stuck advertising too much > information now and being stuck with it. Matches my current thinking. >> * Implicit type definitions are made explicit, and given >> auto-generated names. These names start with ':' so they don't >> clash with the user's names. >> >> Example: a simple union implicitly defines an enumeration type for >> its discriminator. >> >> * All type references are by name. >> >> * Base types are flattened. >> >> * The top type (named 'any') can hold any value. >> >> * The struct and union types are generalized into an object type. > > You've mentioned that idea on list quite some time ago, and it appears > to be working well. > >> >> * Commands take a single argument and return a single result. >> >> Dictionary argument/result or list result is an implicit type >> definition. >> >> The empty object type is used when a command takes no arguments or >> produces no results. >> >> The argument is always of object type, but the introspection schema >> doesn't reflect that. > > And if we ever change QMP to allow non-objects for the arguments, > introspection will still work. Yes. >> The 'gen': false and 'success-response': false directives are >> omitted as implementation detail. > > I understand omitting 'gen' (hmm, I though we could get rid of it by > making netdev_add typesafe, but now this patch adds another 'gen':false > and for good reason). But 'success-response' is part of the ABI; we > already advertise it via qga's 'guest-info' command exposing > 'success-response':'bool' for each command. Adding it would allow a > single introspection command to learn everything, instead of having to > pair introspection with 'guest-info'. On the other hand, as above, it's > easier to start thin and add more later if needed, than it is to start > full and then have to support it forever. You're right, 'success-response' is not an implementation detail. At the very least, I need to come up with a different reason for omitting it, say "TODO don't omit it". >> * Events carry a single data value. >> >> Implicit type definition and empty object type use, just like for >> commands. >> >> The value is of object type, but the introspection schema doesn't >> reflect that. >> >> * Types not used by commands or events are omitted. >> >> Indirect use counts as use. >> >> * Optional members have a default, which can only be null right now >> >> Instead of a mandatory "optional" flag, we have an optional default. >> No default means mandatory, default null means optional without >> default value. Non-null is available for optional with default. >> >> Alternate members can't have defaults, but the introspection schema >> doesn't reflect that. >> >> * Clients should *not* look up types by name, because type names are >> not ABI. Look up the command or event you're interested in, then >> follow the references. >> >> TODO Should we hide the type names to eliminate the temptation? > > Might be worth doing; I see you have a later patch to experiment with it. > >> >> * Likewise, the names of alternate members are not ABI, and should not >> be examined. >> >> TODO Should we hide them, too? > > How, by making the name of an object member optional? I'd do the same as I do for non-ABI type names in my PATCH 47: replace them by obviously useless names :) >> TODO much of the above should go into docs. > > Indeed. Hence why this is RFC still. > >> >> New generator scripts/qapi-introspect.py computes an introspection >> value for its input, and generates a C variable holding it. >> >> FIXME it can generate awfully long lines > > We already have long lines in generated output, but I agree that finding > ways to break it up might be nice. Actually, This one's different in that the length is due to string literals. indent is happy to break lines for us, but not string literals. Longest line is a bit over 4KiB for me. >> A new test-qmp-input-visitor test case feeds its result for both >> tests/qapi-schema/qapi-schema-test.json and qapi-schema.json to a >> QmpInputVisitor to verify it actually conforms to the schema. >> >> New QMP command query-schema takes its return value from that >> variable. Command documentation is incomplete, and marked FIXME. Its >> reply is some 80KiBytes for me right now. > > $ ll qmp-introspect.[ch] > -rw-rw-r--. 1 eblake eblake 89655 Jul 23 17:20 qmp-introspect.c > -rw-rw-r--. 1 eblake eblake 358 Jul 23 17:20 qmp-introspect.h Source is a bit longer than the string literal it defines. >> If this turns out to be too much, we have a couple of options: >> >> * We can use shorter names in the JSON. Not the QMP style. >> >> * Optionally return the sub-schema for commands and events given as >> arguments. > > Filtering is probably worth having, but I'm not sure how easy or hard it > gets... >> >> Right now qmp_query_schema() sends the string literal computed by >> qmp-introspect.py. To compute sub-schema at run time, we'd have to >> duplicate parts of qapi-introspect.py in C. Unattractive. > > ...might be simpler than that. You can do some pre-processing, so that > the C code can take shortcuts of having pre-computed dependency > information. Instead of one giant string, you instead have an array of > structs, one array entry per schema entity: > > struct foo { > const char *name; /* example: ":empty" */ > int metatype; /* SCHEMA_META_TYPE_OBJECT */ > const char *json; /* "{ 'name': ':empty', 'meta-type': 'object', > 'members': [ ] }, " */ > > int ndepends; /* length of depends array */ > int *depends; /* indices of other entities this depends on */ > bool visited; /* helper for filtering */ > }; > > If the user requests filtering, you then do a pre-pass that sets > array[i]->visited = filter_match(), then an iterative pass that says for > every array[i]->visited, you also want to set > array[array[i]->depends[j]]->visited for each j up to > array[i]->ndepends; then finally produce output of "[" + each > array[i]->json where visited is true + "]". Not thrilled by the additional complexity... > Of course, adding filtering as a follow-on patch is the only way to go; > initial implementation is fine as global information. Agreed. >> * Let clients cache the output of query-schema. >> >> It changes only on QEMU upgrades, i.e. rarely. Provide a command >> query-schema-hash. Clients can have a cache indexed by hash, and >> re-query the schema only when they don't have it cached. > > Libvirt already caches things. I don't know if having qemu output a > hash helps libvirt, or if libvirt won't mind just eating the cost of > reading the entire array every time anything else causes it to > repopulate the cache (such as timestamp on qemu binary changing). We > can play with it later. Agreed. >> Signed-off-by: Markus Armbruster <armbru@redhat.com> >> --- > > Missing a change to docs/qapi-code-gen.txt describing the new script and > output. Will fix. >> 30 files changed, 345 insertions(+), 12 deletions(-) >> create mode 100644 qapi/introspect.json >> create mode 100644 scripts/qapi-introspect.py >> >> diff --git a/.gitignore b/.gitignore >> index aed0e1f..a6a02db 100644 >> --- a/.gitignore >> +++ b/.gitignore >> @@ -32,6 +32,7 @@ >> /qapi-visit.[ch] >> /qapi-event.[ch] >> /qmp-commands.h >> +/qmp-introspect.[ch] >> /qmp-marshal.c >> /qemu-doc.html >> /qemu-tech.html > > Missing an addition for the new tests/test-qmp-introspect.h file > leftover after 'make'. On second thought, that belongs at [2] below. > >> +++ b/monitor.c >> @@ -74,6 +74,7 @@ >> #include "block/qapi.h" >> #include "qapi/qmp-event.h" >> #include "qapi-event.h" >> +#include "qmp-introspect.h" >> #include "sysemu/block-backend.h" >> >> /* for hmp_info_irq/pic */ >> @@ -924,6 +925,20 @@ EventInfoList *qmp_query_events(Error **errp) >> return ev_list; >> } >> >> +/* >> + * Minor hack: generated marshalling suppressed for this command >> + * ('gen': false in the schema) so we can simply send the JSON string >> + * instead of first parsing it with visit_type_SchemaInfoList() into a >> + * SchemaInfoList, then unparse it right back in the generated output >> + * marshaller, every time. >> + * Instead, we do it in test-qmp-input-visitor.c, just to make sure >> + * qapi-introspect.py's output actually conforms to the schema. >> + */ >> +static void qmp_query_schema(QDict *qdict, QObject **ret_data, Error **errp) >> +{ >> + *ret_data = qobject_from_json(qmp_schema_json); > > Works for me :) Should this ever become the last user of 'gen': false, we can still go the parse - unparse route. >> +++ b/qapi/introspect.json >> @@ -0,0 +1,69 @@ >> +# -*- Mode: Python -*- >> +# >> +# QAPI introspection >> +# >> +# Copyright (C) 2015 Red Hat, Inc. >> +# >> +# Authors: >> +# Markus Armbruster <armbru@redhat.com> >> +# >> +# This work is licensed under the terms of the GNU GPL, version 2 or later. >> +# See the COPYING file in the top-level directory. >> + >> +{ 'enum': 'SchemaMetaType', >> + 'data': [ 'builtin', 'enum', 'array', 'object', 'alternate', >> + 'command', 'event' ] } >> + >> +{ 'struct': 'SchemaInfoBase', >> + 'data': { 'name': 'str', 'meta-type': 'SchemaMetaType' } } >> + >> +{ 'enum': 'JSONType', >> + 'data': [ 'string', 'number', 'int', 'boolean', 'null', >> + 'object', 'array', 'value' ] } >> + >> +{ 'struct': 'SchemaInfoBuiltin', >> + 'data': { 'json-type': 'JSONType' } } >> + >> +{ 'struct': 'SchemaInfoEnum', >> + 'data': { 'values': ['str'] } } > > Do we want to document anything about sort ordering of this list? Is it > worth sorting the array by name, to allow clients to bsearch for whether > a particular enum value is supported, rather than having to linear search? Haven't thought about it. We currently emit them in definition order, which is not sorted. Largest enums: Q_KEY_CODE_MAX = 125, BLKDEBUG_EVENT_MAX = 43, BLOCKDEV_DRIVER_MAX = 28, CHARDEV_BACKEND_KIND_MAX = 19, RUN_STATE_MAX = 15, NET_CLIENT_OPTIONS_KIND_MAX = 12, Most enums are small: median is 3. Would libvirt prefer them sorted? >> + >> +{ 'struct': 'SchemaInfoArray', >> + 'data': { 'element-type': 'str' } } >> + >> +{ 'struct': 'SchemaInfoObjectMember', >> + 'data': { 'name': 'str', 'type': 'str', '*default': 'any' } } >> +# @default's type must match @type > > or be null; as you mentioned above, this sets up a tri-state of > mandatory, optional with no default, and (for future extension) optional > with default. Yes. >> + >> +{ 'struct': 'SchemaInfoObjectVariant', >> + 'data': { 'case': 'str', >> + 'members': [ 'SchemaInfoObjectMember' ] } } > > Would it be simpler to just have: > > 'data': { 'case': 'str', 'type': 'str' } > > and make the user refer recursively to the (possibly-implicit) type for > the members? Hmmm... QAPISchemaObjectTypeVariant has members name, type, flat, and a few more that don't matter here. For a non-flat variant with name=N, type=T, my code creates { 'case': 'N', 'members': [ { 'name': 'data', 'type': 'T' } ] } This means when the tag is 'N', we have a member 'data' of type 'T'. For a flat variant, it creates { 'case': 'N', 'members': [ { ... the members of T ... } ] } This means when the tag is 'N', we have all the members of T. If I understand you correctly, you're proposing { 'case': 'N', 'type': 'T' } to mean when the tag is 'N', we have all the members of T. For the flat variant above, we'd create exactly that. T must not have variants, but the schema doesn't reflect that. For the simple variant, we'd then create { 'case': 'N', 'type': 'TT' } where TT is a new implicit object type with a single member with name data and type T. Correct? > In particular, if we ever decide to allow a flat union to have another > union as a branch, rather than the current restriction that all branches > must be structs, then referring to the type of a branch may be easier > than breaking out all members of a struct. Not sure I completely get you here. Using an object type instead of a member list is obviously more flexible, because for any member list we can make up an object type with the same meaning, but not vice versa. > And if that's the case, it > may have knock-on simplifications to your earlier patches for tracking > variants. See [1] below for more thoughts... > > Do we want to guarantee anything about the sort ordering in this list? Again, haven't thought about it. Do we expect member lists to get so large binary search is called for? >> + >> +{ 'struct': 'SchemaInfoObject', >> + 'data': { 'members': [ 'SchemaInfoObjectMember' ], >> + '*tag': 'str', >> + '*variants': [ 'SchemaInfoObjectVariant' ] } } > > or these? Same question. >> + >> +{ 'struct': 'SchemaInfoAlternate', >> + 'data': { 'members': [ 'SchemaInfoObjectMember' ] } } > > Here's an example of what you generated: > "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'members': [ { > 'name': 'definition', 'type': 'BlockdevOptions' }, { 'name': > 'reference', 'type': 'str' } ] }, " > > I think you could get away with something simpler: > > 'data': { 'types': [ 'str' ] } > > as in: > "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'types': [ > 'BlockdevOptions', 'str' ] }, " I.e. have a list of types instead of a list of members. Let's see what we'd lose, by enumerating the members of SchemaInfoObjectMember: * name: not ABI, should not be examined (see commit message), thus no loss. * type: kept. * default: never present (see commit message), thus no loss > the only worry is whether we might want future extensions, where we'd > want additional information per element of that array, vs. being forced > to return two arrays in parallel (arrays of structs are more extensible > than arrays of strings). > Seems like this would be just a Yes? Choices: * The only piece of information we need on an alternative right now is the type, so make members a list of types. Nice now, awkward if we ever need more, * To provide for future additions, make it a list of SchemaInfoAlternateMember, where SchemaInfoAlternateMember has just one member type now. * Reuse existing SchemaInfoObjectMember, because that's close and I'm lazy. Preferences? >> + >> +{ 'struct': 'SchemaInfoCommand', >> + 'data': { 'args': 'str', 'returns': 'str' } } > > Again, we can add 'success-return':'bool' later, if desired. We probably will. >> + >> +{ 'struct': 'SchemaInfoEvent', >> + 'data': { 'data': 'str' } } >> + >> +{ 'union': 'SchemaInfo', >> + 'base': 'SchemaInfoBase', >> + 'discriminator': 'meta-type', >> + 'data': { >> + 'builtin': 'SchemaInfoBuiltin', >> + 'enum': 'SchemaInfoEnum', >> + 'array': 'SchemaInfoArray', >> + 'object': 'SchemaInfoObject', >> + 'alternate': 'SchemaInfoAlternate', >> + 'command': 'SchemaInfoCommand', >> + 'event': 'SchemaInfoEvent' } } >> + >> +{ 'command': 'query-schema', >> + 'returns': [ 'SchemaInfo' ], >> + 'gen': false } # just to simplify qmp_query_json() > > with an optional 'data':{'*filter':['str']} as a possible future > extension. Looks nice, in spite of being under-documented! Thanks :) >> +++ b/scripts/qapi-commands.py >> @@ -265,7 +265,7 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor): >> self.defn = None >> self.regy = None >> self.visited_rets = None >> - def visit_begin(self): >> + def visit_begin(self, schema): > > And again my python object-oriented newness is showing through; where I > guess all children have to update signatures to still be polymorphic to > a parent adding a parameter. Maybe they need to be updated, maybe not, but updating them neatly sidesteps Python polymorphism questions from mere mortals like you and me. > Might be worth separating the patch to add the schema parameter so that > there is less churn to the existing scripts - or even rebase the series > up-front to always use the schema parameter and not change the signature > here. I'll consider it. >> self.decl = '' >> self.defn = '' >> self.regy = '' >> @@ -273,7 +273,8 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor): >> def visit_end(self): >> if not middle_mode: >> self.defn += gen_registry(self.regy) >> - self.regy = None >> + self.regy = None >> + self.visited_rets = None > > Is it worth squashing this reset into the patch where the visitor was > first written? I'll double-check these resets are added in the correct patch. >> def visit_command(self, name, info, args, rets, gen, success_response): >> if not gen: >> return >> diff --git a/scripts/qapi-event.py b/scripts/qapi-event.py >> index 184a81f..71da7a9 100644 >> --- a/scripts/qapi-event.py >> +++ b/scripts/qapi-event.py >> @@ -139,13 +139,14 @@ class QAPISchemaGenEventVisitor(QAPISchemaVisitor): >> self.decl = None >> self.defn = None >> self.event_names = None >> - def visit_begin(self): >> + def visit_begin(self, schema): >> self.decl = '' >> self.defn = '' >> self.event_names = [] >> def visit_end(self): >> self.decl += gen_enum(event_enum_name, self.event_names) >> self.defn += gen_enum_lookup(event_enum_name, self.event_names) >> + self.event_names = None > > and again I'll double-check. >> def visit_event(self, name, info, data): >> self.decl += gen_event_send_decl(name, data) >> self.defn += gen_event_send(name, data) >> diff --git a/scripts/qapi-introspect.py b/scripts/qapi-introspect.py >> new file mode 100644 >> index 0000000..e7efc4a >> --- /dev/null >> +++ b/scripts/qapi-introspect.py >> @@ -0,0 +1,159 @@ >> +# >> +# QAPI introspection generator >> +# >> +# Copyright (C) 2015 Red Hat, Inc. >> +# >> +# Authors: >> +# Markus Armbruster <armbru@redhat.com> >> +# >> +# This work is licensed under the terms of the GNU GPL, version 2. >> +# See the COPYING file in the top-level directory. >> + >> +from qapi import * >> + >> +class QAPISchemaGenIntrospectVisitor(QAPISchemaVisitor): >> + def __init__(self): >> + self.schema = None >> + self.jsons = None >> + self.used_types = None >> + self.defn = None >> + self.decl = None >> + >> + def visit_begin(self, schema): >> + self.schema = schema >> + self.jsons = [] >> + self.used_types = [] >> + return QAPISchemaType # don't visit types for now >> + >> + def visit_end(self): >> + # visit the types that are actually used >> + for typ in self.used_types: >> + typ.visit(self) >> + self.jsons.sort() >> + name = prefix + 'qmp_schema_json' >> + self.decl = mcgen(''' >> +extern char %(c_name)s[]; > > Missing 'const' Will fix. >> +''', >> + c_name=c_name(name)) >> + self.defn = mcgen(''' >> +char %(c_name)s[] = "[" >> + "%(c_jsons)s]"; > > And again. Also, I'd consider putting the "]" on its own line, like the > "[" was, so that you can more easily cut and paste individual lines of > generated output (but since JSON doesn't allow trailing comma, I guess > the last line is still always going to be special). That's my reason for keepint the "]" there. >> +''', >> + c_name=c_name(name), >> + c_jsons=', "\n "'.join(self.jsons)) > > Cool syntax :) Possibly bordering on too cool :) >> + self.schema = None >> + self.jsons = None >> + self.used_types = None >> + >> + def _use_type(self, typ): >> + if typ not in self.used_types: >> + self.used_types.append(typ) >> + return typ.name >> + >> + def _gen_json(self, name, mtype, extra): >> + self.jsons.append("{ 'name': '%s', 'meta-type': '%s', %s }" >> + % (name, mtype, extra)) > > Problem. We document that our QMP engine accepts 'single-quoted' input > as an extension to JSON, but that our output will always be > "double-quoted". That means you _must_ write \" everywhere in the > generated C code. (Also, you set the style guide earlier of using '' > quoting around any text destined for generated C code) > > self.jsons.append('''{ \"name\": \"%s\", \"meta-type\": \"%s\", %s }''' > % (name, mtype, extra)) > > (at least, I assume '''...\"...''' is nicer than '...\\\"...') You're right, I shouldn't emit single-quoted JSON. >> + >> + def _gen_members(self, members): >> + return ("'members': [ " >> + + ", ".join([self._gen_member(m) for m in members]) >> + + " ]") > > and more choice-of-quotes issues. Lots more below; I'll quit pointing > them out on this round of review. > >> + >> + def _gen_member(self, member): >> + default = '' >> + if member.optional: >> + default = ", 'default': null" >> + return "{ 'name': '%s', 'type': '%s'%s }" \ >> + % (member.name, self._use_type(member.type), default) >> + >> + def _gen_variants(self, tag_name, variants): >> + return ("'tag': '%s'" % tag_name >> + + ", 'variants': [ " >> + + ", ".join([self._gen_variant(v) for v in variants]) >> + + " ]") >> + >> + def _gen_variant(self, variant): >> + if variant.flat: >> + members = self._gen_members(variant.type.members) >> + else: >> + members = "'members': [ { 'name': 'data', 'type': '%s' } ]" \ >> + % self._use_type(variant.type) >> + return "{ 'case': '%s', %s }" % (variant.name, members) > > [1] Ah, so .flat is still in use here, to avoid having to create > implicit types everywhere. But if we create implicit types for simple > unions, and just track variants by their case name/type instead of case > name/[members], it will allow us to have a union as a case branch (I > don't know that we need that much flexibility), and not have to worry > about exposing .flat everywhere. It may even result in a smaller JSON > string (you'd have to play with it to know for sure). We should try hard to get the introspection schema right from the start. But the internal representation is malleable. I know we can implement non-flat unions completely in terms of flat ones (and that means no .flat), but I also know I failed at doing it in this series. I'm not sure your idea will do the trick completely. It's fine, we can finish the job later. >> + >> + def visit_builtin_type(self, name, info, json_type): >> + self._gen_json(name, 'builtin', >> + "'json-type': '%s'" % json_type) >> + >> + def visit_enum_type(self, name, info, values): >> + self._gen_json(name, 'enum', >> + "'values': [ %s ]" % ", ".join(["'%s'" % v >> + for v in values])) >> + >> + def visit_array_type(self, name, info, element_type): >> + self._gen_json(name, 'array', >> + "'element-type': '%s'" % self._use_type(element_type)) >> + >> + def visit_object_type_flat(self, name, info, members, variants): >> + extra = self._gen_members(members) >> + if variants: >> + extra += ", " + self._gen_variants(variants.tag_name or "type", >> + variants.variants) >> + self._gen_json(name, 'object', extra) >> + > > Do we really need two types of object visitors, or can you reuse the > signature all the other visitors have? The other generators all work with high-level information: base, local members, variants. qapi-introspect.py wants a different, more low-level view of the members: flattened members (including inherited members and any implicit tag member). I could shoehorn both views into a single visitor function, by passing both views, .base + .local_members, and .members. All implementations will use only one of the views, but it's not immediately obvious which one. So I chose to have two visitor functions. Matter of taste. >> + def visit_alternate_type(self, name, info, variants): >> + self._gen_json(name, 'alternate', >> + self._gen_members(variants.variants)) >> + >> + def visit_command(self, name, info, args, rets, gen, success_response): >> + args = args or self.schema.the_empty_object_type >> + rets = rets or self.schema.the_empty_object_type >> + self._gen_json(name, 'command', >> + "'args': '%s', 'returns': '%s'" \ >> + % (self._use_type(args), self._use_type(rets))) >> + >> + def visit_event(self, name, info, data): >> + data = data or self.schema.the_empty_object_type >> + self._gen_json(name, 'event', "'data': '%s'" % self._use_type(data)) >> + >> +(input_file, output_dir, do_c, do_h, prefix, dummy) = parse_command_line() >> + > >> +++ b/scripts/qapi-types.py >> @@ -184,7 +184,6 @@ class QAPISchemaGenTypeVisitor(QAPISchemaVisitor): >> self.fwdecl = None >> self.fwdefn = None >> self.btin = None >> - def visit_begin(self): >> self.decl = '' > > Rebase botch mentioned above [0] Will fix. >> self.defn = '' >> self.fwdecl = '' >> diff --git a/scripts/qapi-visit.py b/scripts/qapi-visit.py >> index 2813bb3..7a03292 100644 >> --- a/scripts/qapi-visit.py >> +++ b/scripts/qapi-visit.py >> @@ -326,7 +326,7 @@ class QAPISchemaGenVisitVisitor(QAPISchemaVisitor): >> self.decl = None >> self.defn = None >> self.btin = None >> - def visit_begin(self): >> + def visit_begin(self, schema): >> self.decl = '' >> self.defn = '' >> self.btin = guardstart('QAPI_VISIT_BUILTIN_VISITOR_DECL') >> diff --git a/scripts/qapi.py b/scripts/qapi.py >> index 6ddb33e..7f9a159 100644 >> --- a/scripts/qapi.py >> +++ b/scripts/qapi.py >> @@ -764,7 +764,7 @@ class QAPISchemaEntity(object): >> pass >> >> class QAPISchemaVisitor(object): >> - def visit_begin(self): >> + def visit_begin(self, schema): > > I don't know enough python to know if making schema optional in the > parent class affects what the child class is allowed to implement while > still overriding things. Not sure I get you here. >> pass >> def visit_end(self): >> pass >> @@ -776,6 +776,8 @@ class QAPISchemaVisitor(object): >> pass >> def visit_object_type(self, name, info, base, members, variants): >> pass >> + def visit_object_type_flat(self, name, info, members, variants): >> + pass >> def visit_alternate_type(self, name, info, variants): >> pass >> def visit_command(self, name, info, args, rets, gen, success_response): >> @@ -892,6 +894,8 @@ class QAPISchemaObjectType(QAPISchemaType): >> def visit(self, visitor): >> visitor.visit_object_type(self.name, self.info, >> self.base, self.local_members, self.variants) >> + visitor.visit_object_type_flat(self.name, self.info, >> + self.members, self.variants) > > Instead of having two object visitor signatures, would it be worth > having a boolean parameter to QAPISchema.visit() that says whether to > pass base, members as (base type, local members) vs. (None, all members)? Then semantics of the single function depends on state set elsewhere. Doesn't feel simpler to me. >> class QAPISchemaObjectTypeMember(object): >> def __init__(self, name, typ, optional): >> @@ -1042,6 +1046,9 @@ class QAPISchema(object): >> ('bool', 'boolean', 'bool', 'false'), >> ('any', 'value', 'QObject' + pointer_suffix , 'NULL')]: >> self._def_builtin_type(*t) >> + self.the_empty_object_type = QAPISchemaObjectType(':empty', None, None, >> + [], None) > > Cool global. Worth adding in a separate patch? Three-liner patch without a user. I'd expect reviewers suggesting to squash it into the patch adding a user :) >> + self._def_entity(self.the_empty_object_type) >> >> def _make_implicit_enum_type(self, name, values): >> name = name + 'Kind' >> @@ -1180,9 +1187,10 @@ class QAPISchema(object): >> ent.check(self) >> >> def visit(self, visitor): >> - visitor.visit_begin() >> + ignore = visitor.visit_begin(self) >> for name in sorted(self.entity_dict.keys()): >> - self.entity_dict[name].visit(visitor) >> + if not ignore or not isinstance(self.entity_dict[name], ignore): >> + self.entity_dict[name].visit(visitor) > > So this lets introspection bypass visiting types on the first pass, and > then collect used types during visit_end(). It means you can only > bypass a single metatype, but that is sufficient for your use; I don't > know if there is any better idiom for this paradigm. I toyed around with several solutions for my problem, and this one came out simplest. I don't exactly love it, but I dislike it less than the other ones. >> visitor.visit_end() >> >> # >> diff --git a/tests/.gitignore b/tests/.gitignore >> index dc813c2..dda86cc 100644 >> --- a/tests/.gitignore >> +++ b/tests/.gitignore >> @@ -19,6 +19,7 @@ test-opts-visitor >> test-qapi-event.[ch] >> test-qapi-types.[ch] >> test-qapi-visit.[ch] >> +test-qapi-introspect.[ch] > > [2] Ah, maybe this is the file that wasn't quite right. Anything I need to fix? >> test-qdev-global-props >> test-qemu-opts >> test-qmp-commands >> diff --git a/tests/Makefile b/tests/Makefile >> index 60b82e2..7b1bf92 100644 >> --- a/tests/Makefile >> +++ b/tests/Makefile >> @@ -253,7 +253,8 @@ check-qapi-schema-y := $(addprefix tests/qapi-schema/, \ >> struct-base-clash.json struct-base-clash-deep.json ) >> >> GENERATED_HEADERS += tests/test-qapi-types.h tests/test-qapi-visit.h \ >> - tests/test-qmp-commands.h tests/test-qapi-event.h >> + tests/test-qmp-commands.h tests/test-qapi-event.h \ >> + tests/test-qmp-introspect.h >> >> test-obj-y = tests/check-qint.o tests/check-qstring.o tests/check-qdict.o \ >> tests/check-qlist.o tests/check-qfloat.o tests/check-qjson.o \ >> @@ -266,7 +267,7 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o tests/check-qdict.o \ >> tests/rcutorture.o tests/test-rcu-list.o >> >> test-qapi-obj-y = tests/test-qapi-visit.o tests/test-qapi-types.o \ >> - tests/test-qapi-event.o >> + tests/test-qapi-event.o tests/test-qmp-introspect.o >> >> $(test-obj-y): QEMU_INCLUDES += -Itests >> QEMU_CFLAGS += -I$(SRC_PATH)/tests >> @@ -327,6 +328,11 @@ $(SRC_PATH)/tests/qapi-schema/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-eve >> $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \ >> $(gen-out-type) -o tests -p "test-" $<, \ >> " GEN $@") >> +tests/test-qmp-introspect.c tests/test-qmp-introspect.h :\ >> +$(SRC_PATH)/tests/qapi-schema/qapi-schema-test.json $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py) >> + $(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py \ >> + $(gen-out-type) -o tests -p "test-" $<, \ >> + " GEN $@") >> >> tests/test-string-output-visitor$(EXESUF): tests/test-string-output-visitor.o $(test-qapi-obj-y) libqemuutil.a libqemustub.a >> tests/test-string-input-visitor$(EXESUF): tests/test-string-input-visitor.o $(test-qapi-obj-y) libqemuutil.a libqemustub.a >> diff --git a/tests/qapi-schema/alternate-good.out b/tests/qapi-schema/alternate-good.out >> index 0cbdfa1..aede1ae 100644 >> --- a/tests/qapi-schema/alternate-good.out >> +++ b/tests/qapi-schema/alternate-good.out >> @@ -1,3 +1,4 @@ >> +object :empty >> alternate Alt > > Again, adding the magic :empty object in a separate patch from the new > introspection code may help minimize the number of files being touched > with the new code. Okay, it's not a three-liner, it's a three-liner plus a trivial expected test output churn. I'll think about it. >> +++ b/tests/test-qmp-input-visitor.c >> @@ -17,6 +17,9 @@ >> #include "qapi/qmp-input-visitor.h" >> #include "test-qapi-types.h" >> #include "test-qapi-visit.h" >> +#include "test-qmp-introspect.h" >> +#include "qmp-introspect.h" >> +#include "qapi-visit.h" >> #include "qapi/qmp/types.h" >> >> typedef struct TestInputVisitorData { >> @@ -660,6 +663,31 @@ static void test_visitor_in_native_list_number(TestInputVisitorData *data, >> qapi_free_UserDefNativeListUnion(cvalue); >> } >> >> +static void do_test_visitor_in_qmp_introspect(TestInputVisitorData *data, >> + const char *schema_json) >> +{ >> + SchemaInfoList *schema = NULL; >> + Error *err = NULL; >> + Visitor *v; >> + >> + v = visitor_input_test_init_raw(data, schema_json); >> + >> + visit_type_SchemaInfoList(v, &schema, NULL, &err); >> + if (err) >> + fprintf(stderr, "%s", error_get_pretty(err)); >> + g_assert(!err); > > Don't you want: > visit_type_SchemaInfoList(..., &error_abort); I'm mimicking existing usage in this file. >> + g_assert(schema); >> + >> + qapi_free_SchemaInfoList(schema); >> +} >> + >> +static void test_visitor_in_qmp_introspect(TestInputVisitorData *data, >> + const void *unused) >> +{ >> + do_test_visitor_in_qmp_introspect(data, test_qmp_schema_json); >> + do_test_visitor_in_qmp_introspect(data, qmp_schema_json); >> +} >> + >> static void input_visitor_test_add(const char *testpath, >> TestInputVisitorData *data, >> void (*test_func)(TestInputVisitorData *data, const void *user_data)) >> @@ -753,6 +781,9 @@ int main(int argc, char **argv) >> input_visitor_test_add("/visitor/input/native_list/number", >> &in_visitor_data, >> test_visitor_in_native_list_number); >> + input_visitor_test_add("/visitor/input/qmp_introspect", >> + &in_visitor_data, >> + test_visitor_in_qmp_introspect); >> >> g_test_run(); >> >> > > Starting to shape up nicely. Thanks! I really appreciate your review. Must have been a big chunk of work.
On 07/28/2015 08:33 AM, Markus Armbruster wrote: > Eric Blake <eblake@redhat.com> writes: > >> On 07/01/2015 02:22 PM, Markus Armbruster wrote: >>> Caution, rough edges. >> >> No joke. It doesn't even compile without this fixup to a rebase snafu >> (see [0] below): > > Uh, how did that happen? I compiled it a million times... (except for > the last time, obviously). At least you're not alone; I've done dumb things like that, too. And at least it was easy to figure out, so I could test it. >>> >>> FIXME it can generate awfully long lines >> >> We already have long lines in generated output, but I agree that finding >> ways to break it up might be nice. Actually, > > This one's different in that the length is due to string literals. > indent is happy to break lines for us, but not string literals. > > Longest line is a bit over 4KiB for me. > If we break up string literals, at least use some indentation to make it obvious that multiple lines merge to a single array entry. For example (after patch 47): ... "{ 'name': ':abr', 'meta-type': 'object', " "'members': [ " "{ 'name': 'device', 'type': ':acg', 'default': null }, " "{ 'name': 'node-name', 'type': ':acg', 'default': null }, " "{ 'name': 'snapshot-file', 'type': ':acg' }, " "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null }, " "{ 'name': 'format', 'type': ':acg', 'default': null }, " "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, " "{ 'name': ... " Oh, and just thought of something while typing: although patch 47 masks type names from the end user, it would be VERY worthwhile if the C code had strategic comments that pointed back to the qapi type name: "{ 'name': ':abr', " /* TypeFoo */ "'meta-type': 'object'" ... (and then there's the '' vs. "" issue which impacts the output, as well). And if you do the "," between entries as a separate line, that might impact whether you do a trailing "]" on a line by its own (see [3] below). >>> + >>> +{ 'struct': 'SchemaInfoEnum', >>> + 'data': { 'values': ['str'] } } >> >> Do we want to document anything about sort ordering of this list? Is it >> worth sorting the array by name, to allow clients to bsearch for whether >> a particular enum value is supported, rather than having to linear search? > > Haven't thought about it. We currently emit them in definition order, > which is not sorted. Largest enums: > > Q_KEY_CODE_MAX = 125, > BLKDEBUG_EVENT_MAX = 43, > BLOCKDEV_DRIVER_MAX = 28, > CHARDEV_BACKEND_KIND_MAX = 19, > RUN_STATE_MAX = 15, > NET_CLIENT_OPTIONS_KIND_MAX = 12, > > Most enums are small: median is 3. > > Would libvirt prefer them sorted? Libvirt can probably live without sorting of enum constants (searching Q_KEY_CODE_MAX isn't going to be that noticeable of an impact on O(n) vs. O(logn); I predict much more time will be spent searching for type "xyz" referenced by command "abc" from the overall array, and repeating that search for multiple feature probes). But if we do want sorting, we need it up front (it will be easier to decide to use bsearch down the road if we are guaranteed that output is sorted, than it would be to try and learn whether whether we are talking to a new vs. old qemu to learn if sorting is in effect because it was added as an after-thought). And if we don't want sorting, documenting that data is NOT guaranteed to be position-dependent, in spite of being in a JSON array, is a nice touch. >>> + >>> +{ 'struct': 'SchemaInfoObjectVariant', >>> + 'data': { 'case': 'str', >>> + 'members': [ 'SchemaInfoObjectMember' ] } } >> >> Would it be simpler to just have: >> >> 'data': { 'case': 'str', 'type': 'str' } >> >> and make the user refer recursively to the (possibly-implicit) type for >> the members? > > Hmmm... > > QAPISchemaObjectTypeVariant has members name, type, flat, and a few more > that don't matter here. > > For a non-flat variant with name=N, type=T, my code creates > > { 'case': 'N', 'members': [ { 'name': 'data', 'type': 'T' } ] } > > This means when the tag is 'N', we have a member 'data' of type 'T'. > > For a flat variant, it creates > > { 'case': 'N', 'members': [ { ... the members of T ... } ] } > > This means when the tag is 'N', we have all the members of T. > > If I understand you correctly, you're proposing > > { 'case': 'N', 'type': 'T' } > > to mean when the tag is 'N', we have all the members of T. For the flat > variant above, we'd create exactly that. > > T must not have variants, but the schema doesn't reflect that. That's our current restriction, but it's one we might decide to lift in the future. Having a type with two different discriminators could get a bit weird to think about, but doesn't seem to be technically impossible. > > For the simple variant, we'd then create > > { 'case': 'N', 'type': 'TT' } > > where TT is a new implicit object type with a single member with name > data and type T. > > Correct? Yes. > >> In particular, if we ever decide to allow a flat union to have another >> union as a branch, rather than the current restriction that all branches >> must be structs, then referring to the type of a branch may be easier >> than breaking out all members of a struct. > > Not sure I completely get you here. Using an object type instead of a > member list is obviously more flexible, because for any member list we > can make up an object type with the same meaning, but not vice versa. Indeed, and that's the restriction that I mention we might someday want to lift. Or, in (modified, due to inline {} types) qapi terms, if we ever want to allow: { 'union': 'Helper', 'data': { 'number': 'int', 'string': 'str' } } { 'enum': 'MyEnum', 'data': [ 'a', 'b' ] } { 'union': 'Main', 'base': { 'switch': 'MyEnum' }, 'discriminator': 'switch', 'data': { 'a': { 'boolean': 'bool' }, 'b': 'Helper' } } { 'command': 'foo', 'data': { 'data': 'Main' } } then that would permit QMP invocations of: { "execute": "foo", "arguments": { "data": { "switch": "a", "boolean": true } } } { "execute": "foo", "arguments": { "data": { "switch": "b", "type": "number", "data": 1 } } } { "execute": "foo", "arguments": { "data": { "switch": "b", "type": "string", "data": "hello" } } } which we can express as a list of case/types for the primary variants of 'Main' (those types in turn refer to the secondary variants of type Helper), but which we cannot express as a [ 'SchemaInfoObjectMember' ] list, because the type of the "data" member depends on the secondary discriminator that is called into use on the "b" case of the primary discriminator. So I think we're saying the same thing, that a [ 'SchemaInfoObjectMember' ] can always be written as a reference to an object type name, but not all object type names can be broken back into an array of SchemaInfoObjectMember (only those types that are pure structs without variants); and that although we currently do not allow sub-variants within a union, we should not get in the way of that being a possible future extension. > >> And if that's the case, it >> may have knock-on simplifications to your earlier patches for tracking >> variants. See [1] below for more thoughts... >> >> Do we want to guarantee anything about the sort ordering in this list? > > Again, haven't thought about it. > > Do we expect member lists to get so large binary search is called for? Probably not, since such a list would be unwieldy for both client and server. We tend to add boxing and optional sub-structs rather than direct parameters if we have that much information to pass along (think about how adding throttling parameters to a new block device was done with a single top-level parameter pointing to a throttling sub-struct, rather than adding lots of throttling parameters at top level). But, as with enum sorting, actually documenting our choice will help cement the expectations of clients on what they have to do when learning if a parameter was added. > >>> + >>> +{ 'struct': 'SchemaInfoObject', >>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ], >>> + '*tag': 'str', >>> + '*variants': [ 'SchemaInfoObjectVariant' ] } } >> >> or these? > > Same question. > Here, if enums are sorted, then case branches within variants should be sorted. If enums are unsorted, then I'm fine if case branches are also unsorted (and possibly in a different order than the enum was), but be sure to document that. >>> + >>> +{ 'struct': 'SchemaInfoAlternate', >>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ] } } >> >> Here's an example of what you generated: >> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'members': [ { >> 'name': 'definition', 'type': 'BlockdevOptions' }, { 'name': >> 'reference', 'type': 'str' } ] }, " >> >> I think you could get away with something simpler: >> >> 'data': { 'types': [ 'str' ] } >> >> as in: >> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'types': [ >> 'BlockdevOptions', 'str' ] }, " > > I.e. have a list of types instead of a list of members. > > Let's see what we'd lose, by enumerating the members of > SchemaInfoObjectMember: > > * name: not ABI, should not be examined (see commit message), thus no > loss. > > * type: kept. > > * default: never present (see commit message), thus no loss > >> the only worry is whether we might want future extensions, where we'd >> want additional information per element of that array, vs. being forced >> to return two arrays in parallel (arrays of structs are more extensible >> than arrays of strings). >> Seems like this would be just a > > Yes? > > Choices: > > * The only piece of information we need on an alternative right now is > the type, so make members a list of types. Nice now, awkward if we > ever need more, > > * To provide for future additions, make it a list of > SchemaInfoAlternateMember, where SchemaInfoAlternateMember has just > one member type now. > > * Reuse existing SchemaInfoObjectMember, because that's close and I'm > lazy. > > Preferences? At this point, my vote is with a new SchemaInfoAlternateMember class (SchemaInfoObjectMember may diverge in a different direction, and it would eliminate the question of how to not expose the branch names as ABI; but keeping things as a (one-member, for now) dictionary will allow future extensions). >>> @@ -265,7 +265,7 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor): >>> self.defn = None >>> self.regy = None >>> self.visited_rets = None >>> - def visit_begin(self): >>> + def visit_begin(self, schema): >> >> And again my python object-oriented newness is showing through; where I >> guess all children have to update signatures to still be polymorphic to >> a parent adding a parameter. > > Maybe they need to be updated, maybe not, but updating them neatly > sidesteps Python polymorphism questions from mere mortals like you and > me. see reference to [4] below > >>> +''', >>> + c_name=c_name(name)) >>> + self.defn = mcgen(''' >>> +char %(c_name)s[] = "[" >>> + "%(c_jsons)s]"; >> >> And again. Also, I'd consider putting the "]" on its own line, like the >> "[" was, so that you can more easily cut and paste individual lines of >> generated output (but since JSON doesn't allow trailing comma, I guess >> the last line is still always going to be special). > > That's my reason for keepint the "]" there. > [3] and that works for me, unless we want to break long lines into shorter ones by virtue of putting "," (and thus "]") on their own line to make it even more obvious where we are breaking between elements of the overall array. >>> +''', >>> + c_name=c_name(name), >>> + c_jsons=', "\n "'.join(self.jsons)) >> >> Cool syntax :) > > Possibly bordering on too cool :) Java was the first language I encountered where "".foo() was valid syntax; sometimes, I wish C had made strings a first class type. I'm fine with it as it is. >> [1] Ah, so .flat is still in use here, to avoid having to create >> implicit types everywhere. But if we create implicit types for simple >> unions, and just track variants by their case name/type instead of case >> name/[members], it will allow us to have a union as a case branch (I >> don't know that we need that much flexibility), and not have to worry >> about exposing .flat everywhere. It may even result in a smaller JSON >> string (you'd have to play with it to know for sure). > > We should try hard to get the introspection schema right from the start. > > But the internal representation is malleable. I know we can implement > non-flat unions completely in terms of flat ones (and that means no > .flat), but I also know I failed at doing it in this series. I'm not > sure your idea will do the trick completely. It's fine, we can finish > the job later. Yes, I've come to that conclusion myself - doesn't matter what we do on the first round internally as long as the output is right; cleaning up the internals can come later. > I could shoehorn both views into a single visitor function, by passing > both views, .base + .local_members, and .members. All implementations > will use only one of the views, but it's not immediately obvious which > one. So I chose to have two visitor functions. Matter of taste. I can live with it (documentation that sub-classes should override at most only one of the two visitors might help the cause, though). >>> +++ b/scripts/qapi.py >>> @@ -764,7 +764,7 @@ class QAPISchemaEntity(object): >>> pass >>> >>> class QAPISchemaVisitor(object): >>> - def visit_begin(self): >>> + def visit_begin(self, schema): >> >> I don't know enough python to know if making schema optional in the >> parent class affects what the child class is allowed to implement while >> still overriding things. > > Not sure I get you here. It was an idle musing on whether class QAPISchemaVisitor(object): def visit_begin(self, schema=None): would permit: class QAPIIntrospectVisitor(QAPISchemaVisitor): def visit_begin(self): with proper polymorphism. But you've already come to the conclusion above [4] that it's easier to not mess with optional parameters (leave that for the python gurus), and that mere mortals are better off using something that obviously works instead of requiring knowledge about language internals. >>> +++ b/tests/.gitignore >>> @@ -19,6 +19,7 @@ test-opts-visitor >>> test-qapi-event.[ch] >>> test-qapi-types.[ch] >>> test-qapi-visit.[ch] >>> +test-qapi-introspect.[ch] >> >> [2] Ah, maybe this is the file that wasn't quite right. > > Anything I need to fix? The generated files created by 'make check' wer named test-qmp-introspect.[ch] (either s/qapi/qmp/ here, or else fix a Makefile rule to generate the desired name). > > Thanks! I really appreciate your review. Must have been a big chunk of > work. Yes, doing a thorough review took me the better part of a week to go through it all, so I can only imagine how much time you've invested in it. Hopefully v3 will be easier to review, because it will be diffs against this version and mainly focusing on whether review comments were addressed.
Eric Blake <eblake@redhat.com> writes: > On 07/28/2015 08:33 AM, Markus Armbruster wrote: >> Eric Blake <eblake@redhat.com> writes: >> >>> On 07/01/2015 02:22 PM, Markus Armbruster wrote: >>>> Caution, rough edges. >>> >>> No joke. It doesn't even compile without this fixup to a rebase snafu >>> (see [0] below): >> >> Uh, how did that happen? I compiled it a million times... (except for >> the last time, obviously). > > At least you're not alone; I've done dumb things like that, too. And at > least it was easy to figure out, so I could test it. > >>>> >>>> FIXME it can generate awfully long lines >>> >>> We already have long lines in generated output, but I agree that finding >>> ways to break it up might be nice. Actually, >> >> This one's different in that the length is due to string literals. >> indent is happy to break lines for us, but not string literals. >> >> Longest line is a bit over 4KiB for me. >> > > If we break up string literals, at least use some indentation to make it > obvious that multiple lines merge to a single array entry. For example > (after patch 47): > > ... > "{ 'name': ':abr', 'meta-type': 'object', " > "'members': [ " > "{ 'name': 'device', 'type': ':acg', 'default': null }, " > "{ 'name': 'node-name', 'type': ':acg', 'default': null }, " > "{ 'name': 'snapshot-file', 'type': ':acg' }, " > "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null > }, " > "{ 'name': 'format', 'type': ':acg', 'default': null }, " > "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, " > "{ 'name': ... " Unconventional indentation, but if it helps the reader... > Oh, and just thought of something while typing: although patch 47 masks > type names from the end user, it would be VERY worthwhile if the C code > had strategic comments that pointed back to the qapi type name: > > "{ 'name': ':abr', " /* TypeFoo */ > "'meta-type': 'object'" > ... If we do that for uses and definitions, we don't need a flag to generate nice typenames. But doing it for uses might complicate the code too much. > (and then there's the '' vs. "" issue which impacts the output, as > well). And if you do the "," between entries as a separate line, that > might impact whether you do a trailing "]" on a line by its own (see [3] > below). > >>>> + >>>> +{ 'struct': 'SchemaInfoEnum', >>>> + 'data': { 'values': ['str'] } } >>> >>> Do we want to document anything about sort ordering of this list? Is it >>> worth sorting the array by name, to allow clients to bsearch for whether >>> a particular enum value is supported, rather than having to linear search? >> >> Haven't thought about it. We currently emit them in definition order, >> which is not sorted. Largest enums: >> >> Q_KEY_CODE_MAX = 125, >> BLKDEBUG_EVENT_MAX = 43, >> BLOCKDEV_DRIVER_MAX = 28, >> CHARDEV_BACKEND_KIND_MAX = 19, >> RUN_STATE_MAX = 15, >> NET_CLIENT_OPTIONS_KIND_MAX = 12, >> >> Most enums are small: median is 3. >> >> Would libvirt prefer them sorted? > > Libvirt can probably live without sorting of enum constants (searching > Q_KEY_CODE_MAX isn't going to be that noticeable of an impact on O(n) > vs. O(logn); I predict much more time will be spent searching for type > "xyz" referenced by command "abc" from the overall array, and repeating > that search for multiple feature probes). > > But if we do want sorting, we need it up front (it will be easier to > decide to use bsearch down the road if we are guaranteed that output is > sorted, than it would be to try and learn whether whether we are talking > to a new vs. old qemu to learn if sorting is in effect because it was > added as an after-thought). I agree adding sorting later is prone to add another instance of rarely used, bit-rotting backward-compatibility code to libvirt, along with the logic whether to use it. Best avoided. > And if we don't want sorting, documenting > that data is NOT guaranteed to be position-dependent, in spite of being > in a JSON array, is a nice touch. What do you mean by "position-dependent"? >>>> + >>>> +{ 'struct': 'SchemaInfoObjectVariant', >>>> + 'data': { 'case': 'str', >>>> + 'members': [ 'SchemaInfoObjectMember' ] } } >>> >>> Would it be simpler to just have: >>> >>> 'data': { 'case': 'str', 'type': 'str' } >>> >>> and make the user refer recursively to the (possibly-implicit) type for >>> the members? >> >> Hmmm... >> >> QAPISchemaObjectTypeVariant has members name, type, flat, and a few more >> that don't matter here. >> >> For a non-flat variant with name=N, type=T, my code creates >> >> { 'case': 'N', 'members': [ { 'name': 'data', 'type': 'T' } ] } >> >> This means when the tag is 'N', we have a member 'data' of type 'T'. >> >> For a flat variant, it creates >> >> { 'case': 'N', 'members': [ { ... the members of T ... } ] } >> >> This means when the tag is 'N', we have all the members of T. >> >> If I understand you correctly, you're proposing >> >> { 'case': 'N', 'type': 'T' } >> >> to mean when the tag is 'N', we have all the members of T. For the flat >> variant above, we'd create exactly that. >> >> T must not have variants, but the schema doesn't reflect that. > > That's our current restriction, but it's one we might decide to lift in > the future. Having a type with two different discriminators could get a > bit weird to think about, but doesn't seem to be technically impossible. > >> >> For the simple variant, we'd then create >> >> { 'case': 'N', 'type': 'TT' } >> >> where TT is a new implicit object type with a single member with name >> data and type T. >> >> Correct? > > Yes. I'm starting to like it. >>> In particular, if we ever decide to allow a flat union to have another >>> union as a branch, rather than the current restriction that all branches >>> must be structs, then referring to the type of a branch may be easier >>> than breaking out all members of a struct. >> >> Not sure I completely get you here. Using an object type instead of a >> member list is obviously more flexible, because for any member list we >> can make up an object type with the same meaning, but not vice versa. > > Indeed, and that's the restriction that I mention we might someday want > to lift. Or, in (modified, due to inline {} types) qapi terms, if we > ever want to allow: > > { 'union': 'Helper', 'data': { 'number': 'int', 'string': 'str' } } > { 'enum': 'MyEnum', 'data': [ 'a', 'b' ] } > { 'union': 'Main', 'base': { 'switch': 'MyEnum' }, > 'discriminator': 'switch', 'data': { 'a': { 'boolean': 'bool' }, > 'b': 'Helper' } } > { 'command': 'foo', 'data': { 'data': 'Main' } } > > then that would permit QMP invocations of: > { "execute": "foo", "arguments": { "data": { > "switch": "a", "boolean": true } } } > { "execute": "foo", "arguments": { "data": { > "switch": "b", "type": "number", "data": 1 } } } > { "execute": "foo", "arguments": { "data": { > "switch": "b", "type": "string", "data": "hello" } } } > > which we can express as a list of case/types for the primary variants of > 'Main' (those types in turn refer to the secondary variants of type > Helper), but which we cannot express as a [ 'SchemaInfoObjectMember' ] > list, because the type of the "data" member depends on the secondary > discriminator that is called into use on the "b" case of the primary > discriminator. > > So I think we're saying the same thing, that a [ > 'SchemaInfoObjectMember' ] can always be written as a reference to an > object type name, but not all object type names can be broken back into > an array of SchemaInfoObjectMember (only those types that are pure > structs without variants); and that although we currently do not allow > sub-variants within a union, we should not get in the way of that being > a possible future extension. > >> >>> And if that's the case, it >>> may have knock-on simplifications to your earlier patches for tracking >>> variants. See [1] below for more thoughts... >>> >>> Do we want to guarantee anything about the sort ordering in this list? >> >> Again, haven't thought about it. >> >> Do we expect member lists to get so large binary search is called for? > > Probably not, since such a list would be unwieldy for both client and > server. We tend to add boxing and optional sub-structs rather than > direct parameters if we have that much information to pass along (think > about how adding throttling parameters to a new block device was done > with a single top-level parameter pointing to a throttling sub-struct, > rather than adding lots of throttling parameters at top level). > > But, as with enum sorting, actually documenting our choice will help > cement the expectations of clients on what they have to do when learning > if a parameter was added. We may want to adopt a consistent rule on sorting stuff. >>>> + >>>> +{ 'struct': 'SchemaInfoObject', >>>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ], >>>> + '*tag': 'str', >>>> + '*variants': [ 'SchemaInfoObjectVariant' ] } } >>> >>> or these? >> >> Same question. >> > > Here, if enums are sorted, then case branches within variants should be > sorted. If enums are unsorted, then I'm fine if case branches are also > unsorted (and possibly in a different order than the enum was), Okay. > but be > sure to document that. Documentation I produce tends to err on the side of brevity, sometimes too much. If something is meant to produce sorted results, I document the sortedness. If not, I habitually say nothing. Since nothing's said, you better assume nothing. Please keep calling out instances of excessive brevity :) >>>> + >>>> +{ 'struct': 'SchemaInfoAlternate', >>>> + 'data': { 'members': [ 'SchemaInfoObjectMember' ] } } >>> >>> Here's an example of what you generated: >>> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'members': [ { >>> 'name': 'definition', 'type': 'BlockdevOptions' }, { 'name': >>> 'reference', 'type': 'str' } ] }, " >>> >>> I think you could get away with something simpler: >>> >>> 'data': { 'types': [ 'str' ] } >>> >>> as in: >>> "{ 'name': 'BlockdevRef', 'meta-type': 'alternate', 'types': [ >>> 'BlockdevOptions', 'str' ] }, " >> >> I.e. have a list of types instead of a list of members. >> >> Let's see what we'd lose, by enumerating the members of >> SchemaInfoObjectMember: >> >> * name: not ABI, should not be examined (see commit message), thus no >> loss. >> >> * type: kept. >> >> * default: never present (see commit message), thus no loss >> >>> the only worry is whether we might want future extensions, where we'd >>> want additional information per element of that array, vs. being forced >>> to return two arrays in parallel (arrays of structs are more extensible >>> than arrays of strings). >>> Seems like this would be just a >> >> Yes? >> >> Choices: >> >> * The only piece of information we need on an alternative right now is >> the type, so make members a list of types. Nice now, awkward if we >> ever need more, >> >> * To provide for future additions, make it a list of >> SchemaInfoAlternateMember, where SchemaInfoAlternateMember has just >> one member type now. >> >> * Reuse existing SchemaInfoObjectMember, because that's close and I'm >> lazy. >> >> Preferences? > > At this point, my vote is with a new SchemaInfoAlternateMember class > (SchemaInfoObjectMember may diverge in a different direction, and it > would eliminate the question of how to not expose the branch names as > ABI; but keeping things as a (one-member, for now) dictionary will allow > future extensions). Since Kevin also questioned my reuse of SchemaInfoObjectMember for alternates in his review of RFC v1, let's go with a separate SchemaInfoAlternateMember. >>>> @@ -265,7 +265,7 @@ class QAPISchemaGenCommandVisitor(QAPISchemaVisitor): >>>> self.defn = None >>>> self.regy = None >>>> self.visited_rets = None >>>> - def visit_begin(self): >>>> + def visit_begin(self, schema): >>> >>> And again my python object-oriented newness is showing through; where I >>> guess all children have to update signatures to still be polymorphic to >>> a parent adding a parameter. >> >> Maybe they need to be updated, maybe not, but updating them neatly >> sidesteps Python polymorphism questions from mere mortals like you and >> me. > > see reference to [4] below > >> >>>> +''', >>>> + c_name=c_name(name)) >>>> + self.defn = mcgen(''' >>>> +char %(c_name)s[] = "[" >>>> + "%(c_jsons)s]"; >>> >>> And again. Also, I'd consider putting the "]" on its own line, like the >>> "[" was, so that you can more easily cut and paste individual lines of >>> generated output (but since JSON doesn't allow trailing comma, I guess >>> the last line is still always going to be special). >> >> That's my reason for keepint the "]" there. >> > > [3] and that works for me, unless we want to break long lines into > shorter ones by virtue of putting "," (and thus "]") on their own line > to make it even more obvious where we are breaking between elements of > the overall array. > >>>> +''', >>>> + c_name=c_name(name), >>>> + c_jsons=', "\n "'.join(self.jsons)) >>> >>> Cool syntax :) >> >> Possibly bordering on too cool :) > > Java was the first language I encountered where "".foo() was valid > syntax; sometimes, I wish C had made strings a first class type. I'm > fine with it as it is. > >>> [1] Ah, so .flat is still in use here, to avoid having to create >>> implicit types everywhere. But if we create implicit types for simple >>> unions, and just track variants by their case name/type instead of case >>> name/[members], it will allow us to have a union as a case branch (I >>> don't know that we need that much flexibility), and not have to worry >>> about exposing .flat everywhere. It may even result in a smaller JSON >>> string (you'd have to play with it to know for sure). >> >> We should try hard to get the introspection schema right from the start. >> >> But the internal representation is malleable. I know we can implement >> non-flat unions completely in terms of flat ones (and that means no >> .flat), but I also know I failed at doing it in this series. I'm not >> sure your idea will do the trick completely. It's fine, we can finish >> the job later. > > Yes, I've come to that conclusion myself - doesn't matter what we do on > the first round internally as long as the output is right; cleaning up > the internals can come later. > > >> I could shoehorn both views into a single visitor function, by passing >> both views, .base + .local_members, and .members. All implementations >> will use only one of the views, but it's not immediately obvious which >> one. So I chose to have two visitor functions. Matter of taste. > > I can live with it (documentation that sub-classes should override at > most only one of the two visitors might help the cause, though). Actually, overriding both would be just fine. We just haven't had a use for that. >>>> +++ b/scripts/qapi.py >>>> @@ -764,7 +764,7 @@ class QAPISchemaEntity(object): >>>> pass >>>> >>>> class QAPISchemaVisitor(object): >>>> - def visit_begin(self): >>>> + def visit_begin(self, schema): >>> >>> I don't know enough python to know if making schema optional in the >>> parent class affects what the child class is allowed to implement while >>> still overriding things. >> >> Not sure I get you here. > > It was an idle musing on whether > > class QAPISchemaVisitor(object): > def visit_begin(self, schema=None): > > would permit: > class QAPIIntrospectVisitor(QAPISchemaVisitor): > def visit_begin(self): > > with proper polymorphism. But you've already come to the conclusion > above [4] that it's easier to not mess with optional parameters (leave > that for the python gurus), and that mere mortals are better off using > something that obviously works instead of requiring knowledge about > language internals. > > >>>> +++ b/tests/.gitignore >>>> @@ -19,6 +19,7 @@ test-opts-visitor >>>> test-qapi-event.[ch] >>>> test-qapi-types.[ch] >>>> test-qapi-visit.[ch] >>>> +test-qapi-introspect.[ch] >>> >>> [2] Ah, maybe this is the file that wasn't quite right. >> >> Anything I need to fix? > > The generated files created by 'make check' wer named > test-qmp-introspect.[ch] (either s/qapi/qmp/ here, or else fix a > Makefile rule to generate the desired name). Scales falling from my eyes... >> Thanks! I really appreciate your review. Must have been a big chunk of >> work. > > Yes, doing a thorough review took me the better part of a week to go > through it all, so I can only imagine how much time you've invested in > it. Hopefully v3 will be easier to review, because it will be diffs > against this version and mainly focusing on whether review comments were > addressed. I'll do my best to keep the diffs in check.
On 07/29/2015 03:19 AM, Markus Armbruster wrote: >>> Longest line is a bit over 4KiB for me. >>> >> >> If we break up string literals, at least use some indentation to make it >> obvious that multiple lines merge to a single array entry. For example >> (after patch 47): >> >> ... >> "{ 'name': ':abr', 'meta-type': 'object', " >> "'members': [ " >> "{ 'name': 'device', 'type': ':acg', 'default': null }, " >> "{ 'name': 'node-name', 'type': ':acg', 'default': null }, " >> "{ 'name': 'snapshot-file', 'type': ':acg' }, " >> "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null >> }, " >> "{ 'name': 'format', 'type': ':acg', 'default': null }, " >> "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, " >> "{ 'name': ... " > > Unconventional indentation, but if it helps the reader... I'm not a stickler about the particular spacing I used, so much as demonstrating an idea. Pick any indentation you like; I was just demonstrating that some well-chosen line breaks, coupled with visual clues on what belongs together, can help in reading the string literal in the generated file. In fact, doesn't python have a way to pretty-print JSON, and then post-process the pretty-printed string to add C \" escaping? >> And if we don't want sorting, documenting >> that data is NOT guaranteed to be position-dependent, in spite of being >> in a JSON array, is a nice touch. > > What do you mean by "position-dependent"? If qemu 2.5 has { 'struct':'Foo', 'data': { 'b': 'int', 'c': 'int' } }, then qemu 2.6 adds '*a': 'int' to the end of that list, then either we guarantee sorting (if you read the members and see 'b' first, then you know 'a' was not added) or we don't (you must read the entire list to see if 'a' has been added; and you cannot assume that 'a' will be last even if it was listed last in the .json file of 2.6); the position in the .json file need not determine the position in the introspection output. >> But, as with enum sorting, actually documenting our choice will help >> cement the expectations of clients on what they have to do when learning >> if a parameter was added. > > We may want to adopt a consistent rule on sorting stuff. To be consistent, the rule would be that either everything is sorted, or nothing is; and if we choose nothing to be sorted, we are unlikely to ever want to add sorting in the future. >>> I could shoehorn both views into a single visitor function, by passing >>> both views, .base + .local_members, and .members. All implementations >>> will use only one of the views, but it's not immediately obvious which >>> one. So I chose to have two visitor functions. Matter of taste. >> >> I can live with it (documentation that sub-classes should override at >> most only one of the two visitors might help the cause, though). > > Actually, overriding both would be just fine. We just haven't had a use > for that. I guess it's hard to predict why a visitor would want to have both callbacks called for every object, so we can't outright state that it is useless. The whole point of a visitor interface is to provide flexibility in designing a visitor later down the road :)
Eric Blake <eblake@redhat.com> writes: > On 07/29/2015 03:19 AM, Markus Armbruster wrote: >>>> Longest line is a bit over 4KiB for me. >>>> >>> >>> If we break up string literals, at least use some indentation to make it >>> obvious that multiple lines merge to a single array entry. For example >>> (after patch 47): >>> >>> ... >>> "{ 'name': ':abr', 'meta-type': 'object', " >>> "'members': [ " >>> "{ 'name': 'device', 'type': ':acg', 'default': null }, " >>> "{ 'name': 'node-name', 'type': ':acg', 'default': null }, " >>> "{ 'name': 'snapshot-file', 'type': ':acg' }, " >>> "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null >>> }, " >>> "{ 'name': 'format', 'type': ':acg', 'default': null }, " >>> "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, " >>> "{ 'name': ... " >> >> Unconventional indentation, but if it helps the reader... > > I'm not a stickler about the particular spacing I used, so much as > demonstrating an idea. Pick any indentation you like; I was just > demonstrating that some well-chosen line breaks, coupled with visual > clues on what belongs together, can help in reading the string literal > in the generated file. > > In fact, doesn't python have a way to pretty-print JSON, and then > post-process the pretty-printed string to add C \" escaping? Interesting idea, definitely worth a doc search. Prettier output can of course be punted to a followup-patch. >>> And if we don't want sorting, documenting >>> that data is NOT guaranteed to be position-dependent, in spite of being >>> in a JSON array, is a nice touch. >> >> What do you mean by "position-dependent"? > > If qemu 2.5 has { 'struct':'Foo', 'data': { 'b': 'int', 'c': 'int' } }, > then qemu 2.6 adds '*a': 'int' to the end of that list, then either we > guarantee sorting (if you read the members and see 'b' first, then you > know 'a' was not added) or we don't (you must read the entire list to > see if 'a' has been added; and you cannot assume that 'a' will be last > even if it was listed last in the .json file of 2.6); the position in > the .json file need not determine the position in the introspection output. Got it. >>> But, as with enum sorting, actually documenting our choice will help >>> cement the expectations of clients on what they have to do when learning >>> if a parameter was added. >> >> We may want to adopt a consistent rule on sorting stuff. > > To be consistent, the rule would be that either everything is sorted, or > nothing is; and if we choose nothing to be sorted, we are unlikely to > ever want to add sorting in the future. Agree. If sorting everything turns out to be cheap (in complexity; other metrics don't matter much), I'm inclined to sort. >>>> I could shoehorn both views into a single visitor function, by passing >>>> both views, .base + .local_members, and .members. All implementations >>>> will use only one of the views, but it's not immediately obvious which >>>> one. So I chose to have two visitor functions. Matter of taste. >>> >>> I can live with it (documentation that sub-classes should override at >>> most only one of the two visitors might help the cause, though). >> >> Actually, overriding both would be just fine. We just haven't had a use >> for that. > > I guess it's hard to predict why a visitor would want to have both > callbacks called for every object, so we can't outright state that it is > useless. The whole point of a visitor interface is to provide > flexibility in designing a visitor later down the road :)
Markus Armbruster <armbru@redhat.com> writes: > Eric Blake <eblake@redhat.com> writes: > >> On 07/29/2015 03:19 AM, Markus Armbruster wrote: >>>>> Longest line is a bit over 4KiB for me. >>>>> >>>> >>>> If we break up string literals, at least use some indentation to make it >>>> obvious that multiple lines merge to a single array entry. For example >>>> (after patch 47): >>>> >>>> ... >>>> "{ 'name': ':abr', 'meta-type': 'object', " >>>> "'members': [ " >>>> "{ 'name': 'device', 'type': ':acg', 'default': null }, " >>>> "{ 'name': 'node-name', 'type': ':acg', 'default': null }, " >>>> "{ 'name': 'snapshot-file', 'type': ':acg' }, " >>>> "{ 'name': 'snapshot-node-name', 'type': ':acg', 'default': null >>>> }, " >>>> "{ 'name': 'format', 'type': ':acg', 'default': null }, " >>>> "{ 'name': 'mode', 'type': ':afo', 'default': null } ] }, " >>>> "{ 'name': ... " >>> >>> Unconventional indentation, but if it helps the reader... >> >> I'm not a stickler about the particular spacing I used, so much as >> demonstrating an idea. Pick any indentation you like; I was just >> demonstrating that some well-chosen line breaks, coupled with visual >> clues on what belongs together, can help in reading the string literal >> in the generated file. >> >> In fact, doesn't python have a way to pretty-print JSON, and then >> post-process the pretty-printed string to add C \" escaping? > > Interesting idea, definitely worth a doc search. Module json, new in 2.6 (Oct 2008). As usual, we're a decade behind: 2.4 (Nov 2004). > Prettier output can of course be punted to a followup-patch. I guess it'll have to wait for 2.6. [...]
diff --git i/scripts/qapi-types.py w/scripts/qapi-types.py index 79e8d24..12f3767 100644 --- i/scripts/qapi-types.py +++ w/scripts/qapi-types.py @@ -184,6 +184,7 @@ class QAPISchemaGenTypeVisitor(QAPISchemaVisitor): self.fwdecl = None self.fwdefn = None self.btin = None + def visit_begin(self, schema): self.decl = '' self.defn = '' self.fwdecl = ''