diff mbox series

[v3,10/10] qapi: scripts: add a generator for qapi's examples

Message ID 20230919201857.675913-11-victortoso@redhat.com
State New
Headers show
Series Validate and test qapi examples | expand

Commit Message

Victor Toso Sept. 19, 2023, 8:18 p.m. UTC
This generator has two goals:
 1. Mechanical validation of QAPI examples
 2. Generate the examples in a JSON format to be consumed for extra
    validation.

The generator iterates over every Example section, parsing both server
and client messages. The generator prints any inconsistency found, for
example:

 |  Error: Extra data: line 1 column 39 (char 38)
 |  Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
 |  Data: {"execute": "cancel-vcpu-dirty-limit"},
 |      "arguments": { "cpu-index": 1 } }

The generator will output other JSON file with all the examples in the
QAPI module that they came from. This can be used to validate the
introspection between QAPI/QMP to language bindings, for example:

 | { "examples": [
 |   {
 |     "id": "ksuxwzfayw",
 |     "client": [
 |     {
 |       "sequence-order": 1
 |       "message-type": "command",
 |       "message":
 |       { "arguments":
 |         { "device": "scratch", "size": 1073741824 },
 |         "execute": "block_resize"
 |       },
 |    } ],
 |    "server": [
 |    {
 |      "sequence-order": 2
 |      "message-type": "return",
 |      "message": { "return": {} },
 |    } ]
 |    }
 |  ] }

Note that the order matters, as read by the Example section and
translated into "sequence-order". A language binding project can then
consume this files to Marshal and Unmarshal, comparing if the results
are what is to be expected.

RFC discussion:
    https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html

Signed-off-by: Victor Toso <victortoso@redhat.com>
---
 scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++
 scripts/qapi/main.py         |   3 +-
 2 files changed, 210 insertions(+), 1 deletion(-)
 create mode 100644 scripts/qapi/dumpexamples.py

Comments

Markus Armbruster Sept. 21, 2023, 11:06 a.m. UTC | #1
Victor Toso <victortoso@redhat.com> writes:

> This generator has two goals:
>  1. Mechanical validation of QAPI examples
>  2. Generate the examples in a JSON format to be consumed for extra
>     validation.
>
> The generator iterates over every Example section, parsing both server
> and client messages. The generator prints any inconsistency found, for
> example:
>
>  |  Error: Extra data: line 1 column 39 (char 38)
>  |  Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
>  |  Data: {"execute": "cancel-vcpu-dirty-limit"},
>  |      "arguments": { "cpu-index": 1 } }

What language does it parse?

Can you give a grammar?

Should the parser be integrated into the doc parser, i.e. class QAPIDoc?

> The generator will output other JSON file with all the examples in the

Another JSON file, or other JSON files?

> QAPI module that they came from. This can be used to validate the
> introspection between QAPI/QMP to language bindings, for example:
>
>  | { "examples": [
>  |   {
>  |     "id": "ksuxwzfayw",
>  |     "client": [
>  |     {
>  |       "sequence-order": 1

Missing comma

>  |       "message-type": "command",
>  |       "message":
>  |       { "arguments":
>  |         { "device": "scratch", "size": 1073741824 },
>  |         "execute": "block_resize"
>  |       },
>  |    } ],
>  |    "server": [
>  |    {
>  |      "sequence-order": 2

Another one.

>  |      "message-type": "return",
>  |      "message": { "return": {} },

Extra comma.

>  |    } ]
>  |    }
>  |  ] }

Indentation is kind of weird.

The JSON's Valid structure and semantics are not documented.  We've
developed a way specify that, you might've heard of it, it's called
"QAPI schema" ;-P

Kidding aside, we've done this before.  E.g. docs/interop/firmware.json
specifies firmware descriptors.  We have some in pc-bios/descriptors/.

> Note that the order matters, as read by the Example section and
> translated into "sequence-order". A language binding project can then
> consume this files to Marshal and Unmarshal, comparing if the results
> are what is to be expected.
>
> RFC discussion:
>     https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html
>
> Signed-off-by: Victor Toso <victortoso@redhat.com>
> ---
>  scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++
>  scripts/qapi/main.py         |   3 +-
>  2 files changed, 210 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/qapi/dumpexamples.py
>
> diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
> new file mode 100644
> index 0000000000..55d9f13ab7
> --- /dev/null
> +++ b/scripts/qapi/dumpexamples.py

Let's name this examples.py.  It already does a bit more than dump
(namely validate), and any future code dealing with examples will likely
go into this file.

> @@ -0,0 +1,208 @@
> +"""
> +Dump examples for Developers
> +"""
> +# Copyright (c) 2023 Red Hat Inc.
> +#
> +# Authors:
> +#  Victor Toso <victortoso@redhat.com>
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2.

We should've insisted on v2+ for the QAPI generator back when we had a
chance.  *Sigh*

> +# See the COPYING file in the top-level directory.
> +
> +# Just for type hint on self
> +from __future__ import annotations
> +
> +import os
> +import json
> +import random
> +import string
> +
> +from typing import Dict, List, Optional
> +
> +from .schema import (
> +    QAPISchema,
> +    QAPISchemaType,
> +    QAPISchemaVisitor,
> +    QAPISchemaEnumMember,
> +    QAPISchemaFeature,
> +    QAPISchemaIfCond,
> +    QAPISchemaObjectType,
> +    QAPISchemaObjectTypeMember,
> +    QAPISchemaVariants,

pylint warns

    scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaEnumMember imported from schema (unused-import)
    scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaObjectTypeMember imported from schema (unused-import)
    scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaVariants imported from schema (unused-import)

> +)
> +from .source import QAPISourceInfo
> +
> +
> +def gen_examples(schema: QAPISchema,
> +                 output_dir: str,
> +                 prefix: str) -> None:
> +    vis = QAPISchemaGenExamplesVisitor(prefix)
> +    schema.visit(vis)
> +    vis.write(output_dir)

The other backends have this at the end of the file.  Either order
works, but consistency is nice.

> +
> +
> +def get_id(random, size: int) -> str:

pylint warns

    scripts/qapi/dumpexamples.py:44:11: W0621: Redefining name 'random' from outer scope (line 17) (redefined-outer-name)

> +    letters = string.ascii_lowercase
> +    return ''.join(random.choice(letters) for i in range(size))
> +
> +
> +def next_object(text, start, end, context) -> (Dict, bool):
> +    # Start of json object
> +    start = text.find("{", start)
> +    end = text.rfind("}", start, end+1)

Single quotes, please, for consistency with quote use elsewhere.

Rules of thumb:

* Double quotes for english text (e.g. error messages), so we don't have
  to escape apostrophes.

* Double quotes when they reduce the escaping

* Else single quotes

More elsewhere, not flagged.

> +
> +    # try catch, pretty print issues

Is this comment useful?

> +    try:
> +        ret = json.loads(text[start:end+1])
> +    except Exception as e:

pylint warns

    scripts/qapi/dumpexamples.py:57:11: W0703: Catching too general exception Exception (broad-except)

Catch JSONDecodeError?

> +        print("Error: {}\nLocation: {}\nData: {}\n".format(
> +              str(e), context, text[start:end+1]))

Errors need to go to stderr.

Have you considered using QAPIError to report these errors?

> +        return {}, True
> +    else:
> +        return ret, False
> +
> +
> +def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool):

Before I review the parser, I'd like to know the (intended) language
being parsed.

> +    examples, clients, servers = [], [], []
> +    failed = False
> +
> +    count = 1
> +    c, s = text.find("->"), text.find("<-")
> +    while c != -1 or s != -1:
> +        if c == -1 or (s != -1 and s < c):
> +            start, target = s, servers
> +        else:
> +            start, target = c, clients
> +
> +        # Find the client and server, if any
> +        if c != -1:
> +            c = text.find("->", start + 1)
> +        if s != -1:
> +            s = text.find("<-", start + 1)
> +
> +        # Find the limit of current's object.
> +        # We first look for the next message, either client or server. If none
> +        # is avaible, we set the end of the text as limit.
> +        if c == -1 and s != -1:
> +            end = s
> +        elif c != -1 and s == -1:
> +            end = c
> +        elif c != -1 and s != -1:
> +            end = (c < s) and c or s

pylint advises

    scripts/qapi/dumpexamples.py:91:12: R1706: Consider using ternary (c if c < s else s) (consider-using-ternary)

> +        else:
> +            end = len(text) - 1
> +
> +        message, error = next_object(text, start, end, context)
> +        if error:
> +            failed = True
> +
> +        if len(message) > 0:
> +            message_type = "return"
> +            if "execute" in message:
> +                message_type = "command"
> +            elif "event" in message:
> +                message_type = "event"
> +
> +            target.append({
> +                "sequence-order": count,
> +                "message-type": message_type,
> +                "message": message
> +            })
> +            count += 1
> +
> +    examples.append({"client": clients, "server": servers})
> +    return examples, failed
> +
> +
> +def parse_examples_of(self: QAPISchemaGenExamplesVisitor,

Uh, shouldn't this be a method of QAPISchemaGenExamplesVisitor?

> +                      name: str):
> +
> +    assert(name in self.schema._entity_dict)

Makes both pycodestyle and pylint unhappy.  Better:

       assert name in self.schema._entity_dict

pylint warns

    scripts/qapi/dumpexamples.py:120:19: W0212: Access to a protected member _entity_dict of a client class (protected-access)

here and two more times below.

> +    obj = self.schema._entity_dict[name]
> +
> +    if not obj.info.pragma.doc_required:
> +        return
> +
> +    assert((obj.doc is not None))

Better:

       assert obj.doc is not None

> +    module_name = obj._module.name
> +
> +    # We initialize random with the name so that we get consistent example
> +    # ids over different generations. The ids of a given example might
> +    # change when adding/removing examples, but that's acceptable as the
> +    # goal is just to grep $id to find what example failed at a given test
> +    # with minimum chorn over regenerating.

churn from?

> +    random.seed(name, version=2)

You're reinitializing the global PRNG.  Feels unclean.  Create your own
here?

> +
> +    for s in obj.doc.sections:
> +        if s.name != "Example":

docs/devel/qapi-code-gen.rst section "Definition documentation":

    A tagged section starts with one of the following words:
    "Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:".
    The section ends with the start of a new section.

You're missing "Examples".

docs/sphinx/qapidoc.py uses s.name.startswith('Example').

> +            continue
> +
> +        if module_name not in self.target:
> +            self.target[module_name] = []
> +
> +        context = f'''{name} at {obj.info.fname}:{obj.info.line}'''
> +        examples, failed = parse_text_to_dicts(s.text, context)
> +        if failed:
> +            # To warn user that docs needs fixing
> +            self.failed = True
> +
> +        for example in examples:
> +            self.target[module_name].append({
> +                    "id": get_id(random, 10),
> +                    "client": example["client"],
> +                    "server": example["server"]
> +            })
> +
> +
> +class QAPISchemaGenExamplesVisitor(QAPISchemaVisitor):
> +
> +    def __init__(self, prefix: str):
> +        super().__init__()
> +        self.target = {}

@target maps what to what?

> +        self.schema = None
> +        self.failed = False
> +
> +    def visit_begin(self, schema):
> +        self.schema = schema
> +
> +    def visit_end(self):
> +        self.schema = None
> +        assert not self.failed, "Should fix the docs"

Unless I'm misreading the code, this asserts "all the examples parse
fine."  Misuse of assert for reporting errors.

> +
> +    def write(self: QAPISchemaGenExamplesVisitor,
> +              output_dir: str) -> None:
> +        for filename, content in self.target.items():
> +            pathname = os.path.join(output_dir, "examples", filename)
> +            odir = os.path.dirname(pathname)
> +            os.makedirs(odir, exist_ok=True)
> +            result = {"examples": content}
> +
> +            with open(pathname, "w") as outfile:

pylint warns

    scripts/qapi/dumpexamples.py:180:17: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)

Recommend to pass encoding='utf=8'.

> +                outfile.write(json.dumps(result, indent=2, sort_keys=True))
> +
> +    def visit_command(self: QAPISchemaGenExamplesVisitor,
> +                      name: str,
> +                      info: Optional[QAPISourceInfo],
> +                      ifcond: QAPISchemaIfCond,
> +                      features: List[QAPISchemaFeature],
> +                      arg_type: Optional[QAPISchemaObjectType],
> +                      ret_type: Optional[QAPISchemaType],
> +                      gen: bool,
> +                      success_response: bool,
> +                      boxed: bool,
> +                      allow_oob: bool,
> +                      allow_preconfig: bool,
> +                      coroutine: bool) -> None:
> +
> +        if gen:
> +            parse_examples_of(self, name)

Why only if gen?

> +
> +    def visit_event(self: QAPISchemaGenExamplesVisitor,
> +                    name: str,
> +                    info: Optional[QAPISourceInfo],
> +                    ifcond: QAPISchemaIfCond,
> +                    features: List[QAPISchemaFeature],
> +                    arg_type: Optional[QAPISchemaObjectType],
> +                    boxed: bool):
> +
> +        parse_examples_of(self, name)

Examples in definition comments for types are silently ignored.  Should
we do something with them?

> diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
> index 316736b6a2..9482439fa9 100644
> --- a/scripts/qapi/main.py
> +++ b/scripts/qapi/main.py
> @@ -13,6 +13,7 @@
>  
>  from .commands import gen_commands
>  from .common import must_match
> +from .dumpexamples import gen_examples
>  from .error import QAPIError
>  from .events import gen_events
>  from .introspect import gen_introspect
> @@ -53,7 +54,7 @@ def generate(schema_file: str,
>      gen_commands(schema, output_dir, prefix, gen_tracing)
>      gen_events(schema, output_dir, prefix)
>      gen_introspect(schema, output_dir, prefix, unmask)
> -
> +    gen_examples(schema, output_dir, prefix)
>  
>  def main() -> int:
>      """

You provide some type annotations, but mypy isn't happy:

    $ mypy --config-file scripts/qapi/mypy.ini scripts/qapi/dumpexamples.py 
    scripts/qapi/parser.py:566: error: Function is missing a return type annotation
    scripts/qapi/parser.py:570: error: Function is missing a return type annotation
    scripts/qapi/dumpexamples.py:44: error: Function is missing a type annotation for one or more arguments
    scripts/qapi/dumpexamples.py:49: error: Function is missing a type annotation for one or more arguments
    scripts/qapi/dumpexamples.py:49: error: Syntax error in type annotation
    scripts/qapi/dumpexamples.py:49: note: Suggestion: Use Tuple[T1, ..., Tn] instead of (T1, ..., Tn)
    scripts/qapi/dumpexamples.py:65: error: Syntax error in type annotation
    scripts/qapi/dumpexamples.py:65: note: Suggestion: Use Tuple[T1, ..., Tn] instead of (T1, ..., Tn)
    scripts/qapi/dumpexamples.py:66: error: Need type annotation for "clients" (hint: "clients: List[<type>] = ...")
    scripts/qapi/dumpexamples.py:66: error: Need type annotation for "servers" (hint: "servers: List[<type>] = ...")
    scripts/qapi/dumpexamples.py:117: error: Function is missing a return type annotation
    scripts/qapi/dumpexamples.py:120: error: "None" has no attribute "_entity_dict"
    scripts/qapi/dumpexamples.py:121: error: "None" has no attribute "_entity_dict"
    scripts/qapi/dumpexamples.py:161: error: Need type annotation for "target" (hint: "target: Dict[<type>, <type>] = ...")
    scripts/qapi/dumpexamples.py:165: error: Function is missing a type annotation
    scripts/qapi/dumpexamples.py:168: error: Function is missing a return type annotation
    scripts/qapi/dumpexamples.py:168: note: Use "-> None" if function does not return a value
    scripts/qapi/dumpexamples.py:200: error: Function is missing a return type annotation
    Found 15 errors in 2 files (checked 1 source file)

I think before I dig deeper, we should discuss my findings so far.


Here's my .pylintrc, in case you want to run pylint yourself:

disable=
    consider-using-f-string,
    fixme,
    invalid-name,
    missing-docstring,
    too-few-public-methods,
    too-many-arguments,
    too-many-branches,
    too-many-instance-attributes,
    too-many-lines,
    too-many-locals,
    too-many-statements,
    unused-argument,
    unused-wildcard-import,
Victor Toso Oct. 18, 2023, 7:35 p.m. UTC | #2
Hi Markus,

Sorry the delay on reply here.

On Thu, Sep 21, 2023 at 01:06:01PM +0200, Markus Armbruster wrote:
> Victor Toso <victortoso@redhat.com> writes:
> 
> > This generator has two goals:
> >  1. Mechanical validation of QAPI examples
> >  2. Generate the examples in a JSON format to be consumed for extra
> >     validation.
> >
> > The generator iterates over every Example section, parsing both server
> > and client messages. The generator prints any inconsistency found, for
> > example:
> >
> >  |  Error: Extra data: line 1 column 39 (char 38)
> >  |  Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
> >  |  Data: {"execute": "cancel-vcpu-dirty-limit"},
> >  |      "arguments": { "cpu-index": 1 } }
> 
> What language does it parse?

It parsers the Example section. Fetches client and server JSON
messages. Validate them.

> Can you give a grammar?

Not sure if needed? I just fetch the json from the example
section and validate it.
 
> Should the parser be integrated into the doc parser, i.e. class QAPIDoc?

Yes, that would be better. I'll try that in the next iteration.

> > The generator will output other JSON file with all the examples in the
> 
> Another JSON file, or other JSON files?

JSON files. QEMU'S QAPI qapi/migration.json will generate a
migration.json with the format mentioned bellow.
> 
> > QAPI module that they came from. This can be used to validate the
> > introspection between QAPI/QMP to language bindings, for example:
> >
> >  | { "examples": [
> >  |   {
> >  |     "id": "ksuxwzfayw",
> >  |     "client": [
> >  |     {
> >  |       "sequence-order": 1
> 
> Missing comma
> 
> >  |       "message-type": "command",
> >  |       "message":
> >  |       { "arguments":
> >  |         { "device": "scratch", "size": 1073741824 },
> >  |         "execute": "block_resize"
> >  |       },
> >  |    } ],
> >  |    "server": [
> >  |    {
> >  |      "sequence-order": 2
> 
> Another one.
> 
> >  |      "message-type": "return",
> >  |      "message": { "return": {} },
> 
> Extra comma.
> 
> >  |    } ]
> >  |    }
> >  |  ] }
> 
> Indentation is kind of weird.

The missing comma was likely my fault on copy & paste.  The
indentation should be what json.dumps() provides, but I think I
changed somehow in the git log too.
 
> The JSON's Valid structure and semantics are not documented.
> We've developed a way specify that, you might've heard of it,
> it's called "QAPI schema" ;-P
> 
> Kidding aside, we've done this before.  E.g. docs/interop/firmware.json
> specifies firmware descriptors.  We have some in pc-bios/descriptors/.

Oh, that's neat. You meant to use QAPI schema as a way to
document output from examples.py?
 
> > Note that the order matters, as read by the Example section and
> > translated into "sequence-order". A language binding project can then
> > consume this files to Marshal and Unmarshal, comparing if the results
> > are what is to be expected.
> >
> > RFC discussion:
> >     https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html
> >
> > Signed-off-by: Victor Toso <victortoso@redhat.com>
> > ---
> >  scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++
> >  scripts/qapi/main.py         |   3 +-
> >  2 files changed, 210 insertions(+), 1 deletion(-)
> >  create mode 100644 scripts/qapi/dumpexamples.py
> >
> > diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
> > new file mode 100644
> > index 0000000000..55d9f13ab7
> > --- /dev/null
> > +++ b/scripts/qapi/dumpexamples.py
> 
> Let's name this examples.py.  It already does a bit more than
> dump (namely validate), and any future code dealing with
> examples will likely go into this file.

Ack.

> > @@ -0,0 +1,208 @@
> > +"""
> > +Dump examples for Developers
> > +"""
> > +# Copyright (c) 2023 Red Hat Inc.
> > +#
> > +# Authors:
> > +#  Victor Toso <victortoso@redhat.com>
> > +#
> > +# This work is licensed under the terms of the GNU GPL, version 2.
> 
> We should've insisted on v2+ for the QAPI generator back when we had a
> chance.  *Sigh*

:)
 
> > +# See the COPYING file in the top-level directory.
> > +
> > +# Just for type hint on self
> > +from __future__ import annotations
> > +
> > +import os
> > +import json
> > +import random
> > +import string
> > +
> > +from typing import Dict, List, Optional
> > +
> > +from .schema import (
> > +    QAPISchema,
> > +    QAPISchemaType,
> > +    QAPISchemaVisitor,
> > +    QAPISchemaEnumMember,
> > +    QAPISchemaFeature,
> > +    QAPISchemaIfCond,
> > +    QAPISchemaObjectType,
> > +    QAPISchemaObjectTypeMember,
> > +    QAPISchemaVariants,
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaEnumMember imported from schema (unused-import)
>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaObjectTypeMember imported from schema (unused-import)
>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaVariants imported from schema (unused-import)

Yes, now I learned a few tricks around python linters and
formatting. Next iteration will be better.
 
> > +)
> > +from .source import QAPISourceInfo
> > +
> > +
> > +def gen_examples(schema: QAPISchema,
> > +                 output_dir: str,
> > +                 prefix: str) -> None:
> > +    vis = QAPISchemaGenExamplesVisitor(prefix)
> > +    schema.visit(vis)
> > +    vis.write(output_dir)
> 
> The other backends have this at the end of the file.  Either order
> works, but consistency is nice.

Ok.
 
> > +
> > +
> > +def get_id(random, size: int) -> str:
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:44:11: W0621: Redefining name
>     'random' from outer scope (line 17) (redefined-outer-name)
> 
> > +    letters = string.ascii_lowercase
> > +    return ''.join(random.choice(letters) for i in range(size))
> > +
> > +
> > +def next_object(text, start, end, context) -> (Dict, bool):
> > +    # Start of json object
> > +    start = text.find("{", start)
> > +    end = text.rfind("}", start, end+1)
> 
> Single quotes, please, for consistency with quote use elsewhere.
> 
> Rules of thumb:
> 
> * Double quotes for english text (e.g. error messages), so we don't have
>   to escape apostrophes.
> 
> * Double quotes when they reduce the escaping
> 
> * Else single quotes
> 
> More elsewhere, not flagged.
> 
> > +
> > +    # try catch, pretty print issues
> 
> Is this comment useful?

I'll remove.
 
> > +    try:
> > +        ret = json.loads(text[start:end+1])
> > +    except Exception as e:
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:57:11: W0703: Catching too general exception Exception (broad-except)
> 
> Catch JSONDecodeError?

Ack.

> > +        print("Error: {}\nLocation: {}\nData: {}\n".format(
> > +              str(e), context, text[start:end+1]))
> 
> Errors need to go to stderr.
> 
> Have you considered using QAPIError to report these errors?

Did not cross my mind, no. I'll take a look.
 
> > +        return {}, True
> > +    else:
> > +        return ret, False
> > +
> > +
> > +def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool):
> 
> Before I review the parser, I'd like to know the (intended) language
> being parsed.

Ok, I'll add documentation about it in the next iteration.
 
> > +    examples, clients, servers = [], [], []
> > +    failed = False
> > +
> > +    count = 1
> > +    c, s = text.find("->"), text.find("<-")
> > +    while c != -1 or s != -1:
> > +        if c == -1 or (s != -1 and s < c):
> > +            start, target = s, servers
> > +        else:
> > +            start, target = c, clients
> > +
> > +        # Find the client and server, if any
> > +        if c != -1:
> > +            c = text.find("->", start + 1)
> > +        if s != -1:
> > +            s = text.find("<-", start + 1)
> > +
> > +        # Find the limit of current's object.
> > +        # We first look for the next message, either client or server. If none
> > +        # is avaible, we set the end of the text as limit.
> > +        if c == -1 and s != -1:
> > +            end = s
> > +        elif c != -1 and s == -1:
> > +            end = c
> > +        elif c != -1 and s != -1:
> > +            end = (c < s) and c or s
> 
> pylint advises
> 
>     scripts/qapi/dumpexamples.py:91:12: R1706: Consider using ternary (c if c < s else s) (consider-using-ternary)
> 
> > +        else:
> > +            end = len(text) - 1
> > +
> > +        message, error = next_object(text, start, end, context)
> > +        if error:
> > +            failed = True
> > +
> > +        if len(message) > 0:
> > +            message_type = "return"
> > +            if "execute" in message:
> > +                message_type = "command"
> > +            elif "event" in message:
> > +                message_type = "event"
> > +
> > +            target.append({
> > +                "sequence-order": count,
> > +                "message-type": message_type,
> > +                "message": message
> > +            })
> > +            count += 1
> > +
> > +    examples.append({"client": clients, "server": servers})
> > +    return examples, failed
> > +
> > +
> > +def parse_examples_of(self: QAPISchemaGenExamplesVisitor,
> 
> Uh, shouldn't this be a method of QAPISchemaGenExamplesVisitor?

Indeed.
 
> > +                      name: str):
> > +
> > +    assert(name in self.schema._entity_dict)
> 
> Makes both pycodestyle and pylint unhappy.  Better:
> 
>        assert name in self.schema._entity_dict
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:120:19: W0212: Access to a protected member _entity_dict of a client class (protected-access)
> 
> here and two more times below.

Thanks, I'll have all of those fixed.
 
> > +    obj = self.schema._entity_dict[name]
> > +
> > +    if not obj.info.pragma.doc_required:
> > +        return
> > +
> > +    assert((obj.doc is not None))
> 
> Better:
> 
>        assert obj.doc is not None
> 
> > +    module_name = obj._module.name
> > +
> > +    # We initialize random with the name so that we get consistent example
> > +    # ids over different generations. The ids of a given example might
> > +    # change when adding/removing examples, but that's acceptable as the
> > +    # goal is just to grep $id to find what example failed at a given test
> > +    # with minimum chorn over regenerating.
> 
> churn from?

There is one id per example section. The idea of having an id is
that, if a test failed, we can easily find what test failed.

The comment tries to clarify that the goal is the id to be kept
intact (hence, we seed from its name), reducing the churn over
regenerating the output.
 
> > +    random.seed(name, version=2)
> 
> You're reinitializing the global PRNG.  Feels unclean.  Create your own
> here?

I don't see much a problem with it, but sure, feels like local
would be cleaner indeed.
 
> > +
> > +    for s in obj.doc.sections:
> > +        if s.name != "Example":
> 
> docs/devel/qapi-code-gen.rst section "Definition documentation":
> 
>     A tagged section starts with one of the following words:
>     "Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:".
>     The section ends with the start of a new section.
> 
> You're missing "Examples".

Aha! I see several tests that I'm skipping, thanks! (19 Examples
sections)

> docs/sphinx/qapidoc.py uses s.name.startswith('Example').
> 
> > +            continue
> > +
> > +        if module_name not in self.target:
> > +            self.target[module_name] = []
> > +
> > +        context = f'''{name} at {obj.info.fname}:{obj.info.line}'''
> > +        examples, failed = parse_text_to_dicts(s.text, context)
> > +        if failed:
> > +            # To warn user that docs needs fixing
> > +            self.failed = True
> > +
> > +        for example in examples:
> > +            self.target[module_name].append({
> > +                    "id": get_id(random, 10),
> > +                    "client": example["client"],
> > +                    "server": example["server"]
> > +            })
> > +
> > +
> > +class QAPISchemaGenExamplesVisitor(QAPISchemaVisitor):
> > +
> > +    def __init__(self, prefix: str):
> > +        super().__init__()
> > +        self.target = {}
> 
> @target maps what to what?

Yes, lacking type hint and some documentation.

It is a dictionary that maps "filename" to an array of the
undocumented object. Each object has 3 fields:
    * "id"     : defines unique id for each test
    * "client" : object that wraps message sent by client (->)
    * "server" : object that wraps message sent by server (<-)

About client and server object, they contain 3 fields:
    * "message"      : The actual payload from the examples
                       section
    * "message-type" : Either "command", "return" or "event"
    * "sequence-order: (int) what index it had in the sequence of
                        examples section.

I'll have it properly documented.
 
> > +        self.schema = None
> > +        self.failed = False
> > +
> > +    def visit_begin(self, schema):
> > +        self.schema = schema
> > +
> > +    def visit_end(self):
> > +        self.schema = None
> > +        assert not self.failed, "Should fix the docs"
> 
> Unless I'm misreading the code, this asserts "all the examples parse
> fine."  Misuse of assert for reporting errors.
> 
> > +
> > +    def write(self: QAPISchemaGenExamplesVisitor,
> > +              output_dir: str) -> None:
> > +        for filename, content in self.target.items():
> > +            pathname = os.path.join(output_dir, "examples", filename)
> > +            odir = os.path.dirname(pathname)
> > +            os.makedirs(odir, exist_ok=True)
> > +            result = {"examples": content}
> > +
> > +            with open(pathname, "w") as outfile:
> 
> pylint warns
> 
>     scripts/qapi/dumpexamples.py:180:17: W1514: Using open without explicitly specifying an encoding (unspecified-encoding)
> 
> Recommend to pass encoding='utf=8'.

Ack.

> > +                outfile.write(json.dumps(result, indent=2, sort_keys=True))
> > +
> > +    def visit_command(self: QAPISchemaGenExamplesVisitor,
> > +                      name: str,
> > +                      info: Optional[QAPISourceInfo],
> > +                      ifcond: QAPISchemaIfCond,
> > +                      features: List[QAPISchemaFeature],
> > +                      arg_type: Optional[QAPISchemaObjectType],
> > +                      ret_type: Optional[QAPISchemaType],
> > +                      gen: bool,
> > +                      success_response: bool,
> > +                      boxed: bool,
> > +                      allow_oob: bool,
> > +                      allow_preconfig: bool,
> > +                      coroutine: bool) -> None:
> > +
> > +        if gen:
> > +            parse_examples_of(self, name)
> 
> Why only if gen?

I honestly don't remember. I'll test it out and document it
properly.
 
> > +
> > +    def visit_event(self: QAPISchemaGenExamplesVisitor,
> > +                    name: str,
> > +                    info: Optional[QAPISourceInfo],
> > +                    ifcond: QAPISchemaIfCond,
> > +                    features: List[QAPISchemaFeature],
> > +                    arg_type: Optional[QAPISchemaObjectType],
> > +                    boxed: bool):
> > +
> > +        parse_examples_of(self, name)
> 
> Examples in definition comments for types are silently ignored.  Should
> we do something with them?

The more the merrier. I'll tackle it in the next iteration too.
 
> > diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
> > index 316736b6a2..9482439fa9 100644
> > --- a/scripts/qapi/main.py
> > +++ b/scripts/qapi/main.py
> > @@ -13,6 +13,7 @@
> >  
> >  from .commands import gen_commands
> >  from .common import must_match
> > +from .dumpexamples import gen_examples
> >  from .error import QAPIError
> >  from .events import gen_events
> >  from .introspect import gen_introspect
> > @@ -53,7 +54,7 @@ def generate(schema_file: str,
> >      gen_commands(schema, output_dir, prefix, gen_tracing)
> >      gen_events(schema, output_dir, prefix)
> >      gen_introspect(schema, output_dir, prefix, unmask)
> > -
> > +    gen_examples(schema, output_dir, prefix)
> >  
> >  def main() -> int:
> >      """
> 
> You provide some type annotations, but mypy isn't happy:
> 
>     $ mypy --config-file scripts/qapi/mypy.ini scripts/qapi/dumpexamples.py 
>     scripts/qapi/parser.py:566: error: Function is missing a return type annotation
>     scripts/qapi/parser.py:570: error: Function is missing a return type annotation
>     scripts/qapi/dumpexamples.py:44: error: Function is missing a type annotation for one or more arguments
>     scripts/qapi/dumpexamples.py:49: error: Function is missing a type annotation for one or more arguments
>     scripts/qapi/dumpexamples.py:49: error: Syntax error in type annotation
>     scripts/qapi/dumpexamples.py:49: note: Suggestion: Use Tuple[T1, ..., Tn] instead of (T1, ..., Tn)
>     scripts/qapi/dumpexamples.py:65: error: Syntax error in type annotation
>     scripts/qapi/dumpexamples.py:65: note: Suggestion: Use Tuple[T1, ..., Tn] instead of (T1, ..., Tn)
>     scripts/qapi/dumpexamples.py:66: error: Need type annotation for "clients" (hint: "clients: List[<type>] = ...")
>     scripts/qapi/dumpexamples.py:66: error: Need type annotation for "servers" (hint: "servers: List[<type>] = ...")
>     scripts/qapi/dumpexamples.py:117: error: Function is missing a return type annotation
>     scripts/qapi/dumpexamples.py:120: error: "None" has no attribute "_entity_dict"
>     scripts/qapi/dumpexamples.py:121: error: "None" has no attribute "_entity_dict"
>     scripts/qapi/dumpexamples.py:161: error: Need type annotation for "target" (hint: "target: Dict[<type>, <type>] = ...")
>     scripts/qapi/dumpexamples.py:165: error: Function is missing a type annotation
>     scripts/qapi/dumpexamples.py:168: error: Function is missing a return type annotation
>     scripts/qapi/dumpexamples.py:168: note: Use "-> None" if function does not return a value
>     scripts/qapi/dumpexamples.py:200: error: Function is missing a return type annotation
>     Found 15 errors in 2 files (checked 1 source file)
> 
> I think before I dig deeper, we should discuss my findings so far.

Yes, I think I agreed with mostly of your suggestions. I'm
learning how to write proper python, so sorry that we saw many
basic lint failures here.
 
> Here's my .pylintrc, in case you want to run pylint yourself:
> 
> disable=
>     consider-using-f-string,
>     fixme,
>     invalid-name,
>     missing-docstring,
>     too-few-public-methods,
>     too-many-arguments,
>     too-many-branches,
>     too-many-instance-attributes,
>     too-many-lines,
>     too-many-locals,
>     too-many-statements,
>     unused-argument,
>     unused-wildcard-import,

Thanks again for the review, I really appreciate it.

Cheers,
Victor
Markus Armbruster Oct. 19, 2023, 6:26 a.m. UTC | #3
Victor Toso <victortoso@redhat.com> writes:

> Hi Markus,
>
> Sorry the delay on reply here.
>
> On Thu, Sep 21, 2023 at 01:06:01PM +0200, Markus Armbruster wrote:
>> Victor Toso <victortoso@redhat.com> writes:
>> 
>> > This generator has two goals:
>> >  1. Mechanical validation of QAPI examples
>> >  2. Generate the examples in a JSON format to be consumed for extra
>> >     validation.
>> >
>> > The generator iterates over every Example section, parsing both server
>> > and client messages. The generator prints any inconsistency found, for
>> > example:
>> >
>> >  |  Error: Extra data: line 1 column 39 (char 38)
>> >  |  Location: cancel-vcpu-dirty-limit at qapi/migration.json:2017
>> >  |  Data: {"execute": "cancel-vcpu-dirty-limit"},
>> >  |      "arguments": { "cpu-index": 1 } }
>> 
>> What language does it parse?
>
> It parsers the Example section. Fetches client and server JSON
> messages. Validate them.
>
>> Can you give a grammar?
>
> Not sure if needed? I just fetch the json from the example
> section and validate it.

The C++ parser just fetches the code from the text file and compiles it
:)

There are way too many parsers out there that show signs of "I'm not
parsing / I don't to nail down the language" delusions.

Let's start with an informal description of the language.

An example is a sequence of QMP input, QMP output, and explanatory text.

QMP input / output is a sequence of lines where the first one starts
with "<- " / "-> ", and the remaining ones start with " ".

Lines that are not QMP input / output are explanatory text.

>> Should the parser be integrated into the doc parser, i.e. class QAPIDoc?
>
> Yes, that would be better. I'll try that in the next iteration.
>
>> > The generator will output other JSON file with all the examples in the
>> 
>> Another JSON file, or other JSON files?
>
> JSON files. QEMU'S QAPI qapi/migration.json will generate a
> migration.json with the format mentioned bellow.
>> 
>> > QAPI module that they came from. This can be used to validate the
>> > introspection between QAPI/QMP to language bindings, for example:
>> >
>> >  | { "examples": [
>> >  |   {
>> >  |     "id": "ksuxwzfayw",
>> >  |     "client": [
>> >  |     {
>> >  |       "sequence-order": 1
>> 
>> Missing comma
>> 
>> >  |       "message-type": "command",
>> >  |       "message":
>> >  |       { "arguments":
>> >  |         { "device": "scratch", "size": 1073741824 },
>> >  |         "execute": "block_resize"
>> >  |       },
>> >  |    } ],
>> >  |    "server": [
>> >  |    {
>> >  |      "sequence-order": 2
>> 
>> Another one.
>> 
>> >  |      "message-type": "return",
>> >  |      "message": { "return": {} },
>> 
>> Extra comma.
>> 
>> >  |    } ]
>> >  |    }
>> >  |  ] }
>> 
>> Indentation is kind of weird.
>
> The missing comma was likely my fault on copy & paste.  The
> indentation should be what json.dumps() provides, but I think I
> changed somehow in the git log too.
>  
>> The JSON's Valid structure and semantics are not documented.
>> We've developed a way specify that, you might've heard of it,
>> it's called "QAPI schema" ;-P
>> 
>> Kidding aside, we've done this before.  E.g. docs/interop/firmware.json
>> specifies firmware descriptors.  We have some in pc-bios/descriptors/.
>
> Oh, that's neat. You meant to use QAPI schema as a way to
> document output from examples.py?

Exactly!

>> > Note that the order matters, as read by the Example section and
>> > translated into "sequence-order". A language binding project can then
>> > consume this files to Marshal and Unmarshal, comparing if the results
>> > are what is to be expected.
>> >
>> > RFC discussion:
>> >     https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg04641.html
>> >
>> > Signed-off-by: Victor Toso <victortoso@redhat.com>
>> > ---
>> >  scripts/qapi/dumpexamples.py | 208 +++++++++++++++++++++++++++++++++++
>> >  scripts/qapi/main.py         |   3 +-
>> >  2 files changed, 210 insertions(+), 1 deletion(-)
>> >  create mode 100644 scripts/qapi/dumpexamples.py
>> >
>> > diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
>> > new file mode 100644
>> > index 0000000000..55d9f13ab7
>> > --- /dev/null
>> > +++ b/scripts/qapi/dumpexamples.py
>> 
>> Let's name this examples.py.  It already does a bit more than
>> dump (namely validate), and any future code dealing with
>> examples will likely go into this file.
>
> Ack.
>
>> > @@ -0,0 +1,208 @@
>> > +"""
>> > +Dump examples for Developers
>> > +"""
>> > +# Copyright (c) 2023 Red Hat Inc.
>> > +#
>> > +# Authors:
>> > +#  Victor Toso <victortoso@redhat.com>
>> > +#
>> > +# This work is licensed under the terms of the GNU GPL, version 2.
>> 
>> We should've insisted on v2+ for the QAPI generator back when we had a
>> chance.  *Sigh*
>
> :)
>  
>> > +# See the COPYING file in the top-level directory.
>> > +
>> > +# Just for type hint on self
>> > +from __future__ import annotations
>> > +
>> > +import os
>> > +import json
>> > +import random
>> > +import string
>> > +
>> > +from typing import Dict, List, Optional
>> > +
>> > +from .schema import (
>> > +    QAPISchema,
>> > +    QAPISchemaType,
>> > +    QAPISchemaVisitor,
>> > +    QAPISchemaEnumMember,
>> > +    QAPISchemaFeature,
>> > +    QAPISchemaIfCond,
>> > +    QAPISchemaObjectType,
>> > +    QAPISchemaObjectTypeMember,
>> > +    QAPISchemaVariants,
>> 
>> pylint warns
>> 
>>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaEnumMember imported from schema (unused-import)
>>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaObjectTypeMember imported from schema (unused-import)
>>     scripts/qapi/dumpexamples.py:22:0: W0611: Unused QAPISchemaVariants imported from schema (unused-import)
>
> Yes, now I learned a few tricks around python linters and
> formatting. Next iteration will be better.
>  
>> > +)
>> > +from .source import QAPISourceInfo
>> > +
>> > +
>> > +def gen_examples(schema: QAPISchema,
>> > +                 output_dir: str,
>> > +                 prefix: str) -> None:
>> > +    vis = QAPISchemaGenExamplesVisitor(prefix)
>> > +    schema.visit(vis)
>> > +    vis.write(output_dir)
>> 
>> The other backends have this at the end of the file.  Either order
>> works, but consistency is nice.
>
> Ok.
>  
>> > +
>> > +
>> > +def get_id(random, size: int) -> str:
>> 
>> pylint warns
>> 
>>     scripts/qapi/dumpexamples.py:44:11: W0621: Redefining name
>>     'random' from outer scope (line 17) (redefined-outer-name)
>> 
>> > +    letters = string.ascii_lowercase
>> > +    return ''.join(random.choice(letters) for i in range(size))
>> > +
>> > +
>> > +def next_object(text, start, end, context) -> (Dict, bool):
>> > +    # Start of json object
>> > +    start = text.find("{", start)
>> > +    end = text.rfind("}", start, end+1)
>> 
>> Single quotes, please, for consistency with quote use elsewhere.
>> 
>> Rules of thumb:
>> 
>> * Double quotes for english text (e.g. error messages), so we don't have
>>   to escape apostrophes.
>> 
>> * Double quotes when they reduce the escaping
>> 
>> * Else single quotes
>> 
>> More elsewhere, not flagged.
>> 
>> > +
>> > +    # try catch, pretty print issues
>> 
>> Is this comment useful?
>
> I'll remove.
>  
>> > +    try:
>> > +        ret = json.loads(text[start:end+1])
>> > +    except Exception as e:
>> 
>> pylint warns
>> 
>>     scripts/qapi/dumpexamples.py:57:11: W0703: Catching too general exception Exception (broad-except)
>> 
>> Catch JSONDecodeError?
>
> Ack.
>
>> > +        print("Error: {}\nLocation: {}\nData: {}\n".format(
>> > +              str(e), context, text[start:end+1]))
>> 
>> Errors need to go to stderr.
>> 
>> Have you considered using QAPIError to report these errors?
>
> Did not cross my mind, no. I'll take a look.
>  
>> > +        return {}, True
>> > +    else:
>> > +        return ret, False
>> > +
>> > +
>> > +def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool):
>> 
>> Before I review the parser, I'd like to know the (intended) language
>> being parsed.
>
> Ok, I'll add documentation about it in the next iteration.
>  
>> > +    examples, clients, servers = [], [], []
>> > +    failed = False
>> > +
>> > +    count = 1
>> > +    c, s = text.find("->"), text.find("<-")
>> > +    while c != -1 or s != -1:
>> > +        if c == -1 or (s != -1 and s < c):
>> > +            start, target = s, servers
>> > +        else:
>> > +            start, target = c, clients
>> > +
>> > +        # Find the client and server, if any
>> > +        if c != -1:
>> > +            c = text.find("->", start + 1)
>> > +        if s != -1:
>> > +            s = text.find("<-", start + 1)
>> > +
>> > +        # Find the limit of current's object.
>> > +        # We first look for the next message, either client or server. If none
>> > +        # is avaible, we set the end of the text as limit.
>> > +        if c == -1 and s != -1:
>> > +            end = s
>> > +        elif c != -1 and s == -1:
>> > +            end = c
>> > +        elif c != -1 and s != -1:
>> > +            end = (c < s) and c or s
>> 
>> pylint advises
>> 
>>     scripts/qapi/dumpexamples.py:91:12: R1706: Consider using ternary (c if c < s else s) (consider-using-ternary)
>> 
>> > +        else:
>> > +            end = len(text) - 1
>> > +
>> > +        message, error = next_object(text, start, end, context)
>> > +        if error:
>> > +            failed = True
>> > +
>> > +        if len(message) > 0:
>> > +            message_type = "return"
>> > +            if "execute" in message:
>> > +                message_type = "command"
>> > +            elif "event" in message:
>> > +                message_type = "event"
>> > +
>> > +            target.append({
>> > +                "sequence-order": count,
>> > +                "message-type": message_type,
>> > +                "message": message
>> > +            })
>> > +            count += 1
>> > +
>> > +    examples.append({"client": clients, "server": servers})
>> > +    return examples, failed
>> > +
>> > +
>> > +def parse_examples_of(self: QAPISchemaGenExamplesVisitor,
>> 
>> Uh, shouldn't this be a method of QAPISchemaGenExamplesVisitor?
>
> Indeed.
>  
>> > +                      name: str):
>> > +
>> > +    assert(name in self.schema._entity_dict)
>> 
>> Makes both pycodestyle and pylint unhappy.  Better:
>> 
>>        assert name in self.schema._entity_dict
>> 
>> pylint warns
>> 
>>     scripts/qapi/dumpexamples.py:120:19: W0212: Access to a protected member _entity_dict of a client class (protected-access)
>> 
>> here and two more times below.
>
> Thanks, I'll have all of those fixed.
>  
>> > +    obj = self.schema._entity_dict[name]
>> > +
>> > +    if not obj.info.pragma.doc_required:
>> > +        return
>> > +
>> > +    assert((obj.doc is not None))
>> 
>> Better:
>> 
>>        assert obj.doc is not None
>> 
>> > +    module_name = obj._module.name
>> > +
>> > +    # We initialize random with the name so that we get consistent example
>> > +    # ids over different generations. The ids of a given example might
>> > +    # change when adding/removing examples, but that's acceptable as the
>> > +    # goal is just to grep $id to find what example failed at a given test
>> > +    # with minimum chorn over regenerating.
>> 
>> churn from?
>
> There is one id per example section. The idea of having an id is
> that, if a test failed, we can easily find what test failed.
>
> The comment tries to clarify that the goal is the id to be kept
> intact (hence, we seed from its name), reducing the churn over
> regenerating the output.

I'm not sure "over" is the correct preposition here.

[...]

>> I think before I dig deeper, we should discuss my findings so far.
>
> Yes, I think I agreed with mostly of your suggestions. I'm
> learning how to write proper python, so sorry that we saw many
> basic lint failures here.

No problem at all!  Such things only become a problem when people refuse
/ are unable to learn ;)

>> Here's my .pylintrc, in case you want to run pylint yourself:
>> 
>> disable=
>>     consider-using-f-string,
>>     fixme,
>>     invalid-name,
>>     missing-docstring,
>>     too-few-public-methods,
>>     too-many-arguments,
>>     too-many-branches,
>>     too-many-instance-attributes,
>>     too-many-lines,
>>     too-many-locals,
>>     too-many-statements,
>>     unused-argument,
>>     unused-wildcard-import,
>
> Thanks again for the review, I really appreciate it.

You're welcome!
diff mbox series

Patch

diff --git a/scripts/qapi/dumpexamples.py b/scripts/qapi/dumpexamples.py
new file mode 100644
index 0000000000..55d9f13ab7
--- /dev/null
+++ b/scripts/qapi/dumpexamples.py
@@ -0,0 +1,208 @@ 
+"""
+Dump examples for Developers
+"""
+# Copyright (c) 2023 Red Hat Inc.
+#
+# Authors:
+#  Victor Toso <victortoso@redhat.com>
+#
+# This work is licensed under the terms of the GNU GPL, version 2.
+# See the COPYING file in the top-level directory.
+
+# Just for type hint on self
+from __future__ import annotations
+
+import os
+import json
+import random
+import string
+
+from typing import Dict, List, Optional
+
+from .schema import (
+    QAPISchema,
+    QAPISchemaType,
+    QAPISchemaVisitor,
+    QAPISchemaEnumMember,
+    QAPISchemaFeature,
+    QAPISchemaIfCond,
+    QAPISchemaObjectType,
+    QAPISchemaObjectTypeMember,
+    QAPISchemaVariants,
+)
+from .source import QAPISourceInfo
+
+
+def gen_examples(schema: QAPISchema,
+                 output_dir: str,
+                 prefix: str) -> None:
+    vis = QAPISchemaGenExamplesVisitor(prefix)
+    schema.visit(vis)
+    vis.write(output_dir)
+
+
+def get_id(random, size: int) -> str:
+    letters = string.ascii_lowercase
+    return ''.join(random.choice(letters) for i in range(size))
+
+
+def next_object(text, start, end, context) -> (Dict, bool):
+    # Start of json object
+    start = text.find("{", start)
+    end = text.rfind("}", start, end+1)
+
+    # try catch, pretty print issues
+    try:
+        ret = json.loads(text[start:end+1])
+    except Exception as e:
+        print("Error: {}\nLocation: {}\nData: {}\n".format(
+              str(e), context, text[start:end+1]))
+        return {}, True
+    else:
+        return ret, False
+
+
+def parse_text_to_dicts(text: str, context: str) -> (List[Dict], bool):
+    examples, clients, servers = [], [], []
+    failed = False
+
+    count = 1
+    c, s = text.find("->"), text.find("<-")
+    while c != -1 or s != -1:
+        if c == -1 or (s != -1 and s < c):
+            start, target = s, servers
+        else:
+            start, target = c, clients
+
+        # Find the client and server, if any
+        if c != -1:
+            c = text.find("->", start + 1)
+        if s != -1:
+            s = text.find("<-", start + 1)
+
+        # Find the limit of current's object.
+        # We first look for the next message, either client or server. If none
+        # is avaible, we set the end of the text as limit.
+        if c == -1 and s != -1:
+            end = s
+        elif c != -1 and s == -1:
+            end = c
+        elif c != -1 and s != -1:
+            end = (c < s) and c or s
+        else:
+            end = len(text) - 1
+
+        message, error = next_object(text, start, end, context)
+        if error:
+            failed = True
+
+        if len(message) > 0:
+            message_type = "return"
+            if "execute" in message:
+                message_type = "command"
+            elif "event" in message:
+                message_type = "event"
+
+            target.append({
+                "sequence-order": count,
+                "message-type": message_type,
+                "message": message
+            })
+            count += 1
+
+    examples.append({"client": clients, "server": servers})
+    return examples, failed
+
+
+def parse_examples_of(self: QAPISchemaGenExamplesVisitor,
+                      name: str):
+
+    assert(name in self.schema._entity_dict)
+    obj = self.schema._entity_dict[name]
+
+    if not obj.info.pragma.doc_required:
+        return
+
+    assert((obj.doc is not None))
+    module_name = obj._module.name
+
+    # We initialize random with the name so that we get consistent example
+    # ids over different generations. The ids of a given example might
+    # change when adding/removing examples, but that's acceptable as the
+    # goal is just to grep $id to find what example failed at a given test
+    # with minimum chorn over regenerating.
+    random.seed(name, version=2)
+
+    for s in obj.doc.sections:
+        if s.name != "Example":
+            continue
+
+        if module_name not in self.target:
+            self.target[module_name] = []
+
+        context = f'''{name} at {obj.info.fname}:{obj.info.line}'''
+        examples, failed = parse_text_to_dicts(s.text, context)
+        if failed:
+            # To warn user that docs needs fixing
+            self.failed = True
+
+        for example in examples:
+            self.target[module_name].append({
+                    "id": get_id(random, 10),
+                    "client": example["client"],
+                    "server": example["server"]
+            })
+
+
+class QAPISchemaGenExamplesVisitor(QAPISchemaVisitor):
+
+    def __init__(self, prefix: str):
+        super().__init__()
+        self.target = {}
+        self.schema = None
+        self.failed = False
+
+    def visit_begin(self, schema):
+        self.schema = schema
+
+    def visit_end(self):
+        self.schema = None
+        assert not self.failed, "Should fix the docs"
+
+    def write(self: QAPISchemaGenExamplesVisitor,
+              output_dir: str) -> None:
+        for filename, content in self.target.items():
+            pathname = os.path.join(output_dir, "examples", filename)
+            odir = os.path.dirname(pathname)
+            os.makedirs(odir, exist_ok=True)
+            result = {"examples": content}
+
+            with open(pathname, "w") as outfile:
+                outfile.write(json.dumps(result, indent=2, sort_keys=True))
+
+    def visit_command(self: QAPISchemaGenExamplesVisitor,
+                      name: str,
+                      info: Optional[QAPISourceInfo],
+                      ifcond: QAPISchemaIfCond,
+                      features: List[QAPISchemaFeature],
+                      arg_type: Optional[QAPISchemaObjectType],
+                      ret_type: Optional[QAPISchemaType],
+                      gen: bool,
+                      success_response: bool,
+                      boxed: bool,
+                      allow_oob: bool,
+                      allow_preconfig: bool,
+                      coroutine: bool) -> None:
+
+        if gen:
+            parse_examples_of(self, name)
+
+    def visit_event(self: QAPISchemaGenExamplesVisitor,
+                    name: str,
+                    info: Optional[QAPISourceInfo],
+                    ifcond: QAPISchemaIfCond,
+                    features: List[QAPISchemaFeature],
+                    arg_type: Optional[QAPISchemaObjectType],
+                    boxed: bool):
+
+        parse_examples_of(self, name)
diff --git a/scripts/qapi/main.py b/scripts/qapi/main.py
index 316736b6a2..9482439fa9 100644
--- a/scripts/qapi/main.py
+++ b/scripts/qapi/main.py
@@ -13,6 +13,7 @@ 
 
 from .commands import gen_commands
 from .common import must_match
+from .dumpexamples import gen_examples
 from .error import QAPIError
 from .events import gen_events
 from .introspect import gen_introspect
@@ -53,7 +54,7 @@  def generate(schema_file: str,
     gen_commands(schema, output_dir, prefix, gen_tracing)
     gen_events(schema, output_dir, prefix)
     gen_introspect(schema, output_dir, prefix, unmask)
-
+    gen_examples(schema, output_dir, prefix)
 
 def main() -> int:
     """