diff mbox

[RFC,1/2] util - add automated ID generation utility

Message ID 17356f309b182f0df47567aeed40ab0e3cfc99d3.1441127976.git.jcody@redhat.com
State New
Headers show

Commit Message

Jeff Cody Sept. 1, 2015, 5:23 p.m. UTC
Multiple sub-systems in QEMU may find it useful to generated IDs
for objects that a user may reference via QMP or HMP.  This patch
presents a standardized way to do it, so that automatic ID generation
follows the same rules.

This patch enforces the following rules when generating an ID:

1.) Guarantee no collisions with a user-specified ID
2.) Identify the sub-system the ID belongs to
3.) Guarantee of uniqueness
4.) Spoiling predictibility, to avoid creating an assumption
    of object ordering and parsing (i.e., we don't want users to think
    they can guess the next ID based on prior behavior).

The scheme for this is as follows (no spaces):

                # subsys D RR
Reserved char --|    |   | |
Subsytem String -----|   | |
Unique number (64-bit) --| |
Two-digit random number ---|

For example, a generated node-name for the block sub-system may take the
look like this:

    #block076

The caller of id_generate() is responsible for freeing the generated
node name string with g_free().

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 include/qemu-common.h |  8 ++++++++
 util/id.c             | 35 +++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)

Comments

Eric Blake Sept. 1, 2015, 6:55 p.m. UTC | #1
On 09/01/2015 11:23 AM, Jeff Cody wrote:
> Multiple sub-systems in QEMU may find it useful to generated IDs
> for objects that a user may reference via QMP or HMP.  This patch
> presents a standardized way to do it, so that automatic ID generation
> follows the same rules.
> 
> This patch enforces the following rules when generating an ID:
> 
> 1.) Guarantee no collisions with a user-specified ID
> 2.) Identify the sub-system the ID belongs to
> 3.) Guarantee of uniqueness
> 4.) Spoiling predictibility, to avoid creating an assumption
>     of object ordering and parsing (i.e., we don't want users to think
>     they can guess the next ID based on prior behavior).
> 
> The scheme for this is as follows (no spaces):
> 
>                 # subsys D RR
> Reserved char --|    |   | |
> Subsytem String -----|   | |

s/Subsytem/Subsystem/

> Unique number (64-bit) --| |
> Two-digit random number ---|
> 
> For example, a generated node-name for the block sub-system may take the
> look like this:

s/take the//

> 
>     #block076
> 
> The caller of id_generate() is responsible for freeing the generated
> node name string with g_free().
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  include/qemu-common.h |  8 ++++++++
>  util/id.c             | 35 +++++++++++++++++++++++++++++++++++
>  2 files changed, 43 insertions(+)
> 

> +char *id_generate(IdSubSystems id)
> +{
> +    const char *id_subsys_str[] = {

s/id_/const id_/

> +        [ID_QDEV]  = "qdev",
> +        [ID_BLOCK] = "block",
> +    };

Do we want some sort of compile-time assertion that we have entries for
all id values?...

> +
> +    static uint64_t id_counters[ID_MAX];
> +    uint32_t rnd;
> +
> +    assert(id < ID_MAX);

...maybe in the form of assert(id_subsys_str[id])


> +
> +    rnd = g_random_int_range(0, 99);
> +
> +    return g_strdup_printf("%c%s%" PRIu64 "%" PRId32, ID_SPECIAL_CHAR,
> +                                                      id_subsys_str[id],
> +                                                      id_counters[id]++,
> +                                                      rnd);
> +}
> 

Looks reasonable to me.
John Snow Sept. 1, 2015, 7:13 p.m. UTC | #2
On 09/01/2015 01:23 PM, Jeff Cody wrote:
> Multiple sub-systems in QEMU may find it useful to generated IDs

generate

> for objects that a user may reference via QMP or HMP.  This patch
> presents a standardized way to do it, so that automatic ID generation
> follows the same rules.
> 
> This patch enforces the following rules when generating an ID:
> 
> 1.) Guarantee no collisions with a user-specified ID
> 2.) Identify the sub-system the ID belongs to
> 3.) Guarantee of uniqueness
> 4.) Spoiling predictibility, to avoid creating an assumption

predictability

>     of object ordering and parsing (i.e., we don't want users to think
>     they can guess the next ID based on prior behavior).
> 
> The scheme for this is as follows (no spaces):
> 
>                 # subsys D RR
> Reserved char --|    |   | |
> Subsytem String -----|   | |

Subsystem

> Unique number (64-bit) --| |
> Two-digit random number ---|
> 
> For example, a generated node-name for the block sub-system may take the
> look like this:
> 

"take this form" or "look like this"

>     #block076
> 
> The caller of id_generate() is responsible for freeing the generated
> node name string with g_free().
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  include/qemu-common.h |  8 ++++++++
>  util/id.c             | 35 +++++++++++++++++++++++++++++++++++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/include/qemu-common.h b/include/qemu-common.h
> index bbaffd1..f6b0105 100644
> --- a/include/qemu-common.h
> +++ b/include/qemu-common.h
> @@ -237,6 +237,14 @@ int64_t strtosz_suffix_unit(const char *nptr, char **end,
>  #define STR_OR_NULL(str) ((str) ? (str) : "null")
>  
>  /* id.c */
> +
> +typedef enum IdSubSystems {
> +    ID_QDEV,
> +    ID_BLOCK,
> +    ID_MAX      /* last element, used as array size */
> +} IdSubSystems;
> +
> +char *id_generate(IdSubSystems);
>  bool id_wellformed(const char *id);
>  
>  /* path.c */
> diff --git a/util/id.c b/util/id.c
> index 09b22fb..48e2935 100644
> --- a/util/id.c
> +++ b/util/id.c
> @@ -26,3 +26,38 @@ bool id_wellformed(const char *id)
>      }
>      return true;
>  }
> +
> +#define ID_SPECIAL_CHAR '#'
> +
> +/* Generates an ID of the form:
> + *
> + * "#block146",
> + *
> + *  where:
> + *      - "#" is always the reserved character '#'
> + *      - "block" refers to the subsystem identifed via IdSubSystems
> + *        and id_subsys_str[]
> + *      - "1" is a unique number (up to a uint64_t) for the subsystem,
> + *      - "46" is a pseudo-random numer to create uniqueness
> + *
> + * The caller is responsible for freeing the returned string with g_free()
> + */
> +char *id_generate(IdSubSystems id)
> +{
> +    const char *id_subsys_str[] = {
> +        [ID_QDEV]  = "qdev",
> +        [ID_BLOCK] = "block",
> +    };
> +

Do we want this local to this function? A lookup table may be useful for
utilities at some point.

> +    static uint64_t id_counters[ID_MAX];
> +    uint32_t rnd;
> +
> +    assert(id < ID_MAX);
> +
> +    rnd = g_random_int_range(0, 99);
> +
> +    return g_strdup_printf("%c%s%" PRIu64 "%" PRId32, ID_SPECIAL_CHAR,
> +                                                      id_subsys_str[id],
> +                                                      id_counters[id]++,
> +                                                      rnd);
> +}
> 

So basically, it's #<sys><counter><rnd>

So we could see:

|block|1|32|

For the block subsystem, 1st device, salt is 3.
But we could also see:

|block|13|2|

Block subsys, 13th device, salt is 2.

Forcing a zero-pad on the salt should be enough to disambiguate in all
cases:

block132
block1302

This way, the last two digits are *always* salt, making the ID
unambiguous and, I think, impossible to collide against regardless of
that the rng returns in the future for new IDs.
Jeff Cody Sept. 1, 2015, 7:15 p.m. UTC | #3
On Tue, Sep 01, 2015 at 12:55:15PM -0600, Eric Blake wrote:
> On 09/01/2015 11:23 AM, Jeff Cody wrote:
> > Multiple sub-systems in QEMU may find it useful to generated IDs
> > for objects that a user may reference via QMP or HMP.  This patch
> > presents a standardized way to do it, so that automatic ID generation
> > follows the same rules.
> > 
> > This patch enforces the following rules when generating an ID:
> > 
> > 1.) Guarantee no collisions with a user-specified ID
> > 2.) Identify the sub-system the ID belongs to
> > 3.) Guarantee of uniqueness
> > 4.) Spoiling predictibility, to avoid creating an assumption
> >     of object ordering and parsing (i.e., we don't want users to think
> >     they can guess the next ID based on prior behavior).
> > 
> > The scheme for this is as follows (no spaces):
> > 
> >                 # subsys D RR
> > Reserved char --|    |   | |
> > Subsytem String -----|   | |
> 
> s/Subsytem/Subsystem/
> 
> > Unique number (64-bit) --| |
> > Two-digit random number ---|
> > 
> > For example, a generated node-name for the block sub-system may take the
> > look like this:
> 
> s/take the//
> 
> > 
> >     #block076
> > 
> > The caller of id_generate() is responsible for freeing the generated
> > node name string with g_free().
> > 
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  include/qemu-common.h |  8 ++++++++
> >  util/id.c             | 35 +++++++++++++++++++++++++++++++++++
> >  2 files changed, 43 insertions(+)
> > 
> 
> > +char *id_generate(IdSubSystems id)
> > +{
> > +    const char *id_subsys_str[] = {
> 
> s/id_/const id_/
> 

Good point.

> > +        [ID_QDEV]  = "qdev",
> > +        [ID_BLOCK] = "block",
> > +    };
> 
> Do we want some sort of compile-time assertion that we have entries for
> all id values?...
> 
> > +
> > +    static uint64_t id_counters[ID_MAX];
> > +    uint32_t rnd;
> > +
> > +    assert(id < ID_MAX);
> 
> ...maybe in the form of assert(id_subsys_str[id])
> 

Yes, I think we do.  If one is missing, that is certainly a mistake,
and we run the risk of collisions as well.

> 
> > +
> > +    rnd = g_random_int_range(0, 99);
> > +
> > +    return g_strdup_printf("%c%s%" PRIu64 "%" PRId32, ID_SPECIAL_CHAR,
> > +                                                      id_subsys_str[id],
> > +                                                      id_counters[id]++,
> > +                                                      rnd);
> > +}
> > 
> 
> Looks reasonable to me.
>

Thanks

-Jeff
Jeff Cody Sept. 1, 2015, 7:21 p.m. UTC | #4
On Tue, Sep 01, 2015 at 03:13:52PM -0400, John Snow wrote:
> 
> 
> On 09/01/2015 01:23 PM, Jeff Cody wrote:
> > Multiple sub-systems in QEMU may find it useful to generated IDs
> 
> generate
> 
> > for objects that a user may reference via QMP or HMP.  This patch
> > presents a standardized way to do it, so that automatic ID generation
> > follows the same rules.
> > 
> > This patch enforces the following rules when generating an ID:
> > 
> > 1.) Guarantee no collisions with a user-specified ID
> > 2.) Identify the sub-system the ID belongs to
> > 3.) Guarantee of uniqueness
> > 4.) Spoiling predictibility, to avoid creating an assumption
> 
> predictability
> 
> >     of object ordering and parsing (i.e., we don't want users to think
> >     they can guess the next ID based on prior behavior).
> > 
> > The scheme for this is as follows (no spaces):
> > 
> >                 # subsys D RR
> > Reserved char --|    |   | |
> > Subsytem String -----|   | |
> 
> Subsystem
> 
> > Unique number (64-bit) --| |
> > Two-digit random number ---|
> > 
> > For example, a generated node-name for the block sub-system may take the
> > look like this:
> > 
> 
> "take this form" or "look like this"
> 

All I can say is, sometimes my fingers don't obey my brain.


> >     #block076
> > 
> > The caller of id_generate() is responsible for freeing the generated
> > node name string with g_free().
> > 
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  include/qemu-common.h |  8 ++++++++
> >  util/id.c             | 35 +++++++++++++++++++++++++++++++++++
> >  2 files changed, 43 insertions(+)
> > 
> > diff --git a/include/qemu-common.h b/include/qemu-common.h
> > index bbaffd1..f6b0105 100644
> > --- a/include/qemu-common.h
> > +++ b/include/qemu-common.h
> > @@ -237,6 +237,14 @@ int64_t strtosz_suffix_unit(const char *nptr, char **end,
> >  #define STR_OR_NULL(str) ((str) ? (str) : "null")
> >  
> >  /* id.c */
> > +
> > +typedef enum IdSubSystems {
> > +    ID_QDEV,
> > +    ID_BLOCK,
> > +    ID_MAX      /* last element, used as array size */
> > +} IdSubSystems;
> > +
> > +char *id_generate(IdSubSystems);
> >  bool id_wellformed(const char *id);
> >  
> >  /* path.c */
> > diff --git a/util/id.c b/util/id.c
> > index 09b22fb..48e2935 100644
> > --- a/util/id.c
> > +++ b/util/id.c
> > @@ -26,3 +26,38 @@ bool id_wellformed(const char *id)
> >      }
> >      return true;
> >  }
> > +
> > +#define ID_SPECIAL_CHAR '#'
> > +
> > +/* Generates an ID of the form:
> > + *
> > + * "#block146",
> > + *
> > + *  where:
> > + *      - "#" is always the reserved character '#'
> > + *      - "block" refers to the subsystem identifed via IdSubSystems
> > + *        and id_subsys_str[]
> > + *      - "1" is a unique number (up to a uint64_t) for the subsystem,
> > + *      - "46" is a pseudo-random numer to create uniqueness
> > + *
> > + * The caller is responsible for freeing the returned string with g_free()
> > + */
> > +char *id_generate(IdSubSystems id)
> > +{
> > +    const char *id_subsys_str[] = {
> > +        [ID_QDEV]  = "qdev",
> > +        [ID_BLOCK] = "block",
> > +    };
> > +
> 
> Do we want this local to this function? A lookup table may be useful for
> utilities at some point.
> 

Possibly.  I'm neutral, we can move it out of the function and make it
static.

> > +    static uint64_t id_counters[ID_MAX];
> > +    uint32_t rnd;
> > +
> > +    assert(id < ID_MAX);
> > +
> > +    rnd = g_random_int_range(0, 99);
> > +
> > +    return g_strdup_printf("%c%s%" PRIu64 "%" PRId32, ID_SPECIAL_CHAR,
> > +                                                      id_subsys_str[id],
> > +                                                      id_counters[id]++,
> > +                                                      rnd);
> > +}
> > 
> 
> So basically, it's #<sys><counter><rnd>
> 
> So we could see:
> 
> |block|1|32|
> 
> For the block subsystem, 1st device, salt is 3.
> But we could also see:
> 
> |block|13|2|
> 
> Block subsys, 13th device, salt is 2.
> 
> Forcing a zero-pad on the salt should be enough to disambiguate in all
> cases:
> 
> block132
> block1302
> 
> This way, the last two digits are *always* salt, making the ID
> unambiguous and, I think, impossible to collide against regardless of
> that the rng returns in the future for new IDs.

Yes - that is actually what I meant to do.  We definitely want to 
enforce two digits for the random element.

Thanks,

Jeff
diff mbox

Patch

diff --git a/include/qemu-common.h b/include/qemu-common.h
index bbaffd1..f6b0105 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -237,6 +237,14 @@  int64_t strtosz_suffix_unit(const char *nptr, char **end,
 #define STR_OR_NULL(str) ((str) ? (str) : "null")
 
 /* id.c */
+
+typedef enum IdSubSystems {
+    ID_QDEV,
+    ID_BLOCK,
+    ID_MAX      /* last element, used as array size */
+} IdSubSystems;
+
+char *id_generate(IdSubSystems);
 bool id_wellformed(const char *id);
 
 /* path.c */
diff --git a/util/id.c b/util/id.c
index 09b22fb..48e2935 100644
--- a/util/id.c
+++ b/util/id.c
@@ -26,3 +26,38 @@  bool id_wellformed(const char *id)
     }
     return true;
 }
+
+#define ID_SPECIAL_CHAR '#'
+
+/* Generates an ID of the form:
+ *
+ * "#block146",
+ *
+ *  where:
+ *      - "#" is always the reserved character '#'
+ *      - "block" refers to the subsystem identifed via IdSubSystems
+ *        and id_subsys_str[]
+ *      - "1" is a unique number (up to a uint64_t) for the subsystem,
+ *      - "46" is a pseudo-random numer to create uniqueness
+ *
+ * The caller is responsible for freeing the returned string with g_free()
+ */
+char *id_generate(IdSubSystems id)
+{
+    const char *id_subsys_str[] = {
+        [ID_QDEV]  = "qdev",
+        [ID_BLOCK] = "block",
+    };
+
+    static uint64_t id_counters[ID_MAX];
+    uint32_t rnd;
+
+    assert(id < ID_MAX);
+
+    rnd = g_random_int_range(0, 99);
+
+    return g_strdup_printf("%c%s%" PRIu64 "%" PRId32, ID_SPECIAL_CHAR,
+                                                      id_subsys_str[id],
+                                                      id_counters[id]++,
+                                                      rnd);
+}