diff mbox

[ovs-dev,monitor_cond,V6,04/11] ovsdb: generate update notifications for monitor_cond session

Message ID 1463495229-26258-5-git-send-email-lirans@il.ibm.com
State Changes Requested
Headers show

Commit Message

Liran Schour May 17, 2016, 2:27 p.m. UTC
Hold session's conditions in ovsdb_monitor_session_condition. Pass it
to ovsdb_monitor for generating "update2" notifications.
Add functions that can generate "update2" notification for a
"monitor_cond" session.
JSON cache is enabled only for session's with true condition only.
"monitor_cond" and "monitor_cond_change" are RFC 7047 extensions
described by ovsdb-server(1) manpage.

Performance evaluation:
OVN is the main candidate for conditional monitoring usage. It is clear that
conditional monitoring reduces computation on the ovn-controller (client) side
due to the reduced size of flow tables and update messages. However,
performance evaluation shows also a reduction in computation on the SB
ovsdb-server side proportional to the degree that each logical network is
spread over physical hosts in the DC.

Evaluation on simulated environment of 50 hosts and 1000 logical ports shows
the following results (cycles #):

LN spread over # hosts|    master    | patch        | change
-------------------------------------------------------------
            1         | 58855158082  | 38175941755  | 35.1%
            3         | 54816462604  | 40255584120  | 26.5%
            6         | 52972265506  | 39481653891  | 25.4%
           12         | 57036827284  | 42008285519  | 26.3%
           18         | 61900476558  | 45903107035  | 25.8%
           24         | 64281399690  | 55617752599  | 13.4%
           30         | 66905128558  | 61835913623  |  7.5%
           42         | 76763742331  | 70522724721  |  8.1%
           50         | 85372146321  | 80130285454  |  6.1%
---
 ovsdb/jsonrpc-server.c  |  41 +++++--
 ovsdb/monitor.c         | 281 +++++++++++++++++++++++++++++++++++++++++++-----
 ovsdb/monitor.h         |  30 +++++-
 ovsdb/ovsdb-server.1.in | 230 ++++++++++++++++++++++++++++++++++++---
 4 files changed, 529 insertions(+), 53 deletions(-)

Comments

Ben Pfaff June 2, 2016, 12:07 a.m. UTC | #1
On Tue, May 17, 2016 at 05:27:02PM +0300, Liran Schour wrote:
> Hold session's conditions in ovsdb_monitor_session_condition. Pass it
> to ovsdb_monitor for generating "update2" notifications.
> Add functions that can generate "update2" notification for a
> "monitor_cond" session.
> JSON cache is enabled only for session's with true condition only.
> "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions
> described by ovsdb-server(1) manpage.

Thanks for the updated patch.

Please make ovsdb_monitor_session_condition_destroy() accept a null
pointer argument and treat it as a no-op, and then remove the check from
the caller, following CodingStyle:

    Functions that destroy an instance of a dynamically-allocated type
    should accept and ignore a null pointer argument.  Code that calls
    such a function (including the C standard library function free())
    should omit a null-pointer check.  We find that this usually makes
    code easier to read.

In the documentation, s/Replay/Reply/, s/deescribed/described/,
s/origin/original/.

In the documentation, there's a reference to [<conditions>*] but that I
think that should be [<condition>*] (without the 's').  The updated
meaning of <condition> is described in a couple of places, but I think
that instead we should just describe an update to the definition of
<condition>, for whatever section defines <condition> (and probably
should do that for the commit a few patches ago that actually updates
the condition parser).

There's a stray comma on a line by itself in the documentation, here:

    +contains the contents of the tables for which "initial" rows are selected.
    +If no tables initial contents are requested, then "result" is an empty object.
    +,
    +.IP

The parentheticals look funny in the documentation here, I'd suggest
just removing them:

    +.IP \(bu
    +If "insert" is omitted or true, "update" notifications are sent for rows newly
    +inserted into the table that match conditions or for rows modified in the table
    +so that their old version does not match the condition and new version does.
    +(new row in the client's replica table)
    +.IP \(bu
    +If "delete" is omitted or true, "update" notifications are sent for rows deleted
    +from the table that match conditions or for rows modified in the table so that
    +their old version does match the conditions and new version does not. (deleted row
    +in the client's replica)

In the documentation for monitor_cond_update, I'm not sure there's value
in describing what's currently unsupported.

We already have some messages with "update" in their names with quite
different semantics.  What if we renamed monitor_cond_update to
monitor_cond_change or similar?  Then there would be less confusion.

I don't understand the design for how the monitor_cond_update replies
are meant to be handled.  The client's goal is probably to know how the
database content as seen through the new condition differs from the
database content as seen through the old condition; let's call this
difference D, just to be clear.  The documentation says such an update
is only sent after the monitor_cond_update reply:

    Updates as a result of a condition change, will be sent only after
    the client received a response to the "monitor_cond_update" request.

With that design, I don't see how the client can find out whether D is
empty or nonempty without waiting for an indefinite time or for some
further change to occur.  For example, if D is empty and the database
does not change, then the client will wait forever to receive a new
update2 notification.  The client can't simply assume that D is empty;
after all, what if the database is just slow?  To solve the problem, I
would suggest that we change the design, so that the documentation would
be more like:

    An update, if any, as a result of a condition change, will be sent
    to the client before the reply to the "monitor_cond_update" request.

Thanks,

Ben.
Liran Schour June 6, 2016, 10:44 a.m. UTC | #2
Ben Pfaff <blp@ovn.org> wrote on 02/06/2016 03:07:08 AM:

> On Tue, May 17, 2016 at 05:27:02PM +0300, Liran Schour wrote:
> > Hold session's conditions in ovsdb_monitor_session_condition. Pass it
> > to ovsdb_monitor for generating "update2" notifications.
> > Add functions that can generate "update2" notification for a
> > "monitor_cond" session.
> > JSON cache is enabled only for session's with true condition only.
> > "monitor_cond" and "monitor_cond_change" are RFC 7047 extensions
> > described by ovsdb-server(1) manpage.
> 
> Thanks for the updated patch.
> 
> Please make ovsdb_monitor_session_condition_destroy() accept a null
> pointer argument and treat it as a no-op, and then remove the check from
> the caller, following CodingStyle:
> 
>     Functions that destroy an instance of a dynamically-allocated type
>     should accept and ignore a null pointer argument.  Code that calls
>     such a function (including the C standard library function free())
>     should omit a null-pointer check.  We find that this usually makes
>     code easier to read.
>

OK.
 
> In the documentation, s/Replay/Reply/, s/deescribed/described/,
> s/origin/original/.
> 

Will fix that.

> In the documentation, there's a reference to [<conditions>*] but that I
> think that should be [<condition>*] (without the 's').  The updated
> meaning of <condition> is described in a couple of places, but I think
> that instead we should just describe an update to the definition of
> <condition>, for whatever section defines <condition> (and probably
> should do that for the commit a few patches ago that actually updates
> the condition parser).
> 

Will fix that and move the <condition> specification to the former patch.

> There's a stray comma on a line by itself in the documentation, here:
> 
>     +contains the contents of the tables for which "initial" rows 
> are selected.
>     +If no tables initial contents are requested, then "result" is 
> an empty object.
>     +,
>     +.IP
> 

Right. Will fix that.

> The parentheticals look funny in the documentation here, I'd suggest
> just removing them:
> 
>     +.IP \(bu
>     +If "insert" is omitted or true, "update" notifications are sent
> for rows newly
>     +inserted into the table that match conditions or for rows 
> modified in the table
>     +so that their old version does not match the condition and new 
> version does.
>     +(new row in the client's replica table)
>     +.IP \(bu
>     +If "delete" is omitted or true, "update" notifications are sent
> for rows deleted
>     +from the table that match conditions or for rows modified in 
> the table so that
>     +their old version does match the conditions and new version 
> does not. (deleted row
>     +in the client's replica)
> 

Will remove that.

> In the documentation for monitor_cond_update, I'm not sure there's value
> in describing what's currently unsupported.
> 

Will move the documentation of monitor_cond_update to the patch that 
implements this.

> We already have some messages with "update" in their names with quite
> different semantics.  What if we renamed monitor_cond_update to
> monitor_cond_change or similar?  Then there would be less confusion.
> 

Will change monitor_cond_update to monitor_cond_change.

> I don't understand the design for how the monitor_cond_update replies
> are meant to be handled.  The client's goal is probably to know how the
> database content as seen through the new condition differs from the
> database content as seen through the old condition; let's call this
> difference D, just to be clear.  The documentation says such an update
> is only sent after the monitor_cond_update reply:
> 
>     Updates as a result of a condition change, will be sent only after
>     the client received a response to the "monitor_cond_update" request.
> 
> With that design, I don't see how the client can find out whether D is
> empty or nonempty without waiting for an indefinite time or for some
> further change to occur.  For example, if D is empty and the database
> does not change, then the client will wait forever to receive a new
> update2 notification.  The client can't simply assume that D is empty;
> after all, what if the database is just slow?  To solve the problem, I
> would suggest that we change the design, so that the documentation would
> be more like:
> 
>     An update, if any, as a result of a condition change, will be sent
>     to the client before the reply to the "monitor_cond_update" request.
> 

Will change the code and documentation to follow the paragraph above.

Thanks for reviewing the code.
- Liran
diff mbox

Patch

diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c
index 4ec9bb1..1663832 100644
--- a/ovsdb/jsonrpc-server.c
+++ b/ovsdb/jsonrpc-server.c
@@ -28,6 +28,7 @@ 
 #include "ovsdb-error.h"
 #include "ovsdb-parser.h"
 #include "ovsdb.h"
+#include "condition.h"
 #include "poll-loop.h"
 #include "reconnect.h"
 #include "row.h"
@@ -865,7 +866,8 @@  ovsdb_jsonrpc_session_got_request(struct ovsdb_jsonrpc_session *s,
             reply = execute_transaction(s, db, request);
         }
     } else if (!strcmp(request->method, "monitor") ||
-               (monitor2_enable__ && !strcmp(request->method, "monitor2"))) {
+               (monitor2_enable__ && !strcmp(request->method, "monitor2")) ||
+               !strcmp(request->method, "monitor_cond")) {
         struct ovsdb *db = ovsdb_jsonrpc_lookup_db(s, request, &reply);
         if (!reply) {
             int l = strlen(request->method) - strlen("monitor");
@@ -1064,6 +1066,7 @@  struct ovsdb_jsonrpc_monitor {
     uint64_t unflushed;         /* The first transaction that has not been
                                        flushed to the jsonrpc remote client. */
     enum ovsdb_monitor_version version;
+    struct ovsdb_monitor_session_condition *condition;/* Session's condition */
 };
 
 static struct ovsdb_jsonrpc_monitor *
@@ -1091,20 +1094,27 @@  parse_bool(struct ovsdb_parser *parser, const char *name, bool default_value)
 }
 
 static struct ovsdb_error * OVS_WARN_UNUSED_RESULT
-ovsdb_jsonrpc_parse_monitor_request(struct ovsdb_monitor *dbmon,
-                                    const struct ovsdb_table *table,
-                                    const struct json *monitor_request)
+ovsdb_jsonrpc_parse_monitor_request(
+                               struct ovsdb_monitor *dbmon,
+                               const struct ovsdb_table *table,
+                               struct ovsdb_monitor_session_condition *cond,
+                               const struct json *monitor_request)
 {
     const struct ovsdb_table_schema *ts = table->schema;
     enum ovsdb_monitor_selection select;
-    const struct json *columns, *select_json;
+    const struct json *columns, *select_json, *where = NULL;
     struct ovsdb_parser parser;
     struct ovsdb_error *error;
 
     ovsdb_parser_init(&parser, monitor_request, "table %s", ts->name);
+    if (cond) {
+        where = ovsdb_parser_member(&parser, "where", OP_ARRAY | OP_OPTIONAL);
+    }
     columns = ovsdb_parser_member(&parser, "columns", OP_ARRAY | OP_OPTIONAL);
+
     select_json = ovsdb_parser_member(&parser, "select",
                                       OP_OBJECT | OP_OPTIONAL);
+
     error = ovsdb_parser_finish(&parser);
     if (error) {
         return error;
@@ -1171,6 +1181,12 @@  ovsdb_jsonrpc_parse_monitor_request(struct ovsdb_monitor *dbmon,
             }
         }
     }
+    if (cond) {
+        error = ovsdb_monitor_table_condition_create(cond, table, where);
+        if (error) {
+            return error;
+        }
+    }
 
     return NULL;
 }
@@ -1209,6 +1225,9 @@  ovsdb_jsonrpc_monitor_create(struct ovsdb_jsonrpc_session *s, struct ovsdb *db,
     m->session = s;
     m->db = db;
     m->dbmon = ovsdb_monitor_create(db, m);
+    if (version == OVSDB_MONITOR_V2) {
+        m->condition = ovsdb_monitor_session_condition_create();
+    }
     m->unflushed = 0;
     m->version = version;
     hmap_insert(&s->monitors, &m->node, json_hash(monitor_id, 0));
@@ -1237,6 +1256,7 @@  ovsdb_jsonrpc_monitor_create(struct ovsdb_jsonrpc_session *s, struct ovsdb *db,
             for (i = 0; i < array->n; i++) {
                 error = ovsdb_jsonrpc_parse_monitor_request(m->dbmon,
                                                             table,
+                                                            m->condition,
                                                             array->elems[i]);
                 if (error) {
                     goto error;
@@ -1245,6 +1265,7 @@  ovsdb_jsonrpc_monitor_create(struct ovsdb_jsonrpc_session *s, struct ovsdb *db,
         } else {
             error = ovsdb_jsonrpc_parse_monitor_request(m->dbmon,
                                                         table,
+                                                        m->condition,
                                                         mr_value);
             if (error) {
                 goto error;
@@ -1269,6 +1290,11 @@  ovsdb_jsonrpc_monitor_create(struct ovsdb_jsonrpc_session *s, struct ovsdb *db,
         m->dbmon = dbmon;
     }
 
+    /* Only now we can bind session's condition to ovsdb_monitor */
+    if (m->condition) {
+        ovsdb_monitor_condition_bind(m->dbmon, m->condition);
+    }
+
     ovsdb_monitor_get_initial(m->dbmon);
     json = ovsdb_jsonrpc_monitor_compose_update(m, true);
     json = json ? json : json_object_create();
@@ -1325,7 +1351,7 @@  ovsdb_jsonrpc_monitor_compose_update(struct ovsdb_jsonrpc_monitor *m,
     }
 
     return ovsdb_monitor_get_update(m->dbmon, initial, &m->unflushed,
-                                    m->version);
+                                    m->condition, m->version);
 }
 
 static bool
@@ -1348,6 +1374,9 @@  ovsdb_jsonrpc_monitor_destroy(struct ovsdb_jsonrpc_monitor *m)
     json_destroy(m->monitor_id);
     hmap_remove(&m->session->monitors, &m->node);
     ovsdb_monitor_remove_jsonrpc_monitor(m->dbmon, m, m->unflushed);
+    if (m->condition) {
+        ovsdb_monitor_session_condition_destroy(m->condition);
+    }
     free(m);
 }
 
diff --git a/ovsdb/monitor.c b/ovsdb/monitor.c
index 7321082..231789c 100644
--- a/ovsdb/monitor.c
+++ b/ovsdb/monitor.c
@@ -27,6 +27,7 @@ 
 #include "ovsdb-parser.h"
 #include "ovsdb.h"
 #include "row.h"
+#include "condition.h"
 #include "simap.h"
 #include "hash.h"
 #include "table.h"
@@ -37,10 +38,27 @@ 
 #include "monitor.h"
 #include "openvswitch/vlog.h"
 
+VLOG_DEFINE_THIS_MODULE(ovsdb_monitor);
 
 static const struct ovsdb_replica_class ovsdb_jsonrpc_replica_class;
 static struct hmap ovsdb_monitors = HMAP_INITIALIZER(&ovsdb_monitors);
 
+/* Keep state of session's conditions */
+struct ovsdb_monitor_session_condition {
+    bool conditional;
+    size_t n_true_cnd;
+    struct shash tables;     /* Contains
+                              *   "struct ovsdb_monitor_table_condition *"s. */
+};
+
+/* Monitored table session's conditions */
+struct ovsdb_monitor_table_condition {
+    const struct ovsdb_table *table;
+    struct ovsdb_monitor_table *mt;
+    struct ovsdb_condition old_condition;
+    struct ovsdb_condition new_condition;
+};
+
 /*  Backend monitor.
  *
  *  ovsdb_monitor keep track of the ovsdb changes.
@@ -129,9 +147,11 @@  struct ovsdb_monitor_table {
 };
 
 typedef struct json *
-(*compose_row_update_cb_func)(const struct ovsdb_monitor_table *mt,
-                              const struct ovsdb_monitor_row *row,
-                              bool initial, unsigned long int *changed);
+(*compose_row_update_cb_func)
+    (const struct ovsdb_monitor_table *mt,
+     const struct ovsdb_monitor_session_condition * condition,
+     const struct ovsdb_monitor_row *row,
+     bool initial, unsigned long int *changed);
 
 static void ovsdb_monitor_destroy(struct ovsdb_monitor *dbmon);
 static struct ovsdb_monitor_changes * ovsdb_monitor_table_add_changes(
@@ -416,6 +436,42 @@  ovsdb_monitor_add_column(struct ovsdb_monitor *dbmon,
     }
 }
 
+static void
+ovsdb_monitor_condition_add_columns(struct ovsdb_monitor *dbmon,
+                                    const struct ovsdb_table *table,
+                                    struct ovsdb_condition *condition)
+{
+    size_t n_columns;
+    int i;
+    const struct ovsdb_column **columns =
+        ovsdb_condition_get_columns(condition, &n_columns);
+
+    for (i = 0; i < n_columns; i++) {
+        ovsdb_monitor_add_column(dbmon, table, columns[i],
+                                 OJMS_NONE, false);
+    }
+
+    free(columns);
+}
+
+/* Bind this session's condition to ovsdb_monitor */
+void
+ovsdb_monitor_condition_bind(struct ovsdb_monitor *dbmon,
+                          struct ovsdb_monitor_session_condition *cond)
+{
+    struct shash_node *node;
+
+    SHASH_FOR_EACH(node, &cond->tables) {
+        struct ovsdb_monitor_table_condition *mtc = node->data;
+        struct ovsdb_monitor_table *mt =
+            shash_find_data(&dbmon->tables, mtc->table->schema->name);
+
+        mtc->mt = mt;
+        ovsdb_monitor_condition_add_columns(dbmon, mtc->table,
+                                            &mtc->new_condition);
+    }
+}
+
 /* Check for duplicated column names. Return the first
  * duplicated column's name if found. Otherwise return
  * NULL.  */
@@ -526,6 +582,157 @@  ovsdb_monitor_row_update_type(bool initial, const bool old, const bool new)
             : !new ? OJMS_DELETE
             : OJMS_MODIFY;
 }
+
+/* Set conditional monitoring mode only if we have non-empty condition in one
+ * of the tables at least */
+static inline void
+ovsdb_monitor_session_condition_set_mode(
+                                  struct ovsdb_monitor_session_condition *cond)
+{
+    cond->conditional = shash_count(&cond->tables) !=
+        cond->n_true_cnd;
+}
+
+/* Returnes an empty allocated session's condition state holder */
+struct ovsdb_monitor_session_condition *
+ovsdb_monitor_session_condition_create(void)
+{
+    struct ovsdb_monitor_session_condition *condition =
+        xzalloc(sizeof *condition);
+
+    condition->conditional = false;
+    shash_init(&condition->tables);
+    return condition;
+}
+
+void
+ovsdb_monitor_session_condition_destroy(
+                           struct ovsdb_monitor_session_condition *condition)
+{
+    struct shash_node *node, *next;
+
+    SHASH_FOR_EACH_SAFE (node, next, &condition->tables) {
+        struct ovsdb_monitor_table_condition *mtc = node->data;
+
+        ovsdb_condition_destroy(&mtc->new_condition);
+        ovsdb_condition_destroy(&mtc->old_condition);
+        shash_delete(&condition->tables, node);
+        free(mtc);
+    }
+    free(condition);
+}
+
+struct ovsdb_error *
+ovsdb_monitor_table_condition_create(
+                         struct ovsdb_monitor_session_condition *condition,
+                         const struct ovsdb_table *table,
+                         const struct json *json_cnd)
+{
+    struct ovsdb_monitor_table_condition *mtc;
+    struct ovsdb_error *error;
+
+    mtc = xzalloc(sizeof *mtc);
+    mtc->table = table;
+    ovsdb_condition_init(&mtc->old_condition);
+    ovsdb_condition_init(&mtc->new_condition);
+
+    if (json_cnd) {
+        error = ovsdb_condition_from_json(table->schema,
+                                          json_cnd,
+                                          NULL,
+                                          &mtc->old_condition);
+        if (error) {
+            free(mtc);
+            return error;
+        }
+    }
+
+    shash_add(&condition->tables, table->schema->name, mtc);
+    /* On session startup old == new condition */
+    ovsdb_condition_clone(&mtc->new_condition, &mtc->old_condition);
+    if (ovsdb_condition_is_true(&mtc->old_condition)) {
+        condition->n_true_cnd++;
+        ovsdb_monitor_session_condition_set_mode(condition);
+    }
+
+    return NULL;
+}
+
+static bool
+ovsdb_monitor_get_table_conditions(
+                      const struct ovsdb_monitor_table *mt,
+                      const struct ovsdb_monitor_session_condition *condition,
+                      struct ovsdb_condition **old_condition,
+                      struct ovsdb_condition **new_condition)
+{
+    if (!condition) {
+        return false;
+    }
+
+    struct ovsdb_monitor_table_condition *mtc =
+        shash_find_data(&condition->tables, mt->table->schema->name);
+
+    if (!mtc) {
+        return false;
+    }
+    *old_condition = &mtc->old_condition;
+    *new_condition = &mtc->new_condition;
+
+    return true;
+}
+
+static enum ovsdb_monitor_selection
+ovsdb_monitor_row_update_type_condition(
+                      const struct ovsdb_monitor_table *mt,
+                      const struct ovsdb_monitor_session_condition *condition,
+                      bool initial,
+                      const struct ovsdb_datum *old,
+                      const struct ovsdb_datum *new)
+{
+    struct ovsdb_condition *old_condition, *new_condition;
+    enum ovsdb_monitor_selection type =
+        ovsdb_monitor_row_update_type(initial, old, new);
+
+    if (ovsdb_monitor_get_table_conditions(mt,
+                                           condition,
+                                           &old_condition,
+                                           &new_condition)) {
+        bool old_cond = !old ? false
+            : ovsdb_condition_empty_or_match_any(old,
+                                                 old_condition,
+                                                 mt->columns_index_map);
+        bool new_cond = !new ? false
+            : ovsdb_condition_empty_or_match_any(new,
+                                                 new_condition,
+                                                 mt->columns_index_map);
+
+        if (!old_cond && !new_cond) {
+            type = OJMS_NONE;
+        }
+
+        switch (type) {
+        case OJMS_INITIAL:
+        case OJMS_INSERT:
+            if (!new_cond) {
+                type = OJMS_NONE;
+            }
+            break;
+        case OJMS_MODIFY:
+            type = !old_cond ? OJMS_INSERT : !new_cond
+                ? OJMS_DELETE : OJMS_MODIFY;
+            break;
+        case OJMS_DELETE:
+            if (!old_cond) {
+                type = OJMS_NONE;
+            }
+            break;
+        case OJMS_NONE:
+            break;
+        }
+    }
+    return type;
+}
+
 static bool
 ovsdb_monitor_row_skip_update(const struct ovsdb_monitor_table *mt,
                               const struct ovsdb_monitor_row *row,
@@ -570,6 +777,7 @@  ovsdb_monitor_row_skip_update(const struct ovsdb_monitor_table *mt,
 static struct json *
 ovsdb_monitor_compose_row_update(
     const struct ovsdb_monitor_table *mt,
+    const struct ovsdb_monitor_session_condition *condition OVS_UNUSED,
     const struct ovsdb_monitor_row *row,
     bool initial, unsigned long int *changed)
 {
@@ -631,6 +839,7 @@  ovsdb_monitor_compose_row_update(
 static struct json *
 ovsdb_monitor_compose_row_update2(
     const struct ovsdb_monitor_table *mt,
+    const struct ovsdb_monitor_session_condition *condition,
     const struct ovsdb_monitor_row *row,
     bool initial, unsigned long int *changed)
 {
@@ -638,7 +847,8 @@  ovsdb_monitor_compose_row_update2(
     struct json *row_update2, *diff_json;
     size_t i;
 
-    type = ovsdb_monitor_row_update_type(initial, row->old, row->new);
+    type = ovsdb_monitor_row_update_type_condition(mt, condition, initial,
+                                                   row->old, row->new);
     if (ovsdb_monitor_row_skip_update(mt, row, type, changed)) {
         return NULL;
     }
@@ -708,9 +918,11 @@  ovsdb_monitor_max_columns(struct ovsdb_monitor *dbmon)
  * RFC 7047) for all the outstanding changes within 'monitor', starting from
  * 'transaction'.  */
 static struct json*
-ovsdb_monitor_compose_update(struct ovsdb_monitor *dbmon,
-                             bool initial, uint64_t transaction,
-                             compose_row_update_cb_func row_update)
+ovsdb_monitor_compose_update(
+                      struct ovsdb_monitor *dbmon,
+                      bool initial, uint64_t transaction,
+                      const struct ovsdb_monitor_session_condition *condition,
+                      compose_row_update_cb_func row_update)
 {
     struct shash_node *node;
     struct json *json;
@@ -732,7 +944,7 @@  ovsdb_monitor_compose_update(struct ovsdb_monitor *dbmon,
         HMAP_FOR_EACH_SAFE (row, next, hmap_node, &changes->rows) {
             struct json *row_json;
 
-            row_json = (*row_update)(mt, row, initial, changed);
+            row_json = (*row_update)(mt, condition, row, initial, changed);
             if (row_json) {
                 char uuid[UUID_LEN + 1];
 
@@ -766,11 +978,13 @@  ovsdb_monitor_compose_update(struct ovsdb_monitor *dbmon,
  * be used as part of the initial reply to a "monitor" request, false if it is
  * going to be used as part of an "update" notification. */
 struct json *
-ovsdb_monitor_get_update(struct ovsdb_monitor *dbmon,
-                         bool initial, uint64_t *unflushed_,
-                         enum ovsdb_monitor_version version)
+ovsdb_monitor_get_update(
+             struct ovsdb_monitor *dbmon,
+             bool initial, uint64_t *unflushed_,
+             const struct ovsdb_monitor_session_condition *condition,
+             enum ovsdb_monitor_version version)
 {
-    struct ovsdb_monitor_json_cache_node *cache_node;
+    struct ovsdb_monitor_json_cache_node *cache_node = NULL;
     struct shash_node *node;
     struct json *json;
     const uint64_t unflushed = *unflushed_;
@@ -778,19 +992,28 @@  ovsdb_monitor_get_update(struct ovsdb_monitor *dbmon,
 
     /* Return a clone of cached json if one exists. Otherwise,
      * generate a new one and add it to the cache.  */
-    cache_node = ovsdb_monitor_json_cache_search(dbmon, version, unflushed);
+    if (!condition || !condition->conditional) {
+        cache_node = ovsdb_monitor_json_cache_search(dbmon, version,
+                                                     unflushed);
+    }
     if (cache_node) {
         json = cache_node->json ? json_clone(cache_node->json) : NULL;
     } else {
         if (version == OVSDB_MONITOR_V1) {
-            json = ovsdb_monitor_compose_update(dbmon, initial, unflushed,
-                                        ovsdb_monitor_compose_row_update);
+            json =
+               ovsdb_monitor_compose_update(dbmon, initial, unflushed,
+                                            condition,
+                                            ovsdb_monitor_compose_row_update);
         } else {
             ovs_assert(version == OVSDB_MONITOR_V2);
-            json = ovsdb_monitor_compose_update(dbmon, initial, unflushed,
-                                        ovsdb_monitor_compose_row_update2);
+            json =
+               ovsdb_monitor_compose_update(dbmon, initial, unflushed,
+                                            condition,
+                                            ovsdb_monitor_compose_row_update2);
+        }
+        if (!condition || !condition->conditional) {
+            ovsdb_monitor_json_cache_insert(dbmon, version, unflushed, json);
         }
-        ovsdb_monitor_json_cache_insert(dbmon, version, unflushed, json);
     }
 
     /* Maintain transaction id of 'changes'. */
@@ -928,6 +1151,11 @@  ovsdb_monitor_changes_classify(enum ovsdb_monitor_selection type,
         return OVSDB_CHANGES_NO_EFFECT;
     }
 
+    if (type == OJMS_MODIFY) {
+        /* Condition might turn a modify operation to insert or delete */
+        type |= OJMS_INSERT | OJMS_DELETE;
+    }
+
     return (mt->select & type)
                 ?  OVSDB_CHANGES_REQUIRE_EXTERNAL_UPDATE
                 :  OVSDB_CHANGES_REQUIRE_INTERNAL_UPDATE;
@@ -955,19 +1183,18 @@  ovsdb_monitor_change_cb(const struct ovsdb_row *old,
     }
     mt = aux->mt;
 
-    HMAP_FOR_EACH(changes, hmap_node, &mt->changes) {
-        enum ovsdb_monitor_changes_efficacy efficacy;
-        enum ovsdb_monitor_selection type;
+    enum ovsdb_monitor_selection type =
+        ovsdb_monitor_row_update_type(false, old, new);
+    enum ovsdb_monitor_changes_efficacy efficacy =
+        ovsdb_monitor_changes_classify(type, mt, changed);
 
-        type = ovsdb_monitor_row_update_type(false, old, new);
-        efficacy = ovsdb_monitor_changes_classify(type, mt, changed);
+    HMAP_FOR_EACH(changes, hmap_node, &mt->changes) {
         if (efficacy > OVSDB_CHANGES_NO_EFFECT) {
             ovsdb_monitor_changes_update(old, new, mt, changes);
         }
-
-        if (aux->efficacy < efficacy) {
-            aux->efficacy = efficacy;
-        }
+    }
+    if (aux->efficacy < efficacy) {
+        aux->efficacy = efficacy;
     }
 
     return true;
diff --git a/ovsdb/monitor.h b/ovsdb/monitor.h
index 067aef0..3bd8bdf 100644
--- a/ovsdb/monitor.h
+++ b/ovsdb/monitor.h
@@ -19,8 +19,11 @@ 
 
 struct ovsdb_monitor;
 struct ovsdb_jsonrpc_monitor;
+struct ovsdb_monitor_session_condition;
+struct ovsdb_condition;
 
 enum ovsdb_monitor_selection {
+    OJMS_NONE = 0,              /* None for this iteration */
     OJMS_INITIAL = 1 << 0,      /* All rows when monitor is created. */
     OJMS_INSERT = 1 << 1,       /* New rows. */
     OJMS_DELETE = 1 << 2,       /* Deleted rows. */
@@ -62,10 +65,12 @@  const char * OVS_WARN_UNUSED_RESULT
 ovsdb_monitor_table_check_duplicates(struct ovsdb_monitor *,
                           const struct ovsdb_table *);
 
-struct json *ovsdb_monitor_get_update(struct ovsdb_monitor *dbmon,
-                                      bool initial,
-                                      uint64_t *unflushed_transaction,
-                                      enum ovsdb_monitor_version version);
+struct json *ovsdb_monitor_get_update(
+               struct ovsdb_monitor *dbmon,
+               bool initial,
+               uint64_t *unflushed_transaction,
+               const struct ovsdb_monitor_session_condition *condition,
+               enum ovsdb_monitor_version version);
 
 void ovsdb_monitor_table_add_select(struct ovsdb_monitor *dbmon,
                                     const struct ovsdb_table *table,
@@ -77,4 +82,21 @@  bool ovsdb_monitor_needs_flush(struct ovsdb_monitor *dbmon,
 void ovsdb_monitor_get_initial(const struct ovsdb_monitor *dbmon);
 
 void ovsdb_monitor_get_memory_usage(struct simap *usage);
+
+struct ovsdb_monitor_session_condition *
+ovsdb_monitor_session_condition_create(void);
+
+void
+ovsdb_monitor_session_condition_destroy(
+                          struct ovsdb_monitor_session_condition *condition);
+struct ovsdb_error *
+ovsdb_monitor_table_condition_create(
+                          struct ovsdb_monitor_session_condition *condition,
+                          const struct ovsdb_table *table,
+                          const struct json *json_cnd);
+
+void
+ovsdb_monitor_condition_bind(struct ovsdb_monitor *dbmon,
+                             struct ovsdb_monitor_session_condition *cond);
+
 #endif
diff --git a/ovsdb/ovsdb-server.1.in b/ovsdb/ovsdb-server.1.in
index f348a3b..2f0a397 100644
--- a/ovsdb/ovsdb-server.1.in
+++ b/ovsdb/ovsdb-server.1.in
@@ -254,31 +254,228 @@  notifications (see below) to the request, it must be unique among all
 active monitors.  \fBovsdb\-server\fR rejects attempt to create two
 monitors with the same identifier.
 .
-.IP "4.1.12. Monitor2"
-A new monitor method added in Open vSwitch version 2.5. Monitor2 allows
-for more efficient update notifications (described below).
+.IP "4.1.12. Monitor_cond"
+A new monitor method added in Open vSwitch version 2.5. The monitor_cond
+request enables a client to replicate subsets of tables within an OVSDB
+database by requesting notifications of changes to rows matching one of
+the conditions specified in "where" by receiving the specified contents
+of these rows when table updates occur. Monitor_cond also allows a more
+efficient update notifications by receiving table-updates2 notifications
+(described below).
+.
 .IP
-The monitor method described in Section 4.1.5 also applies to
-monitor2, with the following exceptions.
+The monitor method described in Section 4.1.5 also applies to monitor_cond,
+with the following exceptions:
 .
 .RS
 .IP \(bu
-RPC request method becomes "monitor2".
+RPC request method becomes "monitor_cond".
 .IP \(bu
-Replay result follows <table-updates2>, described in Section 4.1.13.
+Replay result follows <table-updates2>, described in Section 4.1.14.
 .IP \(bu
 Subsequent changes are sent to the client using the "update2" monitor
-notification, described in Section 4.1.13
+notification, described in Section 4.1.14
+.IP \(bu
+Update notifications are being sent only for rows matching [<conditions>*].
+<condition> is specified in Section 5.1 in the RFC with the following
+change: A condition can be either a 3-element JSON array as deescribed in
+the RFC or a boolean value. In case of an empty array an implicit true
+boolean value will be considered, and all rows will be monitored.
+.RE
+.
+.IP
+The request object has the following members:
+.
+.PP
+.RS
+.nf
+"method": "monitor_cond"
+"params": [<db-name>, <json-value>, <monitor-cond-requests>]
+"id": <nonnull-json-value>
+.fi
+.RE
+.
+.IP
+The <json-value> parameter is used to match subsequent update notifications
+(see below) to this request. The <monitor-cond-requests> object maps the name
+of the table to an array of <monitor-cond-request>.
+.
+.IP
+Each <monitor-cond-request> is an object with the following members:
+.
+.PP
+.RS
+.nf
+"columns": [<column>*]            optional
+"where": [<condition>*]           optional
+"select": <monitor-select>        optional
+.fi
+.RE
+.
+.IP
+The "columns", if present, define the columns within the table to be monitored
+that match conditions. If not present all columns are being monitored.
+.
+.IP
+The "where" if present is a JSON array of <condition> and boolean values. If not
+present or condition is an empty array, implicit True will be considered and
+updates on all rows will be sent. <condition> is specified in Section 5.1 in
+the RFC with the following change: A condition can be either a 3-element JSON
+array as described in the RFC or a boolean value. In case of an empty array an
+implicit true boolean value will be considered, and all rows will be monitored.
+.
+.IP
+<monitor-select> is an object with the following members:
+.
+.PP
+.RS
+.nf
+"initial": <boolean>              optional
+"insert": <boolean>               optional
+"delete": <boolean>               optional
+"modify": <boolean>               optional
+.fi
+.RE
+.
+.IP
+The contents of this object specify how the columns or table are to be
+monitored as explained in more detail below.
+.
+.IP
+The response object has the following members:
+.
+.PP
+.RS
+.nf
+"result": <table-updates2>
+"error": null
+"id": same "id" as request
+.fi
+.RE
+.
+.IP
+The <table-updates2> object is described in detail in Section 4.1.14. It
+contains the contents of the tables for which "initial" rows are selected.
+If no tables initial contents are requested, then "result" is an empty object.
+,
+.IP
+Subsequently, when changes to a specified table that match one of the conditions
+in monitor-cond-request are committed, the changes are automatically sent to the
+client using the "update2" monitor notification (see Section 4.1.14). This
+monitoring persists until the JSON-RPC session terminates or until the client
+sends a "monitor_cancel" JSON-RPC request.
+.
+.IP
+Each <monitor-cond-request> specifies one or more conditions and the manner in
+which the rows that match the conditions are to be monitored. The circumstances in
+which an "update" notification is sent for a row within the table are determined by
+<monitor-select>:
+.
+.RS
+.IP \(bu
+If "initial" is omitted or true, every row in the original table that matches one of
+the conditions is sent as part of the response to the "monitor_cond" request.
+.IP \(bu
+If "insert" is omitted or true, "update" notifications are sent for rows newly
+inserted into the table that match conditions or for rows modified in the table
+so that their old version does not match the condition and new version does.
+(new row in the client's replica table)
+.IP \(bu
+If "delete" is omitted or true, "update" notifications are sent for rows deleted
+from the table that match conditions or for rows modified in the table so that
+their old version does match the conditions and new version does not. (deleted row
+in the client's replica)
+.IP \(bu
+If "modify" is omitted or true, "update" notifications are sent whenever a row in
+the table that matches conditions in both old and new version is modified.
 .RE
 .
 .IP
-Both monitor and monitor2 sessions can exist concurrently. However,
-monitor and monitor2 shares the same <json-value> parameter space; it
-must be unique among all monitor and monitor2 sessions.
+Both monitor and monitor_cond sessions can exist concurrently. However,
+monitor and monitor_cond shares the same <json-value> parameter space; it
+must be unique among all monitor and monitor_cond sessions.
+.
+.IP "4.1.13. Monitor_cond_update"
+The "monitor_cond_update" request enables a client to change an existing
+"monitor_cond" replication of the database by specifying a new condition
+and columns for each replicated table. Currently changing the columns set
+is not supported.
+.
+.IP
+The request object has the following members:
+.
+.IP
+.RS
+.nf
+"method": "monitor_cond_update"
+"params": [<json-value>, <json-value>, <monitor-cond-update-requests>]
+"id": <nonnull-json-value>
+.fi
+.RE
+.
+.IP
+The <json-value> parameter should have a value of an existing conditional
+monitoring session from this client. The second <json-value> in params array
+is the requested value for this session. This value is valid only after
+"monitor_cond_update" is committed. A user can use these values to distinguish
+between update messages before conditions update and after. The
+<monitor-cond-update-requests> object maps the name of the table to an array of
+<monitor-cond-update-request>.
+.
+.IP
+Each <monitor-cond-update-request> is an object with the following members:
+.
+.IP
+.RS
+.nf
+"columns": [<column>*]         optional
+"where": [<condition>*]        optional
+.fi
+.RE
+.
+.IP
+The "columns" specify a new array of columns to be monitored
+(Currently unsupported).
+.
+.IP
+The "where" specify a new array of conditions to be applied to this monitoring
+session.
+.
+.IP
+<condition> is specified in Section 5.1 in the RFC with the following change:
+A condition can be either a 3-element JSON array as described in the RFC or a
+boolean value. In case of an empty array an implicit true boolean value will be
+considered, and all rows will be monitored.
+.
+.IP
+The response object has the following members:
+.
+.IP
+.RS
+.nf
+"result": null
+"error": null
+"id": same "id" as request
+.fi
+.RE
+.IP
+Subsequent <table-updates2> notifications are described in detail in Section
+4.1.14 in the RFC. If insert contents are requested by origin monitor_cond
+request, <table-updates2> will contain rows that match the new condition and
+do not match the old condition.
+If deleted contents are requested by origin monitor request, <table-updates2>
+will contain any matched rows by old condition and not matched by the new
+condition.
+.
+.IP
+Changes according to the new conditions are automatically sent to the client
+using the "update2" monitor notification. Updates as a result of a condition
+change, will be sent only after the client received a response to the
+"monitor_cond_update" request.
 .
-.IP "4.1.13. Update2 notification"
+.IP "4.1.14. Update2 notification"
 The "update2" notification is sent by the server to the client to report
-changes in tables that are being monitored following a "monitor2" request
+changes in tables that are being monitored following a "monitor_cond" request
 as described above. The notification has the following members:
 .
 .RS
@@ -293,7 +490,8 @@  as described above. The notification has the following members:
 The <json-value> in "params" is the same as the value passed as the
 <json-value>  in "params" for the corresponding "monitor" request.
 <table-updates2> is an object that maps from a table name to a <table-update2>.
-A <table-update2> is an object that maps from row's UUID to a <row-update2> object. A <row-update2> is an object with one of the following members:
+A <table-update2> is an object that maps from row's UUID to a <row-update2>
+object. A <row-update2> is an object with one of the following members:
 .
 .RS
 .IP "\(dqinitial\(dq: <row>"
@@ -335,8 +533,8 @@  elements, <row> includes the value from the new column.
 .
 .IP
 Initial views of rows are not presented in update2 notifications,
-but in the response object to the monitor2 request. The formatting of the
-<table-updates2> object, however, is the same in either case.
+but in the response object to the monitor_cond request. The formatting
+of the <table-updates2> object, however, is the same in either case.
 .
 .IP "5.1. Notation"
 For <condition>, RFC 7047 only allows the use of \fB!=\fR, \fB==\fR,