Message ID | 20180218125116.5032-1-ligs@dtdream.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev,v2] python: Fix decoding error when the received data is larger than 4096. | expand |
On Sun, Feb 18, 2018 at 08:51:16PM +0800, Guoshuai Li wrote: > It can only receive 4096 bytes of data each time in jsonrpc, > when there are similar and Chinese characters occupy multiple bytes, > it may receive half a character, this time the decoding will be abnormal. > We need to receive the completed character to decode. > > Signed-off-by: Guoshuai Li <ligs@dtdream.com> Thanks for finding and fixing the problem. I am worried that, with this patch, any genuine encoding errors will be ignored. For example, bytes 0xc0 and 0xc1 never appear in a valid UTF-8 sequence, yet I believe that this code will silently ignore them, along with whatever other data appears along with them. Do you have an idea for how to deal with that? Thanks, Ben.
> On Sun, Feb 18, 2018 at 08:51:16PM +0800, Guoshuai Li wrote: >> It can only receive 4096 bytes of data each time in jsonrpc, >> when there are similar and Chinese characters occupy multiple bytes, >> it may receive half a character, this time the decoding will be abnormal. >> We need to receive the completed character to decode. >> >> Signed-off-by: Guoshuai Li <ligs@dtdream.com> > Thanks for finding and fixing the problem. > > I am worried that, with this patch, any genuine encoding errors will be > ignored. For example, bytes 0xc0 and 0xc1 never appear in a valid UTF-8 > sequence, yet I believe that this code will silently ignore them, along > with whatever other data appears along with them. Do you have an idea > for how to deal with that? > > Thanks, > > Ben. Hello Ben I sent a new version: https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344705.html But I am not satisfied with it. This patch can not solve the received data length is exactly a multiple of 4096, and with 0xc0 and 0xc1. I do not know how to judge that I have received all the data. I think I need more people's help. Thanks~!
On Mon, Feb 26, 2018 at 02:02:05PM +0800, Guoshuai Li wrote: > >On Sun, Feb 18, 2018 at 08:51:16PM +0800, Guoshuai Li wrote: > >>It can only receive 4096 bytes of data each time in jsonrpc, > >>when there are similar and Chinese characters occupy multiple bytes, > >>it may receive half a character, this time the decoding will be abnormal. > >>We need to receive the completed character to decode. > >> > >>Signed-off-by: Guoshuai Li <ligs@dtdream.com> > >Thanks for finding and fixing the problem. > > > >I am worried that, with this patch, any genuine encoding errors will be > >ignored. For example, bytes 0xc0 and 0xc1 never appear in a valid UTF-8 > >sequence, yet I believe that this code will silently ignore them, along > >with whatever other data appears along with them. Do you have an idea > >for how to deal with that? > > > >Thanks, > > > >Ben. > > Hello Ben > > I sent a new version: > https://mail.openvswitch.org/pipermail/ovs-dev/2018-February/344705.html > But I am not satisfied with it. > This patch can not solve the received data length is exactly a multiple of > 4096, and with 0xc0 and 0xc1. > I do not know how to judge that I have received all the data. > I think I need more people's help. It looks like we should be using IncrementalDecoder: https://docs.python.org/3/library/codecs.html?highlight=decode#codecs.IncrementalDecoder
Thanks Ben, this is really a good interface, it makes the code change is simple. A new patch: https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/344845.html > It looks like we should be using IncrementalDecoder: > https://docs.python.org/3/library/codecs.html?highlight=decode#codecs.IncrementalDecoder
Thanks! I'll review it. On Thu, Mar 01, 2018 at 02:31:37PM +0800, Guoshuai Li wrote: > > Thanks Ben, this is really a good interface, it makes the code change is > simple. > > A new patch: > https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/344845.html > > >It looks like we should be using IncrementalDecoder: > >https://docs.python.org/3/library/codecs.html?highlight=decode#codecs.IncrementalDecoder >
diff --git a/python/ovs/jsonrpc.py b/python/ovs/jsonrpc.py index d284190e0..a46821093 100644 --- a/python/ovs/jsonrpc.py +++ b/python/ovs/jsonrpc.py @@ -262,17 +262,21 @@ class Connection(object): if self.status: return self.status, None + recv_buffer = bytearray() while True: if not self.input: + recv_data = None error, data = self.stream.recv(4096) + recv_buffer.extend(bytes(data)) # Python 3 has separate types for strings and bytes. We # received bytes from a socket. We expect it to be string # data, so we convert it here as soon as possible. if data and not error: try: - data = data.decode('utf-8') + recv_data = recv_buffer.decode('utf-8') + recv_buffer = bytearray() except UnicodeError: - error = errno.EILSEQ + continue if error: if (sys.platform == "win32" and error == errno.WSAEWOULDBLOCK): @@ -287,12 +291,12 @@ class Connection(object): % (self.name, os.strerror(error))) self.error(error) return self.status, None - elif not data: + elif not recv_data: self.error(EOF) return EOF, None else: - self.input += data - self.received_bytes += len(data) + self.input += recv_data + self.received_bytes += len(recv_data) else: if self.parser is None: self.parser = ovs.json.Parser() diff --git a/tests/ovsdb-idl.at b/tests/ovsdb-idl.at index 59b2c1991..d63b7758f 100644 --- a/tests/ovsdb-idl.at +++ b/tests/ovsdb-idl.at @@ -328,6 +328,40 @@ OVSDB_CHECK_IDL([simple idl, writing via IDL with unicode], 003: done ]]) +# same as OVSDB_CHECK_IDL but uses the Python IDL implementation. +m4_define([OVSDB_CHECK_IDL_PYN_WITH_EXPOUT], + [AT_SETUP([$1]) + AT_SKIP_IF([test $7 = no]) + AT_KEYWORDS([ovsdb server idl positive Python $5]) + AT_CHECK([ovsdb_start_idltest]) + m4_if([$2], [], [], + [AT_CHECK([ovsdb-client transact unix:socket $2], [0], [ignore], [ignore])]) + AT_CHECK([$8 $srcdir/test-ovsdb.py -t10 idl $srcdir/idltest.ovsschema unix:socket $3], + [0], [stdout], [ignore]) + echo "$4" > expout + AT_CHECK([sort stdout | uuidfilt]m4_if([$6],,, [[| $6]]), + [0], [expout]) + OVSDB_SERVER_SHUTDOWN + AT_CLEANUP]) + +m4_define([OVSDB_CHECK_IDL_PY_WITH_EXPOUT], + [OVSDB_CHECK_IDL_PYN_WITH_EXPOUT([$1 - Python2], [$2], [$3], [$4], [$5], [$6], + [$HAVE_PYTHON], [$PYTHON]) + OVSDB_CHECK_IDL_PYN_WITH_EXPOUT([$1 - Python3], [$2], [$3], [$4], [$5], [$6], + [$HAVE_PYTHON3], [$PYTHON3])]) + +OVSDB_CHECK_IDL_PY_WITH_EXPOUT([simple idl, writing large data via IDL with unicode], + [['["idltest", + {"op": "insert", + "table": "simple", + "row": {"s": "'$(printf "测试超过四千零九十六个字节的中文字符串以使解码出现问题。%.0s" {1..50})'"}}]']], + [['set 0 b 1, insert 1, set 1 s '$(printf "测试超过四千零九十六个字节的中文字符串以使解码出现问题。%.0s" {1..100})'']], + [[000: i=0 r=0 b=false s=$(printf "测试超过四千零九十六个字节的中文字符串以使解码出现问题。%.0s" {1..50}) u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1> +001: commit, status=success +002: i=0 r=0 b=true s=$(printf "测试超过四千零九十六个字节的中文字符串以使解码出现问题。%.0s" {1..50}) u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<1> +002: i=1 r=0 b=false s=$(printf "测试超过四千零九十六个字节的中文字符串以使解码出现问题。%.0s" {1..100}) u=<0> ia=[] ra=[] ba=[] sa=[] ua=[] uuid=<2> +003: done]]) + OVSDB_CHECK_IDL([simple idl, handling verification failure], [['["idltest", {"op": "insert",
It can only receive 4096 bytes of data each time in jsonrpc, when there are similar and Chinese characters occupy multiple bytes, it may receive half a character, this time the decoding will be abnormal. We need to receive the completed character to decode. Signed-off-by: Guoshuai Li <ligs@dtdream.com> --- python/ovs/jsonrpc.py | 14 +++++++++----- tests/ovsdb-idl.at | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+), 5 deletions(-)