Message ID | 1470756370-136415-1-git-send-email-seth.forshee@canonical.com |
---|---|
State | New |
Headers | show |
Is working for me...
On Tue, Aug 09, 2016 at 10:26:10AM -0500, Seth Forshee wrote: > The bug number regular expression matching correctly identifies > lines containing launchpad bug numbers, but once such a line is > identified it treats any string matching "#[0-9]+" as a bug > number. For example, in this changelog entry #1 is identified as > a bug number: > > * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area, > sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449) > > Rather than matching the "LP: #NNNNNN" string and the bug number > separately, they can be combined into one regex which matches > against the full string but returns only the bug number portion > of the string. I believe the current implementation is designed to work wth changelog entries like the following, which would be broken by your patch: linux/xenial$ egrep 'LP: #[0-9].*#' debian.master/changelog - LP: #1395877, #1410480 - LP: #1395877, #1410480 - LP: #1310512, #1320070 - LP: #1085766, #462111 - LP: #1017717, #225 - LP: #978038, #987371 - LP: #915941, #918212 - LP: #915941, #918212 - LP: #907377, #911236 - LP: #737388, #782389, #794642 [...] How about implementing it like this?: First trim away everything up to and including the first "LP:" from the line buffer, then parse every remaining instance of '#([0-9]+)' and consider each to be an LP bug number. -Kamal > Since re.findall will return an empty list if > there are no matches, all that is needed is to append the list > returned by findall with this regex to the bug list for each line > of the changelog. > > Signed-off-by: Seth Forshee <seth.forshee@canonical.com> > --- > ktl/debian.py | 10 +++------- > 1 file changed, 3 insertions(+), 7 deletions(-) > > diff --git a/ktl/debian.py b/ktl/debian.py > index dd2703a..76435aa 100644 > --- a/ktl/debian.py > +++ b/ktl/debian.py > @@ -30,8 +30,7 @@ class Debian: > > package_rc = compile("^(linux[-\S])*.*$") > ver_rc = compile("^linux[-\S]* \(([0-9]+\.[0-9]+\.[0-9]+[-\.][0-9]+\.[0-9]+[~a-z0-9]*)\).*$") > - bug_line_rc = compile("LP:\s#[0-9]+") > - bug_rc = compile("#([0-9]+)") > + bug_rc = compile("LP:\s#([0-9]+)") > > # debian_directories > # > @@ -176,11 +175,8 @@ class Debian: > bugs = [] > else: > # find bug numbers and append them to the list > - bug_line_matches = cls.bug_line_rc.search(line) > - if bug_line_matches: > - bug_matches = findall(cls.bug_rc,line) > - if bug_matches: > - bugs.extend( bug_matches ) > + bug_matches = findall(cls.bug_rc,line) > + bugs.extend(bug_matches) > > content.append(line) > > -- > 2.7.4 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On Wed, Aug 10, 2016 at 07:34:51AM -0700, Kamal Mostafa wrote: > On Tue, Aug 09, 2016 at 10:26:10AM -0500, Seth Forshee wrote: > > The bug number regular expression matching correctly identifies > > lines containing launchpad bug numbers, but once such a line is > > identified it treats any string matching "#[0-9]+" as a bug > > number. For example, in this changelog entry #1 is identified as > > a bug number: > > > > * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area, > > sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449) > > > > Rather than matching the "LP: #NNNNNN" string and the bug number > > separately, they can be combined into one regex which matches > > against the full string but returns only the bug number portion > > of the string. > > I believe the current implementation is designed to work wth changelog > entries like the following, which would be broken by your patch: > > linux/xenial$ egrep 'LP: #[0-9].*#' debian.master/changelog > - LP: #1395877, #1410480 > - LP: #1395877, #1410480 > - LP: #1310512, #1320070 > - LP: #1085766, #462111 > - LP: #1017717, #225 > - LP: #978038, #987371 > - LP: #915941, #918212 > - LP: #915941, #918212 > - LP: #907377, #911236 > - LP: #737388, #782389, #794642 > [...] > > How about implementing it like this?: > > First trim away everything up to and including the first "LP:" from > the line buffer, then parse every remaining instance of '#([0-9]+)' and > consider each to be an LP bug number. Okay, I didn't realize they could be structured that way. In that case your suggestion makes more sense. Hopefully it's not valid for free form text to follow a "LP: #NNNNNNN" tag? Thanks, Seth
On Wed, Aug 10, 2016 at 10:30:39AM -0500, Seth Forshee wrote: > On Wed, Aug 10, 2016 at 07:34:51AM -0700, Kamal Mostafa wrote: > > On Tue, Aug 09, 2016 at 10:26:10AM -0500, Seth Forshee wrote: > > > The bug number regular expression matching correctly identifies > > > lines containing launchpad bug numbers, but once such a line is > > > identified it treats any string matching "#[0-9]+" as a bug > > > number. For example, in this changelog entry #1 is identified as > > > a bug number: > > > > > > * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area, > > > sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449) > > > > > > Rather than matching the "LP: #NNNNNN" string and the bug number > > > separately, they can be combined into one regex which matches > > > against the full string but returns only the bug number portion > > > of the string. > > > > I believe the current implementation is designed to work wth changelog > > entries like the following, which would be broken by your patch: > > > > linux/xenial$ egrep 'LP: #[0-9].*#' debian.master/changelog > > - LP: #1395877, #1410480 > > - LP: #1395877, #1410480 > > - LP: #1310512, #1320070 > > - LP: #1085766, #462111 > > - LP: #1017717, #225 > > - LP: #978038, #987371 > > - LP: #915941, #918212 > > - LP: #915941, #918212 > > - LP: #907377, #911236 > > - LP: #737388, #782389, #794642 > > [...] > > > > How about implementing it like this?: > > > > First trim away everything up to and including the first "LP:" from > > the line buffer, then parse every remaining instance of '#([0-9]+)' and > > consider each to be an LP bug number. > > Okay, I didn't realize they could be structured that way. In that case > your suggestion makes more sense. Hopefully it's not valid for free form > text to follow a "LP: #NNNNNNN" tag? I don't find any instances of that (extraneous free form text) in our changelogs -- just lines of the form: - LP: #737388, #782389, #794642 -Kamal
diff --git a/ktl/debian.py b/ktl/debian.py index dd2703a..76435aa 100644 --- a/ktl/debian.py +++ b/ktl/debian.py @@ -30,8 +30,7 @@ class Debian: package_rc = compile("^(linux[-\S])*.*$") ver_rc = compile("^linux[-\S]* \(([0-9]+\.[0-9]+\.[0-9]+[-\.][0-9]+\.[0-9]+[~a-z0-9]*)\).*$") - bug_line_rc = compile("LP:\s#[0-9]+") - bug_rc = compile("#([0-9]+)") + bug_rc = compile("LP:\s#([0-9]+)") # debian_directories # @@ -176,11 +175,8 @@ class Debian: bugs = [] else: # find bug numbers and append them to the list - bug_line_matches = cls.bug_line_rc.search(line) - if bug_line_matches: - bug_matches = findall(cls.bug_rc,line) - if bug_matches: - bugs.extend( bug_matches ) + bug_matches = findall(cls.bug_rc,line) + bugs.extend(bug_matches) content.append(line)
The bug number regular expression matching correctly identifies lines containing launchpad bug numbers, but once such a line is identified it treats any string matching "#[0-9]+" as a bug number. For example, in this changelog entry #1 is identified as a bug number: * [LTCTest][Opal][OP820] Machine crashed with Oops: Kernel access of bad area, sig: 11 [#1] while executing Froze PE Error injection (LP: #1603449) Rather than matching the "LP: #NNNNNN" string and the bug number separately, they can be combined into one regex which matches against the full string but returns only the bug number portion of the string. Since re.findall will return an empty list if there are no matches, all that is needed is to append the list returned by findall with this regex to the bug list for each line of the changelog. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> --- ktl/debian.py | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)