From patchwork Wed Apr 15 09:06:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Donnellan X-Patchwork-Id: 1271021 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 492Ghh3S6rz9s71 for ; Wed, 15 Apr 2020 19:07:32 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 492Ghh1VX4zDqvx for ; Wed, 15 Apr 2020 19:07:32 +1000 (AEST) X-Original-To: patchwork@lists.ozlabs.org Delivered-To: patchwork@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=ajd@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 492GhY1RPFzDqjQ for ; Wed, 15 Apr 2020 19:07:24 +1000 (AEST) Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03F93kOP001748 for ; Wed, 15 Apr 2020 05:07:21 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 30dnmaeh5s-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 15 Apr 2020 05:07:21 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 15 Apr 2020 10:06:43 +0100 Received: from b06cxnps4075.portsmouth.uk.ibm.com (9.149.109.197) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 15 Apr 2020 10:06:41 +0100 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 03F97Ggp56885294 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Apr 2020 09:07:16 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 64E95A4065; Wed, 15 Apr 2020 09:07:16 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1129CA4055; Wed, 15 Apr 2020 09:07:16 +0000 (GMT) Received: from ozlabs.au.ibm.com (unknown [9.192.253.14]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 15 Apr 2020 09:07:16 +0000 (GMT) Received: from intelligence.ibm.com (unknown [9.81.221.202]) (using TLSv1.2 with cipher DHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.au.ibm.com (Postfix) with ESMTPSA id 05FC7A00A5; Wed, 15 Apr 2020 19:07:09 +1000 (AEST) From: Andrew Donnellan To: patchwork@lists.ozlabs.org Subject: [PATCH] parser: Don't crash when From: is list email but has weird mangle format Date: Wed, 15 Apr 2020 19:06:56 +1000 X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 20041509-0016-0000-0000-00000304E23C X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20041509-0017-0000-0000-00003368DF41 Message-Id: <20200415090656.21101-1-ajd@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-15_01:2020-04-14, 2020-04-15 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 bulkscore=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 priorityscore=1501 suspectscore=1 clxscore=1015 malwarescore=0 phishscore=0 adultscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004150071 X-BeenThere: patchwork@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Patchwork development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jeremy Kerr Errors-To: patchwork-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Patchwork" get_original_sender() tries to demangle DMARC-mangled From headers, in the case where the email's From address is the list address. It knows how to handle Google Groups and Mailman style mangling, where the original submitter's name will be turned into e.g. "Andrew Donnellan via linuxppc-dev". If an email has the From header set to the list address but has a name that doesn't include " via ", we'll throw an exception because stripped_name hasn't been set. Sometimes this is because the list name is seemingly empty, resulting in a mangled name like "Andrew Donnellan via" without the space after "via" that we detect. Handle this as well as we can instead, and add a test. Fixes: 8279a84238c10 ("parser: Unmangle From: headers that have been mangled for DMARC purposes") Reported-by: Jeremy Kerr Signed-off-by: Andrew Donnellan Reviewed-by: Stephen Finucane --- Backport to stable? --- patchwork/parser.py | 7 +++++++ patchwork/tests/test_parser.py | 23 +++++++++++++++++++++++ 2 files changed, 30 insertions(+) diff --git a/patchwork/parser.py b/patchwork/parser.py index a09fd754c4be..dce03a4ff827 100644 --- a/patchwork/parser.py +++ b/patchwork/parser.py @@ -373,6 +373,13 @@ def get_original_sender(mail, name, email): # Mailman uses the format " via " # Google Groups uses "'' via " stripped_name = name[:name.rfind(' via ')].strip().strip("'") + elif name.endswith(' via'): + # Sometimes this seems to happen (perhaps if Mailman isn't set up with + # any list name) + stripped_name = name[:name.rfind(' via')].strip().strip("'") + else: + # We've hit a format that we don't expect + stripped_name = None original_from = clean_header(mail.get('X-Original-From', '')) if original_from: diff --git a/patchwork/tests/test_parser.py b/patchwork/tests/test_parser.py index f5631bee8329..a60eb6b4bac9 100644 --- a/patchwork/tests/test_parser.py +++ b/patchwork/tests/test_parser.py @@ -366,6 +366,29 @@ class SenderCorrelationTest(TestCase): self.assertEqual(person_b._state.adding, False) self.assertEqual(person_b.id, person_a.id) + def test_weird_dmarc_munging(self): + project = create_project() + real_sender = 'Existing Sender ' + munged_sender1 = "'Existing Sender' via <{}>".format(project.listemail) + munged_sender2 = "'Existing Sender' <{}>".format(project.listemail) + + # Unmunged author + mail = self._create_email(real_sender) + person_a = get_or_create_author(mail, project) + person_a.save() + + # Munged with no list name + mail = self._create_email(munged_sender1, None, None, real_sender) + person_b = get_or_create_author(mail, project) + self.assertEqual(person_b._state.adding, False) + self.assertEqual(person_b.id, person_a.id) + + # Munged with no 'via' + mail = self._create_email(munged_sender2, None, None, real_sender) + person_b = get_or_create_author(mail, project) + self.assertEqual(person_b._state.adding, False) + self.assertEqual(person_b.id, person_a.id) + class SeriesCorrelationTest(TestCase): """Validate correct behavior of find_series."""