From patchwork Wed Aug 21 20:31:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Michelson X-Patchwork-Id: 1975096 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=RjVWhZz7; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WpygF2qzwz1ybW for ; Thu, 22 Aug 2024 06:32:09 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 9D34640B83; Wed, 21 Aug 2024 20:32:07 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id WY0DYXpSjxeA; Wed, 21 Aug 2024 20:32:06 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org D322340175 Authentication-Results: smtp2.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key, unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=RjVWhZz7 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id D322340175; Wed, 21 Aug 2024 20:32:05 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7A183C07E7; Wed, 21 Aug 2024 20:32:05 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 07021C07E6 for ; Wed, 21 Aug 2024 20:32:04 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id D5C496080F for ; Wed, 21 Aug 2024 20:32:03 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id IRbf5t2cA0ff for ; Wed, 21 Aug 2024 20:32:02 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=170.10.133.124; helo=us-smtp-delivery-124.mimecast.com; envelope-from=mmichels@redhat.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp3.osuosl.org 905CB6064F Authentication-Results: smtp3.osuosl.org; dmarc=pass (p=none dis=none) header.from=redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 905CB6064F Authentication-Results: smtp3.osuosl.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=RjVWhZz7 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by smtp3.osuosl.org (Postfix) with ESMTPS id 905CB6064F for ; Wed, 21 Aug 2024 20:32:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1724272319; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=0THDsM4J78CFEs13ayWbFd6ccql45upjf9N2OYKYA64=; b=RjVWhZz703EMInxmoFcmvOEvIyHooiW/9csTtn6UKG7vwt8UmdOQGk12eX/84dWR3YJUmt vmrzGq3bUCA4Dl4DFcDrC7JOdh9yUWzxqFM97OwidfB45xuD5l4ZAJRkVCC6Bc09uZiJnM sNGKndoLIZN+RUviYuTeV6M0GFs4hZ0= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-130-rmDEDPDnMnamfMgPDjK81w-1; Wed, 21 Aug 2024 16:31:57 -0400 X-MC-Unique: rmDEDPDnMnamfMgPDjK81w-1 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C4B401955D47 for ; Wed, 21 Aug 2024 20:31:56 +0000 (UTC) Received: from localhost.redhat.com (unknown [10.22.48.13]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E9B58300019C for ; Wed, 21 Aug 2024 20:31:55 +0000 (UTC) From: Mark Michelson To: dev@openvswitch.org Date: Wed, 21 Aug 2024 16:31:49 -0400 Message-ID: <20240821203154.175838-1-mmichels@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn] checkpatch: Add check for non-inclusive language. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This takes the approach of hard-coding the words in a text file, rather than trying to use a dynamic list of words. The inclusive naming project provides a way to download their word list as JSON, but this is avoided here for a few reasons: * Only the base form of words are provided. We would need to analyze the part of speech and try to generate the other forms of the words. * Some entries are more "example" than anything. For example, they provide "blackhat-whitehat" as a single entry, even though you're much more likely to come across the individual words, rather than this specific hyphenated variety. Similarly, "whitelist" is provided in the word list but "blacklist" is not. If it turns out that the word list updates frequently, then it may be worth moving to a more dynamic approach, at the expense of accuracy. For now, this seems like a nice approach. We do not consider this an error in checkpatch, but a warning. There are some cases where non-inclusive words are acceptable, (such as ovs_abort(), or referring to the "master" branch of a third-party repo). This is why the warning only suggests to consider an alternative. On a side note, running this patch through the updated checkpatch.py is pretty funny. Signed-off-by: Mark Michelson --- tests/checkpatch.at | 38 ++++++++++++++++++++ utilities/checkpatch.py | 30 ++++++++++++++++ utilities/excluded_word_list.txt | 59 ++++++++++++++++++++++++++++++++ 3 files changed, 127 insertions(+) create mode 100644 utilities/excluded_word_list.txt diff --git a/tests/checkpatch.at b/tests/checkpatch.at index 6ac0e51f3..79c02b229 100755 --- a/tests/checkpatch.at +++ b/tests/checkpatch.at @@ -642,5 +642,43 @@ try_checkpatch \ #8 FILE: tests/something.at:1: C_H_E_C_K([as gw1 ovs-ofctl dump-flows br-int table=42 | grep "dl_dst=00:00:02:01:02:04" | wc -l], [0], [[1]]) " +AT_CLEANUP + +AT_SETUP([checkpatch - non-inclusive words]) +# This test does not extensively test every single word in the list of +# non-inclusive words. + +# Try a simple exact match with a single word +try_checkpatch \ + "COMMON_PATCH_HEADER + +/* Let's be sure this change doesn't cripple our performance */ + " \ + "WARNING: Non-inclusive word 'cripple' found. Consider replacing. + #8 FILE: A.c:1: + /* Let's be sure this change doesn't cripple our performance */ +" +# Put more than one word on the line, and use different forms of the word. +try_checkpatch \ + "COMMON_PATCH_HEADER + +/* The grandfathers are hallucinating again */ + " \ + "WARNING: Non-inclusive word 'grandfathers' found. Consider replacing. + WARNING: Non-inclusive word 'hallucinating' found. Consider replacing. + #8 FILE: A.c:1: + /* The grandfathers are hallucinating again */ +" + +# And finally, make sure punctuation, etc. don't interfere. +try_checkpatch \ + "COMMON_PATCH_HEADER + +/* Set up master/slave tribe, but don't abort! */ + " \ + "WARNING: Non-inclusive word 'abort' found. Consider replacing. + WARNING: Non-inclusive word 'master' found. Consider replacing. + WARNING: Non-inclusive word 'slave' found. Consider replacing. + WARNING: Non-inclusive word 'tribe' found. Consider replacing. + #8 FILE: A.c:1: + /* Set up master/slave tribe, but don't abort! */ +" AT_CLEANUP diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py index 35204daa2..9a06cf0a1 100755 --- a/utilities/checkpatch.py +++ b/utilities/checkpatch.py @@ -19,6 +19,8 @@ import getopt import os import re import sys +import functools +from pathlib import Path RETURN_CHECK_INITIAL_STATE = 0 RETURN_CHECK_STATE_WITH_RETURN = 1 @@ -582,6 +584,32 @@ def empty_return_with_brace(line): return False +@functools.cache +def load_excluded_words(): + parent_dir = Path(__file__).parent + with open(parent_dir / "excluded_word_list.txt", "r") as f: + return [line.strip() for line in f] + + +def contains_non_inclusive_words(line): + # This returns true if a word is found that falls afoul of our inclusive + # language guidelines. The list of words is sourced from the Tier 1, Tier 2, + # and Tier 3 word lists from https://inclusivenaming.org/word-lists/ . + + excluded_words = load_excluded_words() + + problem_found = False + for word in excluded_words: + match = re.search(rf'\b{word}\b', line, flags=re.IGNORECASE) + if match: + print_warning( + f"Non-inclusive word '{word}' found. Consider replacing." + ) + problem_found = True + + return problem_found + + file_checks = [ {'regex': __regex_added_doc_rst, 'check': check_new_docs_index}, @@ -668,6 +696,8 @@ checks = [ lambda: print_warning("Use of hardcoded table= or" " resubmit=(,) is discouraged in tests." " Consider using MACRO instead.")}, + {'regex': None, 'match_name': None, + 'check': lambda x: contains_non_inclusive_words(x)}, ] diff --git a/utilities/excluded_word_list.txt b/utilities/excluded_word_list.txt new file mode 100644 index 000000000..7a2ce4e09 --- /dev/null +++ b/utilities/excluded_word_list.txt @@ -0,0 +1,59 @@ +abort +aborts +aborting +aborted +abortion +blackhat +blackhats +whitehat +whitehats +cripple +cripples +crippling +cripplingly +crippled +grandfather +grandfathers +grandfathered +grandfathering +master +masters +slave +slaves +slaved +slaving +slavery +slavish +slavishly +tribe +tribes +tribal +tribally +whitelist +whitelists +whitelisted +whitelisting +blacklist +blacklists +blacklisted +blacklisting +sanity-check +sanity-checks +sanity-checked +sanity-checking +blast-radius +blast-radii +hallucinate +hallucinates +hallucinated +hallucinating +hallucination +man-hour +man-hours +man-in-the-middle +men-in-the-middle +segregate +segregates +segregated +segregating +segregation