From patchwork Tue Jan 3 18:58:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 710611 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3ttNYF0HJvz9sDG for ; Wed, 4 Jan 2017 06:00:01 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Ava6ItJX"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3ttNYD6GwxzDqHs for ; Wed, 4 Jan 2017 06:00:00 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Ava6ItJX"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-pg0-x241.google.com (mail-pg0-x241.google.com [IPv6:2607:f8b0:400e:c05::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3ttNWh30NGzDqFh for ; Wed, 4 Jan 2017 05:58:40 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Ava6ItJX"; dkim-atps=neutral Received: by mail-pg0-x241.google.com with SMTP id b1so34338369pgc.1 for ; Tue, 03 Jan 2017 10:58:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=KSheGHW7GIMRVvCgfIowXug8vAOaP3UxXpSCseiR5mI=; b=Ava6ItJXJPkdHr8BvD5JCUFcGs25Jo6qea9+C103T+S0PqWLOmEz1MslQtgKckOloz G8k8MD7qM3T4SVSuItDlwy6Qkiq6+OXervxwRHolgyGqucrVlaPn+hjvfHT6eGQCNHu9 QPorDSzEPePHW8zRuFWqL5WODnQH9+YvIJyFGXTJnF2Oz7/bZN/6xoReOsvJePNZ5WvG MN90p7EzHdwU9u5ZBiOmn/byqz1Bw1V9bLUAgoQAGHo862foooTEn6Vv0K58WuFXRnvu lJ21onWG4w4VWeld2vfEliQkd3CSmOYyZ6ZGR7wCBDv8zjBKeHvd7SKutwfqFi5+Qq5x Yl/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=KSheGHW7GIMRVvCgfIowXug8vAOaP3UxXpSCseiR5mI=; b=tToFRoTdLNuWO7c8+60YztPH51KLgod1ZbaioLR3UFElt6spx4tsv9u/h6T4CQRKjV WbBiQgVz1M/d4BBWBwv6EyHtVUNyIrrYcx42pegIuPGrEWcKvYnkNC+QFKw9i7L6jxmS eQb2WPnZZQgg8Zj37TXhe7M9Oo5Hv+XlmO1TFqo0HIdYzEWtAO1Sswp9pf8OKG9kpaCo Hn6ntM+RngLFFFEaNha0NpFtVeZyrm9H/8PQH5g/cFamQ9o8EUXe/K1ZkI9Qbr4G4P5l 5DAaPJvoAsCM6DjPtXIkKiqr/O/+Tf+krx6pcbYfCuWzNbEdDlNcE5x9c7ULJuP65N1K 6Hhg== X-Gm-Message-State: AIkVDXJWocps0OwooCMmKtvQtHdLpTiDMTFuvEe1EVZYlVjneBcEhdytmwV2glHNg2qbww== X-Received: by 10.98.152.212 with SMTP id d81mr59787122pfk.12.1483469918574; Tue, 03 Jan 2017 10:58:38 -0800 (PST) Received: from roar.au.ibm.com ([203.221.7.176]) by smtp.gmail.com with ESMTPSA id w11sm141574736pfk.75.2017.01.03.10.58.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 03 Jan 2017 10:58:38 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH] powerpc: implement clear_bit_unlock_is_negative_byte() Date: Wed, 4 Jan 2017 04:58:28 +1000 Message-Id: <20170103185828.31311-1-npiggin@gmail.com> X-Mailer: git-send-email 2.11.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Commit b91e1302ad9b8 ("mm: optimize PageWaiters bit use for unlock_page()") added a special bitop function to speed up unlock_page(). Implement this for powerpc. This improves the unlock_page() core code from this: li 9,1 lwsync 1: ldarx 10,0,3,0 andc 10,10,9 stdcx. 10,0,3 bne- 1b ori 2,2,0 ld 9,0(3) andi. 10,9,0x80 beqlr li 4,0 b wake_up_page_bit To this: li 10,1 lwsync 1: ldarx 9,0,3,0 andc 9,9,10 stdcx. 9,0,3 bne- 1b andi. 10,9,0x80 beqlr li 4,0 b wake_up_page_bit In a test of elapsed time for dd writing into 16GB of already-dirty pagecache on a POWER8 with 4K pages, which has one unlock_page per 4kB this patch reduced overhead by 1.1%: N Min Max Median Avg Stddev x 19 2.578 2.619 2.594 2.595 0.011 + 19 2.552 2.592 2.564 2.565 0.008 Difference at 95.0% confidence -0.030 +/- 0.006 -1.142% +/- 0.243% Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/bitops.h | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/arch/powerpc/include/asm/bitops.h b/arch/powerpc/include/asm/bitops.h index 59abc620f8e8..9add12ee13dd 100644 --- a/arch/powerpc/include/asm/bitops.h +++ b/arch/powerpc/include/asm/bitops.h @@ -154,6 +154,31 @@ static __inline__ int test_and_change_bit(unsigned long nr, return test_and_change_bits(BIT_MASK(nr), addr + BIT_WORD(nr)) != 0; } +static __inline__ unsigned long clear_bit_unlock_return_word(int nr, + volatile unsigned long *addr) +{ + unsigned long old, t; + unsigned long *p = (unsigned long *)addr + BIT_WORD(nr); + unsigned long mask = BIT_MASK(nr); + + __asm__ __volatile__ ( + PPC_RELEASE_BARRIER +"1:" PPC_LLARX(%0,0,%3,0) "\n" + "andc %1,%0,%2\n" + PPC405_ERR77(0,%3) + PPC_STLCX "%1,0,%3\n" + "bne- 1b\n" + : "=&r" (old), "=&r" (t) + : "r" (mask), "r" (p) + : "cc", "memory"); + + return old; +} + +/* This is a special function for mm/filemap.c */ +#define clear_bit_unlock_is_negative_byte(nr, addr) \ + (clear_bit_unlock_return_word(nr, addr) & BIT_MASK(PG_waiters)) + #include static __inline__ void __clear_bit_unlock(int nr, volatile unsigned long *addr)