From patchwork Wed Mar 26 15:58:15 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nishanth Aravamudan X-Patchwork-Id: 333979 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from ozlabs.org (localhost [IPv6:::1]) by ozlabs.org (Postfix) with ESMTP id 00FE3140194 for ; Thu, 27 Mar 2014 02:59:16 +1100 (EST) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3F617140098 for ; Thu, 27 Mar 2014 02:58:42 +1100 (EST) Received: from /spool/local by e34.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 26 Mar 2014 09:58:38 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e34.co.us.ibm.com (192.168.1.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 26 Mar 2014 09:58:34 -0600 Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id 3FFF41FF0044 for ; Wed, 26 Mar 2014 09:58:34 -0600 (MDT) Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by b03cxnp08028.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s2QFwXU611338020 for ; Wed, 26 Mar 2014 16:58:34 +0100 Received: from d03av03.boulder.ibm.com (localhost [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id s2QFwWwp023788 for ; Wed, 26 Mar 2014 09:58:33 -0600 Received: from qbert.localdomain ([9.80.97.190]) by d03av03.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id s2QFwV0W023715; Wed, 26 Mar 2014 09:58:32 -0600 Received: by qbert.localdomain (Postfix, from userid 1000) id 593FC48033C; Wed, 26 Mar 2014 08:58:15 -0700 (PDT) Date: Wed, 26 Mar 2014 08:58:15 -0700 From: Nishanth Aravamudan To: linux-mm@kvack.org Subject: [RFC PATCH] hugetlb: ensure hugepage access is denied if hugepages are not supported Message-ID: <20140326155815.GB15234@linux.vnet.ibm.com> References: <20140324230256.GA18778@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20140324230256.GA18778@linux.vnet.ibm.com> X-Operating-System: Linux 3.11.0-18-generic (x86_64) User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14032615-1542-0000-0000-000000A65C69 Cc: paulus@samba.org, linuxppc-dev@lists.ozlabs.org, anton@samba.org, nyc@holomorphy.com X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On 24.03.2014 [16:02:56 -0700], Nishanth Aravamudan wrote: > In KVM guests on Power, if the guest is not backed by hugepages, we see > the following in the guest: > > AnonHugePages: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 64 kB > > This seems like a configuration issue -- why is a hstate of 64k being > registered? > > I did some debugging and found that the following does trigger, > mm/hugetlb.c::hugetlb_init(): > > /* Some platform decide whether they support huge pages at boot > * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when > * there is no such support > */ > if (HPAGE_SHIFT == 0) > return 0; > > That check is only during init-time. So we don't support hugepages, but > none of the hugetlb APIs actually check this condition (HPAGE_SHIFT == > 0), so /proc/meminfo above falsely indicates there is a valid hstate (at > least one). But note that there is no /sys/kernel/mm/hugepages meaning > no hstate was actually registered. > > Further, it turns out that huge_page_order(default_hstate) is 0, so > hugetlb_report_meminfo is doing: > > 1UL << (huge_page_order(h) + PAGE_SHIFT - 10) > > which ends up just doing 1 << (PAGE_SHIFT - 10) and since the base page > size is 64k, we report a hugepage size of 64k... And allow the user to > allocate hugepages via the sysctl, etc. > > What's the right thing to do here? > > 1) Should we add checks for HPAGE_SHIFT == 0 to all the hugetlb APIs? It > seems like HPAGE_SHIFT == 0 should be the equivalent, functionally, of > the config options being off. This seems like a lot of overhead, though, > to put everywhere, so maybe I can do it in an arch-specific macro, that > in asm-generic defaults to 0 (and so will hopefully be compiled out?). > > 2) What should hugetlbfs do when HPAGE_SHIFT == 0? Should it be > mountable? Obviously if it's mountable, we can't great files there > (since the fs will report insufficient space). [1] Here is my solution to this. Comments appreciated! In KVM guests on Power, in a guest not backed by hugepages, we see the following: AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 64 kB HPAGE_SHIFT == 0 in this configuration, which indicates that hugepages are not supported at boot-time, but this is only checked in hugetlb_init(). Extract the check to a helper function, and use it in a few relevant places. This does make hugetlbfs not supported in this environment. I believe this is fine, as there are no valid hugepages and that won't change at runtime. Signed-off-by: Nishanth Aravamudan diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index d19b30a..c7aa477 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -1017,6 +1017,11 @@ static int __init init_hugetlbfs_fs(void) int error; int i; + if (!hugepages_supported()) { + printk(KERN_ERR "hugetlbfs: Disabling because there are no supported page sizes\n"); + return -ENOTSUPP; + } + error = bdi_init(&hugetlbfs_backing_dev_info); if (error) return error; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 8c43cc4..0aea8de 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -450,4 +450,14 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, return ptl; } +static inline bool hugepages_supported(void) +{ + /* + * Some platform decide whether they support huge pages at boot + * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when + * there is no such support + */ + return HPAGE_SHIFT != 0; +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c01cb9f..1c99585 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1949,11 +1949,7 @@ module_exit(hugetlb_exit); static int __init hugetlb_init(void) { - /* Some platform decide whether they support huge pages at boot - * time. On these, such as powerpc, HPAGE_SHIFT is set to 0 when - * there is no such support - */ - if (HPAGE_SHIFT == 0) + if (!hugepages_supported()) return 0; if (!size_to_hstate(default_hstate_size)) { @@ -2069,6 +2065,9 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy, unsigned long tmp; int ret; + if (!hugepages_supported()) + return -ENOTSUPP; + tmp = h->max_huge_pages; if (write && h->order >= MAX_ORDER) @@ -2122,6 +2121,9 @@ int hugetlb_overcommit_handler(struct ctl_table *table, int write, unsigned long tmp; int ret; + if (!hugepages_supported()) + return -ENOTSUPP; + tmp = h->nr_overcommit_huge_pages; if (write && h->order >= MAX_ORDER) @@ -2147,6 +2149,8 @@ out: void hugetlb_report_meminfo(struct seq_file *m) { struct hstate *h = &default_hstate; + if (!hugepages_supported()) + return; seq_printf(m, "HugePages_Total: %5lu\n" "HugePages_Free: %5lu\n" @@ -2163,6 +2167,8 @@ void hugetlb_report_meminfo(struct seq_file *m) int hugetlb_report_node_meminfo(int nid, char *buf) { struct hstate *h = &default_hstate; + if (!hugepages_supported()) + return 0; return sprintf(buf, "Node %d HugePages_Total: %5u\n" "Node %d HugePages_Free: %5u\n" @@ -2177,6 +2183,9 @@ void hugetlb_show_meminfo(void) struct hstate *h; int nid; + if (!hugepages_supported()) + return; + for_each_node_state(nid, N_MEMORY) for_each_hstate(h) pr_info("Node %d hugepages_total=%u hugepages_free=%u hugepages_surp=%u hugepages_size=%lukB\n",