mbox series

[RFC,v2,0/3] mm/gup: fix gup_fast with dynamic page table folding

Message ID 20200907180058.64880-1-gerald.schaefer@linux.ibm.com (mailing list archive)
Headers show
Series mm/gup: fix gup_fast with dynamic page table folding | expand

Message

Gerald Schaefer Sept. 7, 2020, 6 p.m. UTC
This is v2 of an RFC previously discussed here:
https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schaefer@linux.ibm.com/

Patch 1 is a fix for a regression in gup_fast on s390, after our conversion
to common gup_fast code. It will introduce special helper functions
pXd_addr_end_folded(), which have to be used in places where pagetable walk
is done w/o lock and with READ_ONCE, so currently only in gup_fast.

Patch 2 is an attempt to make that more generic, i.e. change pXd_addr_end()
themselves by adding an extra pXd value parameter. That was suggested by
Jason during v1 discussion, because he is already thinking of some other
places where he might want to switch to the READ_ONCE logic for pagetable
walks. In general, that would be the cleanest / safest solution, but there
is some impact on other architectures and common code, hence the new and
greatly enlarged recipient list.

Patch 3 is a "nice to have" add-on, which makes pXd_addr_end() inline
functions instead of #defines, so that we get some type checking for the
new pXd value parameter.

Not sure about Fixes/stable tags for the generic solution. Only patch 1
fixes a real bug on s390, and has Fixes/stable tags. Patches 2 + 3 might
still be nice to have in stable, to ease future backports, but I guess
"nice to have" does not really qualify for stable backports.

Changes in v2:
- Pick option 2 from v1 discussion (pXd_addr_end_folded helpers)
- Add patch 2 + 3 for more generic approach

Alexander Gordeev (3):
  mm/gup: fix gup_fast with dynamic page table folding
  mm: make pXd_addr_end() functions page-table entry aware
  mm: make generic pXd_addr_end() macros inline functions

 arch/arm/include/asm/pgtable-2level.h    |  2 +-
 arch/arm/mm/idmap.c                      |  6 ++--
 arch/arm/mm/mmu.c                        |  8 ++---
 arch/arm64/kernel/hibernate.c            | 16 +++++----
 arch/arm64/kvm/mmu.c                     | 16 ++++-----
 arch/arm64/mm/kasan_init.c               |  8 ++---
 arch/arm64/mm/mmu.c                      | 25 +++++++-------
 arch/powerpc/mm/book3s64/radix_pgtable.c |  7 ++--
 arch/powerpc/mm/hugetlbpage.c            |  6 ++--
 arch/s390/include/asm/pgtable.h          | 42 ++++++++++++++++++++++++
 arch/s390/mm/page-states.c               |  8 ++---
 arch/s390/mm/pageattr.c                  |  8 ++---
 arch/s390/mm/vmem.c                      |  8 ++---
 arch/sparc/mm/hugetlbpage.c              |  6 ++--
 arch/um/kernel/tlb.c                     |  8 ++---
 arch/x86/mm/init_64.c                    | 15 ++++-----
 arch/x86/mm/kasan_init_64.c              | 16 ++++-----
 include/asm-generic/pgtable-nop4d.h      |  2 +-
 include/asm-generic/pgtable-nopmd.h      |  2 +-
 include/asm-generic/pgtable-nopud.h      |  2 +-
 include/linux/pgtable.h                  | 38 ++++++++++++---------
 mm/gup.c                                 |  8 ++---
 mm/ioremap.c                             |  8 ++---
 mm/kasan/init.c                          | 17 +++++-----
 mm/madvise.c                             |  4 +--
 mm/memory.c                              | 40 +++++++++++-----------
 mm/mlock.c                               | 18 +++++++---
 mm/mprotect.c                            |  8 ++---
 mm/pagewalk.c                            |  8 ++---
 mm/swapfile.c                            |  8 ++---
 mm/vmalloc.c                             | 16 ++++-----
 31 files changed, 219 insertions(+), 165 deletions(-)

Comments

Mike Rapoport Sept. 7, 2020, 8:12 p.m. UTC | #1
On Mon, Sep 07, 2020 at 08:00:55PM +0200, Gerald Schaefer wrote:
> This is v2 of an RFC previously discussed here:
> https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schaefer@linux.ibm.com/
> 
> Patch 1 is a fix for a regression in gup_fast on s390, after our conversion
> to common gup_fast code. It will introduce special helper functions
> pXd_addr_end_folded(), which have to be used in places where pagetable walk
> is done w/o lock and with READ_ONCE, so currently only in gup_fast.
> 
> Patch 2 is an attempt to make that more generic, i.e. change pXd_addr_end()
> themselves by adding an extra pXd value parameter. That was suggested by
> Jason during v1 discussion, because he is already thinking of some other
> places where he might want to switch to the READ_ONCE logic for pagetable
> walks. In general, that would be the cleanest / safest solution, but there
> is some impact on other architectures and common code, hence the new and
> greatly enlarged recipient list.
> 
> Patch 3 is a "nice to have" add-on, which makes pXd_addr_end() inline
> functions instead of #defines, so that we get some type checking for the
> new pXd value parameter.
> 
> Not sure about Fixes/stable tags for the generic solution. Only patch 1
> fixes a real bug on s390, and has Fixes/stable tags. Patches 2 + 3 might
> still be nice to have in stable, to ease future backports, but I guess
> "nice to have" does not really qualify for stable backports.

I also think that adding pXd parameter to pXd_addr_end() is a cleaner
way and with this patch 1 is not really required. I would even merge
patches 2 and 3 into a single patch and use only it as the fix.

[ /me apologises to stable@ team :-) ]

> Changes in v2:
> - Pick option 2 from v1 discussion (pXd_addr_end_folded helpers)
> - Add patch 2 + 3 for more generic approach
> 
> Alexander Gordeev (3):
>   mm/gup: fix gup_fast with dynamic page table folding
>   mm: make pXd_addr_end() functions page-table entry aware
>   mm: make generic pXd_addr_end() macros inline functions
> 
>  arch/arm/include/asm/pgtable-2level.h    |  2 +-
>  arch/arm/mm/idmap.c                      |  6 ++--
>  arch/arm/mm/mmu.c                        |  8 ++---
>  arch/arm64/kernel/hibernate.c            | 16 +++++----
>  arch/arm64/kvm/mmu.c                     | 16 ++++-----
>  arch/arm64/mm/kasan_init.c               |  8 ++---
>  arch/arm64/mm/mmu.c                      | 25 +++++++-------
>  arch/powerpc/mm/book3s64/radix_pgtable.c |  7 ++--
>  arch/powerpc/mm/hugetlbpage.c            |  6 ++--
>  arch/s390/include/asm/pgtable.h          | 42 ++++++++++++++++++++++++
>  arch/s390/mm/page-states.c               |  8 ++---
>  arch/s390/mm/pageattr.c                  |  8 ++---
>  arch/s390/mm/vmem.c                      |  8 ++---
>  arch/sparc/mm/hugetlbpage.c              |  6 ++--
>  arch/um/kernel/tlb.c                     |  8 ++---
>  arch/x86/mm/init_64.c                    | 15 ++++-----
>  arch/x86/mm/kasan_init_64.c              | 16 ++++-----
>  include/asm-generic/pgtable-nop4d.h      |  2 +-
>  include/asm-generic/pgtable-nopmd.h      |  2 +-
>  include/asm-generic/pgtable-nopud.h      |  2 +-
>  include/linux/pgtable.h                  | 38 ++++++++++++---------
>  mm/gup.c                                 |  8 ++---
>  mm/ioremap.c                             |  8 ++---
>  mm/kasan/init.c                          | 17 +++++-----
>  mm/madvise.c                             |  4 +--
>  mm/memory.c                              | 40 +++++++++++-----------
>  mm/mlock.c                               | 18 +++++++---
>  mm/mprotect.c                            |  8 ++---
>  mm/pagewalk.c                            |  8 ++---
>  mm/swapfile.c                            |  8 ++---
>  mm/vmalloc.c                             | 16 ++++-----
>  31 files changed, 219 insertions(+), 165 deletions(-)
> 
> -- 
> 2.17.1
>
Christophe Leroy Sept. 8, 2020, 4:42 a.m. UTC | #2
Le 07/09/2020 à 20:00, Gerald Schaefer a écrit :
> This is v2 of an RFC previously discussed here:
> https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schaefer@linux.ibm.com/
> 
> Patch 1 is a fix for a regression in gup_fast on s390, after our conversion
> to common gup_fast code. It will introduce special helper functions
> pXd_addr_end_folded(), which have to be used in places where pagetable walk
> is done w/o lock and with READ_ONCE, so currently only in gup_fast.
> 
> Patch 2 is an attempt to make that more generic, i.e. change pXd_addr_end()
> themselves by adding an extra pXd value parameter. That was suggested by
> Jason during v1 discussion, because he is already thinking of some other
> places where he might want to switch to the READ_ONCE logic for pagetable
> walks. In general, that would be the cleanest / safest solution, but there
> is some impact on other architectures and common code, hence the new and
> greatly enlarged recipient list.
> 
> Patch 3 is a "nice to have" add-on, which makes pXd_addr_end() inline
> functions instead of #defines, so that we get some type checking for the
> new pXd value parameter.
> 
> Not sure about Fixes/stable tags for the generic solution. Only patch 1
> fixes a real bug on s390, and has Fixes/stable tags. Patches 2 + 3 might
> still be nice to have in stable, to ease future backports, but I guess
> "nice to have" does not really qualify for stable backports.

If one day you have to backport a fix that requires patch 2 and/or 3, 
just mark it "depends-on:" and the patches will go in stable at the 
relevant time.

Christophe
Christophe Leroy Sept. 8, 2020, 5:22 a.m. UTC | #3
Le 07/09/2020 à 22:12, Mike Rapoport a écrit :
> On Mon, Sep 07, 2020 at 08:00:55PM +0200, Gerald Schaefer wrote:
>> This is v2 of an RFC previously discussed here:
>> https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schaefer@linux.ibm.com/
>>
>> Patch 1 is a fix for a regression in gup_fast on s390, after our conversion
>> to common gup_fast code. It will introduce special helper functions
>> pXd_addr_end_folded(), which have to be used in places where pagetable walk
>> is done w/o lock and with READ_ONCE, so currently only in gup_fast.
>>
>> Patch 2 is an attempt to make that more generic, i.e. change pXd_addr_end()
>> themselves by adding an extra pXd value parameter. That was suggested by
>> Jason during v1 discussion, because he is already thinking of some other
>> places where he might want to switch to the READ_ONCE logic for pagetable
>> walks. In general, that would be the cleanest / safest solution, but there
>> is some impact on other architectures and common code, hence the new and
>> greatly enlarged recipient list.
>>
>> Patch 3 is a "nice to have" add-on, which makes pXd_addr_end() inline
>> functions instead of #defines, so that we get some type checking for the
>> new pXd value parameter.
>>
>> Not sure about Fixes/stable tags for the generic solution. Only patch 1
>> fixes a real bug on s390, and has Fixes/stable tags. Patches 2 + 3 might
>> still be nice to have in stable, to ease future backports, but I guess
>> "nice to have" does not really qualify for stable backports.
> 
> I also think that adding pXd parameter to pXd_addr_end() is a cleaner
> way and with this patch 1 is not really required. I would even merge
> patches 2 and 3 into a single patch and use only it as the fix.

Why not merging patches 2 and 3, but I would keep patch 1 separate but 
after the generic changes, so that we first do the generic changes, then 
we do the specific S390 use of it.

Christophe
Gerald Schaefer Sept. 8, 2020, 5:36 p.m. UTC | #4
On Tue, 8 Sep 2020 07:22:39 +0200
Christophe Leroy <christophe.leroy@csgroup.eu> wrote:

> 
> 
> Le 07/09/2020 à 22:12, Mike Rapoport a écrit :
> > On Mon, Sep 07, 2020 at 08:00:55PM +0200, Gerald Schaefer wrote:
> >> This is v2 of an RFC previously discussed here:
> >> https://lore.kernel.org/lkml/20200828140314.8556-1-gerald.schaefer@linux.ibm.com/
> >>
> >> Patch 1 is a fix for a regression in gup_fast on s390, after our conversion
> >> to common gup_fast code. It will introduce special helper functions
> >> pXd_addr_end_folded(), which have to be used in places where pagetable walk
> >> is done w/o lock and with READ_ONCE, so currently only in gup_fast.
> >>
> >> Patch 2 is an attempt to make that more generic, i.e. change pXd_addr_end()
> >> themselves by adding an extra pXd value parameter. That was suggested by
> >> Jason during v1 discussion, because he is already thinking of some other
> >> places where he might want to switch to the READ_ONCE logic for pagetable
> >> walks. In general, that would be the cleanest / safest solution, but there
> >> is some impact on other architectures and common code, hence the new and
> >> greatly enlarged recipient list.
> >>
> >> Patch 3 is a "nice to have" add-on, which makes pXd_addr_end() inline
> >> functions instead of #defines, so that we get some type checking for the
> >> new pXd value parameter.
> >>
> >> Not sure about Fixes/stable tags for the generic solution. Only patch 1
> >> fixes a real bug on s390, and has Fixes/stable tags. Patches 2 + 3 might
> >> still be nice to have in stable, to ease future backports, but I guess
> >> "nice to have" does not really qualify for stable backports.
> > 
> > I also think that adding pXd parameter to pXd_addr_end() is a cleaner
> > way and with this patch 1 is not really required. I would even merge
> > patches 2 and 3 into a single patch and use only it as the fix.
> 
> Why not merging patches 2 and 3, but I would keep patch 1 separate but 
> after the generic changes, so that we first do the generic changes, then 
> we do the specific S390 use of it.

Yes, we thought about that approach too. It would at least allow to
get all into stable, more or less nicely, as prerequisite for the s390
fix.

Two concerns kept us from going that way. For once, it might not be
the nicest way to get it all in stable, and we would not want to risk
further objections due to the imminent and rather scary data corruption
issue that we want to fix asap.

For the same reason, we thought that the generalization part might
need more time and agreement from various people, so that we could at
least get the first patch as short-term solution.

It seems now that the generalization is very well accepted so far,
apart from some apparent issues on arm. Also, merging 2 + 3 and
putting them first seems to be acceptable, so we could do that for
v3, if there are no objections.

Of course, we first need to address the few remaining issues for
arm(32?), which do look quite confusing to me so far. BTW, sorry for
the compile error with patch 3, I guess we did the cross-compile only
for 1 + 2 applied, to see the bloat-o-meter changes. But I guess
patch 3 already proved its usefulness by that :-)
Gerald Schaefer Sept. 9, 2020, 4:12 p.m. UTC | #5
On Tue, 8 Sep 2020 19:36:50 +0200
Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:

[..]
> 
> It seems now that the generalization is very well accepted so far,
> apart from some apparent issues on arm. Also, merging 2 + 3 and
> putting them first seems to be acceptable, so we could do that for
> v3, if there are no objections.
> 
> Of course, we first need to address the few remaining issues for
> arm(32?), which do look quite confusing to me so far. BTW, sorry for
> the compile error with patch 3, I guess we did the cross-compile only
> for 1 + 2 applied, to see the bloat-o-meter changes. But I guess
> patch 3 already proved its usefulness by that :-)

Umm, replace "arm" with "power", sorry. No issues on arm so far, but
also no ack I think.

Thanks to Christophe for the power change, and to Mike for volunteering
for some cross compilation and cross-arch testing. Will send v3 with
merged and re-ordered patches after some more testing.