diff mbox

[3/6] 8xx: invalidate non present TLBs

Message ID 1254948364-30074-4-git-send-email-Joakim.Tjernlund@transmode.se (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Joakim Tjernlund Oct. 7, 2009, 8:46 p.m. UTC
8xx sometimes need to load a invalid/non-present TLBs in
it DTLB asm handler.
These must be invalidated separaly as linux mm don't.
---
 arch/powerpc/mm/fault.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

Comments

Benjamin Herrenschmidt Oct. 7, 2009, 9:18 p.m. UTC | #1
On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
> 8xx sometimes need to load a invalid/non-present TLBs in
> it DTLB asm handler.
> These must be invalidated separaly as linux mm don't.

not sure about the dsisr test here, what is the point ?

Cheers,
Ben.

> ---
>  arch/powerpc/mm/fault.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 7699394..72941c7 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -39,7 +39,7 @@
>  #include <asm/uaccess.h>
>  #include <asm/tlbflush.h>
>  #include <asm/siginfo.h>
> -
> +#include <mm/mmu_decl.h>
>  
>  #ifdef CONFIG_KPROBES
>  static inline int notify_page_fault(struct pt_regs *regs)
> @@ -243,6 +243,12 @@ good_area:
>  		goto bad_area;
>  #endif /* CONFIG_6xx */
>  #if defined(CONFIG_8xx)
> +	/* 8xx sometimes need to load a invalid/non-present TLBs.
> +	 * These must be invalidated separately as linux mm don't.
> +	 */
> +	if (error_code & 0x40000000) /* no translation? */
> +		_tlbil_va(address);
> +
>          /* The MPC8xx seems to always set 0x80000000, which is
>           * "undefined".  Of those that can be set, this is the only
>           * one which seems bad.
Joakim Tjernlund Oct. 7, 2009, 10:12 p.m. UTC | #2
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 07/10/2009 23:18:05:
>
> On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
> > 8xx sometimes need to load a invalid/non-present TLBs in
> > it DTLB asm handler.
> > These must be invalidated separaly as linux mm don't.
>
> not sure about the dsisr test here, what is the point ?

To remove the need to do the same in generic pte code. Then
we only need to do it when it counts. Lets see how it works out.

>
> Cheers,
> Ben.
>
> > ---
> >  arch/powerpc/mm/fault.c |    8 +++++++-
> >  1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 7699394..72941c7 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -39,7 +39,7 @@
> >  #include <asm/uaccess.h>
> >  #include <asm/tlbflush.h>
> >  #include <asm/siginfo.h>
> > -
> > +#include <mm/mmu_decl.h>
> >
> >  #ifdef CONFIG_KPROBES
> >  static inline int notify_page_fault(struct pt_regs *regs)
> > @@ -243,6 +243,12 @@ good_area:
> >        goto bad_area;
> >  #endif /* CONFIG_6xx */
> >  #if defined(CONFIG_8xx)
> > +   /* 8xx sometimes need to load a invalid/non-present TLBs.
> > +    * These must be invalidated separately as linux mm don't.
> > +    */
> > +   if (error_code & 0x40000000) /* no translation? */
> > +      _tlbil_va(address);
> > +
> >          /* The MPC8xx seems to always set 0x80000000, which is
> >           * "undefined".  Of those that can be set, this is the only
> >           * one which seems bad.
>
>
>
>
Benjamin Herrenschmidt Oct. 7, 2009, 10:20 p.m. UTC | #3
On Thu, 2009-10-08 at 00:12 +0200, Joakim Tjernlund wrote:
> > not sure about the dsisr test here, what is the point ?
> 
> To remove the need to do the same in generic pte code. Then
> we only need to do it when it counts. Lets see how it works out.

But I'm not sure I trust that DSISR test. Just do it unconditionally...
How much does it cost anyway ?

Ben.
Joakim Tjernlund Oct. 8, 2009, 7:22 p.m. UTC | #4
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 07/10/2009 23:18:05:
>
> On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
> > 8xx sometimes need to load a invalid/non-present TLBs in
> > it DTLB asm handler.
> > These must be invalidated separaly as linux mm don't.
>
> not sure about the dsisr test here, what is the point ?

Without this patch I get about twice as many DTLB errors( on 2.4)

I have also noted that all my dcbst DTLB has the store bit set:
  trap:300 address:10030b8c, dar:10030b8c,err:42000000 dcbst

Thare are comments in the kernel that dcbst wrongly
generates TLB Errors with store set on 8xx. Is this really so?
Should dcbst always trap as a load?

 Jocke

>
> Cheers,
> Ben.
>
> > ---
> >  arch/powerpc/mm/fault.c |    8 +++++++-
> >  1 files changed, 7 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 7699394..72941c7 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -39,7 +39,7 @@
> >  #include <asm/uaccess.h>
> >  #include <asm/tlbflush.h>
> >  #include <asm/siginfo.h>
> > -
> > +#include <mm/mmu_decl.h>
> >
> >  #ifdef CONFIG_KPROBES
> >  static inline int notify_page_fault(struct pt_regs *regs)
> > @@ -243,6 +243,12 @@ good_area:
> >        goto bad_area;
> >  #endif /* CONFIG_6xx */
> >  #if defined(CONFIG_8xx)
> > +   /* 8xx sometimes need to load a invalid/non-present TLBs.
> > +    * These must be invalidated separately as linux mm don't.
> > +    */
> > +   if (error_code & 0x40000000) /* no translation? */
> > +      _tlbil_va(address);
> > +
> >          /* The MPC8xx seems to always set 0x80000000, which is
> >           * "undefined".  Of those that can be set, this is the only
> >           * one which seems bad.
>
>
>
>
Dan Malek Oct. 8, 2009, 8:11 p.m. UTC | #5
On Oct 8, 2009, at 12:22 PM, Joakim Tjernlund wrote:

> hare are comments in the kernel that dcbst wrongly
> generates TLB Errors with store set on 8xx. Is this really so?
> Should dcbst always trap as a load?

There are many comments written about 8xx as various
behavior was discovered.  Worse, some of these details
would be different among the different processor versions.
You need to be careful and test as many different part
versions as possible to ensure you have everything
covered.....  then someone will find a part that doesn't
quite work, "fix" it, and break others :-)

In this particular case, the PEM does state dcbst is treated
as a load, but from experience we know 8xx doesn't work
that way.  Of course, since dcbst is a store operation,
you could argue that 8xx got it correct :-)

Have fun!

	-- Dan
Benjamin Herrenschmidt Oct. 8, 2009, 8:18 p.m. UTC | #6
On Thu, 2009-10-08 at 13:11 -0700, Dan Malek wrote:
> 
> There are many comments written about 8xx as various
> behavior was discovered.  Worse, some of these details
> would be different among the different processor versions.
> You need to be careful and test as many different part
> versions as possible to ensure you have everything
> covered.....  then someone will find a part that doesn't
> quite work, "fix" it, and break others :-)
> 
> In this particular case, the PEM does state dcbst is treated
> as a load, but from experience we know 8xx doesn't work
> that way.  Of course, since dcbst is a store operation,
> you could argue that 8xx got it correct :-) 

Hehe. Well, it's architecturally incorrect, as dcbst is not really a
store operation in the sense that it doesn't modify the target cache
line, and as such doesn't (mustn't) be covered by write access
protection, shouldn't set DIRTY, etc...

So I would argue that 8xx got it wrong either way :-)

Cheers,
Ben.
Benjamin Herrenschmidt Oct. 8, 2009, 8:28 p.m. UTC | #7
Hoy Dan !

While you are around ... I have a question :-)

Do you happen to remember what the story is with the invalidation of
"unpopulated" (aka invalid) entries ?

IE. We create those in the 8xx TLB miss when the PTE is !present (or the
PMD is absent). Those then cause a TLB error on the next access which
allows us to process the page fault. But when/how are those invalid
entries supposed to be invalidated ?

The doco seems to hint that at least in the case of an entry with the
wrong C bit (store to an entry with C=0), the HW automatically
invalidates it before taking the TLB Error but that's all I found.

Is there a general HW policy on 8xx to invalidate TLB entries that cause
TLB errors ? Or is the kernel expected to do it most of the time ?

Cheers,
Ben.
Joakim Tjernlund Oct. 8, 2009, 8:37 p.m. UTC | #8
Dan Malek <dan@embeddedalley.com> wrote on 08/10/2009 22:11:07:
>
>
> On Oct 8, 2009, at 12:22 PM, Joakim Tjernlund wrote:
>
> > hare are comments in the kernel that dcbst wrongly
> > generates TLB Errors with store set on 8xx. Is this really so?
> > Should dcbst always trap as a load?

Hi, been a long time since I heard from you :)

>
> There are many comments written about 8xx as various
> behavior was discovered.  Worse, some of these details
> would be different among the different processor versions.
> You need to be careful and test as many different part
> versions as possible to ensure you have everything
> covered.....  then someone will find a part that doesn't
> quite work, "fix" it, and break others :-)
>
> In this particular case, the PEM does state dcbst is treated
> as a load, but from experience we know 8xx doesn't work
> that way.  Of course, since dcbst is a store operation,
> you could argue that 8xx got it correct :-)

One could try clearing the store bit in the page fault handler, but then
that might cause a loop.
Not sure it has any practical meaning though.

Anyhow, you are welcome to have a look at the patches I have been tossing out.

 Jocke
Benjamin Herrenschmidt Oct. 8, 2009, 8:42 p.m. UTC | #9
On Thu, 2009-10-08 at 21:22 +0200, Joakim Tjernlund wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 07/10/2009 23:18:05:
> >
> > On Wed, 2009-10-07 at 22:46 +0200, Joakim Tjernlund wrote:
> > > 8xx sometimes need to load a invalid/non-present TLBs in
> > > it DTLB asm handler.
> > > These must be invalidated separaly as linux mm don't.
> >
> > not sure about the dsisr test here, what is the point ?
> 
> Without this patch I get about twice as many DTLB errors( on 2.4)

Ok, so it is useful... I would have thought that invalidating a TLB
entry that just caused a fault mostly be a nop.. well, the tlbie on 8xx
ignores the ASID so maybe it's invalidating next door process entries
but even that doesn't sound right. The TLB is so tiny on these things...

Oh well, as I said, something else to look at more closely.

> I have also noted that all my dcbst DTLB has the store bit set:
>   trap:300 address:10030b8c, dar:10030b8c,err:42000000 dcbst
> 
> Thare are comments in the kernel that dcbst wrongly
> generates TLB Errors with store set on 8xx. Is this really so?
> Should dcbst always trap as a load?

Architecturally it should, that's a known 8xx core bug.

Cheers,
Ben.

>  Jocke
> 
> >
> > Cheers,
> > Ben.
> >
> > > ---
> > >  arch/powerpc/mm/fault.c |    8 +++++++-
> > >  1 files changed, 7 insertions(+), 1 deletions(-)
> > >
> > > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > > index 7699394..72941c7 100644
> > > --- a/arch/powerpc/mm/fault.c
> > > +++ b/arch/powerpc/mm/fault.c
> > > @@ -39,7 +39,7 @@
> > >  #include <asm/uaccess.h>
> > >  #include <asm/tlbflush.h>
> > >  #include <asm/siginfo.h>
> > > -
> > > +#include <mm/mmu_decl.h>
> > >
> > >  #ifdef CONFIG_KPROBES
> > >  static inline int notify_page_fault(struct pt_regs *regs)
> > > @@ -243,6 +243,12 @@ good_area:
> > >        goto bad_area;
> > >  #endif /* CONFIG_6xx */
> > >  #if defined(CONFIG_8xx)
> > > +   /* 8xx sometimes need to load a invalid/non-present TLBs.
> > > +    * These must be invalidated separately as linux mm don't.
> > > +    */
> > > +   if (error_code & 0x40000000) /* no translation? */
> > > +      _tlbil_va(address);
> > > +
> > >          /* The MPC8xx seems to always set 0x80000000, which is
> > >           * "undefined".  Of those that can be set, this is the only
> > >           * one which seems bad.
> >
> >
> >
> >
Benjamin Herrenschmidt Oct. 8, 2009, 8:44 p.m. UTC | #10
> One could try clearing the store bit in the page fault handler, but then
> that might cause a loop.
> Not sure it has any practical meaning though.
> 
> Anyhow, you are welcome to have a look at the patches I have been tossing out.

The store bit in do_page_fault() is -very- important (and the only DSISR
bit that is as I wrote earlier). The generic code must know if a fault
is caused by a load or a store since the later is what will cause
copy_on_write to happen for example, or dirty to be set, etc..

Ben.
Dan Malek Oct. 8, 2009, 10:08 p.m. UTC | #11
Hi Ben.

On Oct 8, 2009, at 1:28 PM, Benjamin Herrenschmidt wrote:

> While you are around ... I have a question :-)

I'll try.  Many brain cells have been replaced or lost
over the years :-)

> Do you happen to remember what the story is with the invalidation of
> "unpopulated" (aka invalid) entries ?
>
> IE. We create those in the 8xx TLB miss when the PTE is !present  
> (or the
> PMD is absent). Those then cause a TLB error on the next access which
> allows us to process the page fault. But when/how are those invalid
> entries supposed to be invalidated ?

I thought we did a tlbie() (or whatever the equivalent is today)
when the PTE was updated in the table.  An optimization was to
load the TLB with the entry at that time to avoid a subsequent miss.
In any case, the TLB entry has to be modified by the software.

> The doco seems to hint that at least in the case of an entry with the
> wrong C bit (store to an entry with C=0), the HW automatically
> invalidates it before taking the TLB Error but that's all I found.

I don't remember how C was used in the past, but I suspect
it just mirrored the Linux VM state.  I'm quite certain the MMU
HW would never change a TLB entry.  Where did you read this?
For most of the 8xx "early days," I used to just mark all write
pages as dirty.   For some reason I just overloaded the write/changed
into one bit, it avoided TLB Error overhead and I think even some
silicon bugs.  Since they were tiny and fairly static embedded
systems, it didn't have any effect on the way VM was managed.

> Is there a general HW policy on 8xx to invalidate TLB entries that  
> cause
> TLB errors ? Or is the kernel expected to do it most of the time ?

The MMU HW on the 8xx is just a translator.  I'm now really
certain it won't ever change a TLB entry.  It's completely up to
software to make all TLB changes.

Just make it simple :-)

Thanks.

	-- Dan
Benjamin Herrenschmidt Oct. 8, 2009, 10:23 p.m. UTC | #12
On Thu, 2009-10-08 at 15:08 -0700, Dan Malek wrote:
> Hi Ben.
> 
> On Oct 8, 2009, at 1:28 PM, Benjamin Herrenschmidt wrote:
> 
> > While you are around ... I have a question :-)
> 
> I'll try.  Many brain cells have been replaced or lost
> over the years :-)

Replaced ? You lucky ! I only lose mines :-)

> I thought we did a tlbie() (or whatever the equivalent is today)
> when the PTE was updated in the table.  An optimization was to
> load the TLB with the entry at that time to avoid a subsequent miss.
> In any case, the TLB entry has to be modified by the software.

Ok, that's my understanding too and I think we had the tlbie in
update_mmu_cache to do the trick, though the comment is misleading
making it think that the only reason it's there is for the dcbst
problem. At least that's my understanding. That was lost recently in 2.6
so I'll have to put it back properly.

I don't think we do the pre-load to avoid the second fault, but we
certainly could and should.

> I don't remember how C was used in the past, but I suspect
> it just mirrored the Linux VM state.  I'm quite certain the MMU
> HW would never change a TLB entry.  Where did you read this?

MPC855UM, chapter 8.6 "Memory attributes":

<<
• Reference and change bit updates—The MPC850 does not generate an exception for
  an R (reference) bit update. In fact, there is no entry for an R bit in the TLB.
  The change bit (C) is bit 23 in the level-two descriptor, described in Table 8-4.
  Software updates C (changed) bits, but hardware treats the C bit (negated) as a
  write-protect attribute. Therefore, attempting to write to a page marked unmodified
  invalidates that entry and causes an implementation-specific DTLB error exception.
  ^^^^^^^^^^^^^^^^^^^^^^
  If change bits are not needed, set the C bit to one by default in the PTEs.
>>

And yes, it's weird and that's the only place I think where this is
mentioned which makes me think it could well be a doco bug.

> For most of the 8xx "early days," I used to just mark all write
> pages as dirty.   For some reason I just overloaded the write/changed
> into one bit, it avoided TLB Error overhead and I think even some
> silicon bugs.  Since they were tiny and fairly static embedded
> systems, it didn't have any effect on the way VM was managed.

Well, nowadays at least, most of the time, a writeable page is also
dirty unless it's a writeable shared mapping, and in that later case
you really want to do proper dirty tracking. So I suspect we can
simplify some of that and let the generic code handle dirty by mapping
_PAGE_DIRTY to C and removing _PAGE_HWWRITE. We can also remove all
of the asm munging from DataTLBError, and let the generic C code fixup
DIRTY and ACCESSED when needed, since those should only rarely need a
fixup.

> The MMU HW on the 8xx is just a translator.  I'm now really
> certain it won't ever change a TLB entry.  It's completely up to
> software to make all TLB changes.

That makes sense.

> Just make it simple :-)

Yeah. I think we can simplify the current code a lot, which will speed
up TLB misses (well, nothing much you can do about the infamous errata
#6 but that's another story). It won't give 2.6 back the 2.4 speed due
to sheer bloat of the generic code but it might at least offset some of
the loss by improving the overall TLB miss performance.

Now, I don't have any 8xx gear, so it will be up to Joakim, Scott etc...
to get that stuff right :-)

Thanks for your feedback.

Cheers,
Ben.

> Thanks.
> 
> 	-- Dan
Joakim Tjernlund Oct. 8, 2009, 11:01 p.m. UTC | #13
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 09/10/2009 00:23:48:
>
> On Thu, 2009-10-08 at 15:08 -0700, Dan Malek wrote:
> > Hi Ben.
> >
> > On Oct 8, 2009, at 1:28 PM, Benjamin Herrenschmidt wrote:
> >
> > > While you are around ... I have a question :-)
> >
> > I'll try.  Many brain cells have been replaced or lost
> > over the years :-)
>
> Replaced ? You lucky ! I only lose mines :-)
>
> > I thought we did a tlbie() (or whatever the equivalent is today)
> > when the PTE was updated in the table.  An optimization was to
> > load the TLB with the entry at that time to avoid a subsequent miss.
> > In any case, the TLB entry has to be modified by the software.
>
> Ok, that's my understanding too and I think we had the tlbie in
> update_mmu_cache to do the trick, though the comment is misleading
> making it think that the only reason it's there is for the dcbst
> problem. At least that's my understanding. That was lost recently in 2.6
> so I'll have to put it back properly.

So you don't think my invalidate "only !present pages" patch in do_page_fault
is enough?

>
> I don't think we do the pre-load to avoid the second fault, but we
> certainly could and should.
>
> > I don't remember how C was used in the past, but I suspect
> > it just mirrored the Linux VM state.  I'm quite certain the MMU
> > HW would never change a TLB entry.  Where did you read this?
>
> MPC855UM, chapter 8.6 "Memory attributes":
>
> <<
> • Reference and change bit updates—The MPC850 does not generate an exception for
>   an R (reference) bit update. In fact, there is no entry for an R bit in the TLB.
>   The change bit (C) is bit 23 in the level-two descriptor, described in Table 8-4.
>   Software updates C (changed) bits, but hardware treats the C bit (negated) as a
>   write-protect attribute. Therefore, attempting to write to a page marked unmodified
>   invalidates that entry and causes an implementation-specific DTLB error exception.
>   ^^^^^^^^^^^^^^^^^^^^^^
>   If change bits are not needed, set the C bit to one by default in the PTEs.
> >>
>
> And yes, it's weird and that's the only place I think where this is
> mentioned which makes me think it could well be a doco bug.
>
> > For most of the 8xx "early days," I used to just mark all write
> > pages as dirty.   For some reason I just overloaded the write/changed
> > into one bit, it avoided TLB Error overhead and I think even some
> > silicon bugs.  Since they were tiny and fairly static embedded
> > systems, it didn't have any effect on the way VM was managed.
>
> Well, nowadays at least, most of the time, a writeable page is also
> dirty unless it's a writeable shared mapping, and in that later case
> you really want to do proper dirty tracking. So I suspect we can
> simplify some of that and let the generic code handle dirty by mapping
> _PAGE_DIRTY to C and removing _PAGE_HWWRITE. We can also remove all
> of the asm munging from DataTLBError, and let the generic C code fixup
> DIRTY and ACCESSED when needed, since those should only rarely need a
> fixup.
>
> > The MMU HW on the 8xx is just a translator.  I'm now really
> > certain it won't ever change a TLB entry.  It's completely up to
> > software to make all TLB changes.
>
> That makes sense.
>
> > Just make it simple :-)
>
> Yeah. I think we can simplify the current code a lot, which will speed
> up TLB misses (well, nothing much you can do about the infamous errata
> #6 but that's another story). It won't give 2.6 back the 2.4 speed due
> to sheer bloat of the generic code but it might at least offset some of
> the loss by improving the overall TLB miss performance.

It won't get much faster than my current patch. Trapping all DTLB
Errors to C won't make it faster, only more correct should there be
a bug in the asm version. Actually there is one that has been there
all the time, guarded flag is not set by DTLB Error.

 Jocke

>
> Now, I don't have any 8xx gear, so it will be up to Joakim, Scott etc...
> to get that stuff right :-)

Waiting for Rex and Scott to comment/test.
Dan Malek Oct. 9, 2009, 12:05 a.m. UTC | #14
On Oct 8, 2009, at 1:37 PM, Joakim Tjernlund wrote:

> Hi, been a long time since I heard from you :)

Yeah, hiding among other projects :-)

> Anyhow, you are welcome to have a look at the patches I have been  
> tossing out.

I've been looking, but since I'm not familiar with the current
VM implementation, there isn't much I can contribute.  If
I see something where I can be useful, I'll speak up.

Thanks.

	-- Dan
Dan Malek Oct. 9, 2009, 12:36 a.m. UTC | #15
On Oct 8, 2009, at 3:23 PM, Benjamin Herrenschmidt wrote:

> <<
> • Reference and change bit updates—The MPC850 does not generate an  
> exception for
>   an R (reference) bit update. In fact, there is no entry for an R  
> bit in the TLB.
>   The change bit (C) is bit 23 in the level-two descriptor,  
> described in Table 8-4.
>   Software updates C (changed) bits, but hardware treats the C bit  
> (negated) as a
>   write-protect attribute. Therefore, attempting to write to a page  
> marked unmodified
>   invalidates that entry and causes an implementation-specific DTLB  
> error exception.
>   ^^^^^^^^^^^^^^^^^^^^^^
>   If change bits are not needed, set the C bit to one by default in  
> the PTEs.

How interesting....

I've looked at many 8xx docs and they all have the same text
(probably cut/paste :-))  I'd place some debug code in the C functions
to print out a few of the TLB Entry for various errors to see if this  
really
happens, and for other errors, too.  I guess I never stumbled into
this because I always thought I had to do everything from software,
so just made sure I did.

	-- Dan
Benjamin Herrenschmidt Oct. 9, 2009, 12:56 a.m. UTC | #16
On Fri, 2009-10-09 at 01:01 +0200, Joakim Tjernlund wrote:
> Ok, that's my understanding too and I think we had the tlbie in
> > update_mmu_cache to do the trick, though the comment is misleading
> > making it think that the only reason it's there is for the dcbst
> > problem. At least that's my understanding. That was lost recently in
> 2.6
> > so I'll have to put it back properly.
> 
> So you don't think my invalidate "only !present pages" patch in
> do_page_fault is enough?

It might well be the right solution, I was talking about the code as we
have upstream today.

> I don't think we do the pre-load to avoid the second fault, but we
> It won't get much faster than my current patch. Trapping all DTLB
> Errors to C won't make it faster, only more correct should there be
> a bug in the asm version. Actually there is one that has been there
> all the time, guarded flag is not set by DTLB Error.

There's other areas of improvements I suggested that can make it faster
such as avoiding the whole kernel/user test in the TLB misses.

Removing the stuff in DataTLBError can potentially make normal page
faults faster too by avoiding going through a bunch of useless code
before going to do the real thing in C :-)

As I said, the case of ACCESSED or DIRTY updates are rare enough to not
warrant code in the main page fault hot path.

Cheers,
Ben.
Benjamin Herrenschmidt Oct. 9, 2009, 12:57 a.m. UTC | #17
On Thu, 2009-10-08 at 17:36 -0700, Dan Malek wrote:
> On Oct 8, 2009, at 3:23 PM, Benjamin Herrenschmidt wrote:
> 
> > <<
> > • Reference and change bit updates—The MPC850 does not generate an  
> > exception for
> >   an R (reference) bit update. In fact, there is no entry for an R  
> > bit in the TLB.
> >   The change bit (C) is bit 23 in the level-two descriptor,  
> > described in Table 8-4.
> >   Software updates C (changed) bits, but hardware treats the C bit  
> > (negated) as a
> >   write-protect attribute. Therefore, attempting to write to a page  
> > marked unmodified
> >   invalidates that entry and causes an implementation-specific DTLB  
> > error exception.
> >   ^^^^^^^^^^^^^^^^^^^^^^
> >   If change bits are not needed, set the C bit to one by default in  
> > the PTEs.
> 
> How interesting....
> 
> I've looked at many 8xx docs and they all have the same text
> (probably cut/paste :-))  I'd place some debug code in the C functions
> to print out a few of the TLB Entry for various errors to see if this  
> really
> happens, and for other errors, too.  I guess I never stumbled into
> this because I always thought I had to do everything from software,
> so just made sure I did.

I'm not sure it's worth bothering :-) I'm happy to continue assuming we
need to tlbie and always make sure we do so.

Cheers,
Ben.
diff mbox

Patch

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 7699394..72941c7 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -39,7 +39,7 @@ 
 #include <asm/uaccess.h>
 #include <asm/tlbflush.h>
 #include <asm/siginfo.h>
-
+#include <mm/mmu_decl.h>
 
 #ifdef CONFIG_KPROBES
 static inline int notify_page_fault(struct pt_regs *regs)
@@ -243,6 +243,12 @@  good_area:
 		goto bad_area;
 #endif /* CONFIG_6xx */
 #if defined(CONFIG_8xx)
+	/* 8xx sometimes need to load a invalid/non-present TLBs.
+	 * These must be invalidated separately as linux mm don't.
+	 */
+	if (error_code & 0x40000000) /* no translation? */
+		_tlbil_va(address);
+
         /* The MPC8xx seems to always set 0x80000000, which is
          * "undefined".  Of those that can be set, this is the only
          * one which seems bad.