regression due to "flush SAD/SPD generate false events"

Message ID	1266449529.17794.157.camel@bigi
State	RFC, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:subject:from:reply-to:to:cc:in-reply-to:references :content-type:date:message-id:mime-version:x-mailer; b=Zcc4IM+inuNjrTYP0XoLnJdaF8zI3dJLkGqcmRPqeCrnaeWJSGEeXQngrJogOuLdrd 7BYT9xDn1FAvqIGoUhhPJGbG75ebYnkuILhGiGGULLDRyuJNIjPv3QpDrMWlpN7I341T PUBnpILOhPyczoQF08svnWuF6AA8t0iz4ioNk= Subject: Re: regression due to "flush SAD/SPD generate false events" From: jamal <hadi@cyberus.ca> Reply-To: hadi@cyberus.ca To: Alexey Dobriyan <adobriyan@gmail.com> Cc: davem@davemloft.net, netdev@vger.kernel.org In-Reply-To: <1266448038.17794.135.camel@bigi> References: <20100217191718.GA5256@x200> <1266442972.17794.51.camel@bigi> <1266448038.17794.135.camel@bigi> Content-Type: multipart/mixed; boundary="=-GgLDJnSf8b8TBsYJZOkF" Date: Wed, 17 Feb 2010 18:32:09 -0500 Message-Id: <1266449529.17794.157.camel@bigi> Mime-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk

jamal Feb. 17, 2010, 11:32 p.m. UTC

On Wed, 2010-02-17 at 18:07 -0500, jamal wrote:
> When you flush
> an empty table, the time goes up from about 1.5s
> to 3 secs with pfkey. You also see the EAGAIN for
> each flush ...
> I am going to dig a little more ..

Here is a fix. The speed is restored (actually looks a little better
now) - the only thing is if you try to flush an empty table we return
-ESRCH; this seems reasonable, no? So a script like following 

---
#!/usr/sbin/setkey -f
flush;
spdflush;
----

will get:
---
bigismall:~# time setkey -f ./setkey-sample
The result of line 2: No such process.
The result of line 3: No such process.

real	0m0.663s
user	0m0.080s
sys	0m0.128s
----

cheers,
jamal

Alexey Dobriyan Feb. 18, 2010, 6:55 p.m. UTC | #1

On Wed, Feb 17, 2010 at 06:32:09PM -0500, jamal wrote:
> On Wed, 2010-02-17 at 18:07 -0500, jamal wrote:
> > When you flush
> > an empty table, the time goes up from about 1.5s
> > to 3 secs with pfkey. You also see the EAGAIN for
> > each flush ...
> > I am going to dig a little more ..
> 
> Here is a fix. The speed is restored (actually looks a little better
> now) - the only thing is if you try to flush an empty table we return
> -ESRCH; this seems reasonable, no? So a script like following 
> 
> ---
> #!/usr/sbin/setkey -f
> flush;
> spdflush;
> ----
> 
> will get:
> ---
> bigismall:~# time setkey -f ./setkey-sample
> The result of line 2: No such process.
> The result of line 3: No such process.

I'd expect flushing empty SAD/SPD to be exact nops. :^)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

jamal Feb. 18, 2010, 8:53 p.m. UTC | #2

On Thu, 2010-02-18 at 20:55 +0200, Alexey Dobriyan wrote:

> I'd expect flushing empty SAD/SPD to be exact nops. :^)

Agree - that is a better alternative.
I was hoping returning an error code of 0 will do that.
But that triggers the EAGAIN you saw. It doesnt trigger
an EAGAIN for the netlink side (and it is a nop there).
Let me look at it. Clearly the EAGAIN is causing the 
latency.

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

jamal Feb. 18, 2010, 9:45 p.m. UTC | #3

On Thu, 2010-02-18 at 15:53 -0500, jamal wrote:

> Agree - that is a better alternative.
> I was hoping returning an error code of 0 will do that.
> But that triggers the EAGAIN you saw. It doesnt trigger
> an EAGAIN for the netlink side (and it is a nop there).
> Let me look at it. Clearly the EAGAIN is causing the 
> latency.

Ok, looking closely at the strace i think ive figured out
what pfkey is doing. It basically sets non-blocking receive
and expects to see the flush event whether real or bogus
(EAGAIN is because it tries to recvmsg and fails the first
time because we dont send the event when table is empty).

The two options are: 1) Restore old behavior just for pfkey
where bogus event is sent always or 2) return -ESRCH as I
have done in the last patch.
I think #2 is more accurate. But if that is going to break
existing scripts then #1 is the way to go.

Thoughts?

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Feb. 18, 2010, 10:27 p.m. UTC | #4

From: jamal <hadi@cyberus.ca>
Date: Thu, 18 Feb 2010 16:45:58 -0500

> The two options are: 1) Restore old behavior just for pfkey
> where bogus event is sent always or 2) return -ESRCH as I
> have done in the last patch.
> I think #2 is more accurate. But if that is going to break
> existing scripts then #1 is the way to go.
> 
> Thoughts?

Let's see how #2 works out, if we get any more breakage
cases reported then we'll have to revert again and go with
#1.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

jamal Feb. 18, 2010, 10:52 p.m. UTC | #5

On Thu, 2010-02-18 at 14:27 -0800, David Miller wrote:

> Let's see how #2 works out, if we get any more breakage
> cases reported then we'll have to revert again and go with
> #1.

Actually, i think i found the root cause. The sequence should be:

1) pfkey->kernel: flush
kernel flushes SA/SPD
2)kernel->pfkey: respond to flush message
3)kernel->all user space listeners: i just flushed the SA/SPD

#2 is always needed.
#3 is wrong if SA/SPD is empty.
kernel unfortunately sends a single broadcast which combines
#2 and #3. So the proper fix is to break #2 and #3 into
separate messages. And follow the rules above.

I dont have time to fix it right now - but i will send a patch
tommorow.

cheers,
jamal

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Feb. 18, 2010, 11:01 p.m. UTC | #6

From: jamal <hadi@cyberus.ca>
Date: Thu, 18 Feb 2010 17:52:41 -0500

> I dont have time to fix it right now - but i will send a patch
> tommorow.

Ok, thanks for the update.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

regression due to "flush SAD/SPD generate false events"

Commit Message

Comments

Patch