mbox series

[net,0/7] rxrpc: Fix use of skb_cow_data()

Message ID 156708405310.26102.7954021163316252673.stgit@warthog.procyon.org.uk
Headers show
Series rxrpc: Fix use of skb_cow_data() | expand

Message

David Howells Aug. 29, 2019, 1:07 p.m. UTC
Here's a series of patches that replaces the use of skb_cow_data() in rxrpc
with skb_unshare() early on in the input process.  The problem that is
being seen is that skb_cow_data() indirectly requires that the maximum
usage count on an sk_buff be 1, and it may generate an assertion failure in
pskb_expand_head() if not.

This can occur because rxrpc_input_data() may be still holding a ref when
it has just attached the sk_buff to the rx ring and given that attachment
its own ref.  If recvmsg happens fast enough, skb_cow_data() can see the
ref still held by the softirq handler.

Further, a packet may contain multiple subpackets, each of which gets its
own attachment to the ring and its own ref - also making skb_cow_data() go
bang.

Fix this by:

 (1) The DATA packet is currently parsed for subpackets twice by the input
     routines.  Parse it just once instead and make notes in the sk_buff
     private data.

 (2) Use the notes from (1) when attaching the packet to the ring multiple
     times.  Once the packet is attached to the ring, recvmsg can see it
     and start modifying it, so the softirq handler is not permitted to
     look inside it from that point.

 (3) Pass the ref from the input code to the ring rather than getting an
     extra ref.  rxrpc_input_data() uses a ref on the second refcount to
     prevent the packet from evaporating under it.

 (4) Call skb_unshare() on secured DATA packets in rxrpc_input_packet()
     before we take call->input_lock.  Other sorts of packets don't get
     modified and so can be left.

     A trace is emitted if skb_unshare() eats the skb.  Note that
     skb_share() for our accounting in this regard as we can't see the
     parameters in the packet to log in a trace line if it releases it.

 (5) Remove the calls to skb_cow_data().  These are then no longer
     necessary.

There are also patches to improve the rxrpc_skb tracepoint to make sure
that Tx-derived buffers are identified separately from Rx-derived buffers
in the trace.

The patches are tagged here:

	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
	rxrpc-fixes-20190827

and can also be found on the following branch:

	http://git.kernel.org/cgit/linux/kernel/git/dhowells/linux-fs.git/log/?h=rxrpc-fixes

David
---
David Howells (7):
      rxrpc: Improve jumbo packet counting
      rxrpc: Use info in skbuff instead of reparsing a jumbo packet
      rxrpc: Pass the input handler's data skb reference to the Rx ring
      rxrpc: Abstract out rxtx ring cleanup
      rxrpc: Add a private skb flag to indicate transmission-phase skbs
      rxrpc: Use the tx-phase skb flag to simplify tracing
      rxrpc: Use skb_unshare() rather than skb_cow_data()


 include/trace/events/rxrpc.h |   59 ++++----
 net/rxrpc/ar-internal.h      |   16 ++
 net/rxrpc/call_event.c       |    8 +
 net/rxrpc/call_object.c      |   33 ++---
 net/rxrpc/conn_event.c       |    6 -
 net/rxrpc/input.c            |  304 +++++++++++++++++++++++-------------------
 net/rxrpc/local_event.c      |    4 -
 net/rxrpc/output.c           |    6 -
 net/rxrpc/peer_event.c       |   10 +
 net/rxrpc/protocol.h         |    9 +
 net/rxrpc/recvmsg.c          |   47 ++++--
 net/rxrpc/rxkad.c            |   32 +---
 net/rxrpc/sendmsg.c          |   13 +-
 net/rxrpc/skbuff.c           |   40 ++++--
 14 files changed, 319 insertions(+), 268 deletions(-)

Comments

David Miller Aug. 30, 2019, 9:55 p.m. UTC | #1
From: David Howells <dhowells@redhat.com>
Date: Thu, 29 Aug 2019 14:07:33 +0100

> Here's a series of patches that replaces the use of skb_cow_data() in rxrpc
> with skb_unshare() early on in the input process.  The problem that is
> being seen is that skb_cow_data() indirectly requires that the maximum
> usage count on an sk_buff be 1, and it may generate an assertion failure in
> pskb_expand_head() if not.
> 
> This can occur because rxrpc_input_data() may be still holding a ref when
> it has just attached the sk_buff to the rx ring and given that attachment
> its own ref.  If recvmsg happens fast enough, skb_cow_data() can see the
> ref still held by the softirq handler.
> 
> Further, a packet may contain multiple subpackets, each of which gets its
> own attachment to the ring and its own ref - also making skb_cow_data() go
> bang.
> 
> Fix this by:
 ...
> There are also patches to improve the rxrpc_skb tracepoint to make sure
> that Tx-derived buffers are identified separately from Rx-derived buffers
> in the trace.

This looks great, thanks for reimplementing this using skb_unshare().

> The patches are tagged here:
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
> 	rxrpc-fixes-20190827

Pulled, thanks again.
David Howells Sept. 1, 2019, 7:11 a.m. UTC | #2
Hillf Danton <hdanton@sina.com> wrote:

> > +		/* Unshare the packet so that it can be modified for in-place
> > +		 * decryption.
> > +		 */
> > +		if (sp->hdr.securityIndex != 0) {
> > +			struct sk_buff *nskb = skb_unshare(skb, GFP_ATOMIC);
> > +			if (!nskb) {
> > +				rxrpc_eaten_skb(skb, rxrpc_skb_unshared_nomem);
> > +				goto out;
> > +			}
> > +
> > +			if (nskb != skb) {
> > +				rxrpc_eaten_skb(skb, rxrpc_skb_received);
> > +				rxrpc_new_skb(skb, rxrpc_skb_unshared);
> > +				skb = nskb;
> > +				sp = rxrpc_skb(skb);
> > +			}
> > +		}
> 
> Unsharing skb makes it perilous to take a peep at it afterwards.

Ah, good point.  rxrpc_new_skb() should be after the assignment.

David