[v3,net] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET

When an application is run that:
a) Sets its scheduler to be SCHED_FIFO
and
b) Opens a memory mapped AF_PACKET socket, and sends frames with the
MSG_DONTWAIT flag cleared, its possible for the application to hang
forever in the kernel.  This occurs because when waiting, the code in
tpacket_snd calls schedule, which under normal circumstances allows
other tasks to run, including ksoftirqd, which in some cases is
responsible for freeing the transmitted skb (which in AF_PACKET calls a
destructor that flips the status bit of the transmitted frame back to
available, allowing the transmitting task to complete).

However, when the calling application is SCHED_FIFO, its priority is
such that the schedule call immediately places the task back on the cpu,
preventing ksoftirqd from freeing the skb, which in turn prevents the
transmitting task from detecting that the transmission is complete.

We can fix this by converting the schedule call to a completion
mechanism.  By using a completion queue, we force the calling task, when
it detects there are no more frames to send, to schedule itself off the
cpu until such time as the last transmitted skb is freed, allowing
forward progress to be made.

Tested by myself and the reporter, with good results

Appies to the net tree

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Matteo Croce <mcroce@redhat.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Willem de Bruijn <willemdebruijn.kernel@gmail.com>

Change Notes:

V1->V2:
	Enhance the sleep logic to support being interruptible and
allowing for honoring to SK_SNDTIMEO (Willem de Bruijn)

V2->V3:
	Rearrage the point at which we wait for the completion queue, to
avoid needing to check for ph/skb being null at the end of the loop.
Also move the complete call to the skb destructor to avoid needing to
modify __packet_set_status.  Also gate calling complete on
packet_read_pending returning zero to avoid multiple calls to complete.
(Willem de Bruijn)

	Move timeo computation within loop, to re-fetch the socket
timeout since we also use the timeo variable to record the return code
from the wait_for_complete call (Neil Horman)
---
 net/packet/af_packet.c | 59 +++++++++++++++++++++++++++++++++++++-----
 net/packet/internal.h  |  2 ++
 2 files changed, 55 insertions(+), 6 deletions(-)

Message ID	20190624004604.25607-1-nhorman@tuxdriver.com
State	Superseded
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Neil Horman <nhorman@tuxdriver.com> To: netdev@vger.kernel.org Cc: Neil Horman <nhorman@tuxdriver.com>, Matteo Croce <mcroce@redhat.com>, "David S. Miller" <davem@davemloft.net>, Willem de Bruijn <willemdebruijn.kernel@gmail.com> Subject: [PATCH v3 net] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET Date: Sun, 23 Jun 2019 20:46:04 -0400 Message-Id: <20190624004604.25607-1-nhorman@tuxdriver.com> In-Reply-To: <20190619202533.4856-1-nhorman@tuxdriver.com> References: <20190619202533.4856-1-nhorman@tuxdriver.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk
Series	[v3,net] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET \| expand [v3,net] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET

[v3,net] af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET

Commit Message

Comments

Patch