Message ID | 1594923010-6234-2-git-send-email-akiyano@amazon.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | ENA driver new features | expand |
On Thu, 16 Jul 2020 21:10:03 +0300 akiyano@amazon.com wrote: > This patch doesn't require smp_rmb() instruction in the napi routine > because it assumes cache coherency between two cores. I.e. the > 'interrupts_masked' flag set would be seen by the napi routine, even if > the flag is stored in L1 cache. > To the best of my knowledge this assumption holds for ARM64 and x86_64 > architecture which use a MESI like cache coherency model. If that's the case - for those architectures smb_rmb() should be defined to barrier(). Why can't you adhere to kernel's memory model, rather than guessing the architecture in the driver.
From: <akiyano@amazon.com> Date: Thu, 16 Jul 2020 21:10:03 +0300 > To the best of my knowledge this assumption holds for ARM64 and x86_64 > architecture which use a MESI like cache coherency model. Use the well defined kernel memory model correctly please. This is no place for architectural assumptions. The memory model of the kernel defines the rules, and in what locations various memory barriers are required for correct operation. Thank you.
On 7/17/20, 12:53 PM, "David Miller" <davem@davemloft.net> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. From: <akiyano@amazon.com> Date: Thu, 16 Jul 2020 21:10:03 +0300 > To the best of my knowledge this assumption holds for ARM64 and x86_64 > architecture which use a MESI like cache coherency model. Use the well defined kernel memory model correctly please. This is no place for architectural assumptions. The memory model of the kernel defines the rules, and in what locations various memory barriers are required for correct operation. Thank you. True and we will add smp_rmb() And I wouldn’t worry about the perf hit here, both x86 and modern arm v8 (specifically the Graviton2 that uses ENA) are pretty efficient and close enough to no-op
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index 91be3ffa1c5c..90c0fe15cd23 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -1913,7 +1913,9 @@ static int ena_io_poll(struct napi_struct *napi, int budget) /* Update numa and unmask the interrupt only when schedule * from the interrupt context (vs from sk_busy_loop) */ - if (napi_complete_done(napi, rx_work_done)) { + if (napi_complete_done(napi, rx_work_done) && + READ_ONCE(ena_napi->interrupts_masked)) { + WRITE_ONCE(ena_napi->interrupts_masked, false); /* We apply adaptive moderation on Rx path only. * Tx uses static interrupt moderation. */ @@ -1961,6 +1963,9 @@ static irqreturn_t ena_intr_msix_io(int irq, void *data) ena_napi->first_interrupt = true; + WRITE_ONCE(ena_napi->interrupts_masked, true); + smp_wmb(); /* write interrupts_masked before calling napi */ + napi_schedule_irqoff(&ena_napi->napi); return IRQ_HANDLED; diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h index ba030d260940..89304b403995 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.h +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h @@ -167,6 +167,7 @@ struct ena_napi { struct ena_ring *rx_ring; struct ena_ring *xdp_ring; bool first_interrupt; + bool interrupts_masked; u32 qid; struct dim dim; };