@@ -13,6 +13,7 @@ Post-v2.15.0
* Enable the AVX512 DPCLS implementation to use VPOPCNT instruction if the
CPU supports it. This enhances performance by using the native vpopcount
instructions, instead of the emulated version of vpopcount.
+ * Optimize dp_netdev_output by enhancing compiler optimization potential.
v2.15.0 - xx xxx xxxx
---------------------
@@ -7254,12 +7254,25 @@ dp_execute_output_action(struct dp_netdev_pmd_thread *pmd,
pmd->n_output_batches++;
}
- struct dp_packet *packet;
- DP_PACKET_BATCH_FOR_EACH (i, packet, packets_) {
- p->output_pkts_rxqs[dp_packet_batch_size(&p->output_pkts)] =
- pmd->ctx.last_rxq;
- dp_packet_batch_add(&p->output_pkts, packet);
+ /* The above checks ensure that there is enough space in the output batch.
+ * Using dp_packet_batch_add() has a branch to check if the batch is full.
+ * This branch reduces the compiler's ability to optimize efficiently. The
+ * below code implements packet movement between batches without checks,
+ * with the required semantics of output batch perhaps containing packets.
+ */
+ int batch_size = dp_packet_batch_size(packets_);
+ int out_batch_idx = dp_packet_batch_size(&p->output_pkts);
+ struct dp_netdev_rxq *rxq = pmd->ctx.last_rxq;
+ struct dp_packet_batch *output_batch = &p->output_pkts;
+
+ for (int i = 0; i < batch_size; i++) {
+ struct dp_packet *packet = packets_->packets[i];
+ p->output_pkts_rxqs[out_batch_idx] = rxq;
+ output_batch->packets[out_batch_idx] = packet;
+ out_batch_idx++;
}
+ output_batch->count += batch_size;
+
return true;
}
This commit optimizes the output action, by enabling the compiler to optimize the code better through reducing code complexity. The core concept of this optimization is that the array-length checks have already been performed above the copying code, so can be removed. Removing of the per-packet length checks allows the compiler to auto-vectorize the stores using SIMD registers. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> --- v8: Add NEWS entry. --- NEWS | 1 + lib/dpif-netdev.c | 23 ++++++++++++++++++----- 2 files changed, 19 insertions(+), 5 deletions(-)