Message ID | 20191007135933.12483-3-leon@kernel.org |
---|---|
State | Awaiting Upstream |
Delegated to: | David Miller |
Headers | show |
Series | Optimize SGL registration | expand |
> */ > + > static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num, Same like an empty line sneaked in here. Except for that the whole series looks fine: Reviewed-by: Christoph Hellwig <hch@lst.de>
On 10/7/19 6:59 AM, Leon Romanovsky wrote: > /* > - * Check if the device might use memory registration. This is currently only > - * true for iWarp devices. In the future we can hopefully fine tune this based > - * on HCA driver input. > + * Check if the device might use memory registration. This is currently > + * true for iWarp devices and devices that have optimized SGL registration > + * logic. > */ The following sentence in the above comment looks confusing to me: "Check if the device might use memory registration." That sentence suggests that the HCA decides whether or not to use memory registration. Isn't it the RDMA R/W code that decides whether or not to use memory registration? > + * For RDMA READs we must use MRs on iWarp and can optionaly use them as an > + * optimaztion otherwise. Additionally we have a debug option to force usage > + * of MRs to help testing this code path. You may want to change "optionaly" into "optionally" and "optimaztion" into "optimization". > static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num, > enum dma_data_direction dir, int dma_nents) > { > - if (rdma_protocol_iwarp(dev, port_num) && dir == DMA_FROM_DEVICE) > - return true; > + if (dir == DMA_FROM_DEVICE) { > + if (rdma_protocol_iwarp(dev, port_num)) > + return true; > + if (dev->attrs.max_sgl_rd && dma_nents > dev->attrs.max_sgl_rd) > + return true; > + } > if (unlikely(rdma_rw_force_mr)) > return true; > return false; Should this function be renamed? The function name suggests if this function returns 'true' that using memory registration is mandatory. My understanding is if this function returns true for the mlx5 HCA that using memory registration improves performance but is not mandatory. Thanks, Bart.
On 10/7/19 6:59 AM, Leon Romanovsky wrote: > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > index 4f671378dbfc..60fd98a9b7e8 100644 > --- a/include/rdma/ib_verbs.h > +++ b/include/rdma/ib_verbs.h > @@ -445,6 +445,8 @@ struct ib_device_attr { > struct ib_tm_caps tm_caps; > struct ib_cq_caps cq_caps; > u64 max_dm_size; > + /* Max entries for sgl for optimized performance per READ */ ^^^^^^^^^ optimal? Should it be mentioned that zero means that the HCA has not set this parameter? > + u32 max_sgl_rd; Thanks, Bart.
On Mon, Oct 07, 2019 at 08:00:41AM -0700, Christoph Hellwig wrote: > > */ > > + > > static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num, > > Same like an empty line sneaked in here. Except for that the whole > series looks fine: > > Reviewed-by: Christoph Hellwig <hch@lst.de> Thanks
On Mon, Oct 07, 2019 at 08:12:37AM -0700, Bart Van Assche wrote: > On 10/7/19 6:59 AM, Leon Romanovsky wrote: > > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > > index 4f671378dbfc..60fd98a9b7e8 100644 > > --- a/include/rdma/ib_verbs.h > > +++ b/include/rdma/ib_verbs.h > > @@ -445,6 +445,8 @@ struct ib_device_attr { > > struct ib_tm_caps tm_caps; > > struct ib_cq_caps cq_caps; > > u64 max_dm_size; > > + /* Max entries for sgl for optimized performance per READ */ > ^^^^^^^^^ > optimal? > > Should it be mentioned that zero means that the HCA has not set this > parameter? It is always the case for other max_* values. > > > + u32 max_sgl_rd; > > Thanks, > > Bart.
On Mon, Oct 07, 2019 at 08:07:55AM -0700, Bart Van Assche wrote: > On 10/7/19 6:59 AM, Leon Romanovsky wrote: > > /* > > - * Check if the device might use memory registration. This is currently only > > - * true for iWarp devices. In the future we can hopefully fine tune this based > > - * on HCA driver input. > > + * Check if the device might use memory registration. This is currently > > + * true for iWarp devices and devices that have optimized SGL registration > > + * logic. > > */ > > The following sentence in the above comment looks confusing to me: "Check if > the device might use memory registration." That sentence suggests that the > HCA decides whether or not to use memory registration. Isn't it the RDMA R/W > code that decides whether or not to use memory registration? I'm open for any reasonable text, what do you expect to be written there? > > > + * For RDMA READs we must use MRs on iWarp and can optionaly use them as an > > + * optimaztion otherwise. Additionally we have a debug option to force usage > > + * of MRs to help testing this code path. > > You may want to change "optionaly" into "optionally" and "optimaztion" into > "optimization". Thanks > > > static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num, > > enum dma_data_direction dir, int dma_nents) > > { > > - if (rdma_protocol_iwarp(dev, port_num) && dir == DMA_FROM_DEVICE) > > - return true; > > + if (dir == DMA_FROM_DEVICE) { > > + if (rdma_protocol_iwarp(dev, port_num)) > > + return true; > > + if (dev->attrs.max_sgl_rd && dma_nents > dev->attrs.max_sgl_rd) > > + return true; > > + } > > if (unlikely(rdma_rw_force_mr)) > > return true; > > return false; > > Should this function be renamed? The function name suggests if this function > returns 'true' that using memory registration is mandatory. My understanding > is if this function returns true for the mlx5 HCA that using memory > registration improves performance but is not mandatory. The end result the same, better to work with MR while working with mlx5 for "dma_nents > dev->attrs.max_sgl_rd", Thanks > > Thanks, > > Bart.
On 10/7/19 9:03 AM, Leon Romanovsky wrote: > On Mon, Oct 07, 2019 at 08:07:55AM -0700, Bart Van Assche wrote: >> On 10/7/19 6:59 AM, Leon Romanovsky wrote: >>> /* >>> - * Check if the device might use memory registration. This is currently only >>> - * true for iWarp devices. In the future we can hopefully fine tune this based >>> - * on HCA driver input. >>> + * Check if the device might use memory registration. This is currently >>> + * true for iWarp devices and devices that have optimized SGL registration >>> + * logic. >>> */ >> >> The following sentence in the above comment looks confusing to me: "Check if >> the device might use memory registration." That sentence suggests that the >> HCA decides whether or not to use memory registration. Isn't it the RDMA R/W >> code that decides whether or not to use memory registration? > > I'm open for any reasonable text, what do you expect to be written there? Hi Leon, How about the following (not sure whether this is correct)? /* * Report whether memory registration should be used. Memory * registration must be used for iWarp devices because of * iWARP-specific limitations. Memory registration is also enabled if * registering memory will yield better performance than using multiple * SGE entries. */ Thanks, Bart.
On Mon, Oct 07, 2019 at 03:22:30PM -0700, Bart Van Assche wrote: > On 10/7/19 9:03 AM, Leon Romanovsky wrote: > > On Mon, Oct 07, 2019 at 08:07:55AM -0700, Bart Van Assche wrote: > > > On 10/7/19 6:59 AM, Leon Romanovsky wrote: > > > > /* > > > > - * Check if the device might use memory registration. This is currently only > > > > - * true for iWarp devices. In the future we can hopefully fine tune this based > > > > - * on HCA driver input. > > > > + * Check if the device might use memory registration. This is currently > > > > + * true for iWarp devices and devices that have optimized SGL registration > > > > + * logic. > > > > */ > > > > > > The following sentence in the above comment looks confusing to me: "Check if > > > the device might use memory registration." That sentence suggests that the > > > HCA decides whether or not to use memory registration. Isn't it the RDMA R/W > > > code that decides whether or not to use memory registration? > > > > I'm open for any reasonable text, what do you expect to be written there? > > Hi Leon, > > How about the following (not sure whether this is correct)? > > /* > * Report whether memory registration should be used. Memory > * registration must be used for iWarp devices because of > * iWARP-specific limitations. Memory registration is also enabled if > * registering memory will yield better performance than using multiple > * SGE entries. > */ "Better performance" is relevant for mlx5 only, maybe others will use this max_.. field to overcome their HW limitations. Thanks > > Thanks, > > Bart.
diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c index 5337393d4dfe..c27a543b58ef 100644 --- a/drivers/infiniband/core/rw.c +++ b/drivers/infiniband/core/rw.c @@ -20,14 +20,16 @@ module_param_named(force_mr, rdma_rw_force_mr, bool, 0); MODULE_PARM_DESC(force_mr, "Force usage of MRs for RDMA READ/WRITE operations"); /* - * Check if the device might use memory registration. This is currently only - * true for iWarp devices. In the future we can hopefully fine tune this based - * on HCA driver input. + * Check if the device might use memory registration. This is currently + * true for iWarp devices and devices that have optimized SGL registration + * logic. */ static inline bool rdma_rw_can_use_mr(struct ib_device *dev, u8 port_num) { if (rdma_protocol_iwarp(dev, port_num)) return true; + if (dev->attrs.max_sgl_rd) + return true; if (unlikely(rdma_rw_force_mr)) return true; return false; @@ -35,17 +37,20 @@ static inline bool rdma_rw_can_use_mr(struct ib_device *dev, u8 port_num) /* * Check if the device will use memory registration for this RW operation. - * We currently always use memory registrations for iWarp RDMA READs, and - * have a debug option to force usage of MRs. - * - * XXX: In the future we can hopefully fine tune this based on HCA driver - * input. + * For RDMA READs we must use MRs on iWarp and can optionaly use them as an + * optimaztion otherwise. Additionally we have a debug option to force usage + * of MRs to help testing this code path. */ + static inline bool rdma_rw_io_needs_mr(struct ib_device *dev, u8 port_num, enum dma_data_direction dir, int dma_nents) { - if (rdma_protocol_iwarp(dev, port_num) && dir == DMA_FROM_DEVICE) - return true; + if (dir == DMA_FROM_DEVICE) { + if (rdma_protocol_iwarp(dev, port_num)) + return true; + if (dev->attrs.max_sgl_rd && dma_nents > dev->attrs.max_sgl_rd) + return true; + } if (unlikely(rdma_rw_force_mr)) return true; return false; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 4f671378dbfc..60fd98a9b7e8 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -445,6 +445,8 @@ struct ib_device_attr { struct ib_tm_caps tm_caps; struct ib_cq_caps cq_caps; u64 max_dm_size; + /* Max entries for sgl for optimized performance per READ */ + u32 max_sgl_rd; }; enum ib_mtu {