Message ID | 20190428073932.9898-1-ming.lei@redhat.com |
---|---|
Headers | show |
Series | scsi: core: avoid big pre-allocation for sg list | expand |
On Sun, Apr 28, 2019 at 03:39:29PM +0800, Ming Lei wrote: > Hi, > > Since supporting to blk-mq, big pre-allocation for sg list is introduced, > this way is very unfriendly wrt. memory consumption. > > There were Red Hat internal reports that some scsi_debug based tests > can't be run any more because of too big pre-allocation. > > Also lpfc users commplained that 1GB+ ram is pre-allocatd for single > HBA. > > sg_alloc_table_chained() is improved to support variant size of 1st > pre-allocated SGL in the 1st patch as suggested by Christoph. > > The other two patches try to address this issue by allocating sg list runtime, > meantime pre-allocating one or two inline sg entries for small IO. This > ways follows NVMe's approach wrt. sg list allocation. > > V4: > - add parameter to sg_alloc_table_chained()/sg_free_table_chained() > directly, and update current callers > > V3: > - improve sg_alloc_table_chained() to accept variant size of > the 1st pre-allocated SGL > - applies the improved sg API to address the big pre-allocation > issue > > V2: > - move inline sg table initializetion into one helper > - introduce new helper for getting inline sg > - comment log fix > > > Ming Lei (3): > lib/sg_pool.c: improve APIs for allocating sg pool > scsi: core: avoid to pre-allocate big chunk for protection meta data > scsi: core: avoid to pre-allocate big chunk for sg list > > drivers/nvme/host/fc.c | 7 ++++--- > drivers/nvme/host/rdma.c | 7 ++++--- > drivers/nvme/target/loop.c | 4 ++-- > drivers/scsi/scsi_lib.c | 31 ++++++++++++++++++++++--------- > include/linux/scatterlist.h | 11 +++++++---- > lib/scatterlist.c | 36 +++++++++++++++++++++++------------- > lib/sg_pool.c | 37 +++++++++++++++++++++++++++---------- > net/sunrpc/xprtrdma/svc_rdma_rw.c | 5 +++-- > 8 files changed, 92 insertions(+), 46 deletions(-) > > Cc: Christoph Hellwig <hch@lst.de> > Cc: Bart Van Assche <bvanassche@acm.org> > Cc: Ewan D. Milne <emilne@redhat.com> > Cc: Hannes Reinecke <hare@suse.com> > Cc: Sagi Grimberg <sagi@grimberg.me> > Cc: Chuck Lever <chuck.lever@oracle.com> > Cc: netdev@vger.kernel.org > Cc: linux-nvme@lists.infradead.org Hi Martin, Could you consider to merge this patchset to 5.2 if you are fine? Thanks, Ming
Ming, > Since supporting to blk-mq, big pre-allocation for sg list is > introduced, this way is very unfriendly wrt. memory consumption. Applied to 5.3/scsi-queue with some clarifications to the commit descriptions. I am not entirely sold on 1 for the inline protection SGL size. NVMe over PCIe is pretty constrained thanks to the metadata pointer whereas SCSI DIX uses a real SGL for the PI. Consequently, straddling a page is not that uncommon for large, sequential I/Os. But let's try it out. If performance suffers substantially, we may want to bump it to 2.