Message ID | 20190821003427.13593-1-awilliams@marvell.com |
---|---|
State | Superseded |
Headers | show |
Series | [U-Boot,1/1] nvme: Fix PRP Offset Invalid | expand |
Hi Aaron, On Wed, Aug 21, 2019 at 8:34 AM Aaron Williams <awilliams@marvell.com> wrote: > > When large writes take place I saw a Samsung > EVO 970+ return a status value of 0x13, PRP > Offset Invalid. I tracked this down to the > improper handling of PRP entries. The blocks > the PRP entries are placed in cannot cross a > page boundary and thus should be allocated on > page boundaries. This is how the Linux kernel > driver works. > > With this patch, the PRP pool is allocated on > a page boundary and other than the very first > allocation, the pool size is a multiple of > the page size. Each page can hold (4096 / 8) - 1 > entries since the last entry must point to the > next page in the pool. Please write more words in a line, about 70 characters in a line. > > Signed-off-by: Aaron Williams <awilliams@marvell.com> > --- > drivers/nvme/nvme.c | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c > index 7008a54a6d..ae64459edf 100644 > --- a/drivers/nvme/nvme.c > +++ b/drivers/nvme/nvme.c > @@ -75,6 +75,8 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 *prp2, > int length = total_len; > int i, nprps; > length -= (page_size - offset); > + u32 prps_per_page = (page_size >> 3) - 1; > + u32 num_pages; nits: please move these 2 above the line "length -= (page_size - offset);" > > if (length <= 0) { > *prp2 = 0; > @@ -90,15 +92,16 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 *prp2, > } > > nprps = DIV_ROUND_UP(length, page_size); > + num_pages = (nprps + prps_per_page - 1) / prps_per_page; use DIV_ROUND_UP() > > if (nprps > dev->prp_entry_num) { I think we should adjust nprps before the comparison here. nprps += num_pages - 1; > free(dev->prp_pool); > - dev->prp_pool = malloc(nprps << 3); > + dev->prp_pool = memalign(page_size, num_pages * page_size); Then we need only do: dev->prp_pool = memalign(page_size, nprps << 3)? > if (!dev->prp_pool) { > printf("Error: malloc prp_pool fail\n"); > return -ENOMEM; > } > - dev->prp_entry_num = nprps; > + dev->prp_entry_num = ((page_size >> 3) - 1) * num_pages; and no need to change this line, but we need change the while (nprps) {} loop. > } > > prp_pool = dev->prp_pool; > @@ -791,7 +794,7 @@ static int nvme_probe(struct udevice *udev) > } > memset(ndev->queues, 0, NVME_Q_NUM * sizeof(struct nvme_queue *)); > > - ndev->prp_pool = malloc(MAX_PRP_POOL); > + ndev->prp_pool = memalign(1 << 12, MAX_PRP_POOL); I think we need use ndev->pagesize instead of (1 << 12), but ndev->pagesize is initialized in nvme_configure_admin_queue(), so we need move the initialization of ndev->pagesize out of that function and do it before the memalign() here. > if (!ndev->prp_pool) { > ret = -ENOMEM; > printf("Error: %s: Out of memory!\n", udev->name); > -- Regards, Bin
Hopefully this addresses your concerns. Note that nprbs can't be used directly for the size since one must be removed from each page to point to the next page. -Aaron
Hi Bin, I submitted another patch via git. Hopefully it went through. I'm new to trying to get email to work with GIT since until now nobody in my group has had access to a working SMTP server so I'm still learning how to use git send- email. -Aaron On Wednesday, August 21, 2019 12:55:59 AM PDT Bin Meng wrote: > External Email > > ---------------------------------------------------------------------- > Hi Aaron, > > On Wed, Aug 21, 2019 at 8:34 AM Aaron Williams <awilliams@marvell.com> wrote: > > When large writes take place I saw a Samsung > > EVO 970+ return a status value of 0x13, PRP > > Offset Invalid. I tracked this down to the > > improper handling of PRP entries. The blocks > > the PRP entries are placed in cannot cross a > > page boundary and thus should be allocated on > > page boundaries. This is how the Linux kernel > > driver works. > > > > With this patch, the PRP pool is allocated on > > a page boundary and other than the very first > > allocation, the pool size is a multiple of > > the page size. Each page can hold (4096 / 8) - 1 > > entries since the last entry must point to the > > next page in the pool. > > Please write more words in a line, about 70 characters in a line. > > > Signed-off-by: Aaron Williams <awilliams@marvell.com> > > --- > > > > drivers/nvme/nvme.c | 9 ++++++--- > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c > > index 7008a54a6d..ae64459edf 100644 > > --- a/drivers/nvme/nvme.c > > +++ b/drivers/nvme/nvme.c > > @@ -75,6 +75,8 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 > > *prp2,> > > int length = total_len; > > int i, nprps; > > length -= (page_size - offset); > > > > + u32 prps_per_page = (page_size >> 3) - 1; > > + u32 num_pages; > > nits: please move these 2 above the line "length -= (page_size - offset);" Done. > > > if (length <= 0) { > > > > *prp2 = 0; > > > > @@ -90,15 +92,16 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 > > *prp2,> > > } > > > > nprps = DIV_ROUND_UP(length, page_size); > > > > + num_pages = (nprps + prps_per_page - 1) / prps_per_page; > > use DIV_ROUND_UP() Done > > > if (nprps > dev->prp_entry_num) { > > I think we should adjust nprps before the comparison here. > > nprps += num_pages - 1; > > > free(dev->prp_pool); > > > > - dev->prp_pool = malloc(nprps << 3); > > + dev->prp_pool = memalign(page_size, num_pages * > > page_size); > > Then we need only do: dev->prp_pool = memalign(page_size, nprps << 3)? We can't use nprps << 3 because if the prps span more than a single page then we lose a prp per page. > > > if (!dev->prp_pool) { > > > > printf("Error: malloc prp_pool fail\n"); > > return -ENOMEM; > > > > } > > > > - dev->prp_entry_num = nprps; > > + dev->prp_entry_num = ((page_size >> 3) - 1) * num_pages; > > and no need to change this line, but we need change the while (nprps) {} > loop. > > } > > > > prp_pool = dev->prp_pool; > > > > @@ -791,7 +794,7 @@ static int nvme_probe(struct udevice *udev) > > > > } > > memset(ndev->queues, 0, NVME_Q_NUM * sizeof(struct nvme_queue *)); > > > > - ndev->prp_pool = malloc(MAX_PRP_POOL); > > + ndev->prp_pool = memalign(1 << 12, MAX_PRP_POOL); > > I think we need use ndev->pagesize instead of (1 << 12), but > ndev->pagesize is initialized in nvme_configure_admin_queue(), so we > need move the initialization of ndev->pagesize out of that function > and do it before the memalign() here. > > > if (!ndev->prp_pool) { > > > > ret = -ENOMEM; > > printf("Error: %s: Out of memory!\n", udev->name); > > > > -- > > Regards, > Bin
Hi Aaron, On Wed, Aug 21, 2019 at 7:26 PM Aaron Williams <awilliams@marvell.com> wrote: > > Hi Bin, > > I submitted another patch via git. Hopefully it went through. I'm new to > trying to get email to work with GIT since until now nobody in my group has > had access to a working SMTP server so I'm still learning how to use git send- > email. > > -Aaron > > On Wednesday, August 21, 2019 12:55:59 AM PDT Bin Meng wrote: > > External Email > > > > ---------------------------------------------------------------------- > > Hi Aaron, > > > > On Wed, Aug 21, 2019 at 8:34 AM Aaron Williams <awilliams@marvell.com> > wrote: > > > When large writes take place I saw a Samsung > > > EVO 970+ return a status value of 0x13, PRP > > > Offset Invalid. I tracked this down to the > > > improper handling of PRP entries. The blocks > > > the PRP entries are placed in cannot cross a > > > page boundary and thus should be allocated on > > > page boundaries. This is how the Linux kernel > > > driver works. > > > > > > With this patch, the PRP pool is allocated on > > > a page boundary and other than the very first > > > allocation, the pool size is a multiple of > > > the page size. Each page can hold (4096 / 8) - 1 > > > entries since the last entry must point to the > > > next page in the pool. > > > > Please write more words in a line, about 70 characters in a line. > > > > > Signed-off-by: Aaron Williams <awilliams@marvell.com> > > > --- > > > > > > drivers/nvme/nvme.c | 9 ++++++--- > > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c > > > index 7008a54a6d..ae64459edf 100644 > > > --- a/drivers/nvme/nvme.c > > > +++ b/drivers/nvme/nvme.c > > > @@ -75,6 +75,8 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 > > > *prp2,> > > > int length = total_len; > > > int i, nprps; > > > length -= (page_size - offset); > > > > > > + u32 prps_per_page = (page_size >> 3) - 1; > > > + u32 num_pages; > > > > nits: please move these 2 above the line "length -= (page_size - offset);" > Done. > > > > > if (length <= 0) { > > > > > > *prp2 = 0; > > > > > > @@ -90,15 +92,16 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 > > > *prp2,> > > > } > > > > > > nprps = DIV_ROUND_UP(length, page_size); > > > > > > + num_pages = (nprps + prps_per_page - 1) / prps_per_page; > > > > use DIV_ROUND_UP() > Done > > > > > if (nprps > dev->prp_entry_num) { > > > > I think we should adjust nprps before the comparison here. > > > > nprps += num_pages - 1; > > > > > free(dev->prp_pool); > > > > > > - dev->prp_pool = malloc(nprps << 3); > > > + dev->prp_pool = memalign(page_size, num_pages * > > > page_size); > > > > Then we need only do: dev->prp_pool = memalign(page_size, nprps << 3)? > We can't use nprps << 3 because if the prps span more than a single page then > we lose a prp per page. Looks you missed my comment above. If we adjust nprps by "nprps += num_pages - 1", then we don't lose the last prp per page. Mallocing num_pages * page_size exceeds the real needs. We should only allocate the exact prp size we need. > > > > > if (!dev->prp_pool) { > > > > > > printf("Error: malloc prp_pool fail\n"); > > > return -ENOMEM; > > > > > > } > > > > > > - dev->prp_entry_num = nprps; > > > + dev->prp_entry_num = ((page_size >> 3) - 1) * num_pages; > > > > and no need to change this line, but we need change the while (nprps) {} > > loop. > > > } > > > > > > prp_pool = dev->prp_pool; > > > > > > @@ -791,7 +794,7 @@ static int nvme_probe(struct udevice *udev) > > > > > > } > > > memset(ndev->queues, 0, NVME_Q_NUM * sizeof(struct nvme_queue *)); > > > > > > - ndev->prp_pool = malloc(MAX_PRP_POOL); > > > + ndev->prp_pool = memalign(1 << 12, MAX_PRP_POOL); > > > > I think we need use ndev->pagesize instead of (1 << 12), but > > ndev->pagesize is initialized in nvme_configure_admin_queue(), so we > > need move the initialization of ndev->pagesize out of that function > > and do it before the memalign() here. > > > > > if (!ndev->prp_pool) { > > > > > > ret = -ENOMEM; > > > printf("Error: %s: Out of memory!\n", udev->name); > > > Regards, Bin
Hi Bin, On Wednesday, August 21, 2019 8:23:50 AM PDT Bin Meng wrote: > Hi Aaron, > > On Wed, Aug 21, 2019 at 7:26 PM Aaron Williams <awilliams@marvell.com> wrote: > > Hi Bin, > > > > I submitted another patch via git. Hopefully it went through. I'm new to > > trying to get email to work with GIT since until now nobody in my group > > has > > had access to a working SMTP server so I'm still learning how to use git > > send- email. > > > > -Aaron > > > > On Wednesday, August 21, 2019 12:55:59 AM PDT Bin Meng wrote: > > > External Email > > > > > > ---------------------------------------------------------------------- > > > Hi Aaron, > > > > > > On Wed, Aug 21, 2019 at 8:34 AM Aaron Williams <awilliams@marvell.com> > > > > wrote: > > > > When large writes take place I saw a Samsung > > > > EVO 970+ return a status value of 0x13, PRP > > > > Offset Invalid. I tracked this down to the > > > > improper handling of PRP entries. The blocks > > > > the PRP entries are placed in cannot cross a > > > > page boundary and thus should be allocated on > > > > page boundaries. This is how the Linux kernel > > > > driver works. > > > > > > > > With this patch, the PRP pool is allocated on > > > > a page boundary and other than the very first > > > > allocation, the pool size is a multiple of > > > > the page size. Each page can hold (4096 / 8) - 1 > > > > entries since the last entry must point to the > > > > next page in the pool. > > > > > > Please write more words in a line, about 70 characters in a line. > > > > > > > Signed-off-by: Aaron Williams <awilliams@marvell.com> > > > > --- > > > > > > > > drivers/nvme/nvme.c | 9 ++++++--- > > > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c > > > > index 7008a54a6d..ae64459edf 100644 > > > > --- a/drivers/nvme/nvme.c > > > > +++ b/drivers/nvme/nvme.c > > > > @@ -75,6 +75,8 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 > > > > *prp2,> > > > > > > > > int length = total_len; > > > > int i, nprps; > > > > length -= (page_size - offset); > > > > > > > > + u32 prps_per_page = (page_size >> 3) - 1; > > > > + u32 num_pages; > > > > > > nits: please move these 2 above the line "length -= (page_size - > > > offset);" > > > > Done. > > > > > > if (length <= 0) { > > > > > > > > *prp2 = 0; > > > > > > > > @@ -90,15 +92,16 @@ static int nvme_setup_prps(struct nvme_dev *dev, > > > > u64 > > > > *prp2,> > > > > > > > > } > > > > > > > > nprps = DIV_ROUND_UP(length, page_size); > > > > > > > > + num_pages = (nprps + prps_per_page - 1) / prps_per_page; > > > > > > use DIV_ROUND_UP() > > > > Done > > > > > > if (nprps > dev->prp_entry_num) { > > > > > > I think we should adjust nprps before the comparison here. > > > > > > nprps += num_pages - 1; > > > > > > > free(dev->prp_pool); > > > > > > > > - dev->prp_pool = malloc(nprps << 3); > > > > + dev->prp_pool = memalign(page_size, num_pages * > > > > page_size); > > > > > > Then we need only do: dev->prp_pool = memalign(page_size, nprps << 3)? > > > > We can't use nprps << 3 because if the prps span more than a single page > > then we lose a prp per page. > > Looks you missed my comment above. > > If we adjust nprps by "nprps += num_pages - 1", then we don't lose the > last prp per page. This would work. > > Mallocing num_pages * page_size exceeds the real needs. We should only > allocate the exact prp size we need. This is true, but if we were forced to increase it, there may be a good chance that it may need to be increased again. I do not see an issue in allocating based on the number of pages required rather than the number of bytes. At most, 4K-8 will be wasted. On the other hand, this can prevent additional memalign calls from being called. > > > > > if (!dev->prp_pool) { > > > > > > > > printf("Error: malloc prp_pool fail\n"); > > > > return -ENOMEM; > > > > > > > > } > > > > > > > > - dev->prp_entry_num = nprps; > > > > + dev->prp_entry_num = ((page_size >> 3) - 1) * > > > > num_pages; > > > > > > and no need to change this line, but we need change the while (nprps) {} > > > loop. > > > > > > > } > > > > > > > > prp_pool = dev->prp_pool; > > > > > > > > @@ -791,7 +794,7 @@ static int nvme_probe(struct udevice *udev) > > > > > > > > } > > > > memset(ndev->queues, 0, NVME_Q_NUM * sizeof(struct nvme_queue > > > > *)); > > > > > > > > - ndev->prp_pool = malloc(MAX_PRP_POOL); > > > > + ndev->prp_pool = memalign(1 << 12, MAX_PRP_POOL); > > > > > > I think we need use ndev->pagesize instead of (1 << 12), but > > > ndev->pagesize is initialized in nvme_configure_admin_queue(), so we > > > need move the initialization of ndev->pagesize out of that function > > > and do it before the memalign() here. > > > > > > > if (!ndev->prp_pool) { > > > > > > > > ret = -ENOMEM; > > > > printf("Error: %s: Out of memory!\n", udev->name); > > Regards, > Bin
Hi Aaron, On Thu, Aug 22, 2019 at 6:06 AM Aaron Williams <awilliams@marvell.com> wrote: > > Hi Bin, > > On Wednesday, August 21, 2019 8:23:50 AM PDT Bin Meng wrote: > > Hi Aaron, > > > > On Wed, Aug 21, 2019 at 7:26 PM Aaron Williams <awilliams@marvell.com> > wrote: > > > Hi Bin, > > > > > > I submitted another patch via git. Hopefully it went through. I'm new to > > > trying to get email to work with GIT since until now nobody in my group > > > has > > > had access to a working SMTP server so I'm still learning how to use git > > > send- email. > > > > > > -Aaron > > > > > > On Wednesday, August 21, 2019 12:55:59 AM PDT Bin Meng wrote: > > > > External Email > > > > > > > > ---------------------------------------------------------------------- > > > > Hi Aaron, > > > > > > > > On Wed, Aug 21, 2019 at 8:34 AM Aaron Williams <awilliams@marvell.com> > > > > > > wrote: > > > > > When large writes take place I saw a Samsung > > > > > EVO 970+ return a status value of 0x13, PRP > > > > > Offset Invalid. I tracked this down to the > > > > > improper handling of PRP entries. The blocks > > > > > the PRP entries are placed in cannot cross a > > > > > page boundary and thus should be allocated on > > > > > page boundaries. This is how the Linux kernel > > > > > driver works. > > > > > > > > > > With this patch, the PRP pool is allocated on > > > > > a page boundary and other than the very first > > > > > allocation, the pool size is a multiple of > > > > > the page size. Each page can hold (4096 / 8) - 1 > > > > > entries since the last entry must point to the > > > > > next page in the pool. > > > > > > > > Please write more words in a line, about 70 characters in a line. > > > > > > > > > Signed-off-by: Aaron Williams <awilliams@marvell.com> > > > > > --- > > > > > > > > > > drivers/nvme/nvme.c | 9 ++++++--- > > > > > 1 file changed, 6 insertions(+), 3 deletions(-) > > > > > > > > > > diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c > > > > > index 7008a54a6d..ae64459edf 100644 > > > > > --- a/drivers/nvme/nvme.c > > > > > +++ b/drivers/nvme/nvme.c > > > > > @@ -75,6 +75,8 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 > > > > > *prp2,> > > > > > > > > > > int length = total_len; > > > > > int i, nprps; > > > > > length -= (page_size - offset); > > > > > > > > > > + u32 prps_per_page = (page_size >> 3) - 1; > > > > > + u32 num_pages; > > > > > > > > nits: please move these 2 above the line "length -= (page_size - > > > > offset);" > > > > > > Done. > > > > > > > > if (length <= 0) { > > > > > > > > > > *prp2 = 0; > > > > > > > > > > @@ -90,15 +92,16 @@ static int nvme_setup_prps(struct nvme_dev *dev, > > > > > u64 > > > > > *prp2,> > > > > > > > > > > } > > > > > > > > > > nprps = DIV_ROUND_UP(length, page_size); > > > > > > > > > > + num_pages = (nprps + prps_per_page - 1) / prps_per_page; > > > > > > > > use DIV_ROUND_UP() > > > > > > Done > > > > > > > > if (nprps > dev->prp_entry_num) { > > > > > > > > I think we should adjust nprps before the comparison here. > > > > > > > > nprps += num_pages - 1; > > > > > > > > > free(dev->prp_pool); > > > > > > > > > > - dev->prp_pool = malloc(nprps << 3); > > > > > + dev->prp_pool = memalign(page_size, num_pages * > > > > > page_size); > > > > > > > > Then we need only do: dev->prp_pool = memalign(page_size, nprps << 3)? > > > > > > We can't use nprps << 3 because if the prps span more than a single page > > > then we lose a prp per page. > > > > Looks you missed my comment above. > > > > If we adjust nprps by "nprps += num_pages - 1", then we don't lose the > > last prp per page. > This would work. > > > > > Mallocing num_pages * page_size exceeds the real needs. We should only > > allocate the exact prp size we need. > > This is true, but if we were forced to increase it, there may be a good chance > that it may need to be increased again. I do not see an issue in allocating > based on the number of pages required rather than the number of bytes. At > most, 4K-8 will be wasted. On the other hand, this can prevent additional > memalign calls from being called. OK, please add some comments there indicating why we are doing this. Thanks. [snip] Regards, Bin
diff --git a/drivers/nvme/nvme.c b/drivers/nvme/nvme.c index 7008a54a6d..ae64459edf 100644 --- a/drivers/nvme/nvme.c +++ b/drivers/nvme/nvme.c @@ -75,6 +75,8 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 *prp2, int length = total_len; int i, nprps; length -= (page_size - offset); + u32 prps_per_page = (page_size >> 3) - 1; + u32 num_pages; if (length <= 0) { *prp2 = 0; @@ -90,15 +92,16 @@ static int nvme_setup_prps(struct nvme_dev *dev, u64 *prp2, } nprps = DIV_ROUND_UP(length, page_size); + num_pages = (nprps + prps_per_page - 1) / prps_per_page; if (nprps > dev->prp_entry_num) { free(dev->prp_pool); - dev->prp_pool = malloc(nprps << 3); + dev->prp_pool = memalign(page_size, num_pages * page_size); if (!dev->prp_pool) { printf("Error: malloc prp_pool fail\n"); return -ENOMEM; } - dev->prp_entry_num = nprps; + dev->prp_entry_num = ((page_size >> 3) - 1) * num_pages; } prp_pool = dev->prp_pool; @@ -791,7 +794,7 @@ static int nvme_probe(struct udevice *udev) } memset(ndev->queues, 0, NVME_Q_NUM * sizeof(struct nvme_queue *)); - ndev->prp_pool = malloc(MAX_PRP_POOL); + ndev->prp_pool = memalign(1 << 12, MAX_PRP_POOL); if (!ndev->prp_pool) { ret = -ENOMEM; printf("Error: %s: Out of memory!\n", udev->name);
When large writes take place I saw a Samsung EVO 970+ return a status value of 0x13, PRP Offset Invalid. I tracked this down to the improper handling of PRP entries. The blocks the PRP entries are placed in cannot cross a page boundary and thus should be allocated on page boundaries. This is how the Linux kernel driver works. With this patch, the PRP pool is allocated on a page boundary and other than the very first allocation, the pool size is a multiple of the page size. Each page can hold (4096 / 8) - 1 entries since the last entry must point to the next page in the pool. Signed-off-by: Aaron Williams <awilliams@marvell.com> --- drivers/nvme/nvme.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) -- 2.16.4