Message ID | e3982d5118a90db2442c6ac18f339ec8ba006df2.1578596897.git.berto@igalia.com |
---|---|
State | New |
Headers | show |
Series | qcow2: Misc BDRV_SECTOR_SIZE updates | expand |
On 09.01.20 20:13, Alberto Garcia wrote: > This replaces all remaining instances in the qcow2 code. > > Signed-off-by: Alberto Garcia <berto@igalia.com> > --- > block/qcow2-cluster.c | 2 +- > block/qcow2.c | 10 ++++++---- > 2 files changed, 7 insertions(+), 5 deletions(-) The question of course is why we would even have such instances still. > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c > index 932fc48919..777ca2d409 100644 > --- a/block/qcow2-cluster.c > +++ b/block/qcow2-cluster.c > @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, > * Writes one sector of the L1 table to the disk (can't update single entries > * and we really don't want bdrv_pread to perform a read-modify-write) > */ > -#define L1_ENTRIES_PER_SECTOR (512 / 8) > +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) > int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) Here it’s because the comment is wrong: “Can’t update single entries” – yes, we can. We’d just have to do a bdrv_pwrite() to a single entry. > { > BDRVQcow2State *s = bs->opaque; > diff --git a/block/qcow2.c b/block/qcow2.c > index 783d2b9578..c0f3e715ef 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -3280,7 +3280,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) > > /* Validate options and set default values */ > if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) { > - error_setg(errp, "Image size must be a multiple of 512 bytes"); > + error_setg(errp, "Image size must be a multiple of %u bytes", > + (unsigned) BDRV_SECTOR_SIZE); Either way how patch 1 goes, this is due to a limitation in qemu. > ret = -EINVAL; > goto out; > } > @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, > case QCOW2_CLUSTER_NORMAL: > child = s->data_file; > copy_offset += offset_into_cluster(s, src_offset); > - if ((copy_offset & 511) != 0) { > + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { Hm. I don’t get this one. > ret = -EIO; > goto out; > } > @@ -3958,8 +3959,9 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, > return -ENOTSUP; > } > > - if (offset & 511) { > - error_setg(errp, "The new size must be a multiple of 512"); > + if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) { > + error_setg(errp, "The new size must be a multiple of %u", > + (unsigned) BDRV_SECTOR_SIZE); This too is due to the limitation in qemu that BDS lengths are only stored in multiples of BDRV_SECTOR_SIZE, so we disallow any other image sizes. > return -EINVAL; > } So in 3/4 instances this patch looks good to me (it’s really about the block layers concept of a “sector”, even though that notion is outdated in the first hunk), but I don’t quite get the third hunk. (I’m sure it has to do something with the block layer, so it’s OK to change in this patch, but I’m still interested in why we have that limitation there.) Max
On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote: >> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, >> * Writes one sector of the L1 table to the disk (can't update single entries >> * and we really don't want bdrv_pread to perform a read-modify-write) >> */ >> -#define L1_ENTRIES_PER_SECTOR (512 / 8) >> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) >> int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) > > Here it’s because the comment is wrong: “Can’t update single entries” – > yes, we can. We’d just have to do a bdrv_pwrite() to a single entry. What's the point of qcow2_write_l1_entry() then? >> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, >> case QCOW2_CLUSTER_NORMAL: >> child = s->data_file; >> copy_offset += offset_into_cluster(s, src_offset); >> - if ((copy_offset & 511) != 0) { >> + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { > > Hm. I don’t get this one. Checking the code (e.g. block_copy_do_copy()) it seems that the whole chunk must be cluster aligned so I don't get this one either. Berto
On 17.01.20 00:26, Alberto Garcia wrote: > On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote: >>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, >>> * Writes one sector of the L1 table to the disk (can't update single entries >>> * and we really don't want bdrv_pread to perform a read-modify-write) >>> */ >>> -#define L1_ENTRIES_PER_SECTOR (512 / 8) >>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) >>> int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) >> >> Here it’s because the comment is wrong: “Can’t update single entries” – >> yes, we can. We’d just have to do a bdrv_pwrite() to a single entry. > > What's the point of qcow2_write_l1_entry() then? I think the point was that we couldn’t, for a long time, because the block layer only provided sector-granularity access. This function simply was never changed when the block layer gained the ability to do byte-granularity I/O. (We’d still need this function, but only for the endian swap, I think.) >>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, >>> case QCOW2_CLUSTER_NORMAL: >>> child = s->data_file; >>> copy_offset += offset_into_cluster(s, src_offset); >>> - if ((copy_offset & 511) != 0) { >>> + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { >> >> Hm. I don’t get this one. > > Checking the code (e.g. block_copy_do_copy()) it seems that the whole > chunk must be cluster aligned so I don't get this one either. Hm, how did you get to block_copy_do_copy()? That’s part of the block-copy infrastructure that’s only used for the backup job, as far as I’m aware. It’s different from copy_range. I don’t see any limitation for copy_range. I suppose maybe it doesn’t work for anything that isn’t aligned to physical sectors? But the qcow2 driver shouldn’t care about that. On thing’s for sure, the raw driver doesn’t care about it. Max
Am 17.01.2020 um 10:12 hat Max Reitz geschrieben: > On 17.01.20 00:26, Alberto Garcia wrote: > > On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote: > >>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, > >>> * Writes one sector of the L1 table to the disk (can't update single entries > >>> * and we really don't want bdrv_pread to perform a read-modify-write) > >>> */ > >>> -#define L1_ENTRIES_PER_SECTOR (512 / 8) > >>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) > >>> int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) > >> > >> Here it’s because the comment is wrong: “Can’t update single entries” – > >> yes, we can. We’d just have to do a bdrv_pwrite() to a single entry. > > > > What's the point of qcow2_write_l1_entry() then? > > I think the point was that we couldn’t, for a long time, because the > block layer only provided sector-granularity access. This function > simply was never changed when the block layer gained the ability to do > byte-granularity I/O. > > (We’d still need this function, but only for the endian swap, I think.) We still can't do byte-granularity writes with O_DIRECT, because that's a kernel requirement. The comment explains that we don't want to do a RMW cycle to write a single entry because that would be slower than just writing a whole sector. I think this is still accurate. Maybe we should change the comment to say "can't necessarily update". (The part that looks really wrong in the comment is "bdrv_pread", that should be "bdrv_pwrite"...) Now, what's wrong about the logic to avoid the RMW is that it assumes a fixed required alignment of 512. What it should do is looking at bs->file->bl.request_alignment and rounding accordingly. > >>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, > >>> case QCOW2_CLUSTER_NORMAL: > >>> child = s->data_file; > >>> copy_offset += offset_into_cluster(s, src_offset); > >>> - if ((copy_offset & 511) != 0) { > >>> + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { > >> > >> Hm. I don’t get this one. > > > > Checking the code (e.g. block_copy_do_copy()) it seems that the whole > > chunk must be cluster aligned so I don't get this one either. > > Hm, how did you get to block_copy_do_copy()? That’s part of the > block-copy infrastructure that’s only used for the backup job, as far as > I’m aware. It’s different from copy_range. > > I don’t see any limitation for copy_range. I suppose maybe it doesn’t > work for anything that isn’t aligned to physical sectors? But the qcow2 > driver shouldn’t care about that. > > On thing’s for sure, the raw driver doesn’t care about it. I don't understand this one either. Kevin
On 17.01.20 10:55, Kevin Wolf wrote: > Am 17.01.2020 um 10:12 hat Max Reitz geschrieben: >> On 17.01.20 00:26, Alberto Garcia wrote: >>> On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote: >>>>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, >>>>> * Writes one sector of the L1 table to the disk (can't update single entries >>>>> * and we really don't want bdrv_pread to perform a read-modify-write) >>>>> */ >>>>> -#define L1_ENTRIES_PER_SECTOR (512 / 8) >>>>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) >>>>> int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) >>>> >>>> Here it’s because the comment is wrong: “Can’t update single entries” – >>>> yes, we can. We’d just have to do a bdrv_pwrite() to a single entry. >>> >>> What's the point of qcow2_write_l1_entry() then? >> >> I think the point was that we couldn’t, for a long time, because the >> block layer only provided sector-granularity access. This function >> simply was never changed when the block layer gained the ability to do >> byte-granularity I/O. >> >> (We’d still need this function, but only for the endian swap, I think.) > > We still can't do byte-granularity writes with O_DIRECT, because that's > a kernel requirement. Ah, yes. But that makes BDRV_SECTOR_SIZE the wrong choice. > The comment explains that we don't want to do a RMW cycle to write a > single entry because that would be slower than just writing a whole > sector. I think this is still accurate. Maybe we should change the > comment to say "can't necessarily update". (The part that looks really > wrong in the comment is "bdrv_pread", that should be "bdrv_pwrite"...) Hm. But we wouldn’t do an RMW cycle without O_DIRECT, would we? > Now, what's wrong about the logic to avoid the RMW is that it assumes > a fixed required alignment of 512. What it should do is looking at > bs->file->bl.request_alignment and rounding accordingly. Yes. >>>>> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, >>>>> case QCOW2_CLUSTER_NORMAL: >>>>> child = s->data_file; >>>>> copy_offset += offset_into_cluster(s, src_offset); >>>>> - if ((copy_offset & 511) != 0) { >>>>> + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { >>>> >>>> Hm. I don’t get this one. >>> >>> Checking the code (e.g. block_copy_do_copy()) it seems that the whole >>> chunk must be cluster aligned so I don't get this one either. >> >> Hm, how did you get to block_copy_do_copy()? That’s part of the >> block-copy infrastructure that’s only used for the backup job, as far as >> I’m aware. It’s different from copy_range. >> >> I don’t see any limitation for copy_range. I suppose maybe it doesn’t >> work for anything that isn’t aligned to physical sectors? But the qcow2 >> driver shouldn’t care about that. >> >> On thing’s for sure, the raw driver doesn’t care about it. > > I don't understand this one either. Good. :-) Max
Am 17.01.2020 um 12:01 hat Max Reitz geschrieben: > On 17.01.20 10:55, Kevin Wolf wrote: > > Am 17.01.2020 um 10:12 hat Max Reitz geschrieben: > >> On 17.01.20 00:26, Alberto Garcia wrote: > >>> On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz wrote: > >>>>> @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, > >>>>> * Writes one sector of the L1 table to the disk (can't update single entries > >>>>> * and we really don't want bdrv_pread to perform a read-modify-write) > >>>>> */ > >>>>> -#define L1_ENTRIES_PER_SECTOR (512 / 8) > >>>>> +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) > >>>>> int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) > >>>> > >>>> Here it’s because the comment is wrong: “Can’t update single entries” – > >>>> yes, we can. We’d just have to do a bdrv_pwrite() to a single entry. > >>> > >>> What's the point of qcow2_write_l1_entry() then? > >> > >> I think the point was that we couldn’t, for a long time, because the > >> block layer only provided sector-granularity access. This function > >> simply was never changed when the block layer gained the ability to do > >> byte-granularity I/O. > >> > >> (We’d still need this function, but only for the endian swap, I think.) > > > > We still can't do byte-granularity writes with O_DIRECT, because that's > > a kernel requirement. > > Ah, yes. But that makes BDRV_SECTOR_SIZE the wrong choice. > > > The comment explains that we don't want to do a RMW cycle to write a > > single entry because that would be slower than just writing a whole > > sector. I think this is still accurate. Maybe we should change the > > comment to say "can't necessarily update". (The part that looks really > > wrong in the comment is "bdrv_pread", that should be "bdrv_pwrite"...) > > Hm. But we wouldn’t do an RMW cycle without O_DIRECT, would we? file-posix with a regular file has request_alignment == 1 and wouldn't cause an RMW cycle, I think. I won't try to make a statement about all non-O_DIRECT cases in all protocols. The important point is that some cases exist which would get us RMW. > > Now, what's wrong about the logic to avoid the RMW is that it assumes > > a fixed required alignment of 512. What it should do is looking at > > bs->file->bl.request_alignment and rounding accordingly. > > Yes. Who'll write the patch? :-) Kevin
On Fri 17 Jan 2020 12:51:04 PM CET, Kevin Wolf wrote: >> > Now, what's wrong about the logic to avoid the RMW is that it assumes >> > a fixed required alignment of 512. What it should do is looking at >> > bs->file->bl.request_alignment and rounding accordingly. >> >> Yes. > > Who'll write the patch? :-) I have to resend the series anyway so I can do it. Berto
On Tue 14 Jan 2020 03:15:48 PM CET, Max Reitz <mreitz@redhat.com> wrote: >> @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, >> case QCOW2_CLUSTER_NORMAL: >> child = s->data_file; >> copy_offset += offset_into_cluster(s, src_offset); >> - if ((copy_offset & 511) != 0) { >> + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { > > Hm. I don’t get this one. Ok, this came with Fam's "qemu-img convert with copy offloading" series: https://lists.gnu.org/archive/html/qemu-block/2018-06/msg00015.html and qemu-img uses sectors here: blk_co_copy_range(..., sector_num << BDRV_SECTOR_BITS, n << BDRV_SECTOR_BITS, ...) so I guess that's why the check is there. Again, I think this should be bl.request_alignment, because as far as I can tell copy_file_range() works just fine unless O_DIRECT is used. Berto
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 932fc48919..777ca2d409 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, * Writes one sector of the L1 table to the disk (can't update single entries * and we really don't want bdrv_pread to perform a read-modify-write) */ -#define L1_ENTRIES_PER_SECTOR (512 / 8) +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) { BDRVQcow2State *s = bs->opaque; diff --git a/block/qcow2.c b/block/qcow2.c index 783d2b9578..c0f3e715ef 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3280,7 +3280,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) /* Validate options and set default values */ if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) { - error_setg(errp, "Image size must be a multiple of 512 bytes"); + error_setg(errp, "Image size must be a multiple of %u bytes", + (unsigned) BDRV_SECTOR_SIZE); ret = -EINVAL; goto out; } @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, case QCOW2_CLUSTER_NORMAL: child = s->data_file; copy_offset += offset_into_cluster(s, src_offset); - if ((copy_offset & 511) != 0) { + if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { ret = -EIO; goto out; } @@ -3958,8 +3959,9 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, return -ENOTSUP; } - if (offset & 511) { - error_setg(errp, "The new size must be a multiple of 512"); + if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) { + error_setg(errp, "The new size must be a multiple of %u", + (unsigned) BDRV_SECTOR_SIZE); return -EINVAL; }
This replaces all remaining instances in the qcow2 code. Signed-off-by: Alberto Garcia <berto@igalia.com> --- block/qcow2-cluster.c | 2 +- block/qcow2.c | 10 ++++++---- 2 files changed, 7 insertions(+), 5 deletions(-)