Message ID | 1371934712-11714-8-git-send-email-pl@kamp.de |
---|---|
State | New |
Headers | show |
Il 22/06/2013 22:58, Peter Lieven ha scritto: > if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE > it is possible that sector_num or nb_sectors are not correctly > alligned. > > for now assert that there is no misalignment to avoid data > corruption. You should just fail to open the LUN instead. Paolo > Signed-off-by: Peter Lieven <pl@kamp.de> > --- > block/iscsi.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/block/iscsi.c b/block/iscsi.c > index 683692a..2c410b1 100644 > --- a/block/iscsi.c > +++ b/block/iscsi.c > @@ -249,6 +249,7 @@ iscsi_aio_write16_cb(struct iscsi_context *iscsi, int status, > > static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun) > { > + assert((sector * BDRV_SECTOR_SIZE) % iscsilun->block_size == 0); > return sector * BDRV_SECTOR_SIZE / iscsilun->block_size; > } > > @@ -269,6 +270,8 @@ iscsi_aio_writev_acb(IscsiAIOCB *acb) > acb->status = -EINPROGRESS; > acb->buf = NULL; > > + assert((acb->nb_sectors * BDRV_SECTOR_SIZE) % acb->iscsilun->block_size == 0); > + > /* this will allow us to get rid of 'buf' completely */ > size = acb->nb_sectors * BDRV_SECTOR_SIZE; > > @@ -618,6 +621,8 @@ static int iscsi_aio_discard_acb(IscsiAIOCB *acb) { > acb->buf = NULL; > > list[0].lba = sector_qemu2lun(acb->sector_num, acb->iscsilun); > + > + assert((acb->nb_sectors * BDRV_SECTOR_SIZE) % acb->iscsilun->block_size == 0); > list[0].num = acb->nb_sectors * BDRV_SECTOR_SIZE / acb->iscsilun->block_size; > > acb->task = iscsi_unmap_task(iscsi, acb->iscsilun->lun, > @@ -947,6 +952,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num, > iTask.complete = 0; > iTask.bs = bs; > > + assert((nb_sectors * BDRV_SECTOR_SIZE) % iscsilun->block_size == 0); > + > list[0].lba = sector_qemu2lun(sector_num, iscsilun); > list[0].num = nb_sectors * BDRV_SECTOR_SIZE / iscsilun->block_size; > >
Am 24.06.2013 16:30, schrieb Paolo Bonzini: > Il 22/06/2013 22:58, Peter Lieven ha scritto: >> if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE >> it is possible that sector_num or nb_sectors are not correctly >> alligned. >> >> for now assert that there is no misalignment to avoid data >> corruption. > You should just fail to open the LUN instead. Ronnie added support for reading fragments of blocksize if the offset is aligned. I don't know what the idea was, maybe for qemu reading the boot sector. If the OS does only read multiple of blocksize at aligned offsets everything should work. I now that our storages support 4K blocksize. I can check if it is usable. I could also just fail the operations instead of asserting. Peter > > Paolo > >> Signed-off-by: Peter Lieven <pl@kamp.de> >> --- >> block/iscsi.c | 7 +++++++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/block/iscsi.c b/block/iscsi.c >> index 683692a..2c410b1 100644 >> --- a/block/iscsi.c >> +++ b/block/iscsi.c >> @@ -249,6 +249,7 @@ iscsi_aio_write16_cb(struct iscsi_context *iscsi, int status, >> >> static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun) >> { >> + assert((sector * BDRV_SECTOR_SIZE) % iscsilun->block_size == 0); >> return sector * BDRV_SECTOR_SIZE / iscsilun->block_size; >> } >> >> @@ -269,6 +270,8 @@ iscsi_aio_writev_acb(IscsiAIOCB *acb) >> acb->status = -EINPROGRESS; >> acb->buf = NULL; >> >> + assert((acb->nb_sectors * BDRV_SECTOR_SIZE) % acb->iscsilun->block_size == 0); >> + >> /* this will allow us to get rid of 'buf' completely */ >> size = acb->nb_sectors * BDRV_SECTOR_SIZE; >> >> @@ -618,6 +621,8 @@ static int iscsi_aio_discard_acb(IscsiAIOCB *acb) { >> acb->buf = NULL; >> >> list[0].lba = sector_qemu2lun(acb->sector_num, acb->iscsilun); >> + >> + assert((acb->nb_sectors * BDRV_SECTOR_SIZE) % acb->iscsilun->block_size == 0); >> list[0].num = acb->nb_sectors * BDRV_SECTOR_SIZE / acb->iscsilun->block_size; >> >> acb->task = iscsi_unmap_task(iscsi, acb->iscsilun->lun, >> @@ -947,6 +952,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num, >> iTask.complete = 0; >> iTask.bs = bs; >> >> + assert((nb_sectors * BDRV_SECTOR_SIZE) % iscsilun->block_size == 0); >> + >> list[0].lba = sector_qemu2lun(sector_num, iscsilun); >> list[0].num = nb_sectors * BDRV_SECTOR_SIZE / iscsilun->block_size; >> >>
Il 24/06/2013 18:10, Peter Lieven ha scritto: > Am 24.06.2013 16:30, schrieb Paolo Bonzini: >> Il 22/06/2013 22:58, Peter Lieven ha scritto: >>> if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE >>> it is possible that sector_num or nb_sectors are not correctly >>> alligned. >>> >>> for now assert that there is no misalignment to avoid data >>> corruption. >> You should just fail to open the LUN instead. > Ronnie added support for reading fragments of blocksize if the > offset is aligned. I don't know what the idea was, maybe for qemu > reading the boot sector. If the OS does only read multiple of > blocksize at aligned offsets everything should work. I now that > our storages support 4K blocksize. I can check if it is usable. > I could also just fail the operations instead of asserting. So far, 4K blocksize is usable if you also specify the same block size for the guest device. I have posted once the patches to do read-modify-write, but I never really pursued inclusion of those. In any case, the right place to fix this is the block layer; driver-specific hacks are... hacks. :) Paolo
Am 24.06.2013 18:13, schrieb Paolo Bonzini: > Il 24/06/2013 18:10, Peter Lieven ha scritto: >> Am 24.06.2013 16:30, schrieb Paolo Bonzini: >>> Il 22/06/2013 22:58, Peter Lieven ha scritto: >>>> if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE >>>> it is possible that sector_num or nb_sectors are not correctly >>>> alligned. >>>> >>>> for now assert that there is no misalignment to avoid data >>>> corruption. >>> You should just fail to open the LUN instead. >> Ronnie added support for reading fragments of blocksize if the >> offset is aligned. I don't know what the idea was, maybe for qemu >> reading the boot sector. If the OS does only read multiple of >> blocksize at aligned offsets everything should work. I now that >> our storages support 4K blocksize. I can check if it is usable. >> I could also just fail the operations instead of asserting. > So far, 4K blocksize is usable if you also specify the same block size > for the guest device. I have posted once the patches to do > read-modify-write, but I never really pursued inclusion of those. > > In any case, the right place to fix this is the block layer; > driver-specific hacks are... hacks. :) Where do I find the sector size if not in BDRV_SECTOR_SIZE? Is there a dynamic field or is the answer qemu only support 512 Byte sector size at the moment? So you would go for fail to open a device if the LUN blocksize is not equal to BDRV_SECTOR_SIZE? Peter
Il 24/06/2013 18:24, Peter Lieven ha scritto: >> > So far, 4K blocksize is usable if you also specify the same block size >> > for the guest device. I have posted once the patches to do >> > read-modify-write, but I never really pursued inclusion of those. >> > >> > In any case, the right place to fix this is the block layer; >> > driver-specific hacks are... hacks. :) > Where do I find the sector size if not in BDRV_SECTOR_SIZE? > Is there a dynamic field or is the answer qemu only support 512 Byte > sector size at the moment? bdrv_read/write and friends are always in 512-byte units. There is bs->buffer_alignment for the guest sector size, but it is not set at bdrv_open, only later. So I think QEMU support for bigger sectors is too sparse to be useful. > So you would go for fail to open a device if the LUN blocksize is > not equal to BDRV_SECTOR_SIZE? Either that, or fails reads/writes that are not aligned. Paolo
Am 24.06.2013 18:27, schrieb Paolo Bonzini: > Il 24/06/2013 18:24, Peter Lieven ha scritto: >>>> So far, 4K blocksize is usable if you also specify the same block size >>>> for the guest device. I have posted once the patches to do >>>> read-modify-write, but I never really pursued inclusion of those. >>>> >>>> In any case, the right place to fix this is the block layer; >>>> driver-specific hacks are... hacks. :) >> Where do I find the sector size if not in BDRV_SECTOR_SIZE? >> Is there a dynamic field or is the answer qemu only support 512 Byte >> sector size at the moment? > bdrv_read/write and friends are always in 512-byte units. > > There is bs->buffer_alignment for the guest sector size, but it is not > set at bdrv_open, only later. So I think QEMU support for bigger > sectors is too sparse to be useful. > >> So you would go for fail to open a device if the LUN blocksize is >> not equal to BDRV_SECTOR_SIZE? > Either that, or fails reads/writes that are not aligned. It would be the easiest. Maybe Ronnie can point out what the hack with the acb->buffer_offset in aio_read was meant for. All other functions have the requirement to be properly aligned regarding to the xfersize. I am also not sure if the e.g. reading of 512 byte at sector 0 from a 4K block device does still work with the iovector patches we added earlier. Peter > > Paolo
diff --git a/block/iscsi.c b/block/iscsi.c index 683692a..2c410b1 100644 --- a/block/iscsi.c +++ b/block/iscsi.c @@ -249,6 +249,7 @@ iscsi_aio_write16_cb(struct iscsi_context *iscsi, int status, static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun) { + assert((sector * BDRV_SECTOR_SIZE) % iscsilun->block_size == 0); return sector * BDRV_SECTOR_SIZE / iscsilun->block_size; } @@ -269,6 +270,8 @@ iscsi_aio_writev_acb(IscsiAIOCB *acb) acb->status = -EINPROGRESS; acb->buf = NULL; + assert((acb->nb_sectors * BDRV_SECTOR_SIZE) % acb->iscsilun->block_size == 0); + /* this will allow us to get rid of 'buf' completely */ size = acb->nb_sectors * BDRV_SECTOR_SIZE; @@ -618,6 +621,8 @@ static int iscsi_aio_discard_acb(IscsiAIOCB *acb) { acb->buf = NULL; list[0].lba = sector_qemu2lun(acb->sector_num, acb->iscsilun); + + assert((acb->nb_sectors * BDRV_SECTOR_SIZE) % acb->iscsilun->block_size == 0); list[0].num = acb->nb_sectors * BDRV_SECTOR_SIZE / acb->iscsilun->block_size; acb->task = iscsi_unmap_task(iscsi, acb->iscsilun->lun, @@ -947,6 +952,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num, iTask.complete = 0; iTask.bs = bs; + assert((nb_sectors * BDRV_SECTOR_SIZE) % iscsilun->block_size == 0); + list[0].lba = sector_qemu2lun(sector_num, iscsilun); list[0].num = nb_sectors * BDRV_SECTOR_SIZE / iscsilun->block_size;
if the blocksize of an iSCSI LUN is bigger than the BDRV_SECTOR_SIZE it is possible that sector_num or nb_sectors are not correctly alligned. for now assert that there is no misalignment to avoid data corruption. Signed-off-by: Peter Lieven <pl@kamp.de> --- block/iscsi.c | 7 +++++++ 1 file changed, 7 insertions(+)