diff mbox series

[v2,1/1] qemu-img: do not erase destination file in qemu-img dd command

Message ID 20230930203157.85766-1-mike.maslenkin@gmail.com
State New
Headers show
Series [v2,1/1] qemu-img: do not erase destination file in qemu-img dd command | expand

Commit Message

Mike Maslenkin Sept. 30, 2023, 8:31 p.m. UTC
Add a check that destination file exists and do not call bdrv_create for
this case.

Currently `qemu-img dd` command destroys content of destination file.
Effectively this means that parameters (geometry) of destination image
file are changing. This can be undesirable behavior for user especially
if format of destination image does not support resizing.

Steps to reproduce:
  1. Create empty disk image with some non default size.
       `qemu-img  create -f qcow2 $DEST_IMG 3T`
     Remember that `qemu-img info $DEST_IMG` returns:
       virtual size: 3 TiB (3298534883328 bytes)
       disk size: 240 KiB
       cluster_size: 65536
  2. Run `qemu-img dd -O qcow2 of=$DEST_IMG if=$SRC_IMG bs=1M count=100`
  3. Check `qemu-img info $DEST_IMG` output:
       virtual size: 100 MiB (104857600 bytes)
       disk size: 112 MiB
       cluster_size: 65536

Parameters of $DEST_IMG were changed. Actually `qemu-img dd` has created
a new disk based on current default geometry for particular format.
For example for "parallels" format default BAT for 256GB disk is written
to empty file prior writing disk image data.

With this patch virtual disk metadata and geometry of a destination image
are preserved. As another visible change of `qemu-img dd` behavior is that
if destination image is less than source it can finish with error (similar
to "dd" utility):
  qemu-img: error while writing to output image file: Input/output error

Signed-off-by: Mike Maslenkin <mike.maslenkin@gmail.com>
---
  diff from v1: removed additional fprintf call leaved in patch by accident
---
 qemu-img.c | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

Comments

Denis V. Lunev Oct. 1, 2023, 12:25 p.m. UTC | #1
On 9/30/23 22:31, Mike Maslenkin wrote:
> Add a check that destination file exists and do not call bdrv_create for
> this case.
>
> Currently `qemu-img dd` command destroys content of destination file.
> Effectively this means that parameters (geometry) of destination image
> file are changing. This can be undesirable behavior for user especially
> if format of destination image does not support resizing.
>
> Steps to reproduce:
>    1. Create empty disk image with some non default size.
>         `qemu-img  create -f qcow2 $DEST_IMG 3T`
>       Remember that `qemu-img info $DEST_IMG` returns:
>         virtual size: 3 TiB (3298534883328 bytes)
>         disk size: 240 KiB
>         cluster_size: 65536
>    2. Run `qemu-img dd -O qcow2 of=$DEST_IMG if=$SRC_IMG bs=1M count=100`
>    3. Check `qemu-img info $DEST_IMG` output:
>         virtual size: 100 MiB (104857600 bytes)
>         disk size: 112 MiB
>         cluster_size: 65536
>
> Parameters of $DEST_IMG were changed. Actually `qemu-img dd` has created
> a new disk based on current default geometry for particular format.
> For example for "parallels" format default BAT for 256GB disk is written
> to empty file prior writing disk image data.
>
> With this patch virtual disk metadata and geometry of a destination image
> are preserved. As another visible change of `qemu-img dd` behavior is that
> if destination image is less than source it can finish with error (similar
> to "dd" utility):
>    qemu-img: error while writing to output image file: Input/output error
>
> Signed-off-by: Mike Maslenkin <mike.maslenkin@gmail.com>
> ---
>    diff from v1: removed additional fprintf call leaved in patch by accident
> ---
>   qemu-img.c | 17 ++++++++++-------
>   1 file changed, 10 insertions(+), 7 deletions(-)
>
> diff --git a/qemu-img.c b/qemu-img.c
> index a48edb71015c..1a83c14212fb 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -5150,13 +5150,15 @@ static int img_dd(int argc, char **argv)
>                               size - in.bsz * in.offset, &error_abort);
>       }
>   
> -    ret = bdrv_create(drv, out.filename, opts, &local_err);
> -    if (ret < 0) {
> -        error_reportf_err(local_err,
> -                          "%s: error while creating output image: ",
> -                          out.filename);
> -        ret = -1;
> -        goto out;
> +    if (!g_file_test(out.filename, G_FILE_TEST_EXISTS)) {
> +        ret = bdrv_create(drv, out.filename, opts, &local_err);
> +        if (ret < 0) {
> +            error_reportf_err(local_err,
> +                               "%s: error while creating output image: ",
> +                               out.filename);
> +            ret = -1;
> +            goto out;
> +        }
>       }
>   
>       /* TODO, we can't honour --image-opts for the target,
may be it would be worth to follow conventional
'dd' approach, i.e. add conv=nocreat option which
will do the trick?

Den
Mike Maslenkin Oct. 1, 2023, 5:08 p.m. UTC | #2
I thought about "conv=notrunc", but my main concern is changed virtual
disk metadata.
It depends on how qemu-img used.
May be I followed to wrong pattern, but pros and cons of adding "conv"
parameter was not in my mind in scope of the first patch version.
I see 4 obvious ways of using `qemu-img dd`:
1. Copy virtual disk data between images of same format. I think disk
geometry must be preserved in this case.
2. Copy virtual disk data between different formats. It is a valid
pattern? May be `qemu-img convert` should to be used instead?
3. Merge snapshots to specified disk image, i.e read current state and
write it to new disk image.
4. Copy virtual disk data to raw binary file. Actually this patch
breaks 'dd' behavior for this case when source image is less (in terms
of logical blocks) than existed raw binary file.
    May be for this case condition can be improved to smth like
   if (strcmp(fmt, "raw") || !g_file_test(out.filename,
G_FILE_TEST_EXISTS)) . And parameter "conv=notrunc" may be implemented
additionally for this case.

Three of above do not require  "conv=" parameter from my point of view.

I would be glad to hear other opinions.

Regards,
Mike.


On Sun, Oct 1, 2023 at 3:25 PM Denis V. Lunev <den@virtuozzo.com> wrote:
>
> On 9/30/23 22:31, Mike Maslenkin wrote:
> > Add a check that destination file exists and do not call bdrv_create for
> > this case.
> >
> > Currently `qemu-img dd` command destroys content of destination file.
> > Effectively this means that parameters (geometry) of destination image
> > file are changing. This can be undesirable behavior for user especially
> > if format of destination image does not support resizing.
> >
> > Steps to reproduce:
> >    1. Create empty disk image with some non default size.
> >         `qemu-img  create -f qcow2 $DEST_IMG 3T`
> >       Remember that `qemu-img info $DEST_IMG` returns:
> >         virtual size: 3 TiB (3298534883328 bytes)
> >         disk size: 240 KiB
> >         cluster_size: 65536
> >    2. Run `qemu-img dd -O qcow2 of=$DEST_IMG if=$SRC_IMG bs=1M count=100`
> >    3. Check `qemu-img info $DEST_IMG` output:
> >         virtual size: 100 MiB (104857600 bytes)
> >         disk size: 112 MiB
> >         cluster_size: 65536
> >
> > Parameters of $DEST_IMG were changed. Actually `qemu-img dd` has created
> > a new disk based on current default geometry for particular format.
> > For example for "parallels" format default BAT for 256GB disk is written
> > to empty file prior writing disk image data.
> >
> > With this patch virtual disk metadata and geometry of a destination image
> > are preserved. As another visible change of `qemu-img dd` behavior is that
> > if destination image is less than source it can finish with error (similar
> > to "dd" utility):
> >    qemu-img: error while writing to output image file: Input/output error
> >
> > Signed-off-by: Mike Maslenkin <mike.maslenkin@gmail.com>
> > ---
> >    diff from v1: removed additional fprintf call leaved in patch by accident
> > ---
> >   qemu-img.c | 17 ++++++++++-------
> >   1 file changed, 10 insertions(+), 7 deletions(-)
> >
> > diff --git a/qemu-img.c b/qemu-img.c
> > index a48edb71015c..1a83c14212fb 100644
> > --- a/qemu-img.c
> > +++ b/qemu-img.c
> > @@ -5150,13 +5150,15 @@ static int img_dd(int argc, char **argv)
> >                               size - in.bsz * in.offset, &error_abort);
> >       }
> >
> > -    ret = bdrv_create(drv, out.filename, opts, &local_err);
> > -    if (ret < 0) {
> > -        error_reportf_err(local_err,
> > -                          "%s: error while creating output image: ",
> > -                          out.filename);
> > -        ret = -1;
> > -        goto out;
> > +    if (!g_file_test(out.filename, G_FILE_TEST_EXISTS)) {
> > +        ret = bdrv_create(drv, out.filename, opts, &local_err);
> > +        if (ret < 0) {
> > +            error_reportf_err(local_err,
> > +                               "%s: error while creating output image: ",
> > +                               out.filename);
> > +            ret = -1;
> > +            goto out;
> > +        }
> >       }
> >
> >       /* TODO, we can't honour --image-opts for the target,
> may be it would be worth to follow conventional
> 'dd' approach, i.e. add conv=nocreat option which
> will do the trick?
>
> Den
Denis V. Lunev Oct. 1, 2023, 8:46 p.m. UTC | #3
Can you please not top-post. This makes the discussion complex. This
approach is followed in this mailing list and in other similar lists
like LKML.

On 10/1/23 19:08, Mike Maslenkin wrote:
> I thought about "conv=notrunc", but my main concern is changed virtual
> disk metadata.
> It depends on how qemu-img used.
> May be I followed to wrong pattern, but pros and cons of adding "conv"
> parameter was not in my mind in scope of the first patch version.
> I see 4 obvious ways of using `qemu-img dd`:
> 1. Copy virtual disk data between images of same format. I think disk
> geometry must be preserved in this case.
> 2. Copy virtual disk data between different formats. It is a valid
> pattern? May be `qemu-img convert` should to be used instead?
> 3. Merge snapshots to specified disk image, i.e read current state and
> write it to new disk image.
> 4. Copy virtual disk data to raw binary file. Actually this patch
> breaks 'dd' behavior for this case when source image is less (in terms
> of logical blocks) than existed raw binary file.
>      May be for this case condition can be improved to smth like
>     if (strcmp(fmt, "raw") || !g_file_test(out.filename,
> G_FILE_TEST_EXISTS)) . And parameter "conv=notrunc" may be implemented
> additionally for this case.
My personal opinion is that qemu dd when you will need to
extract the SOME data from the original image and process
it further. Thus I use it to copy some data into raw binary
file. My next goal here would add ability to put data into
stdout that would be beneficial for. Though this is out of the
equation at the moment.

Though, speaking about the approach, I would say that the
patch changes current behavior which is not totally buggy
under a matter of this or that taste. It should be noted that
we are here in Linux world, not in the Mac world where we
were in position to avoid options and selections.

Thus my opinion that original behavior is to be preserved
as somebody is relying on it. The option you are proposing
seems valuable to me also and thus the switch is to be added.
The switch is well-defined in the original 'dd' world thus
either conv= option would be good, either nocreat or notrunc.
For me 'nocreat' seems more natural.

Anyway, the last word here belongs to either Hanna or Kevin ;)

Den

> Three of above do not require  "conv=" parameter from my point of view.
>
> I would be glad to hear other opinions.
>
> Regards,
> Mike.
>
>
> On Sun, Oct 1, 2023 at 3:25 PM Denis V. Lunev <den@virtuozzo.com> wrote:
>> On 9/30/23 22:31, Mike Maslenkin wrote:
>>> Add a check that destination file exists and do not call bdrv_create for
>>> this case.
>>>
>>> Currently `qemu-img dd` command destroys content of destination file.
>>> Effectively this means that parameters (geometry) of destination image
>>> file are changing. This can be undesirable behavior for user especially
>>> if format of destination image does not support resizing.
>>>
>>> Steps to reproduce:
>>>     1. Create empty disk image with some non default size.
>>>          `qemu-img  create -f qcow2 $DEST_IMG 3T`
>>>        Remember that `qemu-img info $DEST_IMG` returns:
>>>          virtual size: 3 TiB (3298534883328 bytes)
>>>          disk size: 240 KiB
>>>          cluster_size: 65536
>>>     2. Run `qemu-img dd -O qcow2 of=$DEST_IMG if=$SRC_IMG bs=1M count=100`
>>>     3. Check `qemu-img info $DEST_IMG` output:
>>>          virtual size: 100 MiB (104857600 bytes)
>>>          disk size: 112 MiB
>>>          cluster_size: 65536
>>>
>>> Parameters of $DEST_IMG were changed. Actually `qemu-img dd` has created
>>> a new disk based on current default geometry for particular format.
>>> For example for "parallels" format default BAT for 256GB disk is written
>>> to empty file prior writing disk image data.
>>>
>>> With this patch virtual disk metadata and geometry of a destination image
>>> are preserved. As another visible change of `qemu-img dd` behavior is that
>>> if destination image is less than source it can finish with error (similar
>>> to "dd" utility):
>>>     qemu-img: error while writing to output image file: Input/output error
>>>
>>> Signed-off-by: Mike Maslenkin <mike.maslenkin@gmail.com>
>>> ---
>>>     diff from v1: removed additional fprintf call leaved in patch by accident
>>> ---
>>>    qemu-img.c | 17 ++++++++++-------
>>>    1 file changed, 10 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/qemu-img.c b/qemu-img.c
>>> index a48edb71015c..1a83c14212fb 100644
>>> --- a/qemu-img.c
>>> +++ b/qemu-img.c
>>> @@ -5150,13 +5150,15 @@ static int img_dd(int argc, char **argv)
>>>                                size - in.bsz * in.offset, &error_abort);
>>>        }
>>>
>>> -    ret = bdrv_create(drv, out.filename, opts, &local_err);
>>> -    if (ret < 0) {
>>> -        error_reportf_err(local_err,
>>> -                          "%s: error while creating output image: ",
>>> -                          out.filename);
>>> -        ret = -1;
>>> -        goto out;
>>> +    if (!g_file_test(out.filename, G_FILE_TEST_EXISTS)) {
>>> +        ret = bdrv_create(drv, out.filename, opts, &local_err);
>>> +        if (ret < 0) {
>>> +            error_reportf_err(local_err,
>>> +                               "%s: error while creating output image: ",
>>> +                               out.filename);
>>> +            ret = -1;
>>> +            goto out;
>>> +        }
>>>        }
>>>
>>>        /* TODO, we can't honour --image-opts for the target,
>> may be it would be worth to follow conventional
>> 'dd' approach, i.e. add conv=nocreat option which
>> will do the trick?
>>
>> Den
Hanna Czenczek Oct. 31, 2023, 2:33 p.m. UTC | #4
On 01.10.23 22:46, Denis V. Lunev wrote:
> Can you please not top-post. This makes the discussion complex. This
> approach is followed in this mailing list and in other similar lists
> like LKML.
>
> On 10/1/23 19:08, Mike Maslenkin wrote:
>> I thought about "conv=notrunc", but my main concern is changed virtual
>> disk metadata.
>> It depends on how qemu-img used.
>> May be I followed to wrong pattern, but pros and cons of adding "conv"
>> parameter was not in my mind in scope of the first patch version.
>> I see 4 obvious ways of using `qemu-img dd`:
>> 1. Copy virtual disk data between images of same format. I think disk
>> geometry must be preserved in this case.
>> 2. Copy virtual disk data between different formats. It is a valid
>> pattern? May be `qemu-img convert` should to be used instead?
>> 3. Merge snapshots to specified disk image, i.e read current state and
>> write it to new disk image.
>> 4. Copy virtual disk data to raw binary file. Actually this patch
>> breaks 'dd' behavior for this case when source image is less (in terms
>> of logical blocks) than existed raw binary file.
>>      May be for this case condition can be improved to smth like
>>     if (strcmp(fmt, "raw") || !g_file_test(out.filename,
>> G_FILE_TEST_EXISTS)) . And parameter "conv=notrunc" may be implemented
>> additionally for this case.
> My personal opinion is that qemu dd when you will need to
> extract the SOME data from the original image and process
> it further. Thus I use it to copy some data into raw binary
> file. My next goal here would add ability to put data into
> stdout that would be beneficial for. Though this is out of the
> equation at the moment.
>
> Though, speaking about the approach, I would say that the
> patch changes current behavior which is not totally buggy
> under a matter of this or that taste. It should be noted that
> we are here in Linux world, not in the Mac world where we
> were in position to avoid options and selections.
>
> Thus my opinion that original behavior is to be preserved
> as somebody is relying on it. The option you are proposing
> seems valuable to me also and thus the switch is to be added.
> The switch is well-defined in the original 'dd' world thus
> either conv= option would be good, either nocreat or notrunc.
> For me 'nocreat' seems more natural.
>
> Anyway, the last word here belongs to either Hanna or Kevin ;)

Personally, and honestly, I see no actual use for qemu-img dd at all, 
because we’re trying to mimic a subset of an interface of a rather 
complex program that has been designed to do what it does. We can only 
fail at that.  Personally, whenever I need dd functionality, I use 
qemu-storage-daemon’s fuse export, and then use the actual dd program on 
top.  Alternatively, qemu-img convert is our native interface; 
unfortunately, its feature set is lacking when compared to qemu-img dd, 
but I think it would be better to improve that rather than working on 
qemu-img dd.

I tend to agree with you, Denis, though, that this patch changes 
behavior, and users may be relying on the current behavior.

Comparing to “what would dd do” is difficult because dd just has no 
concept of file formats.  (That’s another reason why I think it’s bad to 
provide a dd-like interface, and users should rather use dd itself, or 
qemu-img convert.)  But in any case, adding conv=notrunc to keep the 
existing target file would seem fair to me.  I understand conv=nocreat 
would advise dd to never create the target file, which is something 
slightly different (i.e. conv=notrunc will still create a new target 
file if it doesn’t exist yet, but won’t create/truncate one if it 
already exists; while conv=nocreat would disable creating a new target 
file if it doesn’t exist yet, but still truncate it if it does exist; 
using both together would ensure that the target file is never 
created/truncated).

Summary: If we do this under a new conv=notrunc, fine with me.  I just 
don’t think qemu-img dd is something that should be used at all.

Hanna
Eric Blake Nov. 1, 2023, 4:26 p.m. UTC | #5
On Tue, Oct 31, 2023 at 03:33:52PM +0100, Hanna Czenczek wrote:
> Personally, and honestly, I see no actual use for qemu-img dd at all,
> because we’re trying to mimic a subset of an interface of a rather complex
> program that has been designed to do what it does. We can only fail at
> that.  Personally, whenever I need dd functionality, I use
> qemu-storage-daemon’s fuse export, and then use the actual dd program on
> top.  Alternatively, qemu-img convert is our native interface;
> unfortunately, its feature set is lacking when compared to qemu-img dd, but
> I think it would be better to improve that rather than working on qemu-img
> dd.

I also agree that 'qemu-img dd' is not where we should focus; we have
a two-way feature gap (dd can do things convert can't, and convert can
do things dd can't), where the IDEAL world would be convert can do
everything and then dd is then a thin wrapper that calls into convert
under the hood.  Otherwise, we will forever be chasing bugs between
two similar but divergent implementations, and the one that gets more
testing (convert) will be the only one that gets timely fixes.

I should have objected much harder when 'qemu-img dd' was first
proposed.  Oh well.

Here's at least one prior discussion on the mailing list, in 2016(!),
where we proposed enhancing qemu-img dd to add at least seek=; but the
patch was never accepted.  I know there are other bugs lurking in dd
(being unable to choose different offsets for the input and output
file, via skip= vs. seek=, makes it hard to use compared to regular
dd); which backs up my claim that qemu-img dd is low priority.

https://lists.gnu.org/archive/html/qemu-devel/2016-08/msg03896.html

And to some extent, with some convoluted JSON or --image-opts (I
recommend trying it with qemu-storage-daemon, rather than directly in
qemu-img), you CAN open subsets of any other image by layering a raw
BDS with offset/size clamps on top of anything else.  It's just that
we don't have many written examples stating how to do that, although
we DO have such an example in the qemu-nbd documentation:

| Start a long-running server listening with encryption on port 10810,
| and allow clients with a specific X.509 certificate to connect to
| a 1 megabyte subset of a raw file, using the export name 'subset':
| 
| ::
| 
|   qemu-nbd \
|     --object tls-creds-x509,id=tls0,endpoint=server,dir=/path/to/qemutls \
|     --object 'authz-simple,id=auth0,identity=CN=laptop.example.com,,\
|               O=Example Org,,L=London,,ST=London,,C=GB' \
|     --tls-creds tls0 --tls-authz auth0 \
|     -t -x subset -p 10810 \
|     --image-opts driver=raw,offset=1M,size=1M,file.driver=file,file.filename=file.raw

> 
> Summary: If we do this under a new conv=notrunc, fine with me.  I just don’t
> think qemu-img dd is something that should be used at all.

I'm not even sure it is worth adding anything to qemu-img dd.  At this
point, that would just be increasing its technical debt; we're better
off putting lipstick on a pig.
Daniel P. Berrangé Nov. 1, 2023, 4:51 p.m. UTC | #6
On Tue, Oct 31, 2023 at 03:33:52PM +0100, Hanna Czenczek wrote:
> On 01.10.23 22:46, Denis V. Lunev wrote:
> > Can you please not top-post. This makes the discussion complex. This
> > approach is followed in this mailing list and in other similar lists
> > like LKML.
> > 
> > On 10/1/23 19:08, Mike Maslenkin wrote:
> > > I thought about "conv=notrunc", but my main concern is changed virtual
> > > disk metadata.
> > > It depends on how qemu-img used.
> > > May be I followed to wrong pattern, but pros and cons of adding "conv"
> > > parameter was not in my mind in scope of the first patch version.
> > > I see 4 obvious ways of using `qemu-img dd`:
> > > 1. Copy virtual disk data between images of same format. I think disk
> > > geometry must be preserved in this case.
> > > 2. Copy virtual disk data between different formats. It is a valid
> > > pattern? May be `qemu-img convert` should to be used instead?
> > > 3. Merge snapshots to specified disk image, i.e read current state and
> > > write it to new disk image.
> > > 4. Copy virtual disk data to raw binary file. Actually this patch
> > > breaks 'dd' behavior for this case when source image is less (in terms
> > > of logical blocks) than existed raw binary file.
> > >      May be for this case condition can be improved to smth like
> > >     if (strcmp(fmt, "raw") || !g_file_test(out.filename,
> > > G_FILE_TEST_EXISTS)) . And parameter "conv=notrunc" may be implemented
> > > additionally for this case.
> > My personal opinion is that qemu dd when you will need to
> > extract the SOME data from the original image and process
> > it further. Thus I use it to copy some data into raw binary
> > file. My next goal here would add ability to put data into
> > stdout that would be beneficial for. Though this is out of the
> > equation at the moment.
> > 
> > Though, speaking about the approach, I would say that the
> > patch changes current behavior which is not totally buggy
> > under a matter of this or that taste. It should be noted that
> > we are here in Linux world, not in the Mac world where we
> > were in position to avoid options and selections.
> > 
> > Thus my opinion that original behavior is to be preserved
> > as somebody is relying on it. The option you are proposing
> > seems valuable to me also and thus the switch is to be added.
> > The switch is well-defined in the original 'dd' world thus
> > either conv= option would be good, either nocreat or notrunc.
> > For me 'nocreat' seems more natural.
> > 
> > Anyway, the last word here belongs to either Hanna or Kevin ;)
> 
> Personally, and honestly, I see no actual use for qemu-img dd at all,
> because we’re trying to mimic a subset of an interface of a rather complex
> program that has been designed to do what it does. We can only fail at
> that.  Personally, whenever I need dd functionality, I use
> qemu-storage-daemon’s fuse export, and then use the actual dd program on
> top.  Alternatively, qemu-img convert is our native interface;
> unfortunately, its feature set is lacking when compared to qemu-img dd, but
> I think it would be better to improve that rather than working on qemu-img
> dd.

Is there a clear view of what gaps exist in 'qemu-img convert', and more
importantly, how much work is it to close the gaps, such that 'dd' could
potentially be deprecated & eventually removed ? 


With regards,
Daniel
Denis V. Lunev Nov. 1, 2023, 5:03 p.m. UTC | #7
On 11/1/23 17:51, Daniel P. Berrangé wrote:
> On Tue, Oct 31, 2023 at 03:33:52PM +0100, Hanna Czenczek wrote:
>> On 01.10.23 22:46, Denis V. Lunev wrote:
>>> Can you please not top-post. This makes the discussion complex. This
>>> approach is followed in this mailing list and in other similar lists
>>> like LKML.
>>>
>>> On 10/1/23 19:08, Mike Maslenkin wrote:
>>>> I thought about "conv=notrunc", but my main concern is changed virtual
>>>> disk metadata.
>>>> It depends on how qemu-img used.
>>>> May be I followed to wrong pattern, but pros and cons of adding "conv"
>>>> parameter was not in my mind in scope of the first patch version.
>>>> I see 4 obvious ways of using `qemu-img dd`:
>>>> 1. Copy virtual disk data between images of same format. I think disk
>>>> geometry must be preserved in this case.
>>>> 2. Copy virtual disk data between different formats. It is a valid
>>>> pattern? May be `qemu-img convert` should to be used instead?
>>>> 3. Merge snapshots to specified disk image, i.e read current state and
>>>> write it to new disk image.
>>>> 4. Copy virtual disk data to raw binary file. Actually this patch
>>>> breaks 'dd' behavior for this case when source image is less (in terms
>>>> of logical blocks) than existed raw binary file.
>>>>       May be for this case condition can be improved to smth like
>>>>      if (strcmp(fmt, "raw") || !g_file_test(out.filename,
>>>> G_FILE_TEST_EXISTS)) . And parameter "conv=notrunc" may be implemented
>>>> additionally for this case.
>>> My personal opinion is that qemu dd when you will need to
>>> extract the SOME data from the original image and process
>>> it further. Thus I use it to copy some data into raw binary
>>> file. My next goal here would add ability to put data into
>>> stdout that would be beneficial for. Though this is out of the
>>> equation at the moment.
>>>
>>> Though, speaking about the approach, I would say that the
>>> patch changes current behavior which is not totally buggy
>>> under a matter of this or that taste. It should be noted that
>>> we are here in Linux world, not in the Mac world where we
>>> were in position to avoid options and selections.
>>>
>>> Thus my opinion that original behavior is to be preserved
>>> as somebody is relying on it. The option you are proposing
>>> seems valuable to me also and thus the switch is to be added.
>>> The switch is well-defined in the original 'dd' world thus
>>> either conv= option would be good, either nocreat or notrunc.
>>> For me 'nocreat' seems more natural.
>>>
>>> Anyway, the last word here belongs to either Hanna or Kevin ;)
>> Personally, and honestly, I see no actual use for qemu-img dd at all,
>> because we’re trying to mimic a subset of an interface of a rather complex
>> program that has been designed to do what it does. We can only fail at
>> that.  Personally, whenever I need dd functionality, I use
>> qemu-storage-daemon’s fuse export, and then use the actual dd program on
>> top.  Alternatively, qemu-img convert is our native interface;
>> unfortunately, its feature set is lacking when compared to qemu-img dd, but
>> I think it would be better to improve that rather than working on qemu-img
>> dd.
> Is there a clear view of what gaps exist in 'qemu-img convert', and more
> importantly, how much work is it to close the gaps, such that 'dd' could
> potentially be deprecated & eventually removed ?
>
I am using 'qemu-img dd' as a way to get (some) content
from the image. I have dreamed about getting it to
stdout.

Den
Daniel P. Berrangé Nov. 1, 2023, 5:22 p.m. UTC | #8
On Wed, Nov 01, 2023 at 06:03:36PM +0100, Denis V. Lunev wrote:
> On 11/1/23 17:51, Daniel P. Berrangé wrote:
> > On Tue, Oct 31, 2023 at 03:33:52PM +0100, Hanna Czenczek wrote:
> > > On 01.10.23 22:46, Denis V. Lunev wrote:
> > > > Can you please not top-post. This makes the discussion complex. This
> > > > approach is followed in this mailing list and in other similar lists
> > > > like LKML.
> > > > 
> > > > On 10/1/23 19:08, Mike Maslenkin wrote:
> > > > > I thought about "conv=notrunc", but my main concern is changed virtual
> > > > > disk metadata.
> > > > > It depends on how qemu-img used.
> > > > > May be I followed to wrong pattern, but pros and cons of adding "conv"
> > > > > parameter was not in my mind in scope of the first patch version.
> > > > > I see 4 obvious ways of using `qemu-img dd`:
> > > > > 1. Copy virtual disk data between images of same format. I think disk
> > > > > geometry must be preserved in this case.
> > > > > 2. Copy virtual disk data between different formats. It is a valid
> > > > > pattern? May be `qemu-img convert` should to be used instead?
> > > > > 3. Merge snapshots to specified disk image, i.e read current state and
> > > > > write it to new disk image.
> > > > > 4. Copy virtual disk data to raw binary file. Actually this patch
> > > > > breaks 'dd' behavior for this case when source image is less (in terms
> > > > > of logical blocks) than existed raw binary file.
> > > > >       May be for this case condition can be improved to smth like
> > > > >      if (strcmp(fmt, "raw") || !g_file_test(out.filename,
> > > > > G_FILE_TEST_EXISTS)) . And parameter "conv=notrunc" may be implemented
> > > > > additionally for this case.
> > > > My personal opinion is that qemu dd when you will need to
> > > > extract the SOME data from the original image and process
> > > > it further. Thus I use it to copy some data into raw binary
> > > > file. My next goal here would add ability to put data into
> > > > stdout that would be beneficial for. Though this is out of the
> > > > equation at the moment.
> > > > 
> > > > Though, speaking about the approach, I would say that the
> > > > patch changes current behavior which is not totally buggy
> > > > under a matter of this or that taste. It should be noted that
> > > > we are here in Linux world, not in the Mac world where we
> > > > were in position to avoid options and selections.
> > > > 
> > > > Thus my opinion that original behavior is to be preserved
> > > > as somebody is relying on it. The option you are proposing
> > > > seems valuable to me also and thus the switch is to be added.
> > > > The switch is well-defined in the original 'dd' world thus
> > > > either conv= option would be good, either nocreat or notrunc.
> > > > For me 'nocreat' seems more natural.
> > > > 
> > > > Anyway, the last word here belongs to either Hanna or Kevin ;)
> > > Personally, and honestly, I see no actual use for qemu-img dd at all,
> > > because we’re trying to mimic a subset of an interface of a rather complex
> > > program that has been designed to do what it does. We can only fail at
> > > that.  Personally, whenever I need dd functionality, I use
> > > qemu-storage-daemon’s fuse export, and then use the actual dd program on
> > > top.  Alternatively, qemu-img convert is our native interface;
> > > unfortunately, its feature set is lacking when compared to qemu-img dd, but
> > > I think it would be better to improve that rather than working on qemu-img
> > > dd.
> > Is there a clear view of what gaps exist in 'qemu-img convert', and more
> > importantly, how much work is it to close the gaps, such that 'dd' could
> > potentially be deprecated & eventually removed ?
> > 
> I am using 'qemu-img dd' as a way to get (some) content
> from the image. I have dreamed about getting it to
> stdout.

FWIW, you can use qemu-img convert to extract a subset of data from an
image by layering in the 'raw' format.

  qemu-img convert --image-opts \
      driver=raw,offset=1024,size=512,file.driver=file,file.filename=demo.img data.bin

somewhat annoyingly it forces 'size' to be a multiple of 512. That makes
sense if you're using the output as backing for a VM disk, but for simply
extracting data, conceptually it shouldn't be needed.

Yes, the syntax gets hairy with image opts :-)

With regards,
Daniel
diff mbox series

Patch

diff --git a/qemu-img.c b/qemu-img.c
index a48edb71015c..1a83c14212fb 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -5150,13 +5150,15 @@  static int img_dd(int argc, char **argv)
                             size - in.bsz * in.offset, &error_abort);
     }
 
-    ret = bdrv_create(drv, out.filename, opts, &local_err);
-    if (ret < 0) {
-        error_reportf_err(local_err,
-                          "%s: error while creating output image: ",
-                          out.filename);
-        ret = -1;
-        goto out;
+    if (!g_file_test(out.filename, G_FILE_TEST_EXISTS)) {
+        ret = bdrv_create(drv, out.filename, opts, &local_err);
+        if (ret < 0) {
+            error_reportf_err(local_err,
+                               "%s: error while creating output image: ",
+                               out.filename);
+            ret = -1;
+            goto out;
+        }
     }
 
     /* TODO, we can't honour --image-opts for the target,