diff mbox series

[v2] qcow2: keep reference on zeroize with discard-no-unref enabled

Message ID 20230905130839.923041-2-jean-louis@dupond.be
State New
Headers show
Series [v2] qcow2: keep reference on zeroize with discard-no-unref enabled | expand

Commit Message

Jean-Louis Dupond Sept. 5, 2023, 1:08 p.m. UTC
When the discard-no-unref flag is enabled, we keep the reference for
normal discard requests.
But when a discard is executed on a snapshot/qcow2 image with backing,
the discards are saved as zero clusters in the snapshot image.

When committing the snapshot to the backing file, not
discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
any logic to keep the reference when discard-no-unref is enabled.

Therefor we add logic in the zero_in_l2_slice call to keep the reference
on commit.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
---
 block/qcow2-cluster.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

Comments

Hanna Czenczek Sept. 15, 2023, 11:21 a.m. UTC | #1
On 05.09.23 15:08, Jean-Louis Dupond wrote:
> When the discard-no-unref flag is enabled, we keep the reference for
> normal discard requests.
> But when a discard is executed on a snapshot/qcow2 image with backing,
> the discards are saved as zero clusters in the snapshot image.
>
> When committing the snapshot to the backing file, not
> discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
> any logic to keep the reference when discard-no-unref is enabled.
>
> Therefor we add logic in the zero_in_l2_slice call to keep the reference
> on commit.
>
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
> ---
>   block/qcow2-cluster.c | 22 ++++++++++++++++++----
>   1 file changed, 18 insertions(+), 4 deletions(-)

The code looks OK, but the obvious problem I find is that this is not 
what the discard-no-unref option describes.  It talks about discards, 
but this now changes the zero-write path.

I’m fairly certain that you are the only one using this option for now, 
so we might as well change its definition to include zero writes for 
8.2, but we should do that.

Hanna
Jean-Louis Dupond Sept. 25, 2023, 11:40 a.m. UTC | #2
On 15/09/2023 13:21, Hanna Czenczek wrote:
> On 05.09.23 15:08, Jean-Louis Dupond wrote:
>> When the discard-no-unref flag is enabled, we keep the reference for
>> normal discard requests.
>> But when a discard is executed on a snapshot/qcow2 image with backing,
>> the discards are saved as zero clusters in the snapshot image.
>>
>> When committing the snapshot to the backing file, not
>> discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
>> any logic to keep the reference when discard-no-unref is enabled.
>>
>> Therefor we add logic in the zero_in_l2_slice call to keep the reference
>> on commit.
>>
>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
>> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
>> ---
>>   block/qcow2-cluster.c | 22 ++++++++++++++++++----
>>   1 file changed, 18 insertions(+), 4 deletions(-)
>
> The code looks OK, but the obvious problem I find is that this is not 
> what the discard-no-unref option describes.  It talks about discards, 
> but this now changes the zero-write path.
But it's still touching the discard code in the zeroize code path.
Cause we modify the way zeroize does its discard (when BDRV_REQ_MAY_UNMAP)
>
> I’m fairly certain that you are the only one using this option for 
> now, so we might as well change its definition to include zero writes 
> for 8.2, but we should do that.
I agree. How would you name the option then? Cause it still involves 
discard-only code.
Next to that, the option was already added to libvirt also (so this 
needs to be fixed afterwards also).
>
> Hanna
>
Hanna Czenczek Sept. 25, 2023, 2:17 p.m. UTC | #3
On 25.09.23 13:40, Jean-Louis Dupond wrote:
> On 15/09/2023 13:21, Hanna Czenczek wrote:
>> On 05.09.23 15:08, Jean-Louis Dupond wrote:
>>> When the discard-no-unref flag is enabled, we keep the reference for
>>> normal discard requests.
>>> But when a discard is executed on a snapshot/qcow2 image with backing,
>>> the discards are saved as zero clusters in the snapshot image.
>>>
>>> When committing the snapshot to the backing file, not
>>> discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
>>> any logic to keep the reference when discard-no-unref is enabled.
>>>
>>> Therefor we add logic in the zero_in_l2_slice call to keep the 
>>> reference
>>> on commit.
>>>
>>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
>>> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
>>> ---
>>>   block/qcow2-cluster.c | 22 ++++++++++++++++++----
>>>   1 file changed, 18 insertions(+), 4 deletions(-)
>>
>> The code looks OK, but the obvious problem I find is that this is not 
>> what the discard-no-unref option describes.  It talks about discards, 
>> but this now changes the zero-write path.
> But it's still touching the discard code in the zeroize code path.
> Cause we modify the way zeroize does its discard (when 
> BDRV_REQ_MAY_UNMAP)

I find there’s a difference between discard code handling discards from 
the guest, and code handling zero-writes from the guest that internally 
issues discards.  I see your POV, but the documentation isn’t clear that 
not unref'ing on discards not only affects discards issued by the guest, 
but also internal discards that have been generated upon write-zero from 
the guest.

>>
>> I’m fairly certain that you are the only one using this option for 
>> now, so we might as well change its definition to include zero writes 
>> for 8.2, but we should do that.
> I agree. How would you name the option then? Cause it still involves 
> discard-only code.

I wouldn’t change the name, just the definition (description).

Hanna

> Next to that, the option was already added to libvirt also (so this 
> needs to be fixed afterwards also).
>>
>> Hanna
>>
>
Jean-Louis Dupond Oct. 3, 2023, 12:53 p.m. UTC | #4
On 25/09/2023 16:17, Hanna Czenczek wrote:
> On 25.09.23 13:40, Jean-Louis Dupond wrote:
>> On 15/09/2023 13:21, Hanna Czenczek wrote:
>>> On 05.09.23 15:08, Jean-Louis Dupond wrote:
>>>> When the discard-no-unref flag is enabled, we keep the reference for
>>>> normal discard requests.
>>>> But when a discard is executed on a snapshot/qcow2 image with backing,
>>>> the discards are saved as zero clusters in the snapshot image.
>>>>
>>>> When committing the snapshot to the backing file, not
>>>> discard_in_l2_slice is called but zero_in_l2_slice. Which did not had
>>>> any logic to keep the reference when discard-no-unref is enabled.
>>>>
>>>> Therefor we add logic in the zero_in_l2_slice call to keep the 
>>>> reference
>>>> on commit.
>>>>
>>>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621
>>>> Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
>>>> ---
>>>>   block/qcow2-cluster.c | 22 ++++++++++++++++++----
>>>>   1 file changed, 18 insertions(+), 4 deletions(-)
>>>
>>> The code looks OK, but the obvious problem I find is that this is 
>>> not what the discard-no-unref option describes.  It talks about 
>>> discards, but this now changes the zero-write path.
>> But it's still touching the discard code in the zeroize code path.
>> Cause we modify the way zeroize does its discard (when 
>> BDRV_REQ_MAY_UNMAP)
>
> I find there’s a difference between discard code handling discards 
> from the guest, and code handling zero-writes from the guest that 
> internally issues discards.  I see your POV, but the documentation 
> isn’t clear that not unref'ing on discards not only affects discards 
> issued by the guest, but also internal discards that have been 
> generated upon write-zero from the guest.
>
>>>
>>> I’m fairly certain that you are the only one using this option for 
>>> now, so we might as well change its definition to include zero 
>>> writes for 8.2, but we should do that.
>> I agree. How would you name the option then? Cause it still involves 
>> discard-only code.
>
> I wouldn’t change the name, just the definition (description).
Posted a new version with fixed description.
>
> Hanna
>
>> Next to that, the option was already added to libvirt also (so this 
>> needs to be fixed afterwards also).
>>>
>>> Hanna
>>>
>>
>
diff mbox series

Patch

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index f4f6cd6ad0..fc764aea4d 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -1984,7 +1984,7 @@  static int discard_in_l2_slice(BlockDriverState *bs, uint64_t offset,
             /* If we keep the reference, pass on the discard still */
             bdrv_pdiscard(s->data_file, old_l2_entry & L2E_OFFSET_MASK,
                           s->cluster_size);
-       }
+        }
     }
 
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
@@ -2062,9 +2062,15 @@  zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
         QCow2ClusterType type = qcow2_get_cluster_type(bs, old_l2_entry);
         bool unmap = (type == QCOW2_CLUSTER_COMPRESSED) ||
             ((flags & BDRV_REQ_MAY_UNMAP) && qcow2_cluster_is_allocated(type));
-        uint64_t new_l2_entry = unmap ? 0 : old_l2_entry;
+        bool keep_reference =
+            (s->discard_no_unref && type != QCOW2_CLUSTER_COMPRESSED);
+        uint64_t new_l2_entry = old_l2_entry;
         uint64_t new_l2_bitmap = old_l2_bitmap;
 
+        if (unmap && !keep_reference) {
+            new_l2_entry = 0;
+        }
+
         if (has_subclusters(s)) {
             new_l2_bitmap = QCOW_L2_BITMAP_ALL_ZEROES;
         } else {
@@ -2082,9 +2088,17 @@  zero_in_l2_slice(BlockDriverState *bs, uint64_t offset,
             set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap);
         }
 
-        /* Then decrease the refcount */
         if (unmap) {
-            qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST);
+            if (!keep_reference) {
+                /* Then decrease the refcount */
+                qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST);
+            } else if (s->discard_passthrough[QCOW2_DISCARD_REQUEST] &&
+                       (type == QCOW2_CLUSTER_NORMAL ||
+                        type == QCOW2_CLUSTER_ZERO_ALLOC)) {
+                /* If we keep the reference, pass on the discard still */
+                bdrv_pdiscard(s->data_file, old_l2_entry & L2E_OFFSET_MASK,
+                            s->cluster_size);
+            }
         }
     }