diff mbox series

[v2,2/2] e1000e: fix link state on resume

Message ID 20240124104006.335166-1-lvivier@redhat.com
State New
Headers show
Series [v2,1/2] igb: fix link state on resume | expand

Commit Message

Laurent Vivier Jan. 24, 2024, 10:40 a.m. UTC
On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
that sets link_down to false, and thus activates the link even
if we have disabled it.

The problem can be reproduced starting qemu in paused state (-S) and
then set the link to down. When we resume the machine the link appears
to be up.

Reproducer:

   # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S

   {"execute": "qmp_capabilities" }
   {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
   {"execute": "cont" }

To fix the problem, merge the content of e1000e_vm_state_change()
into e1000e_core_post_load() as e1000 does.

Buglink: https://issues.redhat.com/browse/RHEL-21867
Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---

Notes:
    v2: Add Fixes: and a comment about e1000e_intrmgr_resume() purpose.

 hw/net/e1000e_core.c | 54 ++++++--------------------------------------
 hw/net/e1000e_core.h |  2 --
 2 files changed, 7 insertions(+), 49 deletions(-)

Comments

Akihiko Odaki Jan. 28, 2024, 11:37 a.m. UTC | #1
On 2024/01/24 19:40, Laurent Vivier wrote:
> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
> that sets link_down to false, and thus activates the link even
> if we have disabled it.
> 
> The problem can be reproduced starting qemu in paused state (-S) and
> then set the link to down. When we resume the machine the link appears
> to be up.
> 
> Reproducer:
> 
>     # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
> 
>     {"execute": "qmp_capabilities" }
>     {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
>     {"execute": "cont" }
> 
> To fix the problem, merge the content of e1000e_vm_state_change()
> into e1000e_core_post_load() as e1000 does.
> 
> Buglink: https://issues.redhat.com/browse/RHEL-21867
> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
> 
> Notes:
>      v2: Add Fixes: and a comment about e1000e_intrmgr_resume() purpose.

Thanks for taking care of this.

Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Jason Wang Feb. 1, 2024, 5:45 a.m. UTC | #2
On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
> that sets link_down to false, and thus activates the link even
> if we have disabled it.
>
> The problem can be reproduced starting qemu in paused state (-S) and
> then set the link to down. When we resume the machine the link appears
> to be up.
>
> Reproducer:
>
>    # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
>
>    {"execute": "qmp_capabilities" }
>    {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
>    {"execute": "cont" }
>
> To fix the problem, merge the content of e1000e_vm_state_change()
> into e1000e_core_post_load() as e1000 does.
>
> Buglink: https://issues.redhat.com/browse/RHEL-21867
> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
>

I've queued this.

Thanks
Laurent Vivier March 5, 2024, 10:06 a.m. UTC | #3
On 2/1/24 06:45, Jason Wang wrote:
> On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote:
>>
>> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
>> that sets link_down to false, and thus activates the link even
>> if we have disabled it.
>>
>> The problem can be reproduced starting qemu in paused state (-S) and
>> then set the link to down. When we resume the machine the link appears
>> to be up.
>>
>> Reproducer:
>>
>>     # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
>>
>>     {"execute": "qmp_capabilities" }
>>     {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
>>     {"execute": "cont" }
>>
>> To fix the problem, merge the content of e1000e_vm_state_change()
>> into e1000e_core_post_load() as e1000 does.
>>
>> Buglink: https://issues.redhat.com/browse/RHEL-21867
>> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
>> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>
> 
> I've queued this.

Ping?

Thanks,
Laurent
Jason Wang March 6, 2024, 5:29 a.m. UTC | #4
Hi Laurent:

On Tue, Mar 5, 2024 at 6:07 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 2/1/24 06:45, Jason Wang wrote:
> > On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote:
> >>
> >> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
> >> that sets link_down to false, and thus activates the link even
> >> if we have disabled it.
> >>
> >> The problem can be reproduced starting qemu in paused state (-S) and
> >> then set the link to down. When we resume the machine the link appears
> >> to be up.
> >>
> >> Reproducer:
> >>
> >>     # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
> >>
> >>     {"execute": "qmp_capabilities" }
> >>     {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
> >>     {"execute": "cont" }
> >>
> >> To fix the problem, merge the content of e1000e_vm_state_change()
> >> into e1000e_core_post_load() as e1000 does.
> >>
> >> Buglink: https://issues.redhat.com/browse/RHEL-21867
> >> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
> >> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> >> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >> ---
> >>
> >
> > I've queued this.
>
> Ping?

Plan to send a pull request with this by the end of this week.

Thanks

>
> Thanks,
> Laurent
>
Jason Wang March 8, 2024, 8:09 a.m. UTC | #5
On Tue, Mar 5, 2024 at 6:07 PM Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 2/1/24 06:45, Jason Wang wrote:
> > On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote:
> >>
> >> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
> >> that sets link_down to false, and thus activates the link even
> >> if we have disabled it.
> >>
> >> The problem can be reproduced starting qemu in paused state (-S) and
> >> then set the link to down. When we resume the machine the link appears
> >> to be up.
> >>
> >> Reproducer:
> >>
> >>     # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
> >>
> >>     {"execute": "qmp_capabilities" }
> >>     {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
> >>     {"execute": "cont" }
> >>
> >> To fix the problem, merge the content of e1000e_vm_state_change()
> >> into e1000e_core_post_load() as e1000 does.
> >>
> >> Buglink: https://issues.redhat.com/browse/RHEL-21867
> >> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
> >> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> >> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >> ---
> >>
> >
> > I've queued this.
>
> Ping?
>
> Thanks,
> Laurent
>

This fail CI at:

https://gitlab.com/jasowang/qemu/-/jobs/6348725267

It looks to me we can safely drop e1000e_autoneg_pause()?

Thanks
Laurent Vivier March 8, 2024, 8:21 a.m. UTC | #6
On 3/8/24 09:09, Jason Wang wrote:
> On Tue, Mar 5, 2024 at 6:07 PM Laurent Vivier <lvivier@redhat.com> wrote:
>>
>> On 2/1/24 06:45, Jason Wang wrote:
>>> On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote:
>>>>
>>>> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume()
>>>> that sets link_down to false, and thus activates the link even
>>>> if we have disabled it.
>>>>
>>>> The problem can be reproduced starting qemu in paused state (-S) and
>>>> then set the link to down. When we resume the machine the link appears
>>>> to be up.
>>>>
>>>> Reproducer:
>>>>
>>>>      # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S
>>>>
>>>>      {"execute": "qmp_capabilities" }
>>>>      {"execute": "set_link", "arguments": {"name": "net0", "up": false}}
>>>>      {"execute": "cont" }
>>>>
>>>> To fix the problem, merge the content of e1000e_vm_state_change()
>>>> into e1000e_core_post_load() as e1000 does.
>>>>
>>>> Buglink: https://issues.redhat.com/browse/RHEL-21867
>>>> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation")
>>>> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>> ---
>>>>
>>>
>>> I've queued this.
>>
>> Ping?
>>
>> Thanks,
>> Laurent
>>
> 
> This fail CI at:
> 
> https://gitlab.com/jasowang/qemu/-/jobs/6348725267
> 
> It looks to me we can safely drop e1000e_autoneg_pause()?
> 
> Thanks
> 

Yes, I'm going to send a new version of the patch.

Thanks,
Laurent
diff mbox series

Patch

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index e324c02dd589..ac8d4bb2433a 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -123,14 +123,6 @@  e1000e_intmgr_timer_resume(E1000IntrDelayTimer *timer)
     }
 }
 
-static void
-e1000e_intmgr_timer_pause(E1000IntrDelayTimer *timer)
-{
-    if (timer->running) {
-        timer_del(timer->timer);
-    }
-}
-
 static inline void
 e1000e_intrmgr_stop_timer(E1000IntrDelayTimer *timer)
 {
@@ -398,24 +390,6 @@  e1000e_intrmgr_resume(E1000ECore *core)
     }
 }
 
-static void
-e1000e_intrmgr_pause(E1000ECore *core)
-{
-    int i;
-
-    e1000e_intmgr_timer_pause(&core->radv);
-    e1000e_intmgr_timer_pause(&core->rdtr);
-    e1000e_intmgr_timer_pause(&core->raid);
-    e1000e_intmgr_timer_pause(&core->tidv);
-    e1000e_intmgr_timer_pause(&core->tadv);
-
-    e1000e_intmgr_timer_pause(&core->itr);
-
-    for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) {
-        e1000e_intmgr_timer_pause(&core->eitr[i]);
-    }
-}
-
 static void
 e1000e_intrmgr_reset(E1000ECore *core)
 {
@@ -3351,22 +3325,6 @@  e1000e_autoneg_resume(E1000ECore *core)
     }
 }
 
-static void
-e1000e_vm_state_change(void *opaque, bool running, RunState state)
-{
-    E1000ECore *core = opaque;
-
-    if (running) {
-        trace_e1000e_vm_state_running();
-        e1000e_intrmgr_resume(core);
-        e1000e_autoneg_resume(core);
-    } else {
-        trace_e1000e_vm_state_stopped();
-        e1000e_autoneg_pause(core);
-        e1000e_intrmgr_pause(core);
-    }
-}
-
 void
 e1000e_core_pci_realize(E1000ECore     *core,
                         const uint16_t *eeprom_templ,
@@ -3379,9 +3337,6 @@  e1000e_core_pci_realize(E1000ECore     *core,
                                        e1000e_autoneg_timer, core);
     e1000e_intrmgr_pci_realize(core);
 
-    core->vmstate =
-        qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
-
     for (i = 0; i < E1000E_NUM_QUEUES; i++) {
         net_tx_pkt_init(&core->tx[i].tx_pkt, E1000E_MAX_TX_FRAGS);
     }
@@ -3405,8 +3360,6 @@  e1000e_core_pci_uninit(E1000ECore *core)
 
     e1000e_intrmgr_pci_unint(core);
 
-    qemu_del_vm_change_state_handler(core->vmstate);
-
     for (i = 0; i < E1000E_NUM_QUEUES; i++) {
         net_tx_pkt_uninit(core->tx[i].tx_pkt);
     }
@@ -3576,5 +3529,12 @@  e1000e_core_post_load(E1000ECore *core)
      */
     nc->link_down = (core->mac[STATUS] & E1000_STATUS_LU) == 0;
 
+    /*
+     * we need to restart intrmgr timers, as an older version of
+     * QEMU can have stopped them before migration
+     */
+    e1000e_intrmgr_resume(core);
+    e1000e_autoneg_resume(core);
+
     return 0;
 }
diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h
index 66b025cc43f1..01510ca78b47 100644
--- a/hw/net/e1000e_core.h
+++ b/hw/net/e1000e_core.h
@@ -98,8 +98,6 @@  struct E1000Core {
 
     E1000IntrDelayTimer eitr[E1000E_MSIX_VEC_NUM];
 
-    VMChangeStateEntry *vmstate;
-
     uint32_t itr_guest_value;
     uint32_t eitr_guest_value[E1000E_MSIX_VEC_NUM];