Message ID | 20240124104006.335166-1-lvivier@redhat.com |
---|---|
State | New |
Headers | show |
Series | [v2,1/2] igb: fix link state on resume | expand |
On 2024/01/24 19:40, Laurent Vivier wrote: > On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() > that sets link_down to false, and thus activates the link even > if we have disabled it. > > The problem can be reproduced starting qemu in paused state (-S) and > then set the link to down. When we resume the machine the link appears > to be up. > > Reproducer: > > # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S > > {"execute": "qmp_capabilities" } > {"execute": "set_link", "arguments": {"name": "net0", "up": false}} > {"execute": "cont" } > > To fix the problem, merge the content of e1000e_vm_state_change() > into e1000e_core_post_load() as e1000 does. > > Buglink: https://issues.redhat.com/browse/RHEL-21867 > Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") > Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> > Signed-off-by: Laurent Vivier <lvivier@redhat.com> > --- > > Notes: > v2: Add Fixes: and a comment about e1000e_intrmgr_resume() purpose. Thanks for taking care of this. Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote: > > On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() > that sets link_down to false, and thus activates the link even > if we have disabled it. > > The problem can be reproduced starting qemu in paused state (-S) and > then set the link to down. When we resume the machine the link appears > to be up. > > Reproducer: > > # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S > > {"execute": "qmp_capabilities" } > {"execute": "set_link", "arguments": {"name": "net0", "up": false}} > {"execute": "cont" } > > To fix the problem, merge the content of e1000e_vm_state_change() > into e1000e_core_post_load() as e1000 does. > > Buglink: https://issues.redhat.com/browse/RHEL-21867 > Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") > Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> > Signed-off-by: Laurent Vivier <lvivier@redhat.com> > --- > I've queued this. Thanks
On 2/1/24 06:45, Jason Wang wrote: > On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote: >> >> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() >> that sets link_down to false, and thus activates the link even >> if we have disabled it. >> >> The problem can be reproduced starting qemu in paused state (-S) and >> then set the link to down. When we resume the machine the link appears >> to be up. >> >> Reproducer: >> >> # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S >> >> {"execute": "qmp_capabilities" } >> {"execute": "set_link", "arguments": {"name": "net0", "up": false}} >> {"execute": "cont" } >> >> To fix the problem, merge the content of e1000e_vm_state_change() >> into e1000e_core_post_load() as e1000 does. >> >> Buglink: https://issues.redhat.com/browse/RHEL-21867 >> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") >> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> >> --- >> > > I've queued this. Ping? Thanks, Laurent
Hi Laurent: On Tue, Mar 5, 2024 at 6:07 PM Laurent Vivier <lvivier@redhat.com> wrote: > > On 2/1/24 06:45, Jason Wang wrote: > > On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote: > >> > >> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() > >> that sets link_down to false, and thus activates the link even > >> if we have disabled it. > >> > >> The problem can be reproduced starting qemu in paused state (-S) and > >> then set the link to down. When we resume the machine the link appears > >> to be up. > >> > >> Reproducer: > >> > >> # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S > >> > >> {"execute": "qmp_capabilities" } > >> {"execute": "set_link", "arguments": {"name": "net0", "up": false}} > >> {"execute": "cont" } > >> > >> To fix the problem, merge the content of e1000e_vm_state_change() > >> into e1000e_core_post_load() as e1000 does. > >> > >> Buglink: https://issues.redhat.com/browse/RHEL-21867 > >> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") > >> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> > >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> > >> --- > >> > > > > I've queued this. > > Ping? Plan to send a pull request with this by the end of this week. Thanks > > Thanks, > Laurent >
On Tue, Mar 5, 2024 at 6:07 PM Laurent Vivier <lvivier@redhat.com> wrote: > > On 2/1/24 06:45, Jason Wang wrote: > > On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote: > >> > >> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() > >> that sets link_down to false, and thus activates the link even > >> if we have disabled it. > >> > >> The problem can be reproduced starting qemu in paused state (-S) and > >> then set the link to down. When we resume the machine the link appears > >> to be up. > >> > >> Reproducer: > >> > >> # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S > >> > >> {"execute": "qmp_capabilities" } > >> {"execute": "set_link", "arguments": {"name": "net0", "up": false}} > >> {"execute": "cont" } > >> > >> To fix the problem, merge the content of e1000e_vm_state_change() > >> into e1000e_core_post_load() as e1000 does. > >> > >> Buglink: https://issues.redhat.com/browse/RHEL-21867 > >> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") > >> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> > >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> > >> --- > >> > > > > I've queued this. > > Ping? > > Thanks, > Laurent > This fail CI at: https://gitlab.com/jasowang/qemu/-/jobs/6348725267 It looks to me we can safely drop e1000e_autoneg_pause()? Thanks
On 3/8/24 09:09, Jason Wang wrote: > On Tue, Mar 5, 2024 at 6:07 PM Laurent Vivier <lvivier@redhat.com> wrote: >> >> On 2/1/24 06:45, Jason Wang wrote: >>> On Wed, Jan 24, 2024 at 6:40 PM Laurent Vivier <lvivier@redhat.com> wrote: >>>> >>>> On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() >>>> that sets link_down to false, and thus activates the link even >>>> if we have disabled it. >>>> >>>> The problem can be reproduced starting qemu in paused state (-S) and >>>> then set the link to down. When we resume the machine the link appears >>>> to be up. >>>> >>>> Reproducer: >>>> >>>> # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S >>>> >>>> {"execute": "qmp_capabilities" } >>>> {"execute": "set_link", "arguments": {"name": "net0", "up": false}} >>>> {"execute": "cont" } >>>> >>>> To fix the problem, merge the content of e1000e_vm_state_change() >>>> into e1000e_core_post_load() as e1000 does. >>>> >>>> Buglink: https://issues.redhat.com/browse/RHEL-21867 >>>> Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") >>>> Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> >>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com> >>>> --- >>>> >>> >>> I've queued this. >> >> Ping? >> >> Thanks, >> Laurent >> > > This fail CI at: > > https://gitlab.com/jasowang/qemu/-/jobs/6348725267 > > It looks to me we can safely drop e1000e_autoneg_pause()? > > Thanks > Yes, I'm going to send a new version of the patch. Thanks, Laurent
diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c index e324c02dd589..ac8d4bb2433a 100644 --- a/hw/net/e1000e_core.c +++ b/hw/net/e1000e_core.c @@ -123,14 +123,6 @@ e1000e_intmgr_timer_resume(E1000IntrDelayTimer *timer) } } -static void -e1000e_intmgr_timer_pause(E1000IntrDelayTimer *timer) -{ - if (timer->running) { - timer_del(timer->timer); - } -} - static inline void e1000e_intrmgr_stop_timer(E1000IntrDelayTimer *timer) { @@ -398,24 +390,6 @@ e1000e_intrmgr_resume(E1000ECore *core) } } -static void -e1000e_intrmgr_pause(E1000ECore *core) -{ - int i; - - e1000e_intmgr_timer_pause(&core->radv); - e1000e_intmgr_timer_pause(&core->rdtr); - e1000e_intmgr_timer_pause(&core->raid); - e1000e_intmgr_timer_pause(&core->tidv); - e1000e_intmgr_timer_pause(&core->tadv); - - e1000e_intmgr_timer_pause(&core->itr); - - for (i = 0; i < E1000E_MSIX_VEC_NUM; i++) { - e1000e_intmgr_timer_pause(&core->eitr[i]); - } -} - static void e1000e_intrmgr_reset(E1000ECore *core) { @@ -3351,22 +3325,6 @@ e1000e_autoneg_resume(E1000ECore *core) } } -static void -e1000e_vm_state_change(void *opaque, bool running, RunState state) -{ - E1000ECore *core = opaque; - - if (running) { - trace_e1000e_vm_state_running(); - e1000e_intrmgr_resume(core); - e1000e_autoneg_resume(core); - } else { - trace_e1000e_vm_state_stopped(); - e1000e_autoneg_pause(core); - e1000e_intrmgr_pause(core); - } -} - void e1000e_core_pci_realize(E1000ECore *core, const uint16_t *eeprom_templ, @@ -3379,9 +3337,6 @@ e1000e_core_pci_realize(E1000ECore *core, e1000e_autoneg_timer, core); e1000e_intrmgr_pci_realize(core); - core->vmstate = - qemu_add_vm_change_state_handler(e1000e_vm_state_change, core); - for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_init(&core->tx[i].tx_pkt, E1000E_MAX_TX_FRAGS); } @@ -3405,8 +3360,6 @@ e1000e_core_pci_uninit(E1000ECore *core) e1000e_intrmgr_pci_unint(core); - qemu_del_vm_change_state_handler(core->vmstate); - for (i = 0; i < E1000E_NUM_QUEUES; i++) { net_tx_pkt_uninit(core->tx[i].tx_pkt); } @@ -3576,5 +3529,12 @@ e1000e_core_post_load(E1000ECore *core) */ nc->link_down = (core->mac[STATUS] & E1000_STATUS_LU) == 0; + /* + * we need to restart intrmgr timers, as an older version of + * QEMU can have stopped them before migration + */ + e1000e_intrmgr_resume(core); + e1000e_autoneg_resume(core); + return 0; } diff --git a/hw/net/e1000e_core.h b/hw/net/e1000e_core.h index 66b025cc43f1..01510ca78b47 100644 --- a/hw/net/e1000e_core.h +++ b/hw/net/e1000e_core.h @@ -98,8 +98,6 @@ struct E1000Core { E1000IntrDelayTimer eitr[E1000E_MSIX_VEC_NUM]; - VMChangeStateEntry *vmstate; - uint32_t itr_guest_value; uint32_t eitr_guest_value[E1000E_MSIX_VEC_NUM];
On resume e1000e_vm_state_change() always calls e1000e_autoneg_resume() that sets link_down to false, and thus activates the link even if we have disabled it. The problem can be reproduced starting qemu in paused state (-S) and then set the link to down. When we resume the machine the link appears to be up. Reproducer: # qemu-system-x86_64 ... -device e1000e,netdev=netdev0,id=net0 -S {"execute": "qmp_capabilities" } {"execute": "set_link", "arguments": {"name": "net0", "up": false}} {"execute": "cont" } To fix the problem, merge the content of e1000e_vm_state_change() into e1000e_core_post_load() as e1000 does. Buglink: https://issues.redhat.com/browse/RHEL-21867 Fixes: 6f3fbe4ed06a ("net: Introduce e1000e device emulation") Suggested-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Laurent Vivier <lvivier@redhat.com> --- Notes: v2: Add Fixes: and a comment about e1000e_intrmgr_resume() purpose. hw/net/e1000e_core.c | 54 ++++++-------------------------------------- hw/net/e1000e_core.h | 2 -- 2 files changed, 7 insertions(+), 49 deletions(-)