diff mbox

qom/cpu: Remove cpu->exit_request from reset state

Message ID 1392268034-6220-1-git-send-email-edgar.iglesias@gmail.com
State New
Headers show

Commit Message

Edgar E. Iglesias Feb. 13, 2014, 5:07 a.m. UTC
From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>

cpu->exit_request is part of the execution environment and should
not be cleared when a CPU resets.

Otherwise, we might deadlock QEMU if a CPU resets while there is
I/O going on.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
---
 qom/cpu.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Peter Maydell Feb. 15, 2014, 3:42 p.m. UTC | #1
On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
> From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>
> cpu->exit_request is part of the execution environment and should
> not be cleared when a CPU resets.
>
> Otherwise, we might deadlock QEMU if a CPU resets while there is
> I/O going on.
>
> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> ---
>  qom/cpu.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/qom/cpu.c b/qom/cpu.c
> index 9d62479..40d82dd 100644
> --- a/qom/cpu.c
> +++ b/qom/cpu.c
> @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
>          log_cpu_state(cpu, cc->reset_dump_flags);
>      }
>
> -    cpu->exit_request = 0;
>      cpu->interrupt_request = 0;
>      cpu->current_tb = NULL;
>      cpu->halted = 0;

This looks kind of odd to me. What's the situation you see where
this matters -- is the CPU resetting itself, or is some other device
in another thread triggering the CPU reset? TCG or KVM?

thanks
-- PMM
Edgar E. Iglesias Feb. 16, 2014, 2:07 a.m. UTC | #2
On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
> > From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> >
> > cpu->exit_request is part of the execution environment and should
> > not be cleared when a CPU resets.
> >
> > Otherwise, we might deadlock QEMU if a CPU resets while there is
> > I/O going on.
> >
> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> > ---
> >  qom/cpu.c | 1 -
> >  1 file changed, 1 deletion(-)
> >
> > diff --git a/qom/cpu.c b/qom/cpu.c
> > index 9d62479..40d82dd 100644
> > --- a/qom/cpu.c
> > +++ b/qom/cpu.c
> > @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
> >          log_cpu_state(cpu, cc->reset_dump_flags);
> >      }
> >
> > -    cpu->exit_request = 0;
> >      cpu->interrupt_request = 0;
> >      cpu->current_tb = NULL;
> >      cpu->halted = 0;
> 
> This looks kind of odd to me. What's the situation you see where
> this matters -- is the CPU resetting itself, or is some other device
> in another thread triggering the CPU reset? TCG or KVM?

Seeing this in TCG. The CPU gets signaled by the IO thread while the
CPU is resetting itself. If the CPU looses the race, it clears its
exit_request leaving the IO thread waiting for the global lock
potentially forever.

The CPU actually exits generated code but goes right back in because
there is no exit_request pending.

Cheers,
Edgar
Peter Maydell Feb. 20, 2014, 3:58 p.m. UTC | #3
On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
>> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
>> > From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>> >
>> > cpu->exit_request is part of the execution environment and should
>> > not be cleared when a CPU resets.
>> >
>> > Otherwise, we might deadlock QEMU if a CPU resets while there is
>> > I/O going on.
>> >
>> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
>> > ---
>> >  qom/cpu.c | 1 -
>> >  1 file changed, 1 deletion(-)
>> >
>> > diff --git a/qom/cpu.c b/qom/cpu.c
>> > index 9d62479..40d82dd 100644
>> > --- a/qom/cpu.c
>> > +++ b/qom/cpu.c
>> > @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
>> >          log_cpu_state(cpu, cc->reset_dump_flags);
>> >      }
>> >
>> > -    cpu->exit_request = 0;
>> >      cpu->interrupt_request = 0;
>> >      cpu->current_tb = NULL;
>> >      cpu->halted = 0;
>>
>> This looks kind of odd to me. What's the situation you see where
>> this matters -- is the CPU resetting itself, or is some other device
>> in another thread triggering the CPU reset? TCG or KVM?
>
> Seeing this in TCG. The CPU gets signaled by the IO thread while the
> CPU is resetting itself. If the CPU looses the race, it clears its
> exit_request leaving the IO thread waiting for the global lock
> potentially forever.
>
> The CPU actually exits generated code but goes right back in because
> there is no exit_request pending.

Yes, having looked at the code I agree with you, so:

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

However, doesn't this also apply to interrupt_request ?
If we have a pending asserted interrupt on the CPU
(ie the IRQ line into the chip is being held high)
this should result in an interrupt as soon as the
CPU reenables interrupts after reset, I would have
thought. Clearing cpu->interrupt_request here will
make us drop it on the floor.

thanks
-- PMM
Andreas Färber Feb. 20, 2014, 4:15 p.m. UTC | #4
Am 20.02.2014 16:58, schrieb Peter Maydell:
> On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
>> On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
>>> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
>>>> From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>>>>
>>>> cpu->exit_request is part of the execution environment and should
>>>> not be cleared when a CPU resets.
>>>>
>>>> Otherwise, we might deadlock QEMU if a CPU resets while there is
>>>> I/O going on.
>>>>
>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
>>>> ---
>>>>  qom/cpu.c | 1 -
>>>>  1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/qom/cpu.c b/qom/cpu.c
>>>> index 9d62479..40d82dd 100644
>>>> --- a/qom/cpu.c
>>>> +++ b/qom/cpu.c
>>>> @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
>>>>          log_cpu_state(cpu, cc->reset_dump_flags);
>>>>      }
>>>>
>>>> -    cpu->exit_request = 0;
>>>>      cpu->interrupt_request = 0;
>>>>      cpu->current_tb = NULL;
>>>>      cpu->halted = 0;
>>>
>>> This looks kind of odd to me. What's the situation you see where
>>> this matters -- is the CPU resetting itself, or is some other device
>>> in another thread triggering the CPU reset? TCG or KVM?
>>
>> Seeing this in TCG. The CPU gets signaled by the IO thread while the
>> CPU is resetting itself. If the CPU looses the race, it clears its
>> exit_request leaving the IO thread waiting for the global lock
>> potentially forever.
>>
>> The CPU actually exits generated code but goes right back in because
>> there is no exit_request pending.
> 
> Yes, having looked at the code I agree with you, so:
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> However, doesn't this also apply to interrupt_request ?

I was wondering the same thing but didn't find time to investigate yet.

Is it possible that we rather need to register some reset hook or bottom
half to process the exit_request *before* this reset code runs?

Regards,
Andreas

> If we have a pending asserted interrupt on the CPU
> (ie the IRQ line into the chip is being held high)
> this should result in an interrupt as soon as the
> CPU reenables interrupts after reset, I would have
> thought. Clearing cpu->interrupt_request here will
> make us drop it on the floor.
> 
> thanks
> -- PMM
>
Peter Maydell Feb. 20, 2014, 4:34 p.m. UTC | #5
On 20 February 2014 16:15, Andreas Färber <afaerber@suse.de> wrote:
> Am 20.02.2014 16:58, schrieb Peter Maydell:
>> On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
>>> Seeing this in TCG. The CPU gets signaled by the IO thread while the
>>> CPU is resetting itself. If the CPU looses the race, it clears its
>>> exit_request leaving the IO thread waiting for the global lock
>>> potentially forever.
>>>
>>> The CPU actually exits generated code but goes right back in because
>>> there is no exit_request pending.
>>
>> Yes, having looked at the code I agree with you, so:
>>
>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>>
>> However, doesn't this also apply to interrupt_request ?
>
> I was wondering the same thing but didn't find time to investigate yet.
>
> Is it possible that we rather need to register some reset hook or bottom
> half to process the exit_request *before* this reset code runs?

The only way to process the exit_request is to abandon execution
of whatever instruction caused us to try to do the CPU reset.
That's in the general case impossible because we probably
got here via an emulated power control device which has
already updated its internal state and isn't capable of
rolling back.

Simply making sure we honour the exit_request as soon as we've
completed the cpu_reset is much simpler, I think.

thanks
-- PMM
Edgar E. Iglesias Feb. 20, 2014, 11:26 p.m. UTC | #6
On Thu, Feb 20, 2014 at 03:58:15PM +0000, Peter Maydell wrote:
> On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> > On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
> >> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
> >> > From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> >> >
> >> > cpu->exit_request is part of the execution environment and should
> >> > not be cleared when a CPU resets.
> >> >
> >> > Otherwise, we might deadlock QEMU if a CPU resets while there is
> >> > I/O going on.
> >> >
> >> > Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> >> > ---
> >> >  qom/cpu.c | 1 -
> >> >  1 file changed, 1 deletion(-)
> >> >
> >> > diff --git a/qom/cpu.c b/qom/cpu.c
> >> > index 9d62479..40d82dd 100644
> >> > --- a/qom/cpu.c
> >> > +++ b/qom/cpu.c
> >> > @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
> >> >          log_cpu_state(cpu, cc->reset_dump_flags);
> >> >      }
> >> >
> >> > -    cpu->exit_request = 0;
> >> >      cpu->interrupt_request = 0;
> >> >      cpu->current_tb = NULL;
> >> >      cpu->halted = 0;
> >>
> >> This looks kind of odd to me. What's the situation you see where
> >> this matters -- is the CPU resetting itself, or is some other device
> >> in another thread triggering the CPU reset? TCG or KVM?
> >
> > Seeing this in TCG. The CPU gets signaled by the IO thread while the
> > CPU is resetting itself. If the CPU looses the race, it clears its
> > exit_request leaving the IO thread waiting for the global lock
> > potentially forever.
> >
> > The CPU actually exits generated code but goes right back in because
> > there is no exit_request pending.
> 
> Yes, having looked at the code I agree with you, so:
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> However, doesn't this also apply to interrupt_request ?
> If we have a pending asserted interrupt on the CPU
> (ie the IRQ line into the chip is being held high)
> this should result in an interrupt as soon as the
> CPU reenables interrupts after reset, I would have
> thought. Clearing cpu->interrupt_request here will
> make us drop it on the floor.

Hi,

I'm not sure about interrupt_request, seems to be a bit of
a mix. For example, I' not sure it's safe to keep all
the CPU_INTERRUPT_TGT_INT_X bits alive across a reset?
Agree with you about the interrupt hard line though.

Cheers,
Edgar
Peter Maydell Feb. 20, 2014, 11:50 p.m. UTC | #7
On 20 February 2014 23:26, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> On Thu, Feb 20, 2014 at 03:58:15PM +0000, Peter Maydell wrote:
>> However, doesn't this also apply to interrupt_request ?
>> If we have a pending asserted interrupt on the CPU
>> (ie the IRQ line into the chip is being held high)
>> this should result in an interrupt as soon as the
>> CPU reenables interrupts after reset, I would have
>> thought. Clearing cpu->interrupt_request here will
>> make us drop it on the floor.

> I'm not sure about interrupt_request, seems to be a bit of
> a mix. For example, I' not sure it's safe to keep all
> the CPU_INTERRUPT_TGT_INT_X bits alive across a reset?
> Agree with you about the interrupt hard line though.

Mm, could probably use an audit of the uses of interrupt_request.
I suspect that one doesn't matter so much in practice because
a guest dealing with a reset CPU is going to either (a) ensure
that any sources of interrupts are cleared before it unmasks
them or (b) have arranged that nobody sends it interrupts
during the reset in the first place. It would be nice to go through
and check we can avoid the reset of interrupt_request, but
we don't need to delay this patch for that.

thanks
-- PMM
Andreas Färber March 11, 2014, 11:58 p.m. UTC | #8
Am 20.02.2014 16:58, schrieb Peter Maydell:
> On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
>> On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
>>> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
>>>> From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>>>>
>>>> cpu->exit_request is part of the execution environment and should
>>>> not be cleared when a CPU resets.
>>>>
>>>> Otherwise, we might deadlock QEMU if a CPU resets while there is
>>>> I/O going on.
>>>>
>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
>>>> ---
>>>>  qom/cpu.c | 1 -
>>>>  1 file changed, 1 deletion(-)
>>>>
>>>> diff --git a/qom/cpu.c b/qom/cpu.c
>>>> index 9d62479..40d82dd 100644
>>>> --- a/qom/cpu.c
>>>> +++ b/qom/cpu.c
>>>> @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
>>>>          log_cpu_state(cpu, cc->reset_dump_flags);
>>>>      }
>>>>
>>>> -    cpu->exit_request = 0;
>>>>      cpu->interrupt_request = 0;
>>>>      cpu->current_tb = NULL;
>>>>      cpu->halted = 0;
>>>
>>> This looks kind of odd to me. What's the situation you see where
>>> this matters -- is the CPU resetting itself, or is some other device
>>> in another thread triggering the CPU reset? TCG or KVM?
>>
>> Seeing this in TCG. The CPU gets signaled by the IO thread while the
>> CPU is resetting itself. If the CPU looses the race, it clears its
>> exit_request leaving the IO thread waiting for the global lock
>> potentially forever.
>>
>> The CPU actually exits generated code but goes right back in because
>> there is no exit_request pending.
> 
> Yes, having looked at the code I agree with you, so:
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

Thanks, applied to qom-cpu (with clarified commit message):
https://github.com/afaerber/qemu-cpu/commits/qom-cpu

Andreas
Peter Maydell March 12, 2014, 12:03 a.m. UTC | #9
On 11 March 2014 23:58, Andreas Färber <afaerber@suse.de> wrote:
> Am 20.02.2014 16:58, schrieb Peter Maydell:
>> On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
>>> On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
>>>> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
>>>>> From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
>>>>>
>>>>> cpu->exit_request is part of the execution environment and should
>>>>> not be cleared when a CPU resets.
>>>>>
>>>>> Otherwise, we might deadlock QEMU if a CPU resets while there is
>>>>> I/O going on.
>>>>>
>>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
>>>>> ---
>>>>>  qom/cpu.c | 1 -
>>>>>  1 file changed, 1 deletion(-)
>>>>>
>>>>> diff --git a/qom/cpu.c b/qom/cpu.c
>>>>> index 9d62479..40d82dd 100644
>>>>> --- a/qom/cpu.c
>>>>> +++ b/qom/cpu.c
>>>>> @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
>>>>>          log_cpu_state(cpu, cc->reset_dump_flags);
>>>>>      }
>>>>>
>>>>> -    cpu->exit_request = 0;
>>>>>      cpu->interrupt_request = 0;
>>>>>      cpu->current_tb = NULL;
>>>>>      cpu->halted = 0;
>>>>
>>>> This looks kind of odd to me. What's the situation you see where
>>>> this matters -- is the CPU resetting itself, or is some other device
>>>> in another thread triggering the CPU reset? TCG or KVM?
>>>
>>> Seeing this in TCG. The CPU gets signaled by the IO thread while the
>>> CPU is resetting itself. If the CPU looses the race, it clears its
>>> exit_request leaving the IO thread waiting for the global lock
>>> potentially forever.
>>>
>>> The CPU actually exits generated code but goes right back in because
>>> there is no exit_request pending.
>>
>> Yes, having looked at the code I agree with you, so:
>>
>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>
> Thanks, applied to qom-cpu (with clarified commit message):
> https://github.com/afaerber/qemu-cpu/commits/qom-cpu

I'd forgotten about this, but it's a bugfix for a hang, right?
Seems to me like we ought to put it into 2.0 -- were you
planning to submit it via qom-cpu for 2.0?

thanks
-- PMM
Edgar E. Iglesias March 12, 2014, 1:51 a.m. UTC | #10
On Wed, Mar 12, 2014 at 12:03:19AM +0000, Peter Maydell wrote:
> On 11 March 2014 23:58, Andreas Färber <afaerber@suse.de> wrote:
> > Am 20.02.2014 16:58, schrieb Peter Maydell:
> >> On 16 February 2014 02:07, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> >>> On Sat, Feb 15, 2014 at 03:42:56PM +0000, Peter Maydell wrote:
> >>>> On 13 February 2014 05:07,  <edgar.iglesias@gmail.com> wrote:
> >>>>> From: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>
> >>>>>
> >>>>> cpu->exit_request is part of the execution environment and should
> >>>>> not be cleared when a CPU resets.
> >>>>>
> >>>>> Otherwise, we might deadlock QEMU if a CPU resets while there is
> >>>>> I/O going on.
> >>>>>
> >>>>> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
> >>>>> ---
> >>>>>  qom/cpu.c | 1 -
> >>>>>  1 file changed, 1 deletion(-)
> >>>>>
> >>>>> diff --git a/qom/cpu.c b/qom/cpu.c
> >>>>> index 9d62479..40d82dd 100644
> >>>>> --- a/qom/cpu.c
> >>>>> +++ b/qom/cpu.c
> >>>>> @@ -195,7 +195,6 @@ static void cpu_common_reset(CPUState *cpu)
> >>>>>          log_cpu_state(cpu, cc->reset_dump_flags);
> >>>>>      }
> >>>>>
> >>>>> -    cpu->exit_request = 0;
> >>>>>      cpu->interrupt_request = 0;
> >>>>>      cpu->current_tb = NULL;
> >>>>>      cpu->halted = 0;
> >>>>
> >>>> This looks kind of odd to me. What's the situation you see where
> >>>> this matters -- is the CPU resetting itself, or is some other device
> >>>> in another thread triggering the CPU reset? TCG or KVM?
> >>>
> >>> Seeing this in TCG. The CPU gets signaled by the IO thread while the
> >>> CPU is resetting itself. If the CPU looses the race, it clears its
> >>> exit_request leaving the IO thread waiting for the global lock
> >>> potentially forever.
> >>>
> >>> The CPU actually exits generated code but goes right back in because
> >>> there is no exit_request pending.
> >>
> >> Yes, having looked at the code I agree with you, so:
> >>
> >> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> >
> > Thanks, applied to qom-cpu (with clarified commit message):
> > https://github.com/afaerber/qemu-cpu/commits/qom-cpu
> 
> I'd forgotten about this, but it's a bugfix for a hang, right?

Hi, yes it's a bugfix for a hang.

> Seems to me like we ought to put it into 2.0 -- were you

Makes sense to me

Thanks,
Edgar


> planning to submit it via qom-cpu for 2.0?
> 
> thanks
> -- PMM
diff mbox

Patch

diff --git a/qom/cpu.c b/qom/cpu.c
index 9d62479..40d82dd 100644
--- a/qom/cpu.c
+++ b/qom/cpu.c
@@ -195,7 +195,6 @@  static void cpu_common_reset(CPUState *cpu)
         log_cpu_state(cpu, cc->reset_dump_flags);
     }
 
-    cpu->exit_request = 0;
     cpu->interrupt_request = 0;
     cpu->current_tb = NULL;
     cpu->halted = 0;