diff mbox series

[7/7] tests/avocado: ppc64 pseries reverse debugging test

Message ID 20230623125707.323517-8-npiggin@gmail.com
State New
Headers show
Series ppc: fix larx migration, fix record-replay | expand

Commit Message

Nicholas Piggin June 23, 2023, 12:57 p.m. UTC
pseries can run reverse-debugging well enough to pass basic tests.

There is strangeness with reverse-continue possibly relating to a bp
being set on the first instruction or on a snapshot, that causes
the PC to be reported on the first instruction rather than last
breakpoint, so a workaround is added for that for now.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 tests/avocado/reverse_debugging.py | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

Comments

Pavel Dovgalyuk June 26, 2023, 7:49 a.m. UTC | #1
On 23.06.2023 15:57, Nicholas Piggin wrote:
> pseries can run reverse-debugging well enough to pass basic tests.
> 
> There is strangeness with reverse-continue possibly relating to a bp
> being set on the first instruction or on a snapshot, that causes
> the PC to be reported on the first instruction rather than last
> breakpoint, so a workaround is added for that for now.

It means that the test reveals some kind of a bug in PPC debugging 
server implementation.
In this case it is better to fix that instead of adding workaround.

> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   tests/avocado/reverse_debugging.py | 28 +++++++++++++++++++++++++++-
>   1 file changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/avocado/reverse_debugging.py b/tests/avocado/reverse_debugging.py
> index 680c314cfc..553c931994 100644
> --- a/tests/avocado/reverse_debugging.py
> +++ b/tests/avocado/reverse_debugging.py
> @@ -94,7 +94,7 @@ def gdb_bstep(g):
>       def vm_get_icount(vm):
>           return vm.qmp('query-replay')['return']['icount']
>   
> -    def reverse_debugging(self, shift=7, args=None):
> +    def reverse_debugging(self, shift=7, args=None, initial_skip=False):
>           logger = logging.getLogger('replay')
>   
>           # create qcow2 for snapshots
> @@ -135,6 +135,10 @@ def reverse_debugging(self, shift=7, args=None):
>               self.fail('Reverse continue is not supported by QEMU')
>   
>           logger.info('stepping forward')
> +
> +        if initial_skip:
> +            self.gdb_step(g)
> +
>           steps = []
>           # record first instruction addresses
>           for _ in range(self.STEPS):
> @@ -216,3 +220,25 @@ def test_aarch64_virt(self):
>   
>           self.reverse_debugging(
>               args=('-kernel', kernel_path))
> +
> +class ReverseDebugging_ppc64(ReverseDebugging):
> +    """
> +    :avocado: tags=accel:tcg
> +    """
> +
> +    REG_PC = 0x40
> +
> +    # unidentified gitlab timeout problem
> +    @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
> +    def test_ppc64_pseries(self):
> +        """
> +        :avocado: tags=arch:ppc64
> +        :avocado: tags=machine:pseries
> +        """
> +        # start with BIOS only
> +        self.endian_is_le = False
> +        # reverse-continue fails (seems to end up at the start) if a break
> +        # is put on the first instruction. initial_skip skips one before the
> +        # first bp, and it works. Could be due to break at the same icount
> +        # as the snapshot?
> +        self.reverse_debugging(initial_skip=True)
Nicholas Piggin June 26, 2023, 9:34 a.m. UTC | #2
On Mon Jun 26, 2023 at 5:49 PM AEST, Pavel Dovgalyuk wrote:
> On 23.06.2023 15:57, Nicholas Piggin wrote:
> > pseries can run reverse-debugging well enough to pass basic tests.
> > 
> > There is strangeness with reverse-continue possibly relating to a bp
> > being set on the first instruction or on a snapshot, that causes
> > the PC to be reported on the first instruction rather than last
> > breakpoint, so a workaround is added for that for now.
>
> It means that the test reveals some kind of a bug in PPC debugging 
> server implementation.
> In this case it is better to fix that instead of adding workaround.

I agree, and I'm trying to find the cause it hasn't been easy. I
thought the test was still interesting because it otherwise seems
to work well, but hopefully I can find the issue before long.

Thanks,
Nick

>
> > 
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >   tests/avocado/reverse_debugging.py | 28 +++++++++++++++++++++++++++-
> >   1 file changed, 27 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tests/avocado/reverse_debugging.py b/tests/avocado/reverse_debugging.py
> > index 680c314cfc..553c931994 100644
> > --- a/tests/avocado/reverse_debugging.py
> > +++ b/tests/avocado/reverse_debugging.py
> > @@ -94,7 +94,7 @@ def gdb_bstep(g):
> >       def vm_get_icount(vm):
> >           return vm.qmp('query-replay')['return']['icount']
> >   
> > -    def reverse_debugging(self, shift=7, args=None):
> > +    def reverse_debugging(self, shift=7, args=None, initial_skip=False):
> >           logger = logging.getLogger('replay')
> >   
> >           # create qcow2 for snapshots
> > @@ -135,6 +135,10 @@ def reverse_debugging(self, shift=7, args=None):
> >               self.fail('Reverse continue is not supported by QEMU')
> >   
> >           logger.info('stepping forward')
> > +
> > +        if initial_skip:
> > +            self.gdb_step(g)
> > +
> >           steps = []
> >           # record first instruction addresses
> >           for _ in range(self.STEPS):
> > @@ -216,3 +220,25 @@ def test_aarch64_virt(self):
> >   
> >           self.reverse_debugging(
> >               args=('-kernel', kernel_path))
> > +
> > +class ReverseDebugging_ppc64(ReverseDebugging):
> > +    """
> > +    :avocado: tags=accel:tcg
> > +    """
> > +
> > +    REG_PC = 0x40
> > +
> > +    # unidentified gitlab timeout problem
> > +    @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
> > +    def test_ppc64_pseries(self):
> > +        """
> > +        :avocado: tags=arch:ppc64
> > +        :avocado: tags=machine:pseries
> > +        """
> > +        # start with BIOS only
> > +        self.endian_is_le = False
> > +        # reverse-continue fails (seems to end up at the start) if a break
> > +        # is put on the first instruction. initial_skip skips one before the
> > +        # first bp, and it works. Could be due to break at the same icount
> > +        # as the snapshot?
> > +        self.reverse_debugging(initial_skip=True)
Nicholas Piggin July 21, 2023, 1:55 p.m. UTC | #3
On Mon Jun 26, 2023 at 7:34 PM AEST, Nicholas Piggin wrote:
> On Mon Jun 26, 2023 at 5:49 PM AEST, Pavel Dovgalyuk wrote:
> > On 23.06.2023 15:57, Nicholas Piggin wrote:
> > > pseries can run reverse-debugging well enough to pass basic tests.
> > > 
> > > There is strangeness with reverse-continue possibly relating to a bp
> > > being set on the first instruction or on a snapshot, that causes
> > > the PC to be reported on the first instruction rather than last
> > > breakpoint, so a workaround is added for that for now.
> >
> > It means that the test reveals some kind of a bug in PPC debugging 
> > server implementation.
> > In this case it is better to fix that instead of adding workaround.
>
> I agree, and I'm trying to find the cause it hasn't been easy. I
> thought the test was still interesting because it otherwise seems
> to work well, but hopefully I can find the issue before long.

I found the problem after too much staring at record-replay. QEMU works
perfectly, it is the ppc pseries firmware that branches to its own entry
point address within the recorded trace, and that confuses the test
script.

I'm not sure the best way to work around it. Initial skip works okay but
also maybe(?) a good edge case to test a break on the very first
instruction of the trace even if we don't reverse continue to it.

I will send a new series and propose to add the breakpoints before
seeking to the end of the trace, so we will catch a breakpoint being
executed again.

Thanks,
Nick
diff mbox series

Patch

diff --git a/tests/avocado/reverse_debugging.py b/tests/avocado/reverse_debugging.py
index 680c314cfc..553c931994 100644
--- a/tests/avocado/reverse_debugging.py
+++ b/tests/avocado/reverse_debugging.py
@@ -94,7 +94,7 @@  def gdb_bstep(g):
     def vm_get_icount(vm):
         return vm.qmp('query-replay')['return']['icount']
 
-    def reverse_debugging(self, shift=7, args=None):
+    def reverse_debugging(self, shift=7, args=None, initial_skip=False):
         logger = logging.getLogger('replay')
 
         # create qcow2 for snapshots
@@ -135,6 +135,10 @@  def reverse_debugging(self, shift=7, args=None):
             self.fail('Reverse continue is not supported by QEMU')
 
         logger.info('stepping forward')
+
+        if initial_skip:
+            self.gdb_step(g)
+
         steps = []
         # record first instruction addresses
         for _ in range(self.STEPS):
@@ -216,3 +220,25 @@  def test_aarch64_virt(self):
 
         self.reverse_debugging(
             args=('-kernel', kernel_path))
+
+class ReverseDebugging_ppc64(ReverseDebugging):
+    """
+    :avocado: tags=accel:tcg
+    """
+
+    REG_PC = 0x40
+
+    # unidentified gitlab timeout problem
+    @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
+    def test_ppc64_pseries(self):
+        """
+        :avocado: tags=arch:ppc64
+        :avocado: tags=machine:pseries
+        """
+        # start with BIOS only
+        self.endian_is_le = False
+        # reverse-continue fails (seems to end up at the start) if a break
+        # is put on the first instruction. initial_skip skips one before the
+        # first bp, and it works. Could be due to break at the same icount
+        # as the snapshot?
+        self.reverse_debugging(initial_skip=True)