diff mbox series

models: Add Series.is_open

Message ID 20210903155103.39719-1-stephen@that.guru
State Changes Requested
Headers show
Series models: Add Series.is_open | expand

Commit Message

Stephen Finucane Sept. 3, 2021, 3:51 p.m. UTC
Start making series more useful by tracking whether the state is open,
meaning one or more patches are in a state with action_required=True, or
not. This means that the 'git-pw series list' command will only list
series that are actually actionable.

Signed-off-by: Stephen Finucane <stephen@that.guru>
---
Open questions:
- Is there a more efficient way to do the migration than I've done?
- Should we expose an 'is_open' boolean attribute of a 'state' (open,
  closed) attribute for the series views?
- Is there a race condition possible in how I've implemented the series
  state checks?
---
 docs/api/schemas/latest/patchwork.yaml      |  8 +++
 docs/api/schemas/patchwork.j2               | 12 ++++
 docs/api/schemas/v1.3/patchwork.yaml        |  8 +++
 patchwork/api/embedded.py                   |  7 +-
 patchwork/api/filters.py                    |  7 +-
 patchwork/api/series.py                     |  9 ++-
 patchwork/migrations/0046_series_is_open.py | 58 +++++++++++++++++
 patchwork/models.py                         | 20 ++++++
 patchwork/tests/api/test_series.py          | 52 +++++++++++++++
 patchwork/tests/test_models.py              | 72 +++++++++++++++++++++
 10 files changed, 247 insertions(+), 6 deletions(-)
 create mode 100644 patchwork/migrations/0046_series_is_open.py
 create mode 100644 patchwork/tests/test_models.py

Comments

Stephen Finucane Sept. 3, 2021, 3:51 p.m. UTC | #1
On Fri, 2021-09-03 at 16:51 +0100, Stephen Finucane wrote:
> Start making series more useful by tracking whether the state is open,
> meaning one or more patches are in a state with action_required=True, or
> not. This means that the 'git-pw series list' command will only list
> series that are actually actionable.
> 
> Signed-off-by: Stephen Finucane <stephen@that.guru>

This should have been sent as an RFC. This isn't complete and I have open
questions below. Would appreciate some input!

Stephen

> ---
> Open questions:
> - Is there a more efficient way to do the migration than I've done?
> - Should we expose an 'is_open' boolean attribute of a 'state' (open,
>   closed) attribute for the series views?
> - Is there a race condition possible in how I've implemented the series
>   state checks?
> ---
>  docs/api/schemas/latest/patchwork.yaml      |  8 +++
>  docs/api/schemas/patchwork.j2               | 12 ++++
>  docs/api/schemas/v1.3/patchwork.yaml        |  8 +++
>  patchwork/api/embedded.py                   |  7 +-
>  patchwork/api/filters.py                    |  7 +-
>  patchwork/api/series.py                     |  9 ++-
>  patchwork/migrations/0046_series_is_open.py | 58 +++++++++++++++++
>  patchwork/models.py                         | 20 ++++++
>  patchwork/tests/api/test_series.py          | 52 +++++++++++++++
>  patchwork/tests/test_models.py              | 72 +++++++++++++++++++++
>  10 files changed, 247 insertions(+), 6 deletions(-)
>  create mode 100644 patchwork/migrations/0046_series_is_open.py
>  create mode 100644 patchwork/tests/test_models.py
> 
> diff --git docs/api/schemas/latest/patchwork.yaml docs/api/schemas/latest/patchwork.yaml
> index e3bff990..cb8fcdfa 100644
> --- docs/api/schemas/latest/patchwork.yaml
> +++ docs/api/schemas/latest/patchwork.yaml
> @@ -2260,6 +2260,10 @@ components:
>            description: >
>              Version of series as indicated by the subject prefix(es).
>            type: integer
> +        is_open:
> +          title: Status
> +          type: boolean
> +          readOnly: true
>          total:
>            title: Total
>            description: >
> @@ -2610,6 +2614,10 @@ components:
>              Version of series as indicated by the subject prefix(es).
>            type: integer
>            readOnly: true
> +        is_open:
> +          title: Status
> +          type: boolean
> +          readOnly: true
>          mbox:
>            title: Mbox
>            type: string
> diff --git docs/api/schemas/patchwork.j2 docs/api/schemas/patchwork.j2
> index 3b4ad2f6..13610b84 100644
> --- docs/api/schemas/patchwork.j2
> +++ docs/api/schemas/patchwork.j2
> @@ -2354,6 +2354,12 @@ components:
>            description: >
>              Version of series as indicated by the subject prefix(es).
>            type: integer
> +{% if version >= (1, 3) %}
> +        is_open:
> +          title: Status
> +          type: boolean
> +          readOnly: true
> +{% endif %}
>          total:
>            title: Total
>            description: >
> @@ -2718,6 +2724,12 @@ components:
>              Version of series as indicated by the subject prefix(es).
>            type: integer
>            readOnly: true
> +{% if version >= (1, 3) %}
> +        is_open:
> +          title: Status
> +          type: boolean
> +          readOnly: true
> +{% endif %}
>          mbox:
>            title: Mbox
>            type: string
> diff --git docs/api/schemas/v1.3/patchwork.yaml docs/api/schemas/v1.3/patchwork.yaml
> index 6cbba646..3b38799c 100644
> --- docs/api/schemas/v1.3/patchwork.yaml
> +++ docs/api/schemas/v1.3/patchwork.yaml
> @@ -2260,6 +2260,10 @@ components:
>            description: >
>              Version of series as indicated by the subject prefix(es).
>            type: integer
> +        is_open:
> +          title: Status
> +          type: boolean
> +          readOnly: true
>          total:
>            title: Total
>            description: >
> @@ -2610,6 +2614,10 @@ components:
>              Version of series as indicated by the subject prefix(es).
>            type: integer
>            readOnly: true
> +        is_open:
> +          title: Status
> +          type: boolean
> +          readOnly: true
>          mbox:
>            title: Mbox
>            type: string
> diff --git patchwork/api/embedded.py patchwork/api/embedded.py
> index 78316979..38a00ac4 100644
> --- patchwork/api/embedded.py
> +++ patchwork/api/embedded.py
> @@ -181,11 +181,14 @@ class SeriesSerializer(SerializedRelatedField):
>  
>          class Meta:
>              model = models.Series
> -            fields = ('id', 'url', 'web_url', 'date', 'name', 'version',
> -                      'mbox')
> +            fields = (
> +                'id', 'url', 'web_url', 'date', 'name', 'version', 'is_open',
> +                'mbox',
> +            )
>              read_only_fields = fields
>              versioned_fields = {
>                  '1.1': ('web_url', ),
> +                '1.3': ('is_open', ),
>              }
>              extra_kwargs = {
>                  'url': {'view_name': 'api-series-detail'},
> diff --git patchwork/api/filters.py patchwork/api/filters.py
> index d9b65a8f..38de18b6 100644
> --- patchwork/api/filters.py
> +++ patchwork/api/filters.py
> @@ -8,6 +8,7 @@ from django.core.exceptions import ValidationError
>  from django.db.models import Q
>  from django_filters import rest_framework
>  from django_filters.rest_framework import FilterSet
> +from django_filters import BooleanFilter
>  from django_filters import CharFilter
>  from django_filters import IsoDateTimeFilter
>  from django_filters import ModelMultipleChoiceFilter
> @@ -178,10 +179,14 @@ class SeriesFilterSet(TimestampMixin, BaseFilterSet):
>  
>      submitter = PersonFilter(queryset=Person.objects.all(), distinct=False)
>      project = ProjectFilter(queryset=Project.objects.all(), distinct=False)
> +    is_open = BooleanFilter()
>  
>      class Meta:
>          model = Series
> -        fields = ('submitter', 'project')
> +        fields = ('submitter', 'project', 'is_open')
> +        versioned_fields = {
> +            '1.2': ('hash', 'msgid'),
> +        }
>  
>  
>  def msgid_filter(queryset, name, value):
> diff --git patchwork/api/series.py patchwork/api/series.py
> index 106e60f0..f3a498e0 100644
> --- patchwork/api/series.py
> +++ patchwork/api/series.py
> @@ -36,13 +36,16 @@ class SeriesSerializer(BaseHyperlinkedModelSerializer):
>  
>      class Meta:
>          model = Series
> -        fields = ('id', 'url', 'web_url', 'project', 'name', 'date',
> -                  'submitter', 'version', 'total', 'received_total',
> -                  'received_all', 'mbox', 'cover_letter', 'patches')
> +        fields = (
> +            'id', 'url', 'web_url', 'project', 'name', 'date',
> +            'submitter', 'version', 'is_open', 'total', 'received_total',
> +            'received_all', 'mbox', 'cover_letter', 'patches',
> +        )
>          read_only_fields = ('date', 'submitter', 'total', 'received_total',
>                              'received_all', 'mbox', 'cover_letter', 'patches')
>          versioned_fields = {
>              '1.1': ('web_url', ),
> +            '1.3': ('is_open', ),
>          }
>          extra_kwargs = {
>              'url': {'view_name': 'api-series-detail'},
> diff --git patchwork/migrations/0046_series_is_open.py patchwork/migrations/0046_series_is_open.py
> new file mode 100644
> index 00000000..c916ed8b
> --- /dev/null
> +++ patchwork/migrations/0046_series_is_open.py
> @@ -0,0 +1,58 @@
> +from django.db import migrations, models
> +
> +
> +class Migration(migrations.Migration):
> +
> +    dependencies = [
> +        ('patchwork', '0045_addressed_fields'),
> +    ]
> +
> +    operations = [
> +        migrations.AddField(
> +            model_name='series',
> +            name='is_open',
> +            field=models.BooleanField(
> +                help_text='Are all patches in this series resolved?',
> +                null=True,
> +            ),
> +        ),
> +        # Each patch in a series is associated with a state. States have an
> +        # 'action_required' attribute. A state with 'action_required=True' is
> +        # effectively "open", since the maintainer(s) still need to do
> +        # something with it. This means the patches associated with these
> +        # states are open, and by extension any series containing patches
> +        # associated with these states are also open.
> +        #
> +        # Got that? Good. All we're doing here is setting the series' new
> +        # 'is_open' attribute by identifying if any of the patches associated
> +        # with each series have a state with 'action_required=True'. We do this
> +        # aggregation using 'MAX', since booleans are stored as integers in
> +        # most DB backends (1/True > 0/False) and ANSI SQL states True > False
> +
> +        # TODO: Which of these offers the best performance? The first is MySQL
> +        # only fwict
> +        # migrations.RunSQL(
> +        #     """
> +        #     UPDATE patchwork_series
> +        #     JOIN (
> +        #         SELECT patch.series_id AS series_id, MAX(state.action_required) AS is_open  # noqa: E501
> +        #         FROM patchwork_patch patch
> +        #         JOIN patchwork_state state ON patch.state_id = state.id
> +        #         GROUP BY series_id
> +        #     ) AS sub ON patchwork_series.id = sub.series_id
> +        #     SET patchwork_series.is_open = sub.is_open;
> +        #     """,  # noqa: E501
> +        # ),
> +        migrations.RunSQL(
> +            """
> +            UPDATE patchwork_series
> +            SET is_open = (
> +                SELECT MAX(patchwork_state.action_required)
> +                FROM patchwork_patch
> +                JOIN patchwork_state ON state_id = patchwork_state.id
> +                WHERE patchwork_patch.series_id = patchwork_series.id
> +                GROUP BY patchwork_patch.series_id
> +            );
> +            """,
> +        ),
> +    ]
> diff --git patchwork/models.py patchwork/models.py
> index 58e4c51e..7441510b 100644
> --- patchwork/models.py
> +++ patchwork/models.py
> @@ -14,6 +14,7 @@ from django.conf import settings
>  from django.contrib.auth.models import User
>  from django.core.exceptions import ValidationError
>  from django.db import models
> +from django.db import transaction
>  from django.urls import reverse
>  from django.utils.functional import cached_property
>  from django.core.validators import validate_unicode_slug
> @@ -494,6 +495,19 @@ class Patch(SubmissionMixin):
>          for tag in tags:
>              self._set_tag(tag, counter[tag])
>  
> +    def refresh_series_state(self):
> +        if not self.series:
> +            return
> +
> +        # If any of the patches in the series is in an "action required" state,
> +        # then the series is still open
> +        # TODO(stephenfin): Can this still race?
> +        with transaction.atomic():
> +            self.series.is_open = self.series.patches.filter(
> +                state__action_required=True
> +            ).exists()
> +            self.series.save()
> +
>      def save(self, *args, **kwargs):
>          if not hasattr(self, 'state') or not self.state:
>              self.state = get_default_initial_patch_state()
> @@ -503,6 +517,7 @@ class Patch(SubmissionMixin):
>  
>          super(Patch, self).save(**kwargs)
>  
> +        self.refresh_series_state()
>          self.refresh_tag_counts()
>  
>      def is_editable(self, user):
> @@ -772,6 +787,11 @@ class Series(FilenameMixin, models.Model):
>                                    'by the subject prefix(es)')
>      total = models.IntegerField(help_text='Number of patches in series as '
>                                  'indicated by the subject prefix(es)')
> +    is_open = models.BooleanField(
> +        default=True,
> +        null=True,
> +        help_text='Are all patches in this series resolved?',
> +    )
>  
>      @staticmethod
>      def _format_name(obj):
> diff --git patchwork/tests/api/test_series.py patchwork/tests/api/test_series.py
> index ef661773..830b2fdb 100644
> --- patchwork/tests/api/test_series.py
> +++ patchwork/tests/api/test_series.py
> @@ -16,6 +16,7 @@ from patchwork.tests.utils import create_patch
>  from patchwork.tests.utils import create_person
>  from patchwork.tests.utils import create_project
>  from patchwork.tests.utils import create_series
> +from patchwork.tests.utils import create_state
>  from patchwork.tests.utils import create_user
>  
>  if settings.ENABLE_REST_API:
> @@ -122,6 +123,57 @@ class TestSeriesAPI(utils.APITestCase):
>              'submitter': 'test@example.org'})
>          self.assertEqual(0, len(resp.data))
>  
> +    def test_list_filter_is_open(self):
> +        """Filter series by status."""
> +        project_obj = create_project(linkname='myproject')
> +        person_obj = create_person(email='test@example.com')
> +
> +        # create two series with a single patch in each with different states
> +        # for the two patches
> +
> +        state_a = create_state(name='New', slug='new', action_required=True)
> +        series_a = create_series(project=project_obj, submitter=person_obj)
> +        create_cover(series=series_a)
> +        create_patch(series=series_a, state=state_a)
> +        self.assertTrue(series_a.is_open)
> +
> +        state_b = create_state(
> +            name='Accepted', slug='Accepted', action_required=False,
> +        )
> +        series_b = create_series(project=project_obj, submitter=person_obj)
> +        create_cover(series=series_b)
> +        create_patch(series=series_b, state=state_b)
> +        self.assertFalse(series_b.is_open)
> +
> +        resp = self.client.get(self.api_url(), {'is_open': 'True'})
> +        self.assertEqual([series_a.id], [x['id'] for x in resp.data])
> +
> +        resp = self.client.get(self.api_url(), {'is_open': 'False'})
> +        self.assertEqual([series_b.id], [x['id'] for x in resp.data])
> +
> +        resp = self.client.get(self.api_url())
> +        self.assertEqual(
> +            [series_a.id, series_b.id], [x['id'] for x in resp.data]
> +        )
> +
> +    @utils.store_samples('series-list-1-2')
> +    def test_list_version_1_2(self):
> +        """List patches using API v1.2.
> +
> +        Validate that newer fields are dropped for older API versions.
> +        """
> +        self._create_series()
> +
> +        resp = self.client.get(self.api_url(version='1.2'))
> +        self.assertEqual(status.HTTP_200_OK, resp.status_code)
> +        self.assertEqual(1, len(resp.data))
> +        self.assertIn('url', resp.data[0])
> +        self.assertIn('web_url', resp.data[0])
> +        self.assertIn('web_url', resp.data[0]['cover_letter'])
> +        self.assertIn('mbox', resp.data[0]['cover_letter'])
> +        self.assertIn('web_url', resp.data[0]['patches'][0])
> +        self.assertNotIn('is_open', resp.data[0])
> +
>      @utils.store_samples('series-list-1-0')
>      def test_list_version_1_0(self):
>          """List patches using API v1.0.
> diff --git patchwork/tests/test_models.py patchwork/tests/test_models.py
> new file mode 100644
> index 00000000..13db3330
> --- /dev/null
> +++ patchwork/tests/test_models.py
> @@ -0,0 +1,72 @@
> +# Patchwork - automated patch tracking system
> +# Copyright (C) 2021 Stephen Finucane <stephen@that.guru>
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +from django.test import TestCase
> +
> +from patchwork.tests.utils import create_patch
> +from patchwork.tests.utils import create_series
> +from patchwork.tests.utils import create_state
> +
> +
> +class TestPatch(TestCase):
> +
> +    def test_save_updates_series_state(self):
> +        state_new = create_state(
> +            name='New',
> +            slug='new',
> +            action_required=True,
> +        )
> +        state_under_review = create_state(
> +            name='Under Review',
> +            slug='under-review',
> +            action_required=True,
> +        )
> +        state_accepted = create_state(
> +            name='Accepted',
> +            slug='accepted',
> +            action_required=False,
> +        )
> +        state_rejected = create_state(
> +            name='Rejected',
> +            slug='rejected',
> +            action_required=False,
> +        )
> +        series = create_series()
> +        patches = []
> +
> +        # Create four patches - one in each state
> +        for state in (
> +            state_new, state_under_review, state_accepted, state_rejected,
> +        ):
> +            patches.append(create_patch(series=series, state=state))
> +
> +        # The patches will be in various states so the series should still be
> +        # open
> +        self.assertTrue(patches[0].state.action_required)
> +        self.assertTrue(patches[1].state.action_required)
> +        self.assertFalse(patches[2].state.action_required)
> +        self.assertFalse(patches[3].state.action_required)
> +        self.assertTrue(series.is_open)
> +
> +        # Now change the first patch to 'accepted'. The series shouldn't change
> +        # state because the second one is still 'under-review'
> +        patches[0].state = state_accepted
> +        patches[0].save()
> +        self.assertFalse(patches[0].state.action_required)
> +        self.assertTrue(patches[1].state.action_required)
> +        self.assertFalse(patches[2].state.action_required)
> +        self.assertFalse(patches[3].state.action_required)
> +        self.assertTrue(series.is_open)
> +
> +        # Now change the second patch to 'rejected'. The series should finally
> +        # change state because all patches now have a state with
> +        # action_required=False
> +        patches[1].state = state_rejected
> +        patches[1].save()
> +        self.assertFalse(patches[0].state.action_required)
> +        self.assertFalse(patches[1].state.action_required)
> +        self.assertFalse(patches[2].state.action_required)
> +        self.assertFalse(patches[3].state.action_required)
> +        self.assertFalse(series.is_open)
Daniel Axtens Sept. 5, 2021, 1:34 a.m. UTC | #2
Stephen Finucane <stephen@that.guru> writes:

> Start making series more useful by tracking whether the state is open,
> meaning one or more patches are in a state with action_required=True, or
> not. This means that the 'git-pw series list' command will only list
> series that are actually actionable.
>
> Signed-off-by: Stephen Finucane <stephen@that.guru>
> ---
> Open questions:
> - Is there a more efficient way to do the migration than I've done?
> - Should we expose an 'is_open' boolean attribute of a 'state' (open,
>   closed) attribute for the series views?
> - Is there a race condition possible in how I've implemented the series
>   state checks?

AIUI, series.is_open is a derived property (or I suppose in db terms a
'view') of series.patches.

Is the cost of computing it 'live' so bad that we need to cache it in
the db? (a 'materialised view'[1])

Is there any chance we could just always compute on demand? I'm trying
to think of all the ways we expose series:

 - on the patch list page, we already also fetch all the patches

 - on the API /series/ page, we also fetch all the patches even for just
   the list view.

so it seems to me we probably shouldn't suffer a performance hit for
doing it live.


[1] https://www.datacamp.com/community/tutorials/materialized-views-postgresql
and noting that materialised views are not supported natively in mysql -
but you've basically done https://linuxhint.com/materialized-views-mysql
with the refresh logic in the application layer rather than a stored
procedure.

> diff --git patchwork/models.py patchwork/models.py
> index 58e4c51e..7441510b 100644
> --- patchwork/models.py
> +++ patchwork/models.py
> @@ -14,6 +14,7 @@ from django.conf import settings
>  from django.contrib.auth.models import User
>  from django.core.exceptions import ValidationError
>  from django.db import models
> +from django.db import transaction
>  from django.urls import reverse
>  from django.utils.functional import cached_property
>  from django.core.validators import validate_unicode_slug
> @@ -494,6 +495,19 @@ class Patch(SubmissionMixin):
>          for tag in tags:
>              self._set_tag(tag, counter[tag])
>  
> +    def refresh_series_state(self):
> +        if not self.series:
> +            return
> +
> +        # If any of the patches in the series is in an "action required" state,
> +        # then the series is still open
> +        # TODO(stephenfin): Can this still race?
> +        with transaction.atomic():
> +            self.series.is_open = self.series.patches.filter(
> +                state__action_required=True
> +            ).exists()
> +            self.series.save()
> +

AIUI based some experimentation and my experience dealing with our
series race conditions in 2018, this can race and transactions won't
save you here.  The transaction will roll back all the changes if any
statement fails (e.g. if you had an integrityerror from a foreign key),
and other users won't see intermediate writes from your
transaction. What it won't do is preserve the data dependency you're
creating in the way a traditional mutex would.

Basically, consider a concurrent parsemail and an API client changing a
patch state and refreshing series state on the same series:


 | parsemail                    | API client                    |
 ----------------------------------------------------------------
 |                              | series.patch[2].save()        | 
 |                              | BEGIN TRANSACTION             |
 |                              | is_open = SELECT(...) = False |
 | series.patch[3].save()       |                               |
 | BEGIN TRANSACTION            |                               |
 | is_open = SELECT(...) = True |                               |
 | UPDATE series...             |                               |
 | COMMIT TRANSACTION           |                               |
 |                              | UPDATE series...              |
 |                              | COMMIT TRANSACTION            |

both transactions can complete successfully and the end result is
wrong. (You can play with this sort of thing with two concurrent
dbshells both updating a commit_ref in a transaction, which I found very
helpful in getting the details right.)

Not materialising the view avoids persisting a result where we've lost
the race.

Kind regards,
Daniel


>      def save(self, *args, **kwargs):
>          if not hasattr(self, 'state') or not self.state:
>              self.state = get_default_initial_patch_state()
> @@ -503,6 +517,7 @@ class Patch(SubmissionMixin):
>  
>          super(Patch, self).save(**kwargs)
>  
> +        self.refresh_series_state()
>          self.refresh_tag_counts()
>  
>      def is_editable(self, user):
> @@ -772,6 +787,11 @@ class Series(FilenameMixin, models.Model):
>                                    'by the subject prefix(es)')
>      total = models.IntegerField(help_text='Number of patches in series as '
>                                  'indicated by the subject prefix(es)')
> +    is_open = models.BooleanField(
> +        default=True,
> +        null=True,
> +        help_text='Are all patches in this series resolved?',
> +    )
>  
>      @staticmethod
>      def _format_name(obj):
> diff --git patchwork/tests/api/test_series.py patchwork/tests/api/test_series.py
> index ef661773..830b2fdb 100644
> --- patchwork/tests/api/test_series.py
> +++ patchwork/tests/api/test_series.py
> @@ -16,6 +16,7 @@ from patchwork.tests.utils import create_patch
>  from patchwork.tests.utils import create_person
>  from patchwork.tests.utils import create_project
>  from patchwork.tests.utils import create_series
> +from patchwork.tests.utils import create_state
>  from patchwork.tests.utils import create_user
>  
>  if settings.ENABLE_REST_API:
> @@ -122,6 +123,57 @@ class TestSeriesAPI(utils.APITestCase):
>              'submitter': 'test@example.org'})
>          self.assertEqual(0, len(resp.data))
>  
> +    def test_list_filter_is_open(self):
> +        """Filter series by status."""
> +        project_obj = create_project(linkname='myproject')
> +        person_obj = create_person(email='test@example.com')
> +
> +        # create two series with a single patch in each with different states
> +        # for the two patches
> +
> +        state_a = create_state(name='New', slug='new', action_required=True)
> +        series_a = create_series(project=project_obj, submitter=person_obj)
> +        create_cover(series=series_a)
> +        create_patch(series=series_a, state=state_a)
> +        self.assertTrue(series_a.is_open)
> +
> +        state_b = create_state(
> +            name='Accepted', slug='Accepted', action_required=False,
> +        )
> +        series_b = create_series(project=project_obj, submitter=person_obj)
> +        create_cover(series=series_b)
> +        create_patch(series=series_b, state=state_b)
> +        self.assertFalse(series_b.is_open)
> +
> +        resp = self.client.get(self.api_url(), {'is_open': 'True'})
> +        self.assertEqual([series_a.id], [x['id'] for x in resp.data])
> +
> +        resp = self.client.get(self.api_url(), {'is_open': 'False'})
> +        self.assertEqual([series_b.id], [x['id'] for x in resp.data])
> +
> +        resp = self.client.get(self.api_url())
> +        self.assertEqual(
> +            [series_a.id, series_b.id], [x['id'] for x in resp.data]
> +        )
> +
> +    @utils.store_samples('series-list-1-2')
> +    def test_list_version_1_2(self):
> +        """List patches using API v1.2.
> +
> +        Validate that newer fields are dropped for older API versions.
> +        """
> +        self._create_series()
> +
> +        resp = self.client.get(self.api_url(version='1.2'))
> +        self.assertEqual(status.HTTP_200_OK, resp.status_code)
> +        self.assertEqual(1, len(resp.data))
> +        self.assertIn('url', resp.data[0])
> +        self.assertIn('web_url', resp.data[0])
> +        self.assertIn('web_url', resp.data[0]['cover_letter'])
> +        self.assertIn('mbox', resp.data[0]['cover_letter'])
> +        self.assertIn('web_url', resp.data[0]['patches'][0])
> +        self.assertNotIn('is_open', resp.data[0])
> +
>      @utils.store_samples('series-list-1-0')
>      def test_list_version_1_0(self):
>          """List patches using API v1.0.
> diff --git patchwork/tests/test_models.py patchwork/tests/test_models.py
> new file mode 100644
> index 00000000..13db3330
> --- /dev/null
> +++ patchwork/tests/test_models.py
> @@ -0,0 +1,72 @@
> +# Patchwork - automated patch tracking system
> +# Copyright (C) 2021 Stephen Finucane <stephen@that.guru>
> +#
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +from django.test import TestCase
> +
> +from patchwork.tests.utils import create_patch
> +from patchwork.tests.utils import create_series
> +from patchwork.tests.utils import create_state
> +
> +
> +class TestPatch(TestCase):
> +
> +    def test_save_updates_series_state(self):
> +        state_new = create_state(
> +            name='New',
> +            slug='new',
> +            action_required=True,
> +        )
> +        state_under_review = create_state(
> +            name='Under Review',
> +            slug='under-review',
> +            action_required=True,
> +        )
> +        state_accepted = create_state(
> +            name='Accepted',
> +            slug='accepted',
> +            action_required=False,
> +        )
> +        state_rejected = create_state(
> +            name='Rejected',
> +            slug='rejected',
> +            action_required=False,
> +        )
> +        series = create_series()
> +        patches = []
> +
> +        # Create four patches - one in each state
> +        for state in (
> +            state_new, state_under_review, state_accepted, state_rejected,
> +        ):
> +            patches.append(create_patch(series=series, state=state))
> +
> +        # The patches will be in various states so the series should still be
> +        # open
> +        self.assertTrue(patches[0].state.action_required)
> +        self.assertTrue(patches[1].state.action_required)
> +        self.assertFalse(patches[2].state.action_required)
> +        self.assertFalse(patches[3].state.action_required)
> +        self.assertTrue(series.is_open)
> +
> +        # Now change the first patch to 'accepted'. The series shouldn't change
> +        # state because the second one is still 'under-review'
> +        patches[0].state = state_accepted
> +        patches[0].save()
> +        self.assertFalse(patches[0].state.action_required)
> +        self.assertTrue(patches[1].state.action_required)
> +        self.assertFalse(patches[2].state.action_required)
> +        self.assertFalse(patches[3].state.action_required)
> +        self.assertTrue(series.is_open)
> +
> +        # Now change the second patch to 'rejected'. The series should finally
> +        # change state because all patches now have a state with
> +        # action_required=False
> +        patches[1].state = state_rejected
> +        patches[1].save()
> +        self.assertFalse(patches[0].state.action_required)
> +        self.assertFalse(patches[1].state.action_required)
> +        self.assertFalse(patches[2].state.action_required)
> +        self.assertFalse(patches[3].state.action_required)
> +        self.assertFalse(series.is_open)
> -- 
> 2.31.1
Stephen Finucane Sept. 23, 2021, 11:40 a.m. UTC | #3
On Sun, 2021-09-05 at 11:34 +1000, Daniel Axtens wrote:
> Stephen Finucane <stephen@that.guru> writes:
> 
> > Start making series more useful by tracking whether the state is open,
> > meaning one or more patches are in a state with action_required=True, or
> > not. This means that the 'git-pw series list' command will only list
> > series that are actually actionable.
> > 
> > Signed-off-by: Stephen Finucane <stephen@that.guru>
> > ---
> > Open questions:
> > - Is there a more efficient way to do the migration than I've done?
> > - Should we expose an 'is_open' boolean attribute of a 'state' (open,
> >   closed) attribute for the series views?
> > - Is there a race condition possible in how I've implemented the series
> >   state checks?
> 
> AIUI, series.is_open is a derived property (or I suppose in db terms a
> 'view') of series.patches.
> 
> Is the cost of computing it 'live' so bad that we need to cache it in
> the db? (a 'materialised view'[1])
> 
> Is there any chance we could just always compute on demand? I'm trying
> to think of all the ways we expose series:
> 
>  - on the patch list page, we already also fetch all the patches
> 
>  - on the API /series/ page, we also fetch all the patches even for just
>    the list view.
> 
> so it seems to me we probably shouldn't suffer a performance hit for
> doing it live.

The biggest impact I see is filtering. This is the main thing I want because
presently 'git pw series list' is pretty much useless without this. I don't
doing this dynamically would work because attempting to fetch, for example, the
first 100 "closed" series will require loading every series and every patch. You
couldn't just load the first 100 since they could be entirely/mostly "open",
meaning you'd need to load another 100, and another, and another, until you get
100 closed. This will need to be stored _somewhere_ fwict.

> [1] https://www.datacamp.com/community/tutorials/materialized-views-postgresql
> and noting that materialised views are not supported natively in mysql -
> but you've basically done https://linuxhint.com/materialized-views-mysql
> with the refresh logic in the application layer rather than a stored
> procedure.
> 
> > diff --git patchwork/models.py patchwork/models.py
> > index 58e4c51e..7441510b 100644
> > --- patchwork/models.py
> > +++ patchwork/models.py
> > @@ -14,6 +14,7 @@ from django.conf import settings
> >  from django.contrib.auth.models import User
> >  from django.core.exceptions import ValidationError
> >  from django.db import models
> > +from django.db import transaction
> >  from django.urls import reverse
> >  from django.utils.functional import cached_property
> >  from django.core.validators import validate_unicode_slug
> > @@ -494,6 +495,19 @@ class Patch(SubmissionMixin):
> >          for tag in tags:
> >              self._set_tag(tag, counter[tag])
> >  
> > +    def refresh_series_state(self):
> > +        if not self.series:
> > +            return
> > +
> > +        # If any of the patches in the series is in an "action required" state,
> > +        # then the series is still open
> > +        # TODO(stephenfin): Can this still race?
> > +        with transaction.atomic():
> > +            self.series.is_open = self.series.patches.filter(
> > +                state__action_required=True
> > +            ).exists()
> > +            self.series.save()
> > +
> 
> AIUI based some experimentation and my experience dealing with our
> series race conditions in 2018, this can race and transactions won't
> save you here.  The transaction will roll back all the changes if any
> statement fails (e.g. if you had an integrityerror from a foreign key),
> and other users won't see intermediate writes from your
> transaction. What it won't do is preserve the data dependency you're
> creating in the way a traditional mutex would.
> 
> Basically, consider a concurrent parsemail and an API client changing a
> patch state and refreshing series state on the same series:
> 
> 
>  | parsemail                    | API client                    |
>  ----------------------------------------------------------------
>  |                              | series.patch[2].save()        | 
>  |                              | BEGIN TRANSACTION             |
>  |                              | is_open = SELECT(...) = False |
>  | series.patch[3].save()       |                               |
>  | BEGIN TRANSACTION            |                               |
>  | is_open = SELECT(...) = True |                               |
>  | UPDATE series...             |                               |
>  | COMMIT TRANSACTION           |                               |
>  |                              | UPDATE series...              |
>  |                              | COMMIT TRANSACTION            |
> 
> both transactions can complete successfully and the end result is
> wrong. (You can play with this sort of thing with two concurrent
> dbshells both updating a commit_ref in a transaction, which I found very
> helpful in getting the details right.)
> 
> Not materialising the view avoids persisting a result where we've lost
> the race.

Ack, yeah, this makes sense. I need to investigate these materialized views
things, but assuming they're not an option with MySQL and SQLite then I guess
I'll need to find a way to lock the whole thing in a mutex.

Stephen

PS: Sorry for the delay. Holidays.

> Kind regards,
> Daniel
> 
> 
> >      def save(self, *args, **kwargs):
> >          if not hasattr(self, 'state') or not self.state:
> >              self.state = get_default_initial_patch_state()
> > @@ -503,6 +517,7 @@ class Patch(SubmissionMixin):
> >  
> >          super(Patch, self).save(**kwargs)
> >  
> > +        self.refresh_series_state()
> >          self.refresh_tag_counts()
> >  
> >      def is_editable(self, user):
> > @@ -772,6 +787,11 @@ class Series(FilenameMixin, models.Model):
> >                                    'by the subject prefix(es)')
> >      total = models.IntegerField(help_text='Number of patches in series as '
> >                                  'indicated by the subject prefix(es)')
> > +    is_open = models.BooleanField(
> > +        default=True,
> > +        null=True,
> > +        help_text='Are all patches in this series resolved?',
> > +    )
> >  
> >      @staticmethod
> >      def _format_name(obj):
> > diff --git patchwork/tests/api/test_series.py patchwork/tests/api/test_series.py
> > index ef661773..830b2fdb 100644
> > --- patchwork/tests/api/test_series.py
> > +++ patchwork/tests/api/test_series.py
> > @@ -16,6 +16,7 @@ from patchwork.tests.utils import create_patch
> >  from patchwork.tests.utils import create_person
> >  from patchwork.tests.utils import create_project
> >  from patchwork.tests.utils import create_series
> > +from patchwork.tests.utils import create_state
> >  from patchwork.tests.utils import create_user
> >  
> >  if settings.ENABLE_REST_API:
> > @@ -122,6 +123,57 @@ class TestSeriesAPI(utils.APITestCase):
> >              'submitter': 'test@example.org'})
> >          self.assertEqual(0, len(resp.data))
> >  
> > +    def test_list_filter_is_open(self):
> > +        """Filter series by status."""
> > +        project_obj = create_project(linkname='myproject')
> > +        person_obj = create_person(email='test@example.com')
> > +
> > +        # create two series with a single patch in each with different states
> > +        # for the two patches
> > +
> > +        state_a = create_state(name='New', slug='new', action_required=True)
> > +        series_a = create_series(project=project_obj, submitter=person_obj)
> > +        create_cover(series=series_a)
> > +        create_patch(series=series_a, state=state_a)
> > +        self.assertTrue(series_a.is_open)
> > +
> > +        state_b = create_state(
> > +            name='Accepted', slug='Accepted', action_required=False,
> > +        )
> > +        series_b = create_series(project=project_obj, submitter=person_obj)
> > +        create_cover(series=series_b)
> > +        create_patch(series=series_b, state=state_b)
> > +        self.assertFalse(series_b.is_open)
> > +
> > +        resp = self.client.get(self.api_url(), {'is_open': 'True'})
> > +        self.assertEqual([series_a.id], [x['id'] for x in resp.data])
> > +
> > +        resp = self.client.get(self.api_url(), {'is_open': 'False'})
> > +        self.assertEqual([series_b.id], [x['id'] for x in resp.data])
> > +
> > +        resp = self.client.get(self.api_url())
> > +        self.assertEqual(
> > +            [series_a.id, series_b.id], [x['id'] for x in resp.data]
> > +        )
> > +
> > +    @utils.store_samples('series-list-1-2')
> > +    def test_list_version_1_2(self):
> > +        """List patches using API v1.2.
> > +
> > +        Validate that newer fields are dropped for older API versions.
> > +        """
> > +        self._create_series()
> > +
> > +        resp = self.client.get(self.api_url(version='1.2'))
> > +        self.assertEqual(status.HTTP_200_OK, resp.status_code)
> > +        self.assertEqual(1, len(resp.data))
> > +        self.assertIn('url', resp.data[0])
> > +        self.assertIn('web_url', resp.data[0])
> > +        self.assertIn('web_url', resp.data[0]['cover_letter'])
> > +        self.assertIn('mbox', resp.data[0]['cover_letter'])
> > +        self.assertIn('web_url', resp.data[0]['patches'][0])
> > +        self.assertNotIn('is_open', resp.data[0])
> > +
> >      @utils.store_samples('series-list-1-0')
> >      def test_list_version_1_0(self):
> >          """List patches using API v1.0.
> > diff --git patchwork/tests/test_models.py patchwork/tests/test_models.py
> > new file mode 100644
> > index 00000000..13db3330
> > --- /dev/null
> > +++ patchwork/tests/test_models.py
> > @@ -0,0 +1,72 @@
> > +# Patchwork - automated patch tracking system
> > +# Copyright (C) 2021 Stephen Finucane <stephen@that.guru>
> > +#
> > +# SPDX-License-Identifier: GPL-2.0-or-later
> > +
> > +from django.test import TestCase
> > +
> > +from patchwork.tests.utils import create_patch
> > +from patchwork.tests.utils import create_series
> > +from patchwork.tests.utils import create_state
> > +
> > +
> > +class TestPatch(TestCase):
> > +
> > +    def test_save_updates_series_state(self):
> > +        state_new = create_state(
> > +            name='New',
> > +            slug='new',
> > +            action_required=True,
> > +        )
> > +        state_under_review = create_state(
> > +            name='Under Review',
> > +            slug='under-review',
> > +            action_required=True,
> > +        )
> > +        state_accepted = create_state(
> > +            name='Accepted',
> > +            slug='accepted',
> > +            action_required=False,
> > +        )
> > +        state_rejected = create_state(
> > +            name='Rejected',
> > +            slug='rejected',
> > +            action_required=False,
> > +        )
> > +        series = create_series()
> > +        patches = []
> > +
> > +        # Create four patches - one in each state
> > +        for state in (
> > +            state_new, state_under_review, state_accepted, state_rejected,
> > +        ):
> > +            patches.append(create_patch(series=series, state=state))
> > +
> > +        # The patches will be in various states so the series should still be
> > +        # open
> > +        self.assertTrue(patches[0].state.action_required)
> > +        self.assertTrue(patches[1].state.action_required)
> > +        self.assertFalse(patches[2].state.action_required)
> > +        self.assertFalse(patches[3].state.action_required)
> > +        self.assertTrue(series.is_open)
> > +
> > +        # Now change the first patch to 'accepted'. The series shouldn't change
> > +        # state because the second one is still 'under-review'
> > +        patches[0].state = state_accepted
> > +        patches[0].save()
> > +        self.assertFalse(patches[0].state.action_required)
> > +        self.assertTrue(patches[1].state.action_required)
> > +        self.assertFalse(patches[2].state.action_required)
> > +        self.assertFalse(patches[3].state.action_required)
> > +        self.assertTrue(series.is_open)
> > +
> > +        # Now change the second patch to 'rejected'. The series should finally
> > +        # change state because all patches now have a state with
> > +        # action_required=False
> > +        patches[1].state = state_rejected
> > +        patches[1].save()
> > +        self.assertFalse(patches[0].state.action_required)
> > +        self.assertFalse(patches[1].state.action_required)
> > +        self.assertFalse(patches[2].state.action_required)
> > +        self.assertFalse(patches[3].state.action_required)
> > +        self.assertFalse(series.is_open)
> > -- 
> > 2.31.1
Stephen Finucane Sept. 23, 2021, 11:43 a.m. UTC | #4
On Thu, 2021-09-23 at 12:40 +0100, Stephen Finucane wrote:
> On Sun, 2021-09-05 at 11:34 +1000, Daniel Axtens wrote:
> > Stephen Finucane <stephen@that.guru> writes:
> > 
> > > Start making series more useful by tracking whether the state is open,
> > > meaning one or more patches are in a state with action_required=True, or
> > > not. This means that the 'git-pw series list' command will only list
> > > series that are actually actionable.
> > > 
> > > Signed-off-by: Stephen Finucane <stephen@that.guru>
> > > ---
> > > Open questions:
> > > - Is there a more efficient way to do the migration than I've done?
> > > - Should we expose an 'is_open' boolean attribute of a 'state' (open,
> > >   closed) attribute for the series views?
> > > - Is there a race condition possible in how I've implemented the series
> > >   state checks?
> > 
> > AIUI, series.is_open is a derived property (or I suppose in db terms a
> > 'view') of series.patches.
> > 
> > Is the cost of computing it 'live' so bad that we need to cache it in
> > the db? (a 'materialised view'[1])
> > 
> > Is there any chance we could just always compute on demand? I'm trying
> > to think of all the ways we expose series:
> > 
> >  - on the patch list page, we already also fetch all the patches
> > 
> >  - on the API /series/ page, we also fetch all the patches even for just
> >    the list view.
> > 
> > so it seems to me we probably shouldn't suffer a performance hit for
> > doing it live.
> 
> The biggest impact I see is filtering. This is the main thing I want because
> presently 'git pw series list' is pretty much useless without this. I don't
> doing this dynamically would work because attempting to fetch, for example, the
> first 100 "closed" series will require loading every series and every patch. You
> couldn't just load the first 100 since they could be entirely/mostly "open",
> meaning you'd need to load another 100, and another, and another, until you get
> 100 closed. This will need to be stored _somewhere_ fwict.
> 
> > [1] https://www.datacamp.com/community/tutorials/materialized-views-postgresql
> > and noting that materialised views are not supported natively in mysql -
> > but you've basically done https://linuxhint.com/materialized-views-mysql
> > with the refresh logic in the application layer rather than a stored
> > procedure.
> > 
> > > diff --git patchwork/models.py patchwork/models.py
> > > index 58e4c51e..7441510b 100644
> > > --- patchwork/models.py
> > > +++ patchwork/models.py
> > > @@ -14,6 +14,7 @@ from django.conf import settings
> > >  from django.contrib.auth.models import User
> > >  from django.core.exceptions import ValidationError
> > >  from django.db import models
> > > +from django.db import transaction
> > >  from django.urls import reverse
> > >  from django.utils.functional import cached_property
> > >  from django.core.validators import validate_unicode_slug
> > > @@ -494,6 +495,19 @@ class Patch(SubmissionMixin):
> > >          for tag in tags:
> > >              self._set_tag(tag, counter[tag])
> > >  
> > > +    def refresh_series_state(self):
> > > +        if not self.series:
> > > +            return
> > > +
> > > +        # If any of the patches in the series is in an "action required" state,
> > > +        # then the series is still open
> > > +        # TODO(stephenfin): Can this still race?
> > > +        with transaction.atomic():
> > > +            self.series.is_open = self.series.patches.filter(
> > > +                state__action_required=True
> > > +            ).exists()
> > > +            self.series.save()
> > > +
> > 
> > AIUI based some experimentation and my experience dealing with our
> > series race conditions in 2018, this can race and transactions won't
> > save you here.  The transaction will roll back all the changes if any
> > statement fails (e.g. if you had an integrityerror from a foreign key),
> > and other users won't see intermediate writes from your
> > transaction. What it won't do is preserve the data dependency you're
> > creating in the way a traditional mutex would.
> > 
> > Basically, consider a concurrent parsemail and an API client changing a
> > patch state and refreshing series state on the same series:
> > 
> > 
> >  | parsemail                    | API client                    |
> >  ----------------------------------------------------------------
> >  |                              | series.patch[2].save()        | 
> >  |                              | BEGIN TRANSACTION             |
> >  |                              | is_open = SELECT(...) = False |
> >  | series.patch[3].save()       |                               |
> >  | BEGIN TRANSACTION            |                               |
> >  | is_open = SELECT(...) = True |                               |
> >  | UPDATE series...             |                               |
> >  | COMMIT TRANSACTION           |                               |
> >  |                              | UPDATE series...              |
> >  |                              | COMMIT TRANSACTION            |
> > 
> > both transactions can complete successfully and the end result is
> > wrong. (You can play with this sort of thing with two concurrent
> > dbshells both updating a commit_ref in a transaction, which I found very
> > helpful in getting the details right.)
> > 
> > Not materialising the view avoids persisting a result where we've lost
> > the race.
> 
> Ack, yeah, this makes sense. I need to investigate these materialized views
> things, but assuming they're not an option with MySQL and SQLite then I guess
> I'll need to find a way to lock the whole thing in a mutex.

It looks like 'select_for_update()' [1] will potentially do the trick? So long
as I grab this lock before I create or update a patch, I should be okay? Will
experiment.

Stephen

[1] https://docs.djangoproject.com/en/2.2/ref/models/querysets/#select-for-update

> 
> Stephen
> 
> PS: Sorry for the delay. Holidays.
> 
> > Kind regards,
> > Daniel
> > 
> > 
> > >      def save(self, *args, **kwargs):
> > >          if not hasattr(self, 'state') or not self.state:
> > >              self.state = get_default_initial_patch_state()
> > > @@ -503,6 +517,7 @@ class Patch(SubmissionMixin):
> > >  
> > >          super(Patch, self).save(**kwargs)
> > >  
> > > +        self.refresh_series_state()
> > >          self.refresh_tag_counts()
> > >  
> > >      def is_editable(self, user):
> > > @@ -772,6 +787,11 @@ class Series(FilenameMixin, models.Model):
> > >                                    'by the subject prefix(es)')
> > >      total = models.IntegerField(help_text='Number of patches in series as '
> > >                                  'indicated by the subject prefix(es)')
> > > +    is_open = models.BooleanField(
> > > +        default=True,
> > > +        null=True,
> > > +        help_text='Are all patches in this series resolved?',
> > > +    )
> > >  
> > >      @staticmethod
> > >      def _format_name(obj):
> > > diff --git patchwork/tests/api/test_series.py patchwork/tests/api/test_series.py
> > > index ef661773..830b2fdb 100644
> > > --- patchwork/tests/api/test_series.py
> > > +++ patchwork/tests/api/test_series.py
> > > @@ -16,6 +16,7 @@ from patchwork.tests.utils import create_patch
> > >  from patchwork.tests.utils import create_person
> > >  from patchwork.tests.utils import create_project
> > >  from patchwork.tests.utils import create_series
> > > +from patchwork.tests.utils import create_state
> > >  from patchwork.tests.utils import create_user
> > >  
> > >  if settings.ENABLE_REST_API:
> > > @@ -122,6 +123,57 @@ class TestSeriesAPI(utils.APITestCase):
> > >              'submitter': 'test@example.org'})
> > >          self.assertEqual(0, len(resp.data))
> > >  
> > > +    def test_list_filter_is_open(self):
> > > +        """Filter series by status."""
> > > +        project_obj = create_project(linkname='myproject')
> > > +        person_obj = create_person(email='test@example.com')
> > > +
> > > +        # create two series with a single patch in each with different states
> > > +        # for the two patches
> > > +
> > > +        state_a = create_state(name='New', slug='new', action_required=True)
> > > +        series_a = create_series(project=project_obj, submitter=person_obj)
> > > +        create_cover(series=series_a)
> > > +        create_patch(series=series_a, state=state_a)
> > > +        self.assertTrue(series_a.is_open)
> > > +
> > > +        state_b = create_state(
> > > +            name='Accepted', slug='Accepted', action_required=False,
> > > +        )
> > > +        series_b = create_series(project=project_obj, submitter=person_obj)
> > > +        create_cover(series=series_b)
> > > +        create_patch(series=series_b, state=state_b)
> > > +        self.assertFalse(series_b.is_open)
> > > +
> > > +        resp = self.client.get(self.api_url(), {'is_open': 'True'})
> > > +        self.assertEqual([series_a.id], [x['id'] for x in resp.data])
> > > +
> > > +        resp = self.client.get(self.api_url(), {'is_open': 'False'})
> > > +        self.assertEqual([series_b.id], [x['id'] for x in resp.data])
> > > +
> > > +        resp = self.client.get(self.api_url())
> > > +        self.assertEqual(
> > > +            [series_a.id, series_b.id], [x['id'] for x in resp.data]
> > > +        )
> > > +
> > > +    @utils.store_samples('series-list-1-2')
> > > +    def test_list_version_1_2(self):
> > > +        """List patches using API v1.2.
> > > +
> > > +        Validate that newer fields are dropped for older API versions.
> > > +        """
> > > +        self._create_series()
> > > +
> > > +        resp = self.client.get(self.api_url(version='1.2'))
> > > +        self.assertEqual(status.HTTP_200_OK, resp.status_code)
> > > +        self.assertEqual(1, len(resp.data))
> > > +        self.assertIn('url', resp.data[0])
> > > +        self.assertIn('web_url', resp.data[0])
> > > +        self.assertIn('web_url', resp.data[0]['cover_letter'])
> > > +        self.assertIn('mbox', resp.data[0]['cover_letter'])
> > > +        self.assertIn('web_url', resp.data[0]['patches'][0])
> > > +        self.assertNotIn('is_open', resp.data[0])
> > > +
> > >      @utils.store_samples('series-list-1-0')
> > >      def test_list_version_1_0(self):
> > >          """List patches using API v1.0.
> > > diff --git patchwork/tests/test_models.py patchwork/tests/test_models.py
> > > new file mode 100644
> > > index 00000000..13db3330
> > > --- /dev/null
> > > +++ patchwork/tests/test_models.py
> > > @@ -0,0 +1,72 @@
> > > +# Patchwork - automated patch tracking system
> > > +# Copyright (C) 2021 Stephen Finucane <stephen@that.guru>
> > > +#
> > > +# SPDX-License-Identifier: GPL-2.0-or-later
> > > +
> > > +from django.test import TestCase
> > > +
> > > +from patchwork.tests.utils import create_patch
> > > +from patchwork.tests.utils import create_series
> > > +from patchwork.tests.utils import create_state
> > > +
> > > +
> > > +class TestPatch(TestCase):
> > > +
> > > +    def test_save_updates_series_state(self):
> > > +        state_new = create_state(
> > > +            name='New',
> > > +            slug='new',
> > > +            action_required=True,
> > > +        )
> > > +        state_under_review = create_state(
> > > +            name='Under Review',
> > > +            slug='under-review',
> > > +            action_required=True,
> > > +        )
> > > +        state_accepted = create_state(
> > > +            name='Accepted',
> > > +            slug='accepted',
> > > +            action_required=False,
> > > +        )
> > > +        state_rejected = create_state(
> > > +            name='Rejected',
> > > +            slug='rejected',
> > > +            action_required=False,
> > > +        )
> > > +        series = create_series()
> > > +        patches = []
> > > +
> > > +        # Create four patches - one in each state
> > > +        for state in (
> > > +            state_new, state_under_review, state_accepted, state_rejected,
> > > +        ):
> > > +            patches.append(create_patch(series=series, state=state))
> > > +
> > > +        # The patches will be in various states so the series should still be
> > > +        # open
> > > +        self.assertTrue(patches[0].state.action_required)
> > > +        self.assertTrue(patches[1].state.action_required)
> > > +        self.assertFalse(patches[2].state.action_required)
> > > +        self.assertFalse(patches[3].state.action_required)
> > > +        self.assertTrue(series.is_open)
> > > +
> > > +        # Now change the first patch to 'accepted'. The series shouldn't change
> > > +        # state because the second one is still 'under-review'
> > > +        patches[0].state = state_accepted
> > > +        patches[0].save()
> > > +        self.assertFalse(patches[0].state.action_required)
> > > +        self.assertTrue(patches[1].state.action_required)
> > > +        self.assertFalse(patches[2].state.action_required)
> > > +        self.assertFalse(patches[3].state.action_required)
> > > +        self.assertTrue(series.is_open)
> > > +
> > > +        # Now change the second patch to 'rejected'. The series should finally
> > > +        # change state because all patches now have a state with
> > > +        # action_required=False
> > > +        patches[1].state = state_rejected
> > > +        patches[1].save()
> > > +        self.assertFalse(patches[0].state.action_required)
> > > +        self.assertFalse(patches[1].state.action_required)
> > > +        self.assertFalse(patches[2].state.action_required)
> > > +        self.assertFalse(patches[3].state.action_required)
> > > +        self.assertFalse(series.is_open)
> > > -- 
> > > 2.31.1
> 
> 
> _______________________________________________
> Patchwork mailing list
> Patchwork@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/patchwork
Stephen Finucane Sept. 23, 2021, 11:46 a.m. UTC | #5
On Thu, 2021-09-23 at 12:43 +0100, Stephen Finucane wrote:
> On Thu, 2021-09-23 at 12:40 +0100, Stephen Finucane wrote:
> > On Sun, 2021-09-05 at 11:34 +1000, Daniel Axtens wrote:
> > > Stephen Finucane <stephen@that.guru> writes:
> > > 
> > > > Start making series more useful by tracking whether the state is open,
> > > > meaning one or more patches are in a state with action_required=True, or
> > > > not. This means that the 'git-pw series list' command will only list
> > > > series that are actually actionable.
> > > > 
> > > > Signed-off-by: Stephen Finucane <stephen@that.guru>
> > > > ---
> > > > Open questions:
> > > > - Is there a more efficient way to do the migration than I've done?
> > > > - Should we expose an 'is_open' boolean attribute of a 'state' (open,
> > > >   closed) attribute for the series views?
> > > > - Is there a race condition possible in how I've implemented the series
> > > >   state checks?
> > > 
> > > AIUI, series.is_open is a derived property (or I suppose in db terms a
> > > 'view') of series.patches.
> > > 
> > > Is the cost of computing it 'live' so bad that we need to cache it in
> > > the db? (a 'materialised view'[1])
> > > 
> > > Is there any chance we could just always compute on demand? I'm trying
> > > to think of all the ways we expose series:
> > > 
> > >  - on the patch list page, we already also fetch all the patches
> > > 
> > >  - on the API /series/ page, we also fetch all the patches even for just
> > >    the list view.
> > > 
> > > so it seems to me we probably shouldn't suffer a performance hit for
> > > doing it live.
> > 
> > The biggest impact I see is filtering. This is the main thing I want because
> > presently 'git pw series list' is pretty much useless without this. I don't
> > doing this dynamically would work because attempting to fetch, for example, the
> > first 100 "closed" series will require loading every series and every patch. You
> > couldn't just load the first 100 since they could be entirely/mostly "open",
> > meaning you'd need to load another 100, and another, and another, until you get
> > 100 closed. This will need to be stored _somewhere_ fwict.
> > 
> > > [1] https://www.datacamp.com/community/tutorials/materialized-views-postgresql
> > > and noting that materialised views are not supported natively in mysql -
> > > but you've basically done https://linuxhint.com/materialized-views-mysql
> > > with the refresh logic in the application layer rather than a stored
> > > procedure.
> > > 
> > > > diff --git patchwork/models.py patchwork/models.py
> > > > index 58e4c51e..7441510b 100644
> > > > --- patchwork/models.py
> > > > +++ patchwork/models.py
> > > > @@ -14,6 +14,7 @@ from django.conf import settings
> > > >  from django.contrib.auth.models import User
> > > >  from django.core.exceptions import ValidationError
> > > >  from django.db import models
> > > > +from django.db import transaction
> > > >  from django.urls import reverse
> > > >  from django.utils.functional import cached_property
> > > >  from django.core.validators import validate_unicode_slug
> > > > @@ -494,6 +495,19 @@ class Patch(SubmissionMixin):
> > > >          for tag in tags:
> > > >              self._set_tag(tag, counter[tag])
> > > >  
> > > > +    def refresh_series_state(self):
> > > > +        if not self.series:
> > > > +            return
> > > > +
> > > > +        # If any of the patches in the series is in an "action required" state,
> > > > +        # then the series is still open
> > > > +        # TODO(stephenfin): Can this still race?
> > > > +        with transaction.atomic():
> > > > +            self.series.is_open = self.series.patches.filter(
> > > > +                state__action_required=True
> > > > +            ).exists()
> > > > +            self.series.save()
> > > > +
> > > 
> > > AIUI based some experimentation and my experience dealing with our
> > > series race conditions in 2018, this can race and transactions won't
> > > save you here.  The transaction will roll back all the changes if any
> > > statement fails (e.g. if you had an integrityerror from a foreign key),
> > > and other users won't see intermediate writes from your
> > > transaction. What it won't do is preserve the data dependency you're
> > > creating in the way a traditional mutex would.
> > > 
> > > Basically, consider a concurrent parsemail and an API client changing a
> > > patch state and refreshing series state on the same series:
> > > 
> > > 
> > >  | parsemail                    | API client                    |
> > >  ----------------------------------------------------------------
> > >  |                              | series.patch[2].save()        | 
> > >  |                              | BEGIN TRANSACTION             |
> > >  |                              | is_open = SELECT(...) = False |
> > >  | series.patch[3].save()       |                               |
> > >  | BEGIN TRANSACTION            |                               |
> > >  | is_open = SELECT(...) = True |                               |
> > >  | UPDATE series...             |                               |
> > >  | COMMIT TRANSACTION           |                               |
> > >  |                              | UPDATE series...              |
> > >  |                              | COMMIT TRANSACTION            |
> > > 
> > > both transactions can complete successfully and the end result is
> > > wrong. (You can play with this sort of thing with two concurrent
> > > dbshells both updating a commit_ref in a transaction, which I found very
> > > helpful in getting the details right.)
> > > 
> > > Not materialising the view avoids persisting a result where we've lost
> > > the race.
> > 
> > Ack, yeah, this makes sense. I need to investigate these materialized views
> > things, but assuming they're not an option with MySQL and SQLite then I guess
> > I'll need to find a way to lock the whole thing in a mutex.
> 
> It looks like 'select_for_update()' [1] will potentially do the trick? So long
> as I grab this lock before I create or update a patch, I should be okay? Will
> experiment.
> 
> Stephen
> 
> [1] https://docs.djangoproject.com/en/2.2/ref/models/querysets/#select-for-update

...and both MySQL and Postgres support generated columns, which is what we
effectively have here, but at least MySQL only works only on columns of the same
row of the same table [1]

Stephen

[1] https://stackoverflow.com/q/38448583/
diff mbox series

Patch

diff --git docs/api/schemas/latest/patchwork.yaml docs/api/schemas/latest/patchwork.yaml
index e3bff990..cb8fcdfa 100644
--- docs/api/schemas/latest/patchwork.yaml
+++ docs/api/schemas/latest/patchwork.yaml
@@ -2260,6 +2260,10 @@  components:
           description: >
             Version of series as indicated by the subject prefix(es).
           type: integer
+        is_open:
+          title: Status
+          type: boolean
+          readOnly: true
         total:
           title: Total
           description: >
@@ -2610,6 +2614,10 @@  components:
             Version of series as indicated by the subject prefix(es).
           type: integer
           readOnly: true
+        is_open:
+          title: Status
+          type: boolean
+          readOnly: true
         mbox:
           title: Mbox
           type: string
diff --git docs/api/schemas/patchwork.j2 docs/api/schemas/patchwork.j2
index 3b4ad2f6..13610b84 100644
--- docs/api/schemas/patchwork.j2
+++ docs/api/schemas/patchwork.j2
@@ -2354,6 +2354,12 @@  components:
           description: >
             Version of series as indicated by the subject prefix(es).
           type: integer
+{% if version >= (1, 3) %}
+        is_open:
+          title: Status
+          type: boolean
+          readOnly: true
+{% endif %}
         total:
           title: Total
           description: >
@@ -2718,6 +2724,12 @@  components:
             Version of series as indicated by the subject prefix(es).
           type: integer
           readOnly: true
+{% if version >= (1, 3) %}
+        is_open:
+          title: Status
+          type: boolean
+          readOnly: true
+{% endif %}
         mbox:
           title: Mbox
           type: string
diff --git docs/api/schemas/v1.3/patchwork.yaml docs/api/schemas/v1.3/patchwork.yaml
index 6cbba646..3b38799c 100644
--- docs/api/schemas/v1.3/patchwork.yaml
+++ docs/api/schemas/v1.3/patchwork.yaml
@@ -2260,6 +2260,10 @@  components:
           description: >
             Version of series as indicated by the subject prefix(es).
           type: integer
+        is_open:
+          title: Status
+          type: boolean
+          readOnly: true
         total:
           title: Total
           description: >
@@ -2610,6 +2614,10 @@  components:
             Version of series as indicated by the subject prefix(es).
           type: integer
           readOnly: true
+        is_open:
+          title: Status
+          type: boolean
+          readOnly: true
         mbox:
           title: Mbox
           type: string
diff --git patchwork/api/embedded.py patchwork/api/embedded.py
index 78316979..38a00ac4 100644
--- patchwork/api/embedded.py
+++ patchwork/api/embedded.py
@@ -181,11 +181,14 @@  class SeriesSerializer(SerializedRelatedField):
 
         class Meta:
             model = models.Series
-            fields = ('id', 'url', 'web_url', 'date', 'name', 'version',
-                      'mbox')
+            fields = (
+                'id', 'url', 'web_url', 'date', 'name', 'version', 'is_open',
+                'mbox',
+            )
             read_only_fields = fields
             versioned_fields = {
                 '1.1': ('web_url', ),
+                '1.3': ('is_open', ),
             }
             extra_kwargs = {
                 'url': {'view_name': 'api-series-detail'},
diff --git patchwork/api/filters.py patchwork/api/filters.py
index d9b65a8f..38de18b6 100644
--- patchwork/api/filters.py
+++ patchwork/api/filters.py
@@ -8,6 +8,7 @@  from django.core.exceptions import ValidationError
 from django.db.models import Q
 from django_filters import rest_framework
 from django_filters.rest_framework import FilterSet
+from django_filters import BooleanFilter
 from django_filters import CharFilter
 from django_filters import IsoDateTimeFilter
 from django_filters import ModelMultipleChoiceFilter
@@ -178,10 +179,14 @@  class SeriesFilterSet(TimestampMixin, BaseFilterSet):
 
     submitter = PersonFilter(queryset=Person.objects.all(), distinct=False)
     project = ProjectFilter(queryset=Project.objects.all(), distinct=False)
+    is_open = BooleanFilter()
 
     class Meta:
         model = Series
-        fields = ('submitter', 'project')
+        fields = ('submitter', 'project', 'is_open')
+        versioned_fields = {
+            '1.2': ('hash', 'msgid'),
+        }
 
 
 def msgid_filter(queryset, name, value):
diff --git patchwork/api/series.py patchwork/api/series.py
index 106e60f0..f3a498e0 100644
--- patchwork/api/series.py
+++ patchwork/api/series.py
@@ -36,13 +36,16 @@  class SeriesSerializer(BaseHyperlinkedModelSerializer):
 
     class Meta:
         model = Series
-        fields = ('id', 'url', 'web_url', 'project', 'name', 'date',
-                  'submitter', 'version', 'total', 'received_total',
-                  'received_all', 'mbox', 'cover_letter', 'patches')
+        fields = (
+            'id', 'url', 'web_url', 'project', 'name', 'date',
+            'submitter', 'version', 'is_open', 'total', 'received_total',
+            'received_all', 'mbox', 'cover_letter', 'patches',
+        )
         read_only_fields = ('date', 'submitter', 'total', 'received_total',
                             'received_all', 'mbox', 'cover_letter', 'patches')
         versioned_fields = {
             '1.1': ('web_url', ),
+            '1.3': ('is_open', ),
         }
         extra_kwargs = {
             'url': {'view_name': 'api-series-detail'},
diff --git patchwork/migrations/0046_series_is_open.py patchwork/migrations/0046_series_is_open.py
new file mode 100644
index 00000000..c916ed8b
--- /dev/null
+++ patchwork/migrations/0046_series_is_open.py
@@ -0,0 +1,58 @@ 
+from django.db import migrations, models
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('patchwork', '0045_addressed_fields'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='series',
+            name='is_open',
+            field=models.BooleanField(
+                help_text='Are all patches in this series resolved?',
+                null=True,
+            ),
+        ),
+        # Each patch in a series is associated with a state. States have an
+        # 'action_required' attribute. A state with 'action_required=True' is
+        # effectively "open", since the maintainer(s) still need to do
+        # something with it. This means the patches associated with these
+        # states are open, and by extension any series containing patches
+        # associated with these states are also open.
+        #
+        # Got that? Good. All we're doing here is setting the series' new
+        # 'is_open' attribute by identifying if any of the patches associated
+        # with each series have a state with 'action_required=True'. We do this
+        # aggregation using 'MAX', since booleans are stored as integers in
+        # most DB backends (1/True > 0/False) and ANSI SQL states True > False
+
+        # TODO: Which of these offers the best performance? The first is MySQL
+        # only fwict
+        # migrations.RunSQL(
+        #     """
+        #     UPDATE patchwork_series
+        #     JOIN (
+        #         SELECT patch.series_id AS series_id, MAX(state.action_required) AS is_open  # noqa: E501
+        #         FROM patchwork_patch patch
+        #         JOIN patchwork_state state ON patch.state_id = state.id
+        #         GROUP BY series_id
+        #     ) AS sub ON patchwork_series.id = sub.series_id
+        #     SET patchwork_series.is_open = sub.is_open;
+        #     """,  # noqa: E501
+        # ),
+        migrations.RunSQL(
+            """
+            UPDATE patchwork_series
+            SET is_open = (
+                SELECT MAX(patchwork_state.action_required)
+                FROM patchwork_patch
+                JOIN patchwork_state ON state_id = patchwork_state.id
+                WHERE patchwork_patch.series_id = patchwork_series.id
+                GROUP BY patchwork_patch.series_id
+            );
+            """,
+        ),
+    ]
diff --git patchwork/models.py patchwork/models.py
index 58e4c51e..7441510b 100644
--- patchwork/models.py
+++ patchwork/models.py
@@ -14,6 +14,7 @@  from django.conf import settings
 from django.contrib.auth.models import User
 from django.core.exceptions import ValidationError
 from django.db import models
+from django.db import transaction
 from django.urls import reverse
 from django.utils.functional import cached_property
 from django.core.validators import validate_unicode_slug
@@ -494,6 +495,19 @@  class Patch(SubmissionMixin):
         for tag in tags:
             self._set_tag(tag, counter[tag])
 
+    def refresh_series_state(self):
+        if not self.series:
+            return
+
+        # If any of the patches in the series is in an "action required" state,
+        # then the series is still open
+        # TODO(stephenfin): Can this still race?
+        with transaction.atomic():
+            self.series.is_open = self.series.patches.filter(
+                state__action_required=True
+            ).exists()
+            self.series.save()
+
     def save(self, *args, **kwargs):
         if not hasattr(self, 'state') or not self.state:
             self.state = get_default_initial_patch_state()
@@ -503,6 +517,7 @@  class Patch(SubmissionMixin):
 
         super(Patch, self).save(**kwargs)
 
+        self.refresh_series_state()
         self.refresh_tag_counts()
 
     def is_editable(self, user):
@@ -772,6 +787,11 @@  class Series(FilenameMixin, models.Model):
                                   'by the subject prefix(es)')
     total = models.IntegerField(help_text='Number of patches in series as '
                                 'indicated by the subject prefix(es)')
+    is_open = models.BooleanField(
+        default=True,
+        null=True,
+        help_text='Are all patches in this series resolved?',
+    )
 
     @staticmethod
     def _format_name(obj):
diff --git patchwork/tests/api/test_series.py patchwork/tests/api/test_series.py
index ef661773..830b2fdb 100644
--- patchwork/tests/api/test_series.py
+++ patchwork/tests/api/test_series.py
@@ -16,6 +16,7 @@  from patchwork.tests.utils import create_patch
 from patchwork.tests.utils import create_person
 from patchwork.tests.utils import create_project
 from patchwork.tests.utils import create_series
+from patchwork.tests.utils import create_state
 from patchwork.tests.utils import create_user
 
 if settings.ENABLE_REST_API:
@@ -122,6 +123,57 @@  class TestSeriesAPI(utils.APITestCase):
             'submitter': 'test@example.org'})
         self.assertEqual(0, len(resp.data))
 
+    def test_list_filter_is_open(self):
+        """Filter series by status."""
+        project_obj = create_project(linkname='myproject')
+        person_obj = create_person(email='test@example.com')
+
+        # create two series with a single patch in each with different states
+        # for the two patches
+
+        state_a = create_state(name='New', slug='new', action_required=True)
+        series_a = create_series(project=project_obj, submitter=person_obj)
+        create_cover(series=series_a)
+        create_patch(series=series_a, state=state_a)
+        self.assertTrue(series_a.is_open)
+
+        state_b = create_state(
+            name='Accepted', slug='Accepted', action_required=False,
+        )
+        series_b = create_series(project=project_obj, submitter=person_obj)
+        create_cover(series=series_b)
+        create_patch(series=series_b, state=state_b)
+        self.assertFalse(series_b.is_open)
+
+        resp = self.client.get(self.api_url(), {'is_open': 'True'})
+        self.assertEqual([series_a.id], [x['id'] for x in resp.data])
+
+        resp = self.client.get(self.api_url(), {'is_open': 'False'})
+        self.assertEqual([series_b.id], [x['id'] for x in resp.data])
+
+        resp = self.client.get(self.api_url())
+        self.assertEqual(
+            [series_a.id, series_b.id], [x['id'] for x in resp.data]
+        )
+
+    @utils.store_samples('series-list-1-2')
+    def test_list_version_1_2(self):
+        """List patches using API v1.2.
+
+        Validate that newer fields are dropped for older API versions.
+        """
+        self._create_series()
+
+        resp = self.client.get(self.api_url(version='1.2'))
+        self.assertEqual(status.HTTP_200_OK, resp.status_code)
+        self.assertEqual(1, len(resp.data))
+        self.assertIn('url', resp.data[0])
+        self.assertIn('web_url', resp.data[0])
+        self.assertIn('web_url', resp.data[0]['cover_letter'])
+        self.assertIn('mbox', resp.data[0]['cover_letter'])
+        self.assertIn('web_url', resp.data[0]['patches'][0])
+        self.assertNotIn('is_open', resp.data[0])
+
     @utils.store_samples('series-list-1-0')
     def test_list_version_1_0(self):
         """List patches using API v1.0.
diff --git patchwork/tests/test_models.py patchwork/tests/test_models.py
new file mode 100644
index 00000000..13db3330
--- /dev/null
+++ patchwork/tests/test_models.py
@@ -0,0 +1,72 @@ 
+# Patchwork - automated patch tracking system
+# Copyright (C) 2021 Stephen Finucane <stephen@that.guru>
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+from django.test import TestCase
+
+from patchwork.tests.utils import create_patch
+from patchwork.tests.utils import create_series
+from patchwork.tests.utils import create_state
+
+
+class TestPatch(TestCase):
+
+    def test_save_updates_series_state(self):
+        state_new = create_state(
+            name='New',
+            slug='new',
+            action_required=True,
+        )
+        state_under_review = create_state(
+            name='Under Review',
+            slug='under-review',
+            action_required=True,
+        )
+        state_accepted = create_state(
+            name='Accepted',
+            slug='accepted',
+            action_required=False,
+        )
+        state_rejected = create_state(
+            name='Rejected',
+            slug='rejected',
+            action_required=False,
+        )
+        series = create_series()
+        patches = []
+
+        # Create four patches - one in each state
+        for state in (
+            state_new, state_under_review, state_accepted, state_rejected,
+        ):
+            patches.append(create_patch(series=series, state=state))
+
+        # The patches will be in various states so the series should still be
+        # open
+        self.assertTrue(patches[0].state.action_required)
+        self.assertTrue(patches[1].state.action_required)
+        self.assertFalse(patches[2].state.action_required)
+        self.assertFalse(patches[3].state.action_required)
+        self.assertTrue(series.is_open)
+
+        # Now change the first patch to 'accepted'. The series shouldn't change
+        # state because the second one is still 'under-review'
+        patches[0].state = state_accepted
+        patches[0].save()
+        self.assertFalse(patches[0].state.action_required)
+        self.assertTrue(patches[1].state.action_required)
+        self.assertFalse(patches[2].state.action_required)
+        self.assertFalse(patches[3].state.action_required)
+        self.assertTrue(series.is_open)
+
+        # Now change the second patch to 'rejected'. The series should finally
+        # change state because all patches now have a state with
+        # action_required=False
+        patches[1].state = state_rejected
+        patches[1].save()
+        self.assertFalse(patches[0].state.action_required)
+        self.assertFalse(patches[1].state.action_required)
+        self.assertFalse(patches[2].state.action_required)
+        self.assertFalse(patches[3].state.action_required)
+        self.assertFalse(series.is_open)