mbox series

[net-next,RFC,v3,00/14] Add devlink reload action option

Message ID 1598801254-27764-1-git-send-email-moshe@mellanox.com
Headers show
Series Add devlink reload action option | expand

Message

Moshe Shemesh Aug. 30, 2020, 3:27 p.m. UTC
Introduce new option on devlink reload API to enable the user to select the
reload action required. Complete support for all actions in mlx5.
The following reload actions are supported:
  driver_reinit: driver entities re-initialization, applying devlink-param
                 and devlink-resource values.
  fw_activate: firmware activate.
  fw_activate_no_reset: Activate new firmware image without any reset.
                        (also known as: firmware live patching).

Each driver which support this command should expose the reload actions
supported.
The uAPI is backward compatible, if the reload action option is omitted
from the reload command, the driver reinit action will be used.
Note that when required to do firmware activation some drivers may need
to reload the driver. On the other hand some drivers may need to reset
the firmware to reinitialize the driver entities. Therefore, the devlink
reload command returns the actions which were actually done.

Add reload actions counters to hold the history per reload action type.
For example, the number of times fw_activate has been done on this
device since the driver module was added or if the firmware activation
was done with or without reset.

Patch 1 adds the new API reload action option to devlink.
Patch 2 adds reload actions counters.
Patch 3 exposes the reload actions counters on devlink dev get.
Patches 4-9 add support on mlx5 for devlink reload action fw_activate
            and handle the firmware reset events.
Patches 10-11 add devlink enable remote dev reset parameter and use it
             in mlx5.
Patches 12-13 mlx5 add devlink reload action fw_activate_no_reset support
              and event handling.
Patch 14 adds documentation file devlink-reload.rst 

command examples:
$devlink dev reload pci/0000:82:00.0 action driver_reinit
reload_actions_done:
  driver_reinit

$devlink dev reload pci/0000:82:00.0 action fw_activate
reload_actions_done:
  driver_reinit fw_activate

$ devlink dev reload pci/0000:82:00.0 action fw_activate no_reset
reload_actions_done:
  fw_activate_no_reset

v2 -> v3:
- Replace fw_live_patch action by fw_activate_no_reset
- Devlink reload returns the actions done over netlink reply
- Add reload actions counters

v1 -> v2:
- Instead of reload levels driver,fw_reset,fw_live_patch have reload
  actions driver_reinit,fw_activate,fw_live_patch
- Remove driver default level, the action driver_reinit is the default
  action for all drivers 

Moshe Shemesh (14):
  devlink: Add reload action option to devlink reload command
  devlink: Add reload actions counters
  devlink: Add reload actions counters to dev get
  net/mlx5: Add functions to set/query MFRL register
  net/mlx5: Set cap for pci sync for fw update event
  net/mlx5: Handle sync reset request event
  net/mlx5: Handle sync reset now event
  net/mlx5: Handle sync reset abort event
  net/mlx5: Add support for devlink reload action fw activate
  devlink: Add enable_remote_dev_reset generic parameter
  net/mlx5: Add devlink param enable_remote_dev_reset support
  net/mlx5: Add support for fw live patch event
  net/mlx5: Add support for devlink reload action fw activate no reset
  devlink: Add Documentation/networking/devlink/devlink-reload.rst

 .../networking/devlink/devlink-params.rst     |   6 +
 .../networking/devlink/devlink-reload.rst     |  68 +++
 Documentation/networking/devlink/index.rst    |   1 +
 drivers/net/ethernet/mellanox/mlx4/main.c     |  14 +-
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
 .../net/ethernet/mellanox/mlx5/core/devlink.c | 117 ++++-
 .../mellanox/mlx5/core/diag/fw_tracer.c       |  31 ++
 .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
 .../ethernet/mellanox/mlx5/core/fw_reset.c    | 453 ++++++++++++++++++
 .../ethernet/mellanox/mlx5/core/fw_reset.h    |  19 +
 .../net/ethernet/mellanox/mlx5/core/health.c  |  35 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  13 +
 .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
 drivers/net/ethernet/mellanox/mlxsw/core.c    |  24 +-
 drivers/net/netdevsim/dev.c                   |  16 +-
 include/linux/mlx5/device.h                   |   1 +
 include/linux/mlx5/driver.h                   |   4 +
 include/net/devlink.h                         |  13 +-
 include/uapi/linux/devlink.h                  |  24 +
 net/core/devlink.c                            | 174 ++++++-
 20 files changed, 967 insertions(+), 51 deletions(-)
 create mode 100644 Documentation/networking/devlink/devlink-reload.rst
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h

Comments

Jiri Pirko Aug. 31, 2020, 10:49 a.m. UTC | #1
Sun, Aug 30, 2020 at 05:27:20PM CEST, moshe@mellanox.com wrote:
>Introduce new option on devlink reload API to enable the user to select the
>reload action required. Complete support for all actions in mlx5.
>The following reload actions are supported:
>  driver_reinit: driver entities re-initialization, applying devlink-param
>                 and devlink-resource values.
>  fw_activate: firmware activate.
>  fw_activate_no_reset: Activate new firmware image without any reset.
>                        (also known as: firmware live patching).
>
>Each driver which support this command should expose the reload actions
>supported.
>The uAPI is backward compatible, if the reload action option is omitted
>from the reload command, the driver reinit action will be used.
>Note that when required to do firmware activation some drivers may need
>to reload the driver. On the other hand some drivers may need to reset
>the firmware to reinitialize the driver entities. Therefore, the devlink
>reload command returns the actions which were actually done.
>
>Add reload actions counters to hold the history per reload action type.
>For example, the number of times fw_activate has been done on this
>device since the driver module was added or if the firmware activation
>was done with or without reset.
>
>Patch 1 adds the new API reload action option to devlink.
>Patch 2 adds reload actions counters.
>Patch 3 exposes the reload actions counters on devlink dev get.
>Patches 4-9 add support on mlx5 for devlink reload action fw_activate
>            and handle the firmware reset events.
>Patches 10-11 add devlink enable remote dev reset parameter and use it
>             in mlx5.
>Patches 12-13 mlx5 add devlink reload action fw_activate_no_reset support
>              and event handling.
>Patch 14 adds documentation file devlink-reload.rst 
>
>command examples:
>$devlink dev reload pci/0000:82:00.0 action driver_reinit
>reload_actions_done:
>  driver_reinit
>
>$devlink dev reload pci/0000:82:00.0 action fw_activate
>reload_actions_done:
>  driver_reinit fw_activate
>
>$ devlink dev reload pci/0000:82:00.0 action fw_activate no_reset

You are missing "_".


>reload_actions_done:

No need to have "reload" word here. And maybe "performed" would be
better than "done". Idk:
"actions_performed"
?


>  fw_activate_no_reset
>
>v2 -> v3:
>- Replace fw_live_patch action by fw_activate_no_reset
>- Devlink reload returns the actions done over netlink reply
>- Add reload actions counters
>
>v1 -> v2:
>- Instead of reload levels driver,fw_reset,fw_live_patch have reload
>  actions driver_reinit,fw_activate,fw_live_patch
>- Remove driver default level, the action driver_reinit is the default
>  action for all drivers 
>
>Moshe Shemesh (14):
>  devlink: Add reload action option to devlink reload command
>  devlink: Add reload actions counters
>  devlink: Add reload actions counters to dev get
>  net/mlx5: Add functions to set/query MFRL register
>  net/mlx5: Set cap for pci sync for fw update event
>  net/mlx5: Handle sync reset request event
>  net/mlx5: Handle sync reset now event
>  net/mlx5: Handle sync reset abort event
>  net/mlx5: Add support for devlink reload action fw activate
>  devlink: Add enable_remote_dev_reset generic parameter
>  net/mlx5: Add devlink param enable_remote_dev_reset support
>  net/mlx5: Add support for fw live patch event
>  net/mlx5: Add support for devlink reload action fw activate no reset
>  devlink: Add Documentation/networking/devlink/devlink-reload.rst
>
> .../networking/devlink/devlink-params.rst     |   6 +
> .../networking/devlink/devlink-reload.rst     |  68 +++
> Documentation/networking/devlink/index.rst    |   1 +
> drivers/net/ethernet/mellanox/mlx4/main.c     |  14 +-
> .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
> .../net/ethernet/mellanox/mlx5/core/devlink.c | 117 ++++-
> .../mellanox/mlx5/core/diag/fw_tracer.c       |  31 ++
> .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
> .../ethernet/mellanox/mlx5/core/fw_reset.c    | 453 ++++++++++++++++++
> .../ethernet/mellanox/mlx5/core/fw_reset.h    |  19 +
> .../net/ethernet/mellanox/mlx5/core/health.c  |  35 +-
> .../net/ethernet/mellanox/mlx5/core/main.c    |  13 +
> .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
> drivers/net/ethernet/mellanox/mlxsw/core.c    |  24 +-
> drivers/net/netdevsim/dev.c                   |  16 +-
> include/linux/mlx5/device.h                   |   1 +
> include/linux/mlx5/driver.h                   |   4 +
> include/net/devlink.h                         |  13 +-
> include/uapi/linux/devlink.h                  |  24 +
> net/core/devlink.c                            | 174 ++++++-
> 20 files changed, 967 insertions(+), 51 deletions(-)
> create mode 100644 Documentation/networking/devlink/devlink-reload.rst
> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
>
>-- 
>2.17.1
>
Moshe Shemesh Sept. 1, 2020, 8:05 p.m. UTC | #2
On 8/31/2020 1:49 PM, Jiri Pirko wrote:
> Sun, Aug 30, 2020 at 05:27:20PM CEST, moshe@mellanox.com wrote:
>> Introduce new option on devlink reload API to enable the user to select the
>> reload action required. Complete support for all actions in mlx5.
>> The following reload actions are supported:
>>   driver_reinit: driver entities re-initialization, applying devlink-param
>>                  and devlink-resource values.
>>   fw_activate: firmware activate.
>>   fw_activate_no_reset: Activate new firmware image without any reset.
>>                         (also known as: firmware live patching).
>>
>> Each driver which support this command should expose the reload actions
>> supported.
>> The uAPI is backward compatible, if the reload action option is omitted
> >from the reload command, the driver reinit action will be used.
>> Note that when required to do firmware activation some drivers may need
>> to reload the driver. On the other hand some drivers may need to reset
>> the firmware to reinitialize the driver entities. Therefore, the devlink
>> reload command returns the actions which were actually done.
>>
>> Add reload actions counters to hold the history per reload action type.
>> For example, the number of times fw_activate has been done on this
>> device since the driver module was added or if the firmware activation
>> was done with or without reset.
>>
>> Patch 1 adds the new API reload action option to devlink.
>> Patch 2 adds reload actions counters.
>> Patch 3 exposes the reload actions counters on devlink dev get.
>> Patches 4-9 add support on mlx5 for devlink reload action fw_activate
>>             and handle the firmware reset events.
>> Patches 10-11 add devlink enable remote dev reset parameter and use it
>>              in mlx5.
>> Patches 12-13 mlx5 add devlink reload action fw_activate_no_reset support
>>               and event handling.
>> Patch 14 adds documentation file devlink-reload.rst
>>
>> command examples:
>> $devlink dev reload pci/0000:82:00.0 action driver_reinit
>> reload_actions_done:
>>   driver_reinit
>>
>> $devlink dev reload pci/0000:82:00.0 action fw_activate
>> reload_actions_done:
>>   driver_reinit fw_activate
>>
>> $ devlink dev reload pci/0000:82:00.0 action fw_activate no_reset
> You are missing "_".
>
I meant that no_reset is an option, so the uAPI is:

$ devlink dev reload DEV [ netns { PID | NAME | ID } ] [ action { 
driver_reinit | fw_activate [no_reset] } ]

Should have been as "--no_reset" or "-no_reset" but it seems that all 
options in devlink are global, not specific to command.

Probably there is a better way, please advise.

>> reload_actions_done:
> No need to have "reload" word here. And maybe "performed" would be
> better than "done". Idk:
> "actions_performed"
> ?
>

Yes, that's better, I will fix.

>>   fw_activate_no_reset
>>
>> v2 -> v3:
>> - Replace fw_live_patch action by fw_activate_no_reset
>> - Devlink reload returns the actions done over netlink reply
>> - Add reload actions counters
>>
>> v1 -> v2:
>> - Instead of reload levels driver,fw_reset,fw_live_patch have reload
>>   actions driver_reinit,fw_activate,fw_live_patch
>> - Remove driver default level, the action driver_reinit is the default
>>   action for all drivers
>>
>> Moshe Shemesh (14):
>>   devlink: Add reload action option to devlink reload command
>>   devlink: Add reload actions counters
>>   devlink: Add reload actions counters to dev get
>>   net/mlx5: Add functions to set/query MFRL register
>>   net/mlx5: Set cap for pci sync for fw update event
>>   net/mlx5: Handle sync reset request event
>>   net/mlx5: Handle sync reset now event
>>   net/mlx5: Handle sync reset abort event
>>   net/mlx5: Add support for devlink reload action fw activate
>>   devlink: Add enable_remote_dev_reset generic parameter
>>   net/mlx5: Add devlink param enable_remote_dev_reset support
>>   net/mlx5: Add support for fw live patch event
>>   net/mlx5: Add support for devlink reload action fw activate no reset
>>   devlink: Add Documentation/networking/devlink/devlink-reload.rst
>>
>> .../networking/devlink/devlink-params.rst     |   6 +
>> .../networking/devlink/devlink-reload.rst     |  68 +++
>> Documentation/networking/devlink/index.rst    |   1 +
>> drivers/net/ethernet/mellanox/mlx4/main.c     |  14 +-
>> .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
>> .../net/ethernet/mellanox/mlx5/core/devlink.c | 117 ++++-
>> .../mellanox/mlx5/core/diag/fw_tracer.c       |  31 ++
>> .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
>> .../ethernet/mellanox/mlx5/core/fw_reset.c    | 453 ++++++++++++++++++
>> .../ethernet/mellanox/mlx5/core/fw_reset.h    |  19 +
>> .../net/ethernet/mellanox/mlx5/core/health.c  |  35 +-
>> .../net/ethernet/mellanox/mlx5/core/main.c    |  13 +
>> .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
>> drivers/net/ethernet/mellanox/mlxsw/core.c    |  24 +-
>> drivers/net/netdevsim/dev.c                   |  16 +-
>> include/linux/mlx5/device.h                   |   1 +
>> include/linux/mlx5/driver.h                   |   4 +
>> include/net/devlink.h                         |  13 +-
>> include/uapi/linux/devlink.h                  |  24 +
>> net/core/devlink.c                            | 174 ++++++-
>> 20 files changed, 967 insertions(+), 51 deletions(-)
>> create mode 100644 Documentation/networking/devlink/devlink-reload.rst
>> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
>> create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
>>
>> -- 
>> 2.17.1
>>
Jiri Pirko Sept. 2, 2020, 7:55 a.m. UTC | #3
Tue, Sep 01, 2020 at 09:16:17PM CEST, moshe@nvidia.com wrote:
>
>On 8/31/2020 1:49 PM, Jiri Pirko wrote:
>> Sun, Aug 30, 2020 at 05:27:20PM CEST, moshe@mellanox.com wrote:
>> > Introduce new option on devlink reload API to enable the user to select the
>> > reload action required. Complete support for all actions in mlx5.
>> > The following reload actions are supported:
>> >   driver_reinit: driver entities re-initialization, applying devlink-param
>> >                  and devlink-resource values.
>> >   fw_activate: firmware activate.
>> >   fw_activate_no_reset: Activate new firmware image without any reset.
>> >                         (also known as: firmware live patching).
>> > 
>> > Each driver which support this command should expose the reload actions
>> > supported.
>> > The uAPI is backward compatible, if the reload action option is omitted
>> >from the reload command, the driver reinit action will be used.
>> > Note that when required to do firmware activation some drivers may need
>> > to reload the driver. On the other hand some drivers may need to reset
>> > the firmware to reinitialize the driver entities. Therefore, the devlink
>> > reload command returns the actions which were actually done.
>> > 
>> > Add reload actions counters to hold the history per reload action type.
>> > For example, the number of times fw_activate has been done on this
>> > device since the driver module was added or if the firmware activation
>> > was done with or without reset.
>> > 
>> > Patch 1 adds the new API reload action option to devlink.
>> > Patch 2 adds reload actions counters.
>> > Patch 3 exposes the reload actions counters on devlink dev get.
>> > Patches 4-9 add support on mlx5 for devlink reload action fw_activate
>> >             and handle the firmware reset events.
>> > Patches 10-11 add devlink enable remote dev reset parameter and use it
>> >              in mlx5.
>> > Patches 12-13 mlx5 add devlink reload action fw_activate_no_reset support
>> >               and event handling.
>> > Patch 14 adds documentation file devlink-reload.rst
>> > 
>> > command examples:
>> > $devlink dev reload pci/0000:82:00.0 action driver_reinit
>> > reload_actions_done:
>> >   driver_reinit
>> > 
>> > $devlink dev reload pci/0000:82:00.0 action fw_activate
>> > reload_actions_done:
>> >   driver_reinit fw_activate
>> > 
>> > $ devlink dev reload pci/0000:82:00.0 action fw_activate no_reset
>> You are missing "_".
>
> I meant that no_reset is an option here, so the uAPI is :
>
>$ devlink dev reload DEV [ netns { PID | NAME | ID } ] [ action {
>driver_reinit | fw_activate [no_reset] } ]

In the uapi enum, it's a different value. It is desirable to follow the
uapi for things like this. I don't see why not.

>
> Should have been as "--no_reset" or "-no_reset" but it seemed that all
>options in devlink are global, not specific to command
>
>Do you see a better way, please advise

if you want to do it this way, you need a separate netlink attr. But I
don't think it is necessary. I provided suggestion in the other email.


>
>> 
>> > reload_actions_done:
>> No need to have "reload" word here. And maybe "performed" would be
>> better than "done". Idk:
>> "actions_performed"
>> ?
>
>
>Yes, that's better, I will fix.
>
>> 
>> >   fw_activate_no_reset
>> > 
>> > v2 -> v3:
>> > - Replace fw_live_patch action by fw_activate_no_reset
>> > - Devlink reload returns the actions done over netlink reply
>> > - Add reload actions counters
>> > 
>> > v1 -> v2:
>> > - Instead of reload levels driver,fw_reset,fw_live_patch have reload
>> >   actions driver_reinit,fw_activate,fw_live_patch
>> > - Remove driver default level, the action driver_reinit is the default
>> >   action for all drivers
>> > 
>> > Moshe Shemesh (14):
>> >   devlink: Add reload action option to devlink reload command
>> >   devlink: Add reload actions counters
>> >   devlink: Add reload actions counters to dev get
>> >   net/mlx5: Add functions to set/query MFRL register
>> >   net/mlx5: Set cap for pci sync for fw update event
>> >   net/mlx5: Handle sync reset request event
>> >   net/mlx5: Handle sync reset now event
>> >   net/mlx5: Handle sync reset abort event
>> >   net/mlx5: Add support for devlink reload action fw activate
>> >   devlink: Add enable_remote_dev_reset generic parameter
>> >   net/mlx5: Add devlink param enable_remote_dev_reset support
>> >   net/mlx5: Add support for fw live patch event
>> >   net/mlx5: Add support for devlink reload action fw activate no reset
>> >   devlink: Add Documentation/networking/devlink/devlink-reload.rst
>> > 
>> > .../networking/devlink/devlink-params.rst     |   6 +
>> > .../networking/devlink/devlink-reload.rst     |  68 +++
>> > Documentation/networking/devlink/index.rst    |   1 +
>> > drivers/net/ethernet/mellanox/mlx4/main.c     |  14 +-
>> > .../net/ethernet/mellanox/mlx5/core/Makefile  |   2 +-
>> > .../net/ethernet/mellanox/mlx5/core/devlink.c | 117 ++++-
>> > .../mellanox/mlx5/core/diag/fw_tracer.c       |  31 ++
>> > .../mellanox/mlx5/core/diag/fw_tracer.h       |   1 +
>> > .../ethernet/mellanox/mlx5/core/fw_reset.c    | 453 ++++++++++++++++++
>> > .../ethernet/mellanox/mlx5/core/fw_reset.h    |  19 +
>> > .../net/ethernet/mellanox/mlx5/core/health.c  |  35 +-
>> > .../net/ethernet/mellanox/mlx5/core/main.c    |  13 +
>> > .../ethernet/mellanox/mlx5/core/mlx5_core.h   |   2 +
>> > drivers/net/ethernet/mellanox/mlxsw/core.c    |  24 +-
>> > drivers/net/netdevsim/dev.c                   |  16 +-
>> > include/linux/mlx5/device.h                   |   1 +
>> > include/linux/mlx5/driver.h                   |   4 +
>> > include/net/devlink.h                         |  13 +-
>> > include/uapi/linux/devlink.h                  |  24 +
>> > net/core/devlink.c                            | 174 ++++++-
>> > 20 files changed, 967 insertions(+), 51 deletions(-)
>> > create mode 100644 Documentation/networking/devlink/devlink-reload.rst
>> > create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
>> > create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/fw_reset.h
>> > 
>> > -- 
>> > 2.17.1
>> >