mbox series

[v4,00/17] remoteproc: Add support for detaching a rproc

Message ID 20201218173228.2277032-1-mathieu.poirier@linaro.org
Headers show
Series remoteproc: Add support for detaching a rproc | expand

Message

Mathieu Poirier Dec. 18, 2020, 5:32 p.m. UTC
Following the work done here [1], this set provides support for the
remoteproc core to release resources associated with a remote processor
without having to switch it off. That way a platform driver can be removed
or the application processor power cycled while the remote processor is
still operating.

Of special interest in this series are patches 5 and 6 where getting the
address of the resource table installed by an eternal entity if moved to
the core.  This is to support scenarios where a remote process has been
booted by the core but is being detached.  To re-attach the remote
processor, the address of the resource table needs to be known at a later
time than the platform driver's probe() function.

Applies cleanly on v5.10

Thanks,
Mathieu

[1]. https://lkml.org/lkml/2020/7/14/1600

----
New for v4:
- Made binding description OS agnostic (Rob)
- Added functionality to set the external resource table in the core
- Fixed a crash when detaching (Arnaud)
- Fixed error code propagation in rproc_cdev_relase() and rproc_del() (Arnaud)
- Added RB tags

Mathieu Poirier (17):
  dt-bindings: remoteproc: Add bindind to support autonomous processors
  remoteproc: Re-check state in rproc_shutdown()
  remoteproc: Remove useless check in rproc_del()
  remoteproc: Rename function rproc_actuate()
  remoteproc: Add new get_loaded_rsc_table() remoteproc operation
  remoteproc: stm32: Move resource table setup to rproc_ops
  remoteproc: Add new RPROC_ATTACHED state
  remoteproc: Properly represent the attached state
  remoteproc: Properly deal with a kernel panic when attached
  remoteproc: Add new detach() remoteproc operation
  remoteproc: Introduce function __rproc_detach()
  remoteproc: Introduce function rproc_detach()
  remoteproc: Add return value to function rproc_shutdown()
  remoteproc: Properly deal with a stop request when attached
  remoteproc: Properly deal with a start request when attached
  remoteproc: Properly deal with detach request
  remoteproc: Refactor rproc delete and cdev release path

 .../bindings/remoteproc/remoteproc-core.yaml  |  27 +++
 drivers/remoteproc/remoteproc_cdev.c          |  32 ++-
 drivers/remoteproc/remoteproc_core.c          | 211 +++++++++++++++---
 drivers/remoteproc/remoteproc_internal.h      |   8 +
 drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
 drivers/remoteproc/stm32_rproc.c              | 147 ++++++------
 include/linux/remoteproc.h                    |  24 +-
 7 files changed, 344 insertions(+), 125 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml

Comments

Arnaud POULIQUEN Jan. 27, 2021, 8:44 a.m. UTC | #1
Hi Mathieu,

Come back on you series...

On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> Add an new get_loaded_rsc_table() operation in order to support
> scenarios where the remoteproc core has booted a remote processor
> and detaches from it.  When re-attaching to the remote processor,
> the core needs to know where the resource table has been placed
> in memory.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>  drivers/remoteproc/remoteproc_core.c     | 6 ++++++
>  drivers/remoteproc/remoteproc_internal.h | 8 ++++++++
>  include/linux/remoteproc.h               | 5 ++++-
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index d0f6b39b56f9..3d87c910aca7 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1556,6 +1556,12 @@ static int rproc_attach(struct rproc *rproc)
>  		return ret;
>  	}
>  
> +	ret = rproc_get_loaded_rsc_table(rproc);
> +	if (ret) {
> +		dev_err(dev, "can't load resource table: %d\n", ret);
> +		goto disable_iommu;
> +	}
> +

This function is rather ambiguous. Without the example of stm32, it is not
obvious what the platform driver has to do in this ops. And the update of rproc
in the in the core instead of in platform driver seems to me more reliable.

Here is a suggestion considering that ->cached_table is always NULL:


struct resource_table *rproc_get_loaded_rsc_table(struct rproc *rproc,
                                                  size_t* size)
{

	if (rproc->ops->get_loaded_rsc_table) {
		return rproc->ops->get_loaded_rsc_table(rproc, size);

	*size = 0;
	return NULL;
}

then in rproc_attach:

	table_ptr = rproc_get_loaded_rsc_table(rproc, &tab_size);
	if (PTR_ERR(table_ptr) {
		dev_err(dev, "can't load resource table: %d\n", ret);
		goto disable_iommu;
	}
 	rproc->cached_table = NULL;
 	rproc->table_ptr = table_ptr;
 	rproc->table_sz = table_sz;


Thanks,
Arnaud

>  	/* reset max_notifyid */
>  	rproc->max_notifyid = -1;
>  
> diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h
> index c34002888d2c..c48b301d6ad1 100644
> --- a/drivers/remoteproc/remoteproc_internal.h
> +++ b/drivers/remoteproc/remoteproc_internal.h
> @@ -177,6 +177,14 @@ struct resource_table *rproc_find_loaded_rsc_table(struct rproc *rproc,
>  	return NULL;
>  }
>  
> +static inline int rproc_get_loaded_rsc_table(struct rproc *rproc)
> +{
> +	if (rproc->ops->get_loaded_rsc_table)
> +		return rproc->ops->get_loaded_rsc_table(rproc);
> +
> +	return 0;
> +}
> +
>  static inline
>  bool rproc_u64_fit_in_size_t(u64 val)
>  {
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 3fa3ba6498e8..571615e77e6f 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -368,7 +368,9 @@ enum rsc_handling_status {
>   * RSC_HANDLED if resource was handled, RSC_IGNORED if not handled and a
>   * negative value on error
>   * @load_rsc_table:	load resource table from firmware image
> - * @find_loaded_rsc_table: find the loaded resouce table
> + * @find_loaded_rsc_table: find the loaded resource table from firmware image
> + * @get_loaded_rsc_table: get resource table installed in memory
> + *			  by external entity
>   * @load:		load firmware to memory, where the remote processor
>   *			expects to find it
>   * @sanity_check:	sanity check the fw image
> @@ -389,6 +391,7 @@ struct rproc_ops {
>  			  int offset, int avail);
>  	struct resource_table *(*find_loaded_rsc_table)(
>  				struct rproc *rproc, const struct firmware *fw);
> +	int (*get_loaded_rsc_table)(struct rproc *rproc);
>  	int (*load)(struct rproc *rproc, const struct firmware *fw);
>  	int (*sanity_check)(struct rproc *rproc, const struct firmware *fw);
>  	u64 (*get_boot_addr)(struct rproc *rproc, const struct firmware *fw);
>
Arnaud POULIQUEN Jan. 27, 2021, 8:46 a.m. UTC | #2
On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> Introduce function __rproc_detach() to perform the same kind of
> operation as rproc_stop(), but instead of switching off the
> remote processor using rproc->ops->stop(), it uses
> rproc->ops->detach().  That way it is possible for the core
> to release the resources associated with a remote processor while
> the latter is kept operating.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Reviewed-by: Peng Fan <peng.fan@nxp.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 42 ++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index fc28053c7f89..e665ed4776c3 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1670,6 +1670,48 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>  	return 0;
>  }
>  
> +/*
> + * __rproc_detach(): Does the opposite of rproc_attach()
> + */
> +static int __maybe_unused __rproc_detach(struct rproc *rproc)
> +{
> +	struct device *dev = &rproc->dev;
> +	int ret;
> +
> +	/* No need to continue if a detach() operation has not been provided */
> +	if (!rproc->ops->detach)
> +		return -EINVAL;

I wonder if this ops should be optional.

> +
> +	/* Stop any subdevices for the remote processor */
> +	rproc_stop_subdevices(rproc, false);
> +
> +	/*
> +	 * If the remote processors was started by the core then a cached_table
> +	 * is present and we must follow the same cleanup sequence as we would
> +	 * for a shutdown().  As it is in rproc_stop(), use the cached resource
> +	 * table for the rest of the detach process since ->table_ptr will
> +	 * become invalid as soon as carveouts are released in
> +	 * rproc_resource_cleanup().
> +	 */
> +	if (rproc->cached_table)
> +		rproc->table_ptr = rproc->cached_table;
> +
> +	/* Tell the remote processor the core isn't available anymore */
> +	ret = rproc->ops->detach(rproc);
> +	if (ret) {
> +		dev_err(dev, "can't detach from rproc: %d\n", ret);
> +		rproc_start_subdevices(rproc);

Not sure that this would be possible in all cases, without a unprepare and
prepare. What about having the same behavior as the rproc_stop failure?

Thanks
Arnaud.

> +		return ret;
> +	}
> +
> +	rproc_unprepare_subdevices(rproc);
> +
> +	rproc->state = RPROC_DETACHED;
> +
> +	dev_info(dev, "detached remote processor %s\n", rproc->name);
> +
> +	return 0;
> +}
>  
>  /**
>   * rproc_trigger_recovery() - recover a remoteproc
>
Arnaud POULIQUEN Jan. 27, 2021, 8:50 a.m. UTC | #3
On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> Introduce function rproc_detach() to enable the remoteproc
> core to release the resources associated with a remote processor
> without stopping its operation.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Reviewed-by: Peng Fan <peng.fan@nxp.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 71 +++++++++++++++++++++++++++-
>  include/linux/remoteproc.h           |  2 +
>  2 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index e665ed4776c3..ece3f15070b9 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1673,7 +1673,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>  /*
>   * __rproc_detach(): Does the opposite of rproc_attach()
>   */
> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> +static int __rproc_detach(struct rproc *rproc)
>  {
>  	struct device *dev = &rproc->dev;
>  	int ret;
> @@ -1927,6 +1927,75 @@ void rproc_shutdown(struct rproc *rproc)
>  }
>  EXPORT_SYMBOL(rproc_shutdown);
>  
> +/**
> + * rproc_detach() - Detach the remote processor from the
> + * remoteproc core
> + *
> + * @rproc: the remote processor
> + *
> + * Detach a remote processor (previously attached to with rproc_actuate()).

You rename the function to rproc_attach in you patch 04/17.

Then Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>

Thanks,
Arnaud

> + *
> + * In case @rproc is still being used by an additional user(s), then
> + * this function will just decrement the power refcount and exit,
> + * without disconnecting the device.
> + *
> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> + * processor know that services provided by the application processor are
> + * no longer available.  From there it should be possible to remove the
> + * platform driver and even power cycle the application processor (if the HW
> + * supports it) without needing to switch off the remote processor.
> + */
> +int rproc_detach(struct rproc *rproc)
> +{
> +	struct device *dev = &rproc->dev;
> +	int ret;
> +
> +	ret = mutex_lock_interruptible(&rproc->lock);
> +	if (ret) {
> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> +		return ret;
> +	}
> +
> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> +		ret = -EPERM;
> +		goto out;
> +	}
> +
> +	/* if the remote proc is still needed, bail out */
> +	if (!atomic_dec_and_test(&rproc->power)) {
> +		ret = -EBUSY;
> +		goto out;
> +	}
> +
> +	ret = __rproc_detach(rproc);
> +	if (ret) {
> +		atomic_inc(&rproc->power);
> +		goto out;
> +	}
> +
> +	/* clean up all acquired resources */
> +	rproc_resource_cleanup(rproc);
> +
> +	rproc_disable_iommu(rproc);
> +
> +	/*
> +	 * If the remote processor was booted by the core the cached table needs
> +	 * to be freed and ->table_ptr set to NULL because it will be
> +	 * invalidated by rproc_resource_cleanup().  If the remote processor was
> +	 * attached to ->cached_table is NULL and kfree() returns right away.
> +	 *
> +	 * In either case ->table_ptr has to be set to NULL.  It will be set
> +	 * again when the remote processor is re-attached to.
> +	 */
> +	kfree(rproc->cached_table);
> +	rproc->cached_table = NULL;
> +	rproc->table_ptr = NULL;
> +out:
> +	mutex_unlock(&rproc->lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL(rproc_detach);
> +
>  /**
>   * rproc_get_by_phandle() - find a remote processor by phandle
>   * @phandle: phandle to the rproc
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 9bb34c3eb847..65ece6f177b7 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -659,6 +659,8 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>  
>  int rproc_boot(struct rproc *rproc);
>  void rproc_shutdown(struct rproc *rproc);
> +int rproc_detach(struct rproc *rproc);
> +int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
>  int rproc_coredump_add_custom_segment(struct rproc *rproc,
>
Arnaud POULIQUEN Jan. 27, 2021, 8:56 a.m. UTC | #4
look good to me
Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>

On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> Refactor function rproc_del() and rproc_cdev_release() to take
> into account the policy specified in the device tree.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>  drivers/remoteproc/remoteproc_cdev.c | 18 +++++++++++---
>  drivers/remoteproc/remoteproc_core.c | 36 ++++++++++++++++++++++++----
>  include/linux/remoteproc.h           |  4 ++++
>  3 files changed, 51 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
> index f7645f289563..9b2fb6fbf8e7 100644
> --- a/drivers/remoteproc/remoteproc_cdev.c
> +++ b/drivers/remoteproc/remoteproc_cdev.c
> @@ -87,11 +87,23 @@ static long rproc_device_ioctl(struct file *filp, unsigned int ioctl, unsigned l
>  static int rproc_cdev_release(struct inode *inode, struct file *filp)
>  {
>  	struct rproc *rproc = container_of(inode->i_cdev, struct rproc, cdev);
> +	int ret;
>  
> -	if (rproc->cdev_put_on_release && rproc->state == RPROC_RUNNING)
> -		rproc_shutdown(rproc);
> +	if (!rproc->cdev_put_on_release)
> +		return 0;
>  
> -	return 0;
> +	/*
> +	 * The application has crashed or is releasing its file handle.  Detach
> +	 * or shutdown the remote processor based on the policy specified in the
> +	 * DT.  No need to check rproc->state right away, it will be done
> +	 * in either rproc_detach() or rproc_shutdown().
> +	 */
> +	if (rproc->autonomous_on_core_shutdown)
> +		ret = rproc_detach(rproc);
> +	else
> +		ret = rproc_shutdown(rproc);
> +
> +	return ret;
>  }
>  
>  static const struct file_operations rproc_fops = {
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 2fe42ac7ca89..9f47a4ec0ec6 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -2254,6 +2254,22 @@ static int rproc_alloc_ops(struct rproc *rproc, const struct rproc_ops *ops)
>  	return 0;
>  }
>  
> +static void rproc_set_automation_flags(struct rproc *rproc)
> +{
> +	struct device *dev = rproc->dev.parent;
> +	struct device_node *np = dev->of_node;
> +	bool core_shutdown;
> +
> +	/*
> +	 * When function rproc_cdev_release() or rproc_del() are called and
> +	 * the remote processor has been attached to, it will be detached from
> +	 * (rather than turned off) if "autonomous-on-core-shutdown is specified
> +	 * in the DT.
> +	 */
> +	core_shutdown = of_property_read_bool(np, "autonomous-on-core-shutdown");
> +	rproc->autonomous_on_core_shutdown = core_shutdown;
> +}
> +
>  /**
>   * rproc_alloc() - allocate a remote processor handle
>   * @dev: the underlying device
> @@ -2312,6 +2328,8 @@ struct rproc *rproc_alloc(struct device *dev, const char *name,
>  	if (rproc_alloc_ops(rproc, ops))
>  		goto put_device;
>  
> +	rproc_set_automation_flags(rproc);
> +
>  	/* Assign a unique device index and name */
>  	rproc->index = ida_simple_get(&rproc_dev_index, 0, 0, GFP_KERNEL);
>  	if (rproc->index < 0) {
> @@ -2388,15 +2406,25 @@ EXPORT_SYMBOL(rproc_put);
>   * of the outstanding reference created by rproc_alloc. To decrement that
>   * one last refcount, one still needs to call rproc_free().
>   *
> - * Returns 0 on success and -EINVAL if @rproc isn't valid.
> + * Returns 0 on success and a negative error code on failure.
>   */
>  int rproc_del(struct rproc *rproc)
>  {
> +	int ret;
> +
>  	if (!rproc)
>  		return -EINVAL;
>  
> -	/* TODO: make sure this works with rproc->power > 1 */
> -	rproc_shutdown(rproc);
> +	/*
> +	 * TODO: make sure this works with rproc->power > 1
> +	 *
> +	 * No need to check rproc->state right away, it will be done in either
> +	 * rproc_detach() or rproc_shutdown().
> +	 */
> +	if (rproc->autonomous_on_core_shutdown)
> +		ret = rproc_detach(rproc);
> +	else
> +		ret = rproc_shutdown(rproc);
>  
>  	mutex_lock(&rproc->lock);
>  	rproc->state = RPROC_DELETED;
> @@ -2415,7 +2443,7 @@ int rproc_del(struct rproc *rproc)
>  
>  	device_del(&rproc->dev);
>  
> -	return 0;
> +	return ret;
>  }
>  EXPORT_SYMBOL(rproc_del);
>  
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index aa5bceb72015..012bebbd324b 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -519,6 +519,9 @@ struct rproc_dump_segment {
>   * @nb_vdev: number of vdev currently handled by rproc
>   * @char_dev: character device of the rproc
>   * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release
> + * @autonomous_on_core_shutdown: true if the remote processor should be detached
> + *				 from (rather than turned off) when the remoteproc
> + *				 core goes away.
>   */
>  struct rproc {
>  	struct list_head node;
> @@ -557,6 +560,7 @@ struct rproc {
>  	u16 elf_machine;
>  	struct cdev cdev;
>  	bool cdev_put_on_release;
> +	bool autonomous_on_core_shutdown;
>  };
>  
>  /**
>
Arnaud POULIQUEN Jan. 27, 2021, 9:21 a.m. UTC | #5
Hi Mathieu

On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> Following the work done here [1], this set provides support for the
> remoteproc core to release resources associated with a remote processor
> without having to switch it off. That way a platform driver can be removed
> or the application processor power cycled while the remote processor is
> still operating.
> 
> Of special interest in this series are patches 5 and 6 where getting the
> address of the resource table installed by an eternal entity if moved to
> the core.  This is to support scenarios where a remote process has been
> booted by the core but is being detached.  To re-attach the remote
> processor, the address of the resource table needs to be known at a later
> time than the platform driver's probe() function.
> 
> Applies cleanly on v5.10
> 
> Thanks,
> Mathieu
> 
> [1]. https://lkml.org/lkml/2020/7/14/1600
> 
> ----
> New for v4:
> - Made binding description OS agnostic (Rob)
> - Added functionality to set the external resource table in the core
> - Fixed a crash when detaching (Arnaud)
> - Fixed error code propagation in rproc_cdev_relase() and rproc_del() (Arnaud)
> - Added RB tags


I tested you series, attach and  detach is working well.

Then I faced issue when tried to re-attach after a detach.

But I don't know if this feature has to be supported in this step.

The 2 issues I found are:

1) memory carveouts are released on detach so need to be reinitialized.
The use of prepare/unprepare for the attach and detach would solve the issue but
probably need to add parameter to differentiate a start/stop from a attach/detach.

2) The vrings in the loaded resource table (so no cached) has to be properly
reinitialized. In rproc_free_vring  the vring da is set to 0 that is then
considered as a fixed address.

Here is a fix which works on the stm32 platform

@@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
 	 */
 	if (rproc->table_ptr) {
 		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
-		rsc->vring[idx].da = 0;
+		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
 		rsc->vring[idx].notifyid = -1;
 	}
 }

Here, perhaps a better alternative would be to make a cached copy on attach
before updating it. On the next attach, the cached copy would be copied as it is
done in rproc_start.

Thanks,
Arnaud


> 
> Mathieu Poirier (17):
>   dt-bindings: remoteproc: Add bindind to support autonomous processors
>   remoteproc: Re-check state in rproc_shutdown()
>   remoteproc: Remove useless check in rproc_del()
>   remoteproc: Rename function rproc_actuate()
>   remoteproc: Add new get_loaded_rsc_table() remoteproc operation
>   remoteproc: stm32: Move resource table setup to rproc_ops
>   remoteproc: Add new RPROC_ATTACHED state
>   remoteproc: Properly represent the attached state
>   remoteproc: Properly deal with a kernel panic when attached
>   remoteproc: Add new detach() remoteproc operation
>   remoteproc: Introduce function __rproc_detach()
>   remoteproc: Introduce function rproc_detach()
>   remoteproc: Add return value to function rproc_shutdown()
>   remoteproc: Properly deal with a stop request when attached
>   remoteproc: Properly deal with a start request when attached
>   remoteproc: Properly deal with detach request
>   remoteproc: Refactor rproc delete and cdev release path
> 
>  .../bindings/remoteproc/remoteproc-core.yaml  |  27 +++
>  drivers/remoteproc/remoteproc_cdev.c          |  32 ++-
>  drivers/remoteproc/remoteproc_core.c          | 211 +++++++++++++++---
>  drivers/remoteproc/remoteproc_internal.h      |   8 +
>  drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
>  drivers/remoteproc/stm32_rproc.c              | 147 ++++++------
>  include/linux/remoteproc.h                    |  24 +-
>  7 files changed, 344 insertions(+), 125 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml
>
Mathieu Poirier Jan. 29, 2021, 9:37 p.m. UTC | #6
On Wed, Jan 27, 2021 at 09:44:28AM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu,
> 
> Come back on you series...
> 
> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> > Add an new get_loaded_rsc_table() operation in order to support
> > scenarios where the remoteproc core has booted a remote processor
> > and detaches from it.  When re-attaching to the remote processor,
> > the core needs to know where the resource table has been placed
> > in memory.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > ---
> >  drivers/remoteproc/remoteproc_core.c     | 6 ++++++
> >  drivers/remoteproc/remoteproc_internal.h | 8 ++++++++
> >  include/linux/remoteproc.h               | 5 ++++-
> >  3 files changed, 18 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index d0f6b39b56f9..3d87c910aca7 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1556,6 +1556,12 @@ static int rproc_attach(struct rproc *rproc)
> >  		return ret;
> >  	}
> >  
> > +	ret = rproc_get_loaded_rsc_table(rproc);
> > +	if (ret) {
> > +		dev_err(dev, "can't load resource table: %d\n", ret);
> > +		goto disable_iommu;
> > +	}
> > +
> 
> This function is rather ambiguous. Without the example of stm32, it is not
> obvious what the platform driver has to do in this ops. And the update of rproc
> in the in the core instead of in platform driver seems to me more reliable.
> 
> Here is a suggestion considering that ->cached_table is always NULL:
> 
> 
> struct resource_table *rproc_get_loaded_rsc_table(struct rproc *rproc,
>                                                   size_t* size)
> {
> 
> 	if (rproc->ops->get_loaded_rsc_table) {
> 		return rproc->ops->get_loaded_rsc_table(rproc, size);
> 
> 	*size = 0;
> 	return NULL;
> }
> 
> then in rproc_attach:
> 
> 	table_ptr = rproc_get_loaded_rsc_table(rproc, &tab_size);
> 	if (PTR_ERR(table_ptr) {
> 		dev_err(dev, "can't load resource table: %d\n", ret);
> 		goto disable_iommu;
> 	}
>  	rproc->cached_table = NULL;
>  	rproc->table_ptr = table_ptr;
>  	rproc->table_sz = table_sz;
>

Much better yes, thanks for the suggestion.
 
> 
> Thanks,
> Arnaud
> 
> >  	/* reset max_notifyid */
> >  	rproc->max_notifyid = -1;
> >  
> > diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h
> > index c34002888d2c..c48b301d6ad1 100644
> > --- a/drivers/remoteproc/remoteproc_internal.h
> > +++ b/drivers/remoteproc/remoteproc_internal.h
> > @@ -177,6 +177,14 @@ struct resource_table *rproc_find_loaded_rsc_table(struct rproc *rproc,
> >  	return NULL;
> >  }
> >  
> > +static inline int rproc_get_loaded_rsc_table(struct rproc *rproc)
> > +{
> > +	if (rproc->ops->get_loaded_rsc_table)
> > +		return rproc->ops->get_loaded_rsc_table(rproc);
> > +
> > +	return 0;
> > +}
> > +
> >  static inline
> >  bool rproc_u64_fit_in_size_t(u64 val)
> >  {
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index 3fa3ba6498e8..571615e77e6f 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -368,7 +368,9 @@ enum rsc_handling_status {
> >   * RSC_HANDLED if resource was handled, RSC_IGNORED if not handled and a
> >   * negative value on error
> >   * @load_rsc_table:	load resource table from firmware image
> > - * @find_loaded_rsc_table: find the loaded resouce table
> > + * @find_loaded_rsc_table: find the loaded resource table from firmware image
> > + * @get_loaded_rsc_table: get resource table installed in memory
> > + *			  by external entity
> >   * @load:		load firmware to memory, where the remote processor
> >   *			expects to find it
> >   * @sanity_check:	sanity check the fw image
> > @@ -389,6 +391,7 @@ struct rproc_ops {
> >  			  int offset, int avail);
> >  	struct resource_table *(*find_loaded_rsc_table)(
> >  				struct rproc *rproc, const struct firmware *fw);
> > +	int (*get_loaded_rsc_table)(struct rproc *rproc);
> >  	int (*load)(struct rproc *rproc, const struct firmware *fw);
> >  	int (*sanity_check)(struct rproc *rproc, const struct firmware *fw);
> >  	u64 (*get_boot_addr)(struct rproc *rproc, const struct firmware *fw);
> >
Mathieu Poirier Jan. 29, 2021, 10:17 p.m. UTC | #7
On Wed, Jan 27, 2021 at 09:46:58AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> > Introduce function __rproc_detach() to perform the same kind of
> > operation as rproc_stop(), but instead of switching off the
> > remote processor using rproc->ops->stop(), it uses
> > rproc->ops->detach().  That way it is possible for the core
> > to release the resources associated with a remote processor while
> > the latter is kept operating.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Reviewed-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 42 ++++++++++++++++++++++++++++
> >  1 file changed, 42 insertions(+)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index fc28053c7f89..e665ed4776c3 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1670,6 +1670,48 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >  	return 0;
> >  }
> >  
> > +/*
> > + * __rproc_detach(): Does the opposite of rproc_attach()
> > + */
> > +static int __maybe_unused __rproc_detach(struct rproc *rproc)
> > +{
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	/* No need to continue if a detach() operation has not been provided */
> > +	if (!rproc->ops->detach)
> > +		return -EINVAL;
> 
> I wonder if this ops should be optional.

Function rproc_validate() doesn't check for it so it is optional.  Returning an
error is to indicate to sysfs the operation is not supported if someone tries to
do a "detach" when rproc::ops doesn't provide it.

> 
> > +
> > +	/* Stop any subdevices for the remote processor */
> > +	rproc_stop_subdevices(rproc, false);
> > +
> > +	/*
> > +	 * If the remote processors was started by the core then a cached_table
> > +	 * is present and we must follow the same cleanup sequence as we would
> > +	 * for a shutdown().  As it is in rproc_stop(), use the cached resource
> > +	 * table for the rest of the detach process since ->table_ptr will
> > +	 * become invalid as soon as carveouts are released in
> > +	 * rproc_resource_cleanup().
> > +	 */
> > +	if (rproc->cached_table)
> > +		rproc->table_ptr = rproc->cached_table;
> > +
> > +	/* Tell the remote processor the core isn't available anymore */
> > +	ret = rproc->ops->detach(rproc);
> > +	if (ret) {
> > +		dev_err(dev, "can't detach from rproc: %d\n", ret);
> > +		rproc_start_subdevices(rproc);
> 
> Not sure that this would be possible in all cases, without a unprepare and
> prepare. What about having the same behavior as the rproc_stop failure?

I thought rproc_stop()'s failure path was buggy and could be improved but as you
say, there might be other ramifications to take into account.  I agree that it
is more prudent to follow the current behavior from rproc_stop() and leave
enhancements for another patchset.

> 
> Thanks
> Arnaud.
> 
> > +		return ret;
> > +	}
> > +
> > +	rproc_unprepare_subdevices(rproc);
> > +
> > +	rproc->state = RPROC_DETACHED;
> > +
> > +	dev_info(dev, "detached remote processor %s\n", rproc->name);
> > +
> > +	return 0;
> > +}
> >  
> >  /**
> >   * rproc_trigger_recovery() - recover a remoteproc
> >
Mathieu Poirier Jan. 29, 2021, 10:31 p.m. UTC | #8
On Wed, Jan 27, 2021 at 09:50:31AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> > Introduce function rproc_detach() to enable the remoteproc
> > core to release the resources associated with a remote processor
> > without stopping its operation.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Reviewed-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 71 +++++++++++++++++++++++++++-
> >  include/linux/remoteproc.h           |  2 +
> >  2 files changed, 72 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index e665ed4776c3..ece3f15070b9 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1673,7 +1673,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >  /*
> >   * __rproc_detach(): Does the opposite of rproc_attach()
> >   */
> > -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> > +static int __rproc_detach(struct rproc *rproc)
> >  {
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> > @@ -1927,6 +1927,75 @@ void rproc_shutdown(struct rproc *rproc)
> >  }
> >  EXPORT_SYMBOL(rproc_shutdown);
> >  
> > +/**
> > + * rproc_detach() - Detach the remote processor from the
> > + * remoteproc core
> > + *
> > + * @rproc: the remote processor
> > + *
> > + * Detach a remote processor (previously attached to with rproc_actuate()).
> 
> You rename the function to rproc_attach in you patch 04/17.
> 

Yes, good catch.

> Then Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>
> 
> Thanks,
> Arnaud
> 
> > + *
> > + * In case @rproc is still being used by an additional user(s), then
> > + * this function will just decrement the power refcount and exit,
> > + * without disconnecting the device.
> > + *
> > + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> > + * processor know that services provided by the application processor are
> > + * no longer available.  From there it should be possible to remove the
> > + * platform driver and even power cycle the application processor (if the HW
> > + * supports it) without needing to switch off the remote processor.
> > + */
> > +int rproc_detach(struct rproc *rproc)
> > +{
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = mutex_lock_interruptible(&rproc->lock);
> > +	if (ret) {
> > +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> > +		return ret;
> > +	}
> > +
> > +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/* if the remote proc is still needed, bail out */
> > +	if (!atomic_dec_and_test(&rproc->power)) {
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	ret = __rproc_detach(rproc);
> > +	if (ret) {
> > +		atomic_inc(&rproc->power);
> > +		goto out;
> > +	}
> > +
> > +	/* clean up all acquired resources */
> > +	rproc_resource_cleanup(rproc);
> > +
> > +	rproc_disable_iommu(rproc);
> > +
> > +	/*
> > +	 * If the remote processor was booted by the core the cached table needs
> > +	 * to be freed and ->table_ptr set to NULL because it will be
> > +	 * invalidated by rproc_resource_cleanup().  If the remote processor was
> > +	 * attached to ->cached_table is NULL and kfree() returns right away.
> > +	 *
> > +	 * In either case ->table_ptr has to be set to NULL.  It will be set
> > +	 * again when the remote processor is re-attached to.
> > +	 */
> > +	kfree(rproc->cached_table);
> > +	rproc->cached_table = NULL;
> > +	rproc->table_ptr = NULL;
> > +out:
> > +	mutex_unlock(&rproc->lock);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(rproc_detach);
> > +
> >  /**
> >   * rproc_get_by_phandle() - find a remote processor by phandle
> >   * @phandle: phandle to the rproc
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index 9bb34c3eb847..65ece6f177b7 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -659,6 +659,8 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >  
> >  int rproc_boot(struct rproc *rproc);
> >  void rproc_shutdown(struct rproc *rproc);
> > +int rproc_detach(struct rproc *rproc);
> > +int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> >  int rproc_coredump_add_custom_segment(struct rproc *rproc,
> >
Mathieu Poirier Feb. 2, 2021, 12:49 a.m. UTC | #9
On Wed, Jan 27, 2021 at 10:21:24AM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu
> 
> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> > Following the work done here [1], this set provides support for the
> > remoteproc core to release resources associated with a remote processor
> > without having to switch it off. That way a platform driver can be removed
> > or the application processor power cycled while the remote processor is
> > still operating.
> > 
> > Of special interest in this series are patches 5 and 6 where getting the
> > address of the resource table installed by an eternal entity if moved to
> > the core.  This is to support scenarios where a remote process has been
> > booted by the core but is being detached.  To re-attach the remote
> > processor, the address of the resource table needs to be known at a later
> > time than the platform driver's probe() function.
> > 
> > Applies cleanly on v5.10
> > 
> > Thanks,
> > Mathieu
> > 
> > [1]. https://lkml.org/lkml/2020/7/14/1600
> > 
> > ----
> > New for v4:
> > - Made binding description OS agnostic (Rob)
> > - Added functionality to set the external resource table in the core
> > - Fixed a crash when detaching (Arnaud)
> > - Fixed error code propagation in rproc_cdev_relase() and rproc_del() (Arnaud)
> > - Added RB tags
> 
> 
> I tested you series, attach and  detach is working well.
> 
> Then I faced issue when tried to re-attach after a detach.
>

Right, in this case don't expect the re-attach to work properly because function
stm32_rproc_detach() does not exist.  As such the M4 doesn't put itself back
in "wait-for-attach" mode as it does when booted by the boot loader.  If I
remember correctly we talked about that during an earlier conversation and we
agreed FW support would be needed to properly test the re-attach.
 
> But I don't know if this feature has to be supported in this step.
> 
> The 2 issues I found are:
> 
> 1) memory carveouts are released on detach so need to be reinitialized.
> The use of prepare/unprepare for the attach and detach would solve the issue but
> probably need to add parameter to differentiate a start/stop from a attach/detach.
> 
> 2) The vrings in the loaded resource table (so no cached) has to be properly
> reinitialized. In rproc_free_vring  the vring da is set to 0 that is then
> considered as a fixed address.
> 
> Here is a fix which works on the stm32 platform
> 
> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
>  	 */
>  	if (rproc->table_ptr) {
>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
> -		rsc->vring[idx].da = 0;
> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
>  		rsc->vring[idx].notifyid = -1;
>  	}
>  }

In light of the above let me know if these two issues are still relevant.  If
so I'll investigate further.

Thanks,
Mathieu

> 
> Here, perhaps a better alternative would be to make a cached copy on attach
> before updating it. On the next attach, the cached copy would be copied as it is
> done in rproc_start.
> 
> Thanks,
> Arnaud
> 
> 
> > 
> > Mathieu Poirier (17):
> >   dt-bindings: remoteproc: Add bindind to support autonomous processors
> >   remoteproc: Re-check state in rproc_shutdown()
> >   remoteproc: Remove useless check in rproc_del()
> >   remoteproc: Rename function rproc_actuate()
> >   remoteproc: Add new get_loaded_rsc_table() remoteproc operation
> >   remoteproc: stm32: Move resource table setup to rproc_ops
> >   remoteproc: Add new RPROC_ATTACHED state
> >   remoteproc: Properly represent the attached state
> >   remoteproc: Properly deal with a kernel panic when attached
> >   remoteproc: Add new detach() remoteproc operation
> >   remoteproc: Introduce function __rproc_detach()
> >   remoteproc: Introduce function rproc_detach()
> >   remoteproc: Add return value to function rproc_shutdown()
> >   remoteproc: Properly deal with a stop request when attached
> >   remoteproc: Properly deal with a start request when attached
> >   remoteproc: Properly deal with detach request
> >   remoteproc: Refactor rproc delete and cdev release path
> > 
> >  .../bindings/remoteproc/remoteproc-core.yaml  |  27 +++
> >  drivers/remoteproc/remoteproc_cdev.c          |  32 ++-
> >  drivers/remoteproc/remoteproc_core.c          | 211 +++++++++++++++---
> >  drivers/remoteproc/remoteproc_internal.h      |   8 +
> >  drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
> >  drivers/remoteproc/stm32_rproc.c              | 147 ++++++------
> >  include/linux/remoteproc.h                    |  24 +-
> >  7 files changed, 344 insertions(+), 125 deletions(-)
> >  create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml
> >
Mathieu Poirier Feb. 2, 2021, 10:42 p.m. UTC | #10
On Tue, Feb 02, 2021 at 09:54:13AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 2/2/21 1:49 AM, Mathieu Poirier wrote:
> > On Wed, Jan 27, 2021 at 10:21:24AM +0100, Arnaud POULIQUEN wrote:
> >> Hi Mathieu
> >>
> >> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> >>> Following the work done here [1], this set provides support for the
> >>> remoteproc core to release resources associated with a remote processor
> >>> without having to switch it off. That way a platform driver can be removed
> >>> or the application processor power cycled while the remote processor is
> >>> still operating.
> >>>
> >>> Of special interest in this series are patches 5 and 6 where getting the
> >>> address of the resource table installed by an eternal entity if moved to
> >>> the core.  This is to support scenarios where a remote process has been
> >>> booted by the core but is being detached.  To re-attach the remote
> >>> processor, the address of the resource table needs to be known at a later
> >>> time than the platform driver's probe() function.
> >>>
> >>> Applies cleanly on v5.10
> >>>
> >>> Thanks,
> >>> Mathieu
> >>>
> >>> [1]. https://lkml.org/lkml/2020/7/14/1600
> >>>
> >>> ----
> >>> New for v4:
> >>> - Made binding description OS agnostic (Rob)
> >>> - Added functionality to set the external resource table in the core
> >>> - Fixed a crash when detaching (Arnaud)
> >>> - Fixed error code propagation in rproc_cdev_relase() and rproc_del() (Arnaud)
> >>> - Added RB tags
> >>
> >>
> >> I tested you series, attach and  detach is working well.
> >>
> >> Then I faced issue when tried to re-attach after a detach.
> >>
> > 
> > Right, in this case don't expect the re-attach to work properly because function
> > stm32_rproc_detach() does not exist.  As such the M4 doesn't put itself back
> > in "wait-for-attach" mode as it does when booted by the boot loader.  If I
> > remember correctly we talked about that during an earlier conversation and we
> > agreed FW support would be needed to properly test the re-attach.
> 
> Yes you are right the remote firmware needs to be inform about the detach, and
> this is the purpose of the detach ops.
> But also some actions are missing on local side as some resources have also to
> be reinitialized as described in my previous mail.
> For instance the resource table is handled by the remoteproc framework. The
> remote firmware should only have a read access to this table.
> 
> >  
> >> But I don't know if this feature has to be supported in this step.
> >>
> >> The 2 issues I found are:
> >>
> >> 1) memory carveouts are released on detach so need to be reinitialized.
> >> The use of prepare/unprepare for the attach and detach would solve the issue but
> >> probably need to add parameter to differentiate a start/stop from a attach/detach.

I am in agreement with you assesment and the patch you have proposed to fix it.
Right now carveouts are properly handled between start and stop operations
because their parsing and allocation is done as part for ops:parse_fw(), which
is called for each iteration.  Moving the parsing and allocation to
rproc_prepare_device() ends up doing the same thing.

I am not sure we absolutely need an extra parameter to differentiate between
start/stop and attach/detach.  Typically the rproc_prepare_device() is used to
do some kind of setup like providing power to a memory bank.  I suppose that if
a memory bank is already powered by the boot loader, asking to power it up again
won't do anything.

As such I suggest we keep the parameters as they are now.  We can revisit if a
use case comes up at a later time. 

> >>
> >> 2) The vrings in the loaded resource table (so no cached) has to be properly
> >> reinitialized. In rproc_free_vring  the vring da is set to 0 that is then
> >> considered as a fixed address.
> >>
> >> Here is a fix which works on the stm32 platform
> >>
> >> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
> >>  	 */
> >>  	if (rproc->table_ptr) {
> >>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
> >> -		rsc->vring[idx].da = 0;
> >> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
> >>  		rsc->vring[idx].notifyid = -1;
> >>  	}
> >>  }

I see why this could be needed.  I initially assumed the remote processor would
install a new resource table in memory upon receiving notification the core is
going away.  But upon reflection the remote processor may not even have access
to the image it is running.

Let me think about this further - I want to make sure we don't introduce a
regression for current implementations.

> > 
> > In light of the above let me know if these two issues are still relevant.  If
> > so I'll investigate further.
> 
> To highlight the issue just test attach/detach/attch  with a firmware that
> implements a RPMsg communication. On the second attach the virtio framework is
> not properly restarted.
> 
> Then please find at the end of the mail 3 patches for test I added on top of
> your series,that allow me to reattach. Of course the RPMsg channels are not
> re-created as i don't implement the remote FW part, but the Linux virtio and
> RPmsg frameworks are restarted.
> 
> - [PATCH 1/3] remoteproc: stm32: add capability to detach from the remoteproc
>   => Add a dummy function in stm32_rproc for test.
> - [PATCH 2/3] remoteproc: Add prepare/unprepare for attach detach
>   => Add prepare/unprepare on attach/detach + implement attach in stm32mp1 to
>      reinitialize the memory region that as been cleaned on detach.
> - [PATCH 3/3] remoteproc: virtio: set to vring address to FW_RSC_ADDR_ANY on free
>   => Reinitialize the vring addresses on detach. For this one a better
>      implementation would be to use a cached resource table to fully
>      reinitialize it on re-attach.
> 
> Thanks,
> Arnaud
> 
> > 
> > Thanks,
> > Mathieu
> > 
> >>
> >> Here, perhaps a better alternative would be to make a cached copy on attach
> >> before updating it. On the next attach, the cached copy would be copied as it is
> >> done in rproc_start.
> >>
> >> Thanks,
> >> Arnaud
> >>
> >>
> >>>
> >>> Mathieu Poirier (17):
> >>>   dt-bindings: remoteproc: Add bindind to support autonomous processors
> >>>   remoteproc: Re-check state in rproc_shutdown()
> >>>   remoteproc: Remove useless check in rproc_del()
> >>>   remoteproc: Rename function rproc_actuate()
> >>>   remoteproc: Add new get_loaded_rsc_table() remoteproc operation
> >>>   remoteproc: stm32: Move resource table setup to rproc_ops
> >>>   remoteproc: Add new RPROC_ATTACHED state
> >>>   remoteproc: Properly represent the attached state
> >>>   remoteproc: Properly deal with a kernel panic when attached
> >>>   remoteproc: Add new detach() remoteproc operation
> >>>   remoteproc: Introduce function __rproc_detach()
> >>>   remoteproc: Introduce function rproc_detach()
> >>>   remoteproc: Add return value to function rproc_shutdown()
> >>>   remoteproc: Properly deal with a stop request when attached
> >>>   remoteproc: Properly deal with a start request when attached
> >>>   remoteproc: Properly deal with detach request
> >>>   remoteproc: Refactor rproc delete and cdev release path
> >>>
> >>>  .../bindings/remoteproc/remoteproc-core.yaml  |  27 +++
> >>>  drivers/remoteproc/remoteproc_cdev.c          |  32 ++-
> >>>  drivers/remoteproc/remoteproc_core.c          | 211 +++++++++++++++---
> >>>  drivers/remoteproc/remoteproc_internal.h      |   8 +
> >>>  drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
> >>>  drivers/remoteproc/stm32_rproc.c              | 147 ++++++------
> >>>  include/linux/remoteproc.h                    |  24 +-
> >>>  7 files changed, 344 insertions(+), 125 deletions(-)
> >>>  create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml
> >>>
> 
> Subject: [PATCH 1/3] remoteproc: stm32: add capability to detach from the
>  remoteproc
> 
> Add a dummy function to allow to detach. No specific action is needed
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> ---
>  drivers/remoteproc/stm32_rproc.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
> index 2c949725b91e..b325d28f627c 100644
> --- a/drivers/remoteproc/stm32_rproc.c
> +++ b/drivers/remoteproc/stm32_rproc.c
> @@ -590,6 +590,12 @@ static int stm32_rproc_attach(struct rproc *rproc)
>  	return reset_control_assert(ddata->hold_boot);
>  }
> 
> +static int stm32_rproc_detach(struct rproc *rproc)
> +{
> +	/* Nothing to do but ops mandatory to support the detach feature */
> +	return 0;
> +}
> +
>  static int stm32_rproc_stop(struct rproc *rproc)
>  {
>  	struct stm32_rproc *ddata = rproc->priv;
> @@ -712,6 +718,7 @@ static struct rproc_ops st_rproc_ops = {
>  	.start		= stm32_rproc_start,
>  	.stop		= stm32_rproc_stop,
>  	.attach		= stm32_rproc_attach,
> +	.detach		= stm32_rproc_detach,
>  	.kick		= stm32_rproc_kick,
>  	.load		= rproc_elf_load_segments,
>  	.parse_fw	= stm32_rproc_parse_fw,
> -- 
> 2.17.1
> 
> 
> ------------------------------------------------------------------------
> 
> 
> Subject: [PATCH 2/3] remoteproc: Add prepare/unprepare for attach detach
> 
> Some actions such as memory resources reallocation are needed when try
> to reattach. Use the prepare ops for these actions.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 14 ++++++++++++++
>  drivers/remoteproc/stm32_rproc.c     | 14 +++++++-------
>  2 files changed, 21 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index f1f51ad1a1d6..f177561b8863 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1557,6 +1557,13 @@ static int rproc_attach(struct rproc *rproc)
>  		return ret;
>  	}
> 
> +	/* Prepare rproc for firmware loading if needed */
> +	ret = rproc_prepare_device(rproc);
> +	if (ret) {
> +		dev_err(dev, "can't prepare rproc %s: %d\n", rproc->name, ret);
> +		goto disable_iommu;
> +	}
> +
>  	ret = rproc_get_loaded_rsc_table(rproc);
>  	if (ret) {
>  		dev_err(dev, "can't load resource table: %d\n", ret);
> @@ -1990,6 +1997,13 @@ int rproc_detach(struct rproc *rproc)
>  	/* clean up all acquired resources */
>  	rproc_resource_cleanup(rproc);
> 
> +	/* Release HW resources if needed */
> +	ret = rproc_unprepare_device(rproc);
> +	if (ret) {
> +		atomic_inc(&rproc->power);
> +		goto out;
> +	}
> +
>  	rproc_disable_iommu(rproc);
> 
>  	/*
> diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
> index b325d28f627c..bf50d79b1f09 100644
> --- a/drivers/remoteproc/stm32_rproc.c
> +++ b/drivers/remoteproc/stm32_rproc.c
> @@ -413,9 +413,6 @@ static int stm32_rproc_parse_fw(struct rproc *rproc, const
> struct firmware *fw)
>  	struct stm32_rproc *ddata = rproc->priv;
>  	int ret;
> 
> -	ret  = stm32_rproc_parse_memory_regions(rproc);
> -	if (ret)
> -		return ret;
> 
>  	if (ddata->trproc)
>  		ret = rproc_tee_get_rsc_table(ddata->trproc);
> @@ -580,6 +577,12 @@ static int stm32_rproc_start(struct rproc *rproc)
> 
>  	return reset_control_assert(ddata->hold_boot);
>  }
> +static int stm32_rproc_prepare(struct rproc *rproc)
> +{
> +	dev_err(&rproc->dev, "%s: %d\n", __func__, __LINE__);
> +
> +	return stm32_rproc_parse_memory_regions(rproc);
> +}
> 
>  static int stm32_rproc_attach(struct rproc *rproc)
>  {
> @@ -717,6 +720,7 @@ static int stm32_rproc_get_loaded_rsc_table(struct rproc *rproc)
>  static struct rproc_ops st_rproc_ops = {
>  	.start		= stm32_rproc_start,
>  	.stop		= stm32_rproc_stop,
> +	.prepare	= stm32_rproc_prepare,
>  	.attach		= stm32_rproc_attach,
>  	.detach		= stm32_rproc_detach,
>  	.kick		= stm32_rproc_kick,
> @@ -921,10 +925,6 @@ static int stm32_rproc_probe(struct platform_device *pdev)
> 
>  	if (state == M4_STATE_CRUN) {
>  		rproc->state = RPROC_DETACHED;
> -
> -		ret = stm32_rproc_parse_memory_regions(rproc);
> -		if (ret)
> -			goto free_resources;
>  	}
> 
>  	rproc->has_iommu = false;
> -- 
> 2.17.1
> 
> 
> ------------------------------------------------------------------------
> 
> Subject: [PATCH 3/3] remoteproc: virtio: set to vring address to
>  FW_RSC_ADDR_ANY on free
> 
> The resource table vring structure is cleaned on free. But value is set
> to 0. This value is considered as a valid address. Set the value
> to  FW_RSC_ADDR_ANY instead.
> This is needed to allow to reattach to an autonomous firmware.
> An alternative would be to save the resource table before updating it.
> On free the value would be reset to initial value.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index f177561b8863..5b5de4db3981 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
>  	 */
>  	if (rproc->table_ptr) {
>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
> -		rsc->vring[idx].da = 0;
> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
>  		rsc->vring[idx].notifyid = -1;
>  	}
>  }
> -- 
> 2.17.1
> 
> 
> 
> 
> 
>
Arnaud POULIQUEN Feb. 3, 2021, 7:58 a.m. UTC | #11
On 2/2/21 11:42 PM, Mathieu Poirier wrote:
> On Tue, Feb 02, 2021 at 09:54:13AM +0100, Arnaud POULIQUEN wrote:
>>
>>
>> On 2/2/21 1:49 AM, Mathieu Poirier wrote:
>>> On Wed, Jan 27, 2021 at 10:21:24AM +0100, Arnaud POULIQUEN wrote:
>>>> Hi Mathieu
>>>>
>>>> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
>>>>> Following the work done here [1], this set provides support for the
>>>>> remoteproc core to release resources associated with a remote processor
>>>>> without having to switch it off. That way a platform driver can be removed
>>>>> or the application processor power cycled while the remote processor is
>>>>> still operating.
>>>>>
>>>>> Of special interest in this series are patches 5 and 6 where getting the
>>>>> address of the resource table installed by an eternal entity if moved to
>>>>> the core.  This is to support scenarios where a remote process has been
>>>>> booted by the core but is being detached.  To re-attach the remote
>>>>> processor, the address of the resource table needs to be known at a later
>>>>> time than the platform driver's probe() function.
>>>>>
>>>>> Applies cleanly on v5.10
>>>>>
>>>>> Thanks,
>>>>> Mathieu
>>>>>
>>>>> [1]. https://lkml.org/lkml/2020/7/14/1600
>>>>>
>>>>> ----
>>>>> New for v4:
>>>>> - Made binding description OS agnostic (Rob)
>>>>> - Added functionality to set the external resource table in the core
>>>>> - Fixed a crash when detaching (Arnaud)
>>>>> - Fixed error code propagation in rproc_cdev_relase() and rproc_del() (Arnaud)
>>>>> - Added RB tags
>>>>
>>>>
>>>> I tested you series, attach and  detach is working well.
>>>>
>>>> Then I faced issue when tried to re-attach after a detach.
>>>>
>>>
>>> Right, in this case don't expect the re-attach to work properly because function
>>> stm32_rproc_detach() does not exist.  As such the M4 doesn't put itself back
>>> in "wait-for-attach" mode as it does when booted by the boot loader.  If I
>>> remember correctly we talked about that during an earlier conversation and we
>>> agreed FW support would be needed to properly test the re-attach.
>>
>> Yes you are right the remote firmware needs to be inform about the detach, and
>> this is the purpose of the detach ops.
>> But also some actions are missing on local side as some resources have also to
>> be reinitialized as described in my previous mail.
>> For instance the resource table is handled by the remoteproc framework. The
>> remote firmware should only have a read access to this table.
>>
>>>  
>>>> But I don't know if this feature has to be supported in this step.
>>>>
>>>> The 2 issues I found are:
>>>>
>>>> 1) memory carveouts are released on detach so need to be reinitialized.
>>>> The use of prepare/unprepare for the attach and detach would solve the issue but
>>>> probably need to add parameter to differentiate a start/stop from a attach/detach.
> 
> I am in agreement with you assesment and the patch you have proposed to fix it.
> Right now carveouts are properly handled between start and stop operations
> because their parsing and allocation is done as part for ops:parse_fw(), which
> is called for each iteration.  Moving the parsing and allocation to
> rproc_prepare_device() ends up doing the same thing.
> 
> I am not sure we absolutely need an extra parameter to differentiate between
> start/stop and attach/detach.  Typically the rproc_prepare_device() is used to
> do some kind of setup like providing power to a memory bank.  I suppose that if
> a memory bank is already powered by the boot loader, asking to power it up again
> won't do anything.
> 
> As such I suggest we keep the parameters as they are now.  We can revisit if a
> use case comes up at a later time. 
> 

Your suggestion sound good to me.

>>>>
>>>> 2) The vrings in the loaded resource table (so no cached) has to be properly
>>>> reinitialized. In rproc_free_vring  the vring da is set to 0 that is then
>>>> considered as a fixed address.
>>>>
>>>> Here is a fix which works on the stm32 platform
>>>>
>>>> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
>>>>  	 */
>>>>  	if (rproc->table_ptr) {
>>>>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
>>>> -		rsc->vring[idx].da = 0;
>>>> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
>>>>  		rsc->vring[idx].notifyid = -1;
>>>>  	}
>>>>  }
> 
> I see why this could be needed.  I initially assumed the remote processor would
> install a new resource table in memory upon receiving notification the core is
> going away.  But upon reflection the remote processor may not even have access
> to the image it is running.

The risk here is that both processors try to access this table at the same time.

> 
> Let me think about this further - I want to make sure we don't introduce a
> regression for current implementations.

Just a precision: the vring DA can be fixed by the coprocessor firmware to
impose the address, my patch is not sufficent in such case for the reattachment.
That's why i suggested a copy of the resource table before updating it.

Thanks,
Arnaud

> 
>>>
>>> In light of the above let me know if these two issues are still relevant.  If
>>> so I'll investigate further.
>>
>> To highlight the issue just test attach/detach/attch  with a firmware that
>> implements a RPMsg communication. On the second attach the virtio framework is
>> not properly restarted.
>>
>> Then please find at the end of the mail 3 patches for test I added on top of
>> your series,that allow me to reattach. Of course the RPMsg channels are not
>> re-created as i don't implement the remote FW part, but the Linux virtio and
>> RPmsg frameworks are restarted.
>>
>> - [PATCH 1/3] remoteproc: stm32: add capability to detach from the remoteproc
>>   => Add a dummy function in stm32_rproc for test.
>> - [PATCH 2/3] remoteproc: Add prepare/unprepare for attach detach
>>   => Add prepare/unprepare on attach/detach + implement attach in stm32mp1 to
>>      reinitialize the memory region that as been cleaned on detach.
>> - [PATCH 3/3] remoteproc: virtio: set to vring address to FW_RSC_ADDR_ANY on free
>>   => Reinitialize the vring addresses on detach. For this one a better
>>      implementation would be to use a cached resource table to fully
>>      reinitialize it on re-attach.
>>
>> Thanks,
>> Arnaud
>>
>>>
>>> Thanks,
>>> Mathieu
>>>
>>>>
>>>> Here, perhaps a better alternative would be to make a cached copy on attach
>>>> before updating it. On the next attach, the cached copy would be copied as it is
>>>> done in rproc_start.
>>>>
>>>> Thanks,
>>>> Arnaud
>>>>
>>>>
>>>>>
>>>>> Mathieu Poirier (17):
>>>>>   dt-bindings: remoteproc: Add bindind to support autonomous processors
>>>>>   remoteproc: Re-check state in rproc_shutdown()
>>>>>   remoteproc: Remove useless check in rproc_del()
>>>>>   remoteproc: Rename function rproc_actuate()
>>>>>   remoteproc: Add new get_loaded_rsc_table() remoteproc operation
>>>>>   remoteproc: stm32: Move resource table setup to rproc_ops
>>>>>   remoteproc: Add new RPROC_ATTACHED state
>>>>>   remoteproc: Properly represent the attached state
>>>>>   remoteproc: Properly deal with a kernel panic when attached
>>>>>   remoteproc: Add new detach() remoteproc operation
>>>>>   remoteproc: Introduce function __rproc_detach()
>>>>>   remoteproc: Introduce function rproc_detach()
>>>>>   remoteproc: Add return value to function rproc_shutdown()
>>>>>   remoteproc: Properly deal with a stop request when attached
>>>>>   remoteproc: Properly deal with a start request when attached
>>>>>   remoteproc: Properly deal with detach request
>>>>>   remoteproc: Refactor rproc delete and cdev release path
>>>>>
>>>>>  .../bindings/remoteproc/remoteproc-core.yaml  |  27 +++
>>>>>  drivers/remoteproc/remoteproc_cdev.c          |  32 ++-
>>>>>  drivers/remoteproc/remoteproc_core.c          | 211 +++++++++++++++---
>>>>>  drivers/remoteproc/remoteproc_internal.h      |   8 +
>>>>>  drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
>>>>>  drivers/remoteproc/stm32_rproc.c              | 147 ++++++------
>>>>>  include/linux/remoteproc.h                    |  24 +-
>>>>>  7 files changed, 344 insertions(+), 125 deletions(-)
>>>>>  create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml
>>>>>
>>
>> Subject: [PATCH 1/3] remoteproc: stm32: add capability to detach from the
>>  remoteproc
>>
>> Add a dummy function to allow to detach. No specific action is needed
>>
>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>> ---
>>  drivers/remoteproc/stm32_rproc.c | 7 +++++++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
>> index 2c949725b91e..b325d28f627c 100644
>> --- a/drivers/remoteproc/stm32_rproc.c
>> +++ b/drivers/remoteproc/stm32_rproc.c
>> @@ -590,6 +590,12 @@ static int stm32_rproc_attach(struct rproc *rproc)
>>  	return reset_control_assert(ddata->hold_boot);
>>  }
>>
>> +static int stm32_rproc_detach(struct rproc *rproc)
>> +{
>> +	/* Nothing to do but ops mandatory to support the detach feature */
>> +	return 0;
>> +}
>> +
>>  static int stm32_rproc_stop(struct rproc *rproc)
>>  {
>>  	struct stm32_rproc *ddata = rproc->priv;
>> @@ -712,6 +718,7 @@ static struct rproc_ops st_rproc_ops = {
>>  	.start		= stm32_rproc_start,
>>  	.stop		= stm32_rproc_stop,
>>  	.attach		= stm32_rproc_attach,
>> +	.detach		= stm32_rproc_detach,
>>  	.kick		= stm32_rproc_kick,
>>  	.load		= rproc_elf_load_segments,
>>  	.parse_fw	= stm32_rproc_parse_fw,
>> -- 
>> 2.17.1
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> Subject: [PATCH 2/3] remoteproc: Add prepare/unprepare for attach detach
>>
>> Some actions such as memory resources reallocation are needed when try
>> to reattach. Use the prepare ops for these actions.
>>
>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
>> ---
>>  drivers/remoteproc/remoteproc_core.c | 14 ++++++++++++++
>>  drivers/remoteproc/stm32_rproc.c     | 14 +++++++-------
>>  2 files changed, 21 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/remoteproc/remoteproc_core.c
>> b/drivers/remoteproc/remoteproc_core.c
>> index f1f51ad1a1d6..f177561b8863 100644
>> --- a/drivers/remoteproc/remoteproc_core.c
>> +++ b/drivers/remoteproc/remoteproc_core.c
>> @@ -1557,6 +1557,13 @@ static int rproc_attach(struct rproc *rproc)
>>  		return ret;
>>  	}
>>
>> +	/* Prepare rproc for firmware loading if needed */
>> +	ret = rproc_prepare_device(rproc);
>> +	if (ret) {
>> +		dev_err(dev, "can't prepare rproc %s: %d\n", rproc->name, ret);
>> +		goto disable_iommu;
>> +	}
>> +
>>  	ret = rproc_get_loaded_rsc_table(rproc);
>>  	if (ret) {
>>  		dev_err(dev, "can't load resource table: %d\n", ret);
>> @@ -1990,6 +1997,13 @@ int rproc_detach(struct rproc *rproc)
>>  	/* clean up all acquired resources */
>>  	rproc_resource_cleanup(rproc);
>>
>> +	/* Release HW resources if needed */
>> +	ret = rproc_unprepare_device(rproc);
>> +	if (ret) {
>> +		atomic_inc(&rproc->power);
>> +		goto out;
>> +	}
>> +
>>  	rproc_disable_iommu(rproc);
>>
>>  	/*
>> diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
>> index b325d28f627c..bf50d79b1f09 100644
>> --- a/drivers/remoteproc/stm32_rproc.c
>> +++ b/drivers/remoteproc/stm32_rproc.c
>> @@ -413,9 +413,6 @@ static int stm32_rproc_parse_fw(struct rproc *rproc, const
>> struct firmware *fw)
>>  	struct stm32_rproc *ddata = rproc->priv;
>>  	int ret;
>>
>> -	ret  = stm32_rproc_parse_memory_regions(rproc);
>> -	if (ret)
>> -		return ret;
>>
>>  	if (ddata->trproc)
>>  		ret = rproc_tee_get_rsc_table(ddata->trproc);
>> @@ -580,6 +577,12 @@ static int stm32_rproc_start(struct rproc *rproc)
>>
>>  	return reset_control_assert(ddata->hold_boot);
>>  }
>> +static int stm32_rproc_prepare(struct rproc *rproc)
>> +{
>> +	dev_err(&rproc->dev, "%s: %d\n", __func__, __LINE__);
>> +
>> +	return stm32_rproc_parse_memory_regions(rproc);
>> +}
>>
>>  static int stm32_rproc_attach(struct rproc *rproc)
>>  {
>> @@ -717,6 +720,7 @@ static int stm32_rproc_get_loaded_rsc_table(struct rproc *rproc)
>>  static struct rproc_ops st_rproc_ops = {
>>  	.start		= stm32_rproc_start,
>>  	.stop		= stm32_rproc_stop,
>> +	.prepare	= stm32_rproc_prepare,
>>  	.attach		= stm32_rproc_attach,
>>  	.detach		= stm32_rproc_detach,
>>  	.kick		= stm32_rproc_kick,
>> @@ -921,10 +925,6 @@ static int stm32_rproc_probe(struct platform_device *pdev)
>>
>>  	if (state == M4_STATE_CRUN) {
>>  		rproc->state = RPROC_DETACHED;
>> -
>> -		ret = stm32_rproc_parse_memory_regions(rproc);
>> -		if (ret)
>> -			goto free_resources;
>>  	}
>>
>>  	rproc->has_iommu = false;
>> -- 
>> 2.17.1
>>
>>
>> ------------------------------------------------------------------------
>>
>> Subject: [PATCH 3/3] remoteproc: virtio: set to vring address to
>>  FW_RSC_ADDR_ANY on free
>>
>> The resource table vring structure is cleaned on free. But value is set
>> to 0. This value is considered as a valid address. Set the value
>> to  FW_RSC_ADDR_ANY instead.
>> This is needed to allow to reattach to an autonomous firmware.
>> An alternative would be to save the resource table before updating it.
>> On free the value would be reset to initial value.
>>
>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
>> ---
>>  drivers/remoteproc/remoteproc_core.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/remoteproc/remoteproc_core.c
>> b/drivers/remoteproc/remoteproc_core.c
>> index f177561b8863..5b5de4db3981 100644
>> --- a/drivers/remoteproc/remoteproc_core.c
>> +++ b/drivers/remoteproc/remoteproc_core.c
>> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
>>  	 */
>>  	if (rproc->table_ptr) {
>>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
>> -		rsc->vring[idx].da = 0;
>> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
>>  		rsc->vring[idx].notifyid = -1;
>>  	}
>>  }
>> -- 
>> 2.17.1
>>
>>
>>
>>
>>
>>
Mathieu Poirier Feb. 8, 2021, 11:43 p.m. UTC | #12
On Wed, Feb 03, 2021 at 08:58:15AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 2/2/21 11:42 PM, Mathieu Poirier wrote:
> > On Tue, Feb 02, 2021 at 09:54:13AM +0100, Arnaud POULIQUEN wrote:
> >>
> >>
> >> On 2/2/21 1:49 AM, Mathieu Poirier wrote:
> >>> On Wed, Jan 27, 2021 at 10:21:24AM +0100, Arnaud POULIQUEN wrote:
> >>>> Hi Mathieu
> >>>>
> >>>> On 12/18/20 6:32 PM, Mathieu Poirier wrote:
> >>>>> Following the work done here [1], this set provides support for the
> >>>>> remoteproc core to release resources associated with a remote processor
> >>>>> without having to switch it off. That way a platform driver can be removed
> >>>>> or the application processor power cycled while the remote processor is
> >>>>> still operating.
> >>>>>
> >>>>> Of special interest in this series are patches 5 and 6 where getting the
> >>>>> address of the resource table installed by an eternal entity if moved to
> >>>>> the core.  This is to support scenarios where a remote process has been
> >>>>> booted by the core but is being detached.  To re-attach the remote
> >>>>> processor, the address of the resource table needs to be known at a later
> >>>>> time than the platform driver's probe() function.
> >>>>>
> >>>>> Applies cleanly on v5.10
> >>>>>
> >>>>> Thanks,
> >>>>> Mathieu
> >>>>>
> >>>>> [1]. https://lkml.org/lkml/2020/7/14/1600
> >>>>>
> >>>>> ----
> >>>>> New for v4:
> >>>>> - Made binding description OS agnostic (Rob)
> >>>>> - Added functionality to set the external resource table in the core
> >>>>> - Fixed a crash when detaching (Arnaud)
> >>>>> - Fixed error code propagation in rproc_cdev_relase() and rproc_del() (Arnaud)
> >>>>> - Added RB tags
> >>>>
> >>>>
> >>>> I tested you series, attach and  detach is working well.
> >>>>
> >>>> Then I faced issue when tried to re-attach after a detach.
> >>>>
> >>>
> >>> Right, in this case don't expect the re-attach to work properly because function
> >>> stm32_rproc_detach() does not exist.  As such the M4 doesn't put itself back
> >>> in "wait-for-attach" mode as it does when booted by the boot loader.  If I
> >>> remember correctly we talked about that during an earlier conversation and we
> >>> agreed FW support would be needed to properly test the re-attach.
> >>
> >> Yes you are right the remote firmware needs to be inform about the detach, and
> >> this is the purpose of the detach ops.
> >> But also some actions are missing on local side as some resources have also to
> >> be reinitialized as described in my previous mail.
> >> For instance the resource table is handled by the remoteproc framework. The
> >> remote firmware should only have a read access to this table.
> >>
> >>>  
> >>>> But I don't know if this feature has to be supported in this step.
> >>>>
> >>>> The 2 issues I found are:
> >>>>
> >>>> 1) memory carveouts are released on detach so need to be reinitialized.
> >>>> The use of prepare/unprepare for the attach and detach would solve the issue but
> >>>> probably need to add parameter to differentiate a start/stop from a attach/detach.
> > 
> > I am in agreement with you assesment and the patch you have proposed to fix it.
> > Right now carveouts are properly handled between start and stop operations
> > because their parsing and allocation is done as part for ops:parse_fw(), which
> > is called for each iteration.  Moving the parsing and allocation to
> > rproc_prepare_device() ends up doing the same thing.
> > 
> > I am not sure we absolutely need an extra parameter to differentiate between
> > start/stop and attach/detach.  Typically the rproc_prepare_device() is used to
> > do some kind of setup like providing power to a memory bank.  I suppose that if
> > a memory bank is already powered by the boot loader, asking to power it up again
> > won't do anything.
> > 
> > As such I suggest we keep the parameters as they are now.  We can revisit if a
> > use case comes up at a later time. 
> > 
> 
> Your suggestion sound good to me.
> 
> >>>>
> >>>> 2) The vrings in the loaded resource table (so no cached) has to be properly
> >>>> reinitialized. In rproc_free_vring  the vring da is set to 0 that is then
> >>>> considered as a fixed address.
> >>>>
> >>>> Here is a fix which works on the stm32 platform
> >>>>
> >>>> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
> >>>>  	 */
> >>>>  	if (rproc->table_ptr) {
> >>>>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
> >>>> -		rsc->vring[idx].da = 0;
> >>>> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
> >>>>  		rsc->vring[idx].notifyid = -1;
> >>>>  	}
> >>>>  }
> > 
> > I see why this could be needed.  I initially assumed the remote processor would
> > install a new resource table in memory upon receiving notification the core is
> > going away.  But upon reflection the remote processor may not even have access
> > to the image it is running.
> 
> The risk here is that both processors try to access this table at the same time.
> 
> > 
> > Let me think about this further - I want to make sure we don't introduce a
> > regression for current implementations.
> 
> Just a precision: the vring DA can be fixed by the coprocessor firmware to
> impose the address, my patch is not sufficent in such case for the reattachment.
> That's why i suggested a copy of the resource table before updating it.
> 
> Thanks,
> Arnaud
> 
> > 
> >>>
> >>> In light of the above let me know if these two issues are still relevant.  If
> >>> so I'll investigate further.
> >>
> >> To highlight the issue just test attach/detach/attch  with a firmware that
> >> implements a RPMsg communication. On the second attach the virtio framework is
> >> not properly restarted.
> >>
> >> Then please find at the end of the mail 3 patches for test I added on top of
> >> your series,that allow me to reattach. Of course the RPMsg channels are not
> >> re-created as i don't implement the remote FW part, but the Linux virtio and
> >> RPmsg frameworks are restarted.
> >>
> >> - [PATCH 1/3] remoteproc: stm32: add capability to detach from the remoteproc
> >>   => Add a dummy function in stm32_rproc for test.
> >> - [PATCH 2/3] remoteproc: Add prepare/unprepare for attach detach
> >>   => Add prepare/unprepare on attach/detach + implement attach in stm32mp1 to
> >>      reinitialize the memory region that as been cleaned on detach.
> >> - [PATCH 3/3] remoteproc: virtio: set to vring address to FW_RSC_ADDR_ANY on free
> >>   => Reinitialize the vring addresses on detach. For this one a better
> >>      implementation would be to use a cached resource table to fully
> >>      reinitialize it on re-attach.
> >>
> >> Thanks,
> >> Arnaud
> >>
> >>>
> >>> Thanks,
> >>> Mathieu
> >>>
> >>>>
> >>>> Here, perhaps a better alternative would be to make a cached copy on attach
> >>>> before updating it. On the next attach, the cached copy would be copied as it is
> >>>> done in rproc_start.

I've been staring long and hard at this code and I don't see another way than
keeping a clean copy of the resource table as you mentionned above.  And
managing that table isn't trivial either.  I am working on a solution, something
that I should be able to publish by the end of the week.

Thanks,
Mathieu

> >>>>
> >>>> Thanks,
> >>>> Arnaud
> >>>>
> >>>>
> >>>>>
> >>>>> Mathieu Poirier (17):
> >>>>>   dt-bindings: remoteproc: Add bindind to support autonomous processors
> >>>>>   remoteproc: Re-check state in rproc_shutdown()
> >>>>>   remoteproc: Remove useless check in rproc_del()
> >>>>>   remoteproc: Rename function rproc_actuate()
> >>>>>   remoteproc: Add new get_loaded_rsc_table() remoteproc operation
> >>>>>   remoteproc: stm32: Move resource table setup to rproc_ops
> >>>>>   remoteproc: Add new RPROC_ATTACHED state
> >>>>>   remoteproc: Properly represent the attached state
> >>>>>   remoteproc: Properly deal with a kernel panic when attached
> >>>>>   remoteproc: Add new detach() remoteproc operation
> >>>>>   remoteproc: Introduce function __rproc_detach()
> >>>>>   remoteproc: Introduce function rproc_detach()
> >>>>>   remoteproc: Add return value to function rproc_shutdown()
> >>>>>   remoteproc: Properly deal with a stop request when attached
> >>>>>   remoteproc: Properly deal with a start request when attached
> >>>>>   remoteproc: Properly deal with detach request
> >>>>>   remoteproc: Refactor rproc delete and cdev release path
> >>>>>
> >>>>>  .../bindings/remoteproc/remoteproc-core.yaml  |  27 +++
> >>>>>  drivers/remoteproc/remoteproc_cdev.c          |  32 ++-
> >>>>>  drivers/remoteproc/remoteproc_core.c          | 211 +++++++++++++++---
> >>>>>  drivers/remoteproc/remoteproc_internal.h      |   8 +
> >>>>>  drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
> >>>>>  drivers/remoteproc/stm32_rproc.c              | 147 ++++++------
> >>>>>  include/linux/remoteproc.h                    |  24 +-
> >>>>>  7 files changed, 344 insertions(+), 125 deletions(-)
> >>>>>  create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml
> >>>>>
> >>
> >> Subject: [PATCH 1/3] remoteproc: stm32: add capability to detach from the
> >>  remoteproc
> >>
> >> Add a dummy function to allow to detach. No specific action is needed
> >>
> >> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> >> ---
> >>  drivers/remoteproc/stm32_rproc.c | 7 +++++++
> >>  1 file changed, 7 insertions(+)
> >>
> >> diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
> >> index 2c949725b91e..b325d28f627c 100644
> >> --- a/drivers/remoteproc/stm32_rproc.c
> >> +++ b/drivers/remoteproc/stm32_rproc.c
> >> @@ -590,6 +590,12 @@ static int stm32_rproc_attach(struct rproc *rproc)
> >>  	return reset_control_assert(ddata->hold_boot);
> >>  }
> >>
> >> +static int stm32_rproc_detach(struct rproc *rproc)
> >> +{
> >> +	/* Nothing to do but ops mandatory to support the detach feature */
> >> +	return 0;
> >> +}
> >> +
> >>  static int stm32_rproc_stop(struct rproc *rproc)
> >>  {
> >>  	struct stm32_rproc *ddata = rproc->priv;
> >> @@ -712,6 +718,7 @@ static struct rproc_ops st_rproc_ops = {
> >>  	.start		= stm32_rproc_start,
> >>  	.stop		= stm32_rproc_stop,
> >>  	.attach		= stm32_rproc_attach,
> >> +	.detach		= stm32_rproc_detach,
> >>  	.kick		= stm32_rproc_kick,
> >>  	.load		= rproc_elf_load_segments,
> >>  	.parse_fw	= stm32_rproc_parse_fw,
> >> -- 
> >> 2.17.1
> >>
> >>
> >> ------------------------------------------------------------------------
> >>
> >>
> >> Subject: [PATCH 2/3] remoteproc: Add prepare/unprepare for attach detach
> >>
> >> Some actions such as memory resources reallocation are needed when try
> >> to reattach. Use the prepare ops for these actions.
> >>
> >> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
> >> ---
> >>  drivers/remoteproc/remoteproc_core.c | 14 ++++++++++++++
> >>  drivers/remoteproc/stm32_rproc.c     | 14 +++++++-------
> >>  2 files changed, 21 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/remoteproc/remoteproc_core.c
> >> b/drivers/remoteproc/remoteproc_core.c
> >> index f1f51ad1a1d6..f177561b8863 100644
> >> --- a/drivers/remoteproc/remoteproc_core.c
> >> +++ b/drivers/remoteproc/remoteproc_core.c
> >> @@ -1557,6 +1557,13 @@ static int rproc_attach(struct rproc *rproc)
> >>  		return ret;
> >>  	}
> >>
> >> +	/* Prepare rproc for firmware loading if needed */
> >> +	ret = rproc_prepare_device(rproc);
> >> +	if (ret) {
> >> +		dev_err(dev, "can't prepare rproc %s: %d\n", rproc->name, ret);
> >> +		goto disable_iommu;
> >> +	}
> >> +
> >>  	ret = rproc_get_loaded_rsc_table(rproc);
> >>  	if (ret) {
> >>  		dev_err(dev, "can't load resource table: %d\n", ret);
> >> @@ -1990,6 +1997,13 @@ int rproc_detach(struct rproc *rproc)
> >>  	/* clean up all acquired resources */
> >>  	rproc_resource_cleanup(rproc);
> >>
> >> +	/* Release HW resources if needed */
> >> +	ret = rproc_unprepare_device(rproc);
> >> +	if (ret) {
> >> +		atomic_inc(&rproc->power);
> >> +		goto out;
> >> +	}
> >> +
> >>  	rproc_disable_iommu(rproc);
> >>
> >>  	/*
> >> diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
> >> index b325d28f627c..bf50d79b1f09 100644
> >> --- a/drivers/remoteproc/stm32_rproc.c
> >> +++ b/drivers/remoteproc/stm32_rproc.c
> >> @@ -413,9 +413,6 @@ static int stm32_rproc_parse_fw(struct rproc *rproc, const
> >> struct firmware *fw)
> >>  	struct stm32_rproc *ddata = rproc->priv;
> >>  	int ret;
> >>
> >> -	ret  = stm32_rproc_parse_memory_regions(rproc);
> >> -	if (ret)
> >> -		return ret;
> >>
> >>  	if (ddata->trproc)
> >>  		ret = rproc_tee_get_rsc_table(ddata->trproc);
> >> @@ -580,6 +577,12 @@ static int stm32_rproc_start(struct rproc *rproc)
> >>
> >>  	return reset_control_assert(ddata->hold_boot);
> >>  }
> >> +static int stm32_rproc_prepare(struct rproc *rproc)
> >> +{
> >> +	dev_err(&rproc->dev, "%s: %d\n", __func__, __LINE__);
> >> +
> >> +	return stm32_rproc_parse_memory_regions(rproc);
> >> +}
> >>
> >>  static int stm32_rproc_attach(struct rproc *rproc)
> >>  {
> >> @@ -717,6 +720,7 @@ static int stm32_rproc_get_loaded_rsc_table(struct rproc *rproc)
> >>  static struct rproc_ops st_rproc_ops = {
> >>  	.start		= stm32_rproc_start,
> >>  	.stop		= stm32_rproc_stop,
> >> +	.prepare	= stm32_rproc_prepare,
> >>  	.attach		= stm32_rproc_attach,
> >>  	.detach		= stm32_rproc_detach,
> >>  	.kick		= stm32_rproc_kick,
> >> @@ -921,10 +925,6 @@ static int stm32_rproc_probe(struct platform_device *pdev)
> >>
> >>  	if (state == M4_STATE_CRUN) {
> >>  		rproc->state = RPROC_DETACHED;
> >> -
> >> -		ret = stm32_rproc_parse_memory_regions(rproc);
> >> -		if (ret)
> >> -			goto free_resources;
> >>  	}
> >>
> >>  	rproc->has_iommu = false;
> >> -- 
> >> 2.17.1
> >>
> >>
> >> ------------------------------------------------------------------------
> >>
> >> Subject: [PATCH 3/3] remoteproc: virtio: set to vring address to
> >>  FW_RSC_ADDR_ANY on free
> >>
> >> The resource table vring structure is cleaned on free. But value is set
> >> to 0. This value is considered as a valid address. Set the value
> >> to  FW_RSC_ADDR_ANY instead.
> >> This is needed to allow to reattach to an autonomous firmware.
> >> An alternative would be to save the resource table before updating it.
> >> On free the value would be reset to initial value.
> >>
> >> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
> >> ---
> >>  drivers/remoteproc/remoteproc_core.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/remoteproc/remoteproc_core.c
> >> b/drivers/remoteproc/remoteproc_core.c
> >> index f177561b8863..5b5de4db3981 100644
> >> --- a/drivers/remoteproc/remoteproc_core.c
> >> +++ b/drivers/remoteproc/remoteproc_core.c
> >> @@ -425,7 +425,7 @@ void rproc_free_vring(struct rproc_vring *rvring)
> >>  	 */
> >>  	if (rproc->table_ptr) {
> >>  		rsc = (void *)rproc->table_ptr + rvring->rvdev->rsc_offset;
> >> -		rsc->vring[idx].da = 0;
> >> +		rsc->vring[idx].da = FW_RSC_ADDR_ANY;
> >>  		rsc->vring[idx].notifyid = -1;
> >>  	}
> >>  }
> >> -- 
> >> 2.17.1
> >>
> >>
> >>
> >>
> >>
> >>