mbox series

[v3,00/15] remoteproc: Add support for detaching from rproc

Message ID 20201126210642.897302-1-mathieu.poirier@linaro.org
Headers show
Series remoteproc: Add support for detaching from rproc | expand

Message

Mathieu Poirier Nov. 26, 2020, 9:06 p.m. UTC
Following the work done here [1], this set provides support for the
remoteproc core to release resources associated with a remote processor
without having to switch it off. That way a platform driver can be removed
or the application processor power cycled while the remote processor is
still operating.

Applies cleanly on rproc-next (c3c21b356505).

Thanks,
Mathieu

[1]. https://lkml.org/lkml/2020/7/14/1600

----
New for V3:
- Added RB from Arnaud where applicable.
- Reformatted comments about "detach" operation in struct rproc_ops.
- Fixed error path in rproc_shutdown().
- Fixed processing of "start" command in state_store() and rproc_cdev_write().
- Changed binding from "autonomous-on-core-reboot" to
  "autonomous-on-core-shutdown".
- Wrote a proper YAML file for the binding.

Mathieu Poirier (15):
  dt-bindings: remoteproc: Add bindind to support autonomous processors
  remoteproc: Re-check state in rproc_shutdown()
  remoteproc: Remove useless check in rproc_del()
  remoteproc: Add new RPROC_ATTACHED state
  remoteproc: Properly represent the attached state
  remoteproc: Properly deal with a kernel panic when attached
  remoteproc: Add new detach() remoteproc operation
  remoteproc: Introduce function __rproc_detach()
  remoteproc: Introduce function rproc_detach()
  remoteproc: Rename function rproc_actuate()
  remoteproc: Add return value to function rproc_shutdown()
  remoteproc: Properly deal with a stop request when attached
  remoteproc: Properly deal with a start request when attached
  remoteproc: Properly deal with detach request
  remoteproc: Refactor rproc delete and cdev release path

 .../bindings/remoteproc/remoteproc-core.yaml  |  25 +++
 drivers/remoteproc/remoteproc_cdev.c          |  27 ++-
 drivers/remoteproc/remoteproc_core.c          | 182 +++++++++++++++---
 drivers/remoteproc/remoteproc_sysfs.c         |  20 +-
 include/linux/remoteproc.h                    |  18 +-
 5 files changed, 225 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/remoteproc/remoteproc-core.yaml

Comments

Arnaud POULIQUEN Dec. 2, 2020, 6:13 p.m. UTC | #1
Hi Mathieu

On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> Add an new detach() operation in order to support scenarios where
> the remoteproc core is going away but the remote processor is
> kept operating.  This could be the case when the system is
> rebooted or when the platform driver is removed.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Reviewed-by: Peng Fan <peng.fan@nxp.com>

Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>

Thanks,
Arnaud
> ---
>  include/linux/remoteproc.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 9be112b5c09d..da15b77583d3 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -361,6 +361,7 @@ enum rsc_handling_status {
>   * @start:	power on the device and boot it
>   * @stop:	power off the device
>   * @attach:	attach to a device that his already powered up
> + * @detach:	detach from a device, leaving it powered up
>   * @kick:	kick a virtqueue (virtqueue id given as a parameter)
>   * @da_to_va:	optional platform hook to perform address translations
>   * @parse_fw:	parse firmware to extract information (e.g. resource table)
> @@ -382,6 +383,7 @@ struct rproc_ops {
>  	int (*start)(struct rproc *rproc);
>  	int (*stop)(struct rproc *rproc);
>  	int (*attach)(struct rproc *rproc);
> +	int (*detach)(struct rproc *rproc);
>  	void (*kick)(struct rproc *rproc, int vqid);
>  	void * (*da_to_va)(struct rproc *rproc, u64 da, size_t len);
>  	int (*parse_fw)(struct rproc *rproc, const struct firmware *fw);
>
Arnaud POULIQUEN Dec. 2, 2020, 6:14 p.m. UTC | #2
On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> Add a return value to function rproc_shutdown() in order to
> properly deal with error conditions that may occur.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Reviewed-by: Peng Fan <peng.fan@nxp.com>

Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>

Thanks,
Arnaud
> ---
>  drivers/remoteproc/remoteproc_core.c | 19 ++++++++++++++-----
>  include/linux/remoteproc.h           |  2 +-
>  2 files changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index b54f60cc3cbd..51275107eb1f 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1869,7 +1869,7 @@ EXPORT_SYMBOL(rproc_boot);
>   *   returns, and users can still use it with a subsequent rproc_boot(), if
>   *   needed.
>   */
> -void rproc_shutdown(struct rproc *rproc)
> +int rproc_shutdown(struct rproc *rproc)
>  {
>  	struct device *dev = &rproc->dev;
>  	int ret;
> @@ -1877,15 +1877,19 @@ void rproc_shutdown(struct rproc *rproc)
>  	ret = mutex_lock_interruptible(&rproc->lock);
>  	if (ret) {
>  		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> -		return;
> +		return ret;
>  	}
>  
> -	if (rproc->state != RPROC_RUNNING)
> +	if (rproc->state != RPROC_RUNNING) {
> +		ret = -EPERM;
>  		goto out;
> +	}
>  
>  	/* if the remote proc is still needed, bail out */
> -	if (!atomic_dec_and_test(&rproc->power))
> +	if (!atomic_dec_and_test(&rproc->power)) {
> +		ret = -EBUSY;
>  		goto out;
> +	}
>  
>  	ret = rproc_stop(rproc, false);
>  	if (ret) {
> @@ -1897,7 +1901,11 @@ void rproc_shutdown(struct rproc *rproc)
>  	rproc_resource_cleanup(rproc);
>  
>  	/* release HW resources if needed */
> -	rproc_unprepare_device(rproc);
> +	ret = rproc_unprepare_device(rproc);
> +	if (ret) {
> +		atomic_inc(&rproc->power);
> +		goto out;
> +	}
>  
>  	rproc_disable_iommu(rproc);
>  
> @@ -1907,6 +1915,7 @@ void rproc_shutdown(struct rproc *rproc)
>  	rproc->table_ptr = NULL;
>  out:
>  	mutex_unlock(&rproc->lock);
> +	return ret;
>  }
>  EXPORT_SYMBOL(rproc_shutdown);
>  
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 329c1c071dcf..02312096d59f 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -655,7 +655,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>  			     u32 da, const char *name, ...);
>  
>  int rproc_boot(struct rproc *rproc);
> -void rproc_shutdown(struct rproc *rproc);
> +int rproc_shutdown(struct rproc *rproc);
>  int rproc_detach(struct rproc *rproc);
>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>
Arnaud POULIQUEN Dec. 2, 2020, 6:14 p.m. UTC | #3
On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> This patch takes into account scenarios where a remote processor
> has been attached to when receiving a "start" command from sysfs.
> 
> As with the "running" case, the command can't be carried out if the
> remote processor is already in operation.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>


Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>

Thanks,
Arnaud

> ---
>  drivers/remoteproc/remoteproc_cdev.c  | 3 ++-
>  drivers/remoteproc/remoteproc_sysfs.c | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
> index d06f8d4919c7..61541bc7d26c 100644
> --- a/drivers/remoteproc/remoteproc_cdev.c
> +++ b/drivers/remoteproc/remoteproc_cdev.c
> @@ -32,7 +32,8 @@ static ssize_t rproc_cdev_write(struct file *filp, const char __user *buf, size_
>  		return -EFAULT;
>  
>  	if (!strncmp(cmd, "start", len)) {
> -		if (rproc->state == RPROC_RUNNING)
> +		if (rproc->state == RPROC_RUNNING ||
> +		    rproc->state == RPROC_ATTACHED)
>  			return -EBUSY;
>  
>  		ret = rproc_boot(rproc);
> diff --git a/drivers/remoteproc/remoteproc_sysfs.c b/drivers/remoteproc/remoteproc_sysfs.c
> index 3696f2ccc785..7d281cfe3e03 100644
> --- a/drivers/remoteproc/remoteproc_sysfs.c
> +++ b/drivers/remoteproc/remoteproc_sysfs.c
> @@ -194,7 +194,8 @@ static ssize_t state_store(struct device *dev,
>  	int ret = 0;
>  
>  	if (sysfs_streq(buf, "start")) {
> -		if (rproc->state == RPROC_RUNNING)
> +		if (rproc->state == RPROC_RUNNING ||
> +		    rproc->state == RPROC_ATTACHED)
>  			return -EBUSY;
>  
>  		ret = rproc_boot(rproc);
>
Mathieu Poirier Dec. 8, 2020, 8:25 p.m. UTC | #4
On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu,
> 
> 
> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> > Introduce function rproc_detach() to enable the remoteproc
> > core to release the resources associated with a remote processor
> > without stopping its operation.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Reviewed-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
> >  include/linux/remoteproc.h           |  1 +
> >  2 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 928b3f975798..f5adf05762e9 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >  /*
> >   * __rproc_detach(): Does the opposite of rproc_attach()
> >   */
> > -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> > +static int __rproc_detach(struct rproc *rproc)
> >  {
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> > @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
> >  }
> >  EXPORT_SYMBOL(rproc_shutdown);
> >  
> > +/**
> > + * rproc_detach() - Detach the remote processor from the
> > + * remoteproc core
> > + *
> > + * @rproc: the remote processor
> > + *
> > + * Detach a remote processor (previously attached to with rproc_actuate()).
> > + *
> > + * In case @rproc is still being used by an additional user(s), then
> > + * this function will just decrement the power refcount and exit,
> > + * without disconnecting the device.
> > + *
> > + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> > + * processor know that services provided by the application processor are
> > + * no longer available.  From there it should be possible to remove the
> > + * platform driver and even power cycle the application processor (if the HW
> > + * supports it) without needing to switch off the remote processor.
> > + */
> > +int rproc_detach(struct rproc *rproc)
> > +{
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = mutex_lock_interruptible(&rproc->lock);
> > +	if (ret) {
> > +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> > +		return ret;
> > +	}
> > +
> > +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/* if the remote proc is still needed, bail out */
> > +	if (!atomic_dec_and_test(&rproc->power)) {
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	ret = __rproc_detach(rproc);
> > +	if (ret) {
> > +		atomic_inc(&rproc->power);
> > +		goto out;
> > +	}
> > +
> > +	/* clean up all acquired resources */
> > +	rproc_resource_cleanup(rproc);
> 
> I started to test the series, I found 2 problems testing in STM32P1 board.
> 
> 1) the resource_table pointer is unmapped if the firmware has been booted by the
> Linux, generating a crash in rproc_free_vring.
> I attached a fix at the end of the mail.
> 
> 2) After the detach, the rproc state is "detached"
> but it is no longer possible to re-attach to it correctly.
> Neither if the firmware is standalone, nor if it has been booted
> by the Linux.

Thanks for the report - I thought both problems had been fixed...

> 
> I did not investigate, but the issue is probably linked to the resource
> table address which is set to NULL.
> 
> So we either have to fix the problem in order to attach or forbid the transition.
> 

Perfect timing on your side as I was contemplating sending another revision.
Let me look at things and I will get back to you.

> 
> Regards,
> Arnaud
> 
> > +
> > +	rproc_disable_iommu(rproc);
> > +
> > +	/*
> > +	 * Set the remote processor's table pointer to NULL.  Since mapping
> > +	 * of the resource table to a virtual address is done in the platform
> > +	 * driver, unmapping should also be done there.
> > +	 */
> > +	rproc->table_ptr = NULL;
> > +out:
> > +	mutex_unlock(&rproc->lock);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(rproc_detach);
> > +
> >  /**
> >   * rproc_get_by_phandle() - find a remote processor by phandle
> >   * @phandle: phandle to the rproc
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index da15b77583d3..329c1c071dcf 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >  
> >  int rproc_boot(struct rproc *rproc);
> >  void rproc_shutdown(struct rproc *rproc);
> > +int rproc_detach(struct rproc *rproc);
> >  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> > 
> 
> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> Date: Tue, 8 Dec 2020 18:54:51 +0100
> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
> 
> If the firmware has been loaded and started by the kernel, the
> resource table has probably been mapped by the carveout allocation
> (see rproc_elf_find_loaded_rsc_table).
> In this case the memory can have been unmapped before the vrings are free.
> The result is a crash that occurs in rproc_free_vring while try to use the
> unmapped pointer.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index 2b0a52fb3398..3508ffba4a2a 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>  		goto out;
>  	}
> 
> +	/*
> +	 * Prevent case that the installed resource table is no longer
> +	 * accessible (e.g. memory unmapped), use the cache if available
> +	 */
> +	if (rproc->cached_table)
> +		rproc->table_ptr = rproc->cached_table;
> +
>  	ret = __rproc_detach(rproc);
>  	if (ret) {
>  		atomic_inc(&rproc->power);
> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
> 
>  	rproc_disable_iommu(rproc);
> 
> +	/* Free the chached table memory that can has been allocated*/
> +	kfree(rproc->cached_table);
> +	rproc->cached_table = NULL;
>  	/*
> -	 * Set the remote processor's table pointer to NULL.  Since mapping
> -	 * of the resource table to a virtual address is done in the platform
> -	 * driver, unmapping should also be done there.
> +	 * Set the remote processor's table pointer to NULL. If mapping
> +	 * of the resource table to a virtual address has been done in the
> +	 * platform driver(attachment to an existing firmware),
> +	 * unmapping should also be done there.
>  	 */
>  	rproc->table_ptr = NULL;
>  out:
> -- 
> 2.17.1
> 
> 
>
Mathieu Poirier Dec. 9, 2020, 12:53 a.m. UTC | #5
On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
> Hi Mathieu,
> 
> 
> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> > Introduce function rproc_detach() to enable the remoteproc
> > core to release the resources associated with a remote processor
> > without stopping its operation.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Reviewed-by: Peng Fan <peng.fan@nxp.com>
> > ---
> >  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
> >  include/linux/remoteproc.h           |  1 +
> >  2 files changed, 65 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 928b3f975798..f5adf05762e9 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >  /*
> >   * __rproc_detach(): Does the opposite of rproc_attach()
> >   */
> > -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> > +static int __rproc_detach(struct rproc *rproc)
> >  {
> >  	struct device *dev = &rproc->dev;
> >  	int ret;
> > @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
> >  }
> >  EXPORT_SYMBOL(rproc_shutdown);
> >  
> > +/**
> > + * rproc_detach() - Detach the remote processor from the
> > + * remoteproc core
> > + *
> > + * @rproc: the remote processor
> > + *
> > + * Detach a remote processor (previously attached to with rproc_actuate()).
> > + *
> > + * In case @rproc is still being used by an additional user(s), then
> > + * this function will just decrement the power refcount and exit,
> > + * without disconnecting the device.
> > + *
> > + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> > + * processor know that services provided by the application processor are
> > + * no longer available.  From there it should be possible to remove the
> > + * platform driver and even power cycle the application processor (if the HW
> > + * supports it) without needing to switch off the remote processor.
> > + */
> > +int rproc_detach(struct rproc *rproc)
> > +{
> > +	struct device *dev = &rproc->dev;
> > +	int ret;
> > +
> > +	ret = mutex_lock_interruptible(&rproc->lock);
> > +	if (ret) {
> > +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> > +		return ret;
> > +	}
> > +
> > +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> > +		ret = -EPERM;
> > +		goto out;
> > +	}
> > +
> > +	/* if the remote proc is still needed, bail out */
> > +	if (!atomic_dec_and_test(&rproc->power)) {
> > +		ret = -EBUSY;
> > +		goto out;
> > +	}
> > +
> > +	ret = __rproc_detach(rproc);
> > +	if (ret) {
> > +		atomic_inc(&rproc->power);
> > +		goto out;
> > +	}
> > +
> > +	/* clean up all acquired resources */
> > +	rproc_resource_cleanup(rproc);
> 
> I started to test the series, I found 2 problems testing in STM32P1 board.
> 
> 1) the resource_table pointer is unmapped if the firmware has been booted by the
> Linux, generating a crash in rproc_free_vring.
> I attached a fix at the end of the mail.
> 

I have reproduced the condition on my side and confirm that your solution is
correct.  See below for a minor comment. 

> 2) After the detach, the rproc state is "detached"
> but it is no longer possible to re-attach to it correctly.
> Neither if the firmware is standalone, nor if it has been booted
> by the Linux.
> 

Did you update your FW image?  If so, I need to run the same one.

> I did not investigate, but the issue is probably linked to the resource
> table address which is set to NULL.
> 
> So we either have to fix the problem in order to attach or forbid the transition.
> 
> 
> Regards,
> Arnaud
> 
> > +
> > +	rproc_disable_iommu(rproc);
> > +
> > +	/*
> > +	 * Set the remote processor's table pointer to NULL.  Since mapping
> > +	 * of the resource table to a virtual address is done in the platform
> > +	 * driver, unmapping should also be done there.
> > +	 */
> > +	rproc->table_ptr = NULL;
> > +out:
> > +	mutex_unlock(&rproc->lock);
> > +	return ret;
> > +}
> > +EXPORT_SYMBOL(rproc_detach);
> > +
> >  /**
> >   * rproc_get_by_phandle() - find a remote processor by phandle
> >   * @phandle: phandle to the rproc
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index da15b77583d3..329c1c071dcf 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >  
> >  int rproc_boot(struct rproc *rproc);
> >  void rproc_shutdown(struct rproc *rproc);
> > +int rproc_detach(struct rproc *rproc);
> >  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> > 
> 
> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> Date: Tue, 8 Dec 2020 18:54:51 +0100
> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
> 
> If the firmware has been loaded and started by the kernel, the
> resource table has probably been mapped by the carveout allocation
> (see rproc_elf_find_loaded_rsc_table).
> In this case the memory can have been unmapped before the vrings are free.
> The result is a crash that occurs in rproc_free_vring while try to use the
> unmapped pointer.
> 
> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> ---
>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index 2b0a52fb3398..3508ffba4a2a 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>  		goto out;
>  	}
> 
> +	/*
> +	 * Prevent case that the installed resource table is no longer
> +	 * accessible (e.g. memory unmapped), use the cache if available
> +	 */
> +	if (rproc->cached_table)
> +		rproc->table_ptr = rproc->cached_table;

I don't think there is an explicit need to check ->cached_table.  If the remote
processor has been started by the remoteproc core it is valid anyway.  And below
kfree() is called invariably. 

So that problem is fixed.  Let me know about your FW image and we'll pick it up
from there.

Mathieu

> +
>  	ret = __rproc_detach(rproc);
>  	if (ret) {
>  		atomic_inc(&rproc->power);
> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
> 
>  	rproc_disable_iommu(rproc);
> 
> +	/* Free the chached table memory that can has been allocated*/
> +	kfree(rproc->cached_table);
> +	rproc->cached_table = NULL;
>  	/*
> -	 * Set the remote processor's table pointer to NULL.  Since mapping
> -	 * of the resource table to a virtual address is done in the platform
> -	 * driver, unmapping should also be done there.
> +	 * Set the remote processor's table pointer to NULL. If mapping
> +	 * of the resource table to a virtual address has been done in the
> +	 * platform driver(attachment to an existing firmware),
> +	 * unmapping should also be done there.
>  	 */
>  	rproc->table_ptr = NULL;
>  out:
> -- 
> 2.17.1
> 
> 
>
Arnaud POULIQUEN Dec. 9, 2020, 8:45 a.m. UTC | #6
On 12/9/20 1:53 AM, Mathieu Poirier wrote:
> On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
>> Hi Mathieu,
>>
>>
>> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
>>> Introduce function rproc_detach() to enable the remoteproc
>>> core to release the resources associated with a remote processor
>>> without stopping its operation.
>>>
>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Reviewed-by: Peng Fan <peng.fan@nxp.com>
>>> ---
>>>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
>>>  include/linux/remoteproc.h           |  1 +
>>>  2 files changed, 65 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>>> index 928b3f975798..f5adf05762e9 100644
>>> --- a/drivers/remoteproc/remoteproc_core.c
>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>>>  /*
>>>   * __rproc_detach(): Does the opposite of rproc_attach()
>>>   */
>>> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
>>> +static int __rproc_detach(struct rproc *rproc)
>>>  {
>>>  	struct device *dev = &rproc->dev;
>>>  	int ret;
>>> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
>>>  }
>>>  EXPORT_SYMBOL(rproc_shutdown);
>>>  
>>> +/**
>>> + * rproc_detach() - Detach the remote processor from the
>>> + * remoteproc core
>>> + *
>>> + * @rproc: the remote processor
>>> + *
>>> + * Detach a remote processor (previously attached to with rproc_actuate()).
>>> + *
>>> + * In case @rproc is still being used by an additional user(s), then
>>> + * this function will just decrement the power refcount and exit,
>>> + * without disconnecting the device.
>>> + *
>>> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
>>> + * processor know that services provided by the application processor are
>>> + * no longer available.  From there it should be possible to remove the
>>> + * platform driver and even power cycle the application processor (if the HW
>>> + * supports it) without needing to switch off the remote processor.
>>> + */
>>> +int rproc_detach(struct rproc *rproc)
>>> +{
>>> +	struct device *dev = &rproc->dev;
>>> +	int ret;
>>> +
>>> +	ret = mutex_lock_interruptible(&rproc->lock);
>>> +	if (ret) {
>>> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
>>> +		return ret;
>>> +	}
>>> +
>>> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
>>> +		ret = -EPERM;
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* if the remote proc is still needed, bail out */
>>> +	if (!atomic_dec_and_test(&rproc->power)) {
>>> +		ret = -EBUSY;
>>> +		goto out;
>>> +	}
>>> +
>>> +	ret = __rproc_detach(rproc);
>>> +	if (ret) {
>>> +		atomic_inc(&rproc->power);
>>> +		goto out;
>>> +	}
>>> +
>>> +	/* clean up all acquired resources */
>>> +	rproc_resource_cleanup(rproc);
>>
>> I started to test the series, I found 2 problems testing in STM32P1 board.
>>
>> 1) the resource_table pointer is unmapped if the firmware has been booted by the
>> Linux, generating a crash in rproc_free_vring.
>> I attached a fix at the end of the mail.
>>
> 
> I have reproduced the condition on my side and confirm that your solution is
> correct.  See below for a minor comment. 
> 
>> 2) After the detach, the rproc state is "detached"
>> but it is no longer possible to re-attach to it correctly.
>> Neither if the firmware is standalone, nor if it has been booted
>> by the Linux.
>>
> 
> Did you update your FW image?  If so, I need to run the same one.
> 
>> I did not investigate, but the issue is probably linked to the resource
>> table address which is set to NULL.
>>
>> So we either have to fix the problem in order to attach or forbid the transition.
>>
>>
>> Regards,
>> Arnaud
>>
>>> +
>>> +	rproc_disable_iommu(rproc);
>>> +
>>> +	/*
>>> +	 * Set the remote processor's table pointer to NULL.  Since mapping
>>> +	 * of the resource table to a virtual address is done in the platform
>>> +	 * driver, unmapping should also be done there.
>>> +	 */
>>> +	rproc->table_ptr = NULL;
>>> +out:
>>> +	mutex_unlock(&rproc->lock);
>>> +	return ret;
>>> +}
>>> +EXPORT_SYMBOL(rproc_detach);
>>> +
>>>  /**
>>>   * rproc_get_by_phandle() - find a remote processor by phandle
>>>   * @phandle: phandle to the rproc
>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>>> index da15b77583d3..329c1c071dcf 100644
>>> --- a/include/linux/remoteproc.h
>>> +++ b/include/linux/remoteproc.h
>>> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>>>  
>>>  int rproc_boot(struct rproc *rproc);
>>>  void rproc_shutdown(struct rproc *rproc);
>>> +int rproc_detach(struct rproc *rproc);
>>>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>>>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>>>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
>>>
>>
>> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>> Date: Tue, 8 Dec 2020 18:54:51 +0100
>> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
>>
>> If the firmware has been loaded and started by the kernel, the
>> resource table has probably been mapped by the carveout allocation
>> (see rproc_elf_find_loaded_rsc_table).
>> In this case the memory can have been unmapped before the vrings are free.
>> The result is a crash that occurs in rproc_free_vring while try to use the
>> unmapped pointer.
>>
>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>> ---
>>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/remoteproc/remoteproc_core.c
>> b/drivers/remoteproc/remoteproc_core.c
>> index 2b0a52fb3398..3508ffba4a2a 100644
>> --- a/drivers/remoteproc/remoteproc_core.c
>> +++ b/drivers/remoteproc/remoteproc_core.c
>> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>>  		goto out;
>>  	}
>>
>> +	/*
>> +	 * Prevent case that the installed resource table is no longer
>> +	 * accessible (e.g. memory unmapped), use the cache if available
>> +	 */
>> +	if (rproc->cached_table)
>> +		rproc->table_ptr = rproc->cached_table;
> 
> I don't think there is an explicit need to check ->cached_table.  If the remote
> processor has been started by the remoteproc core it is valid anyway.  And below
> kfree() is called invariably. 

The condition is needed, the  rproc->cached_table is null if the firmware as
been preloaded and the Linux remote proc just attaches to it.
The cached is used only when Linux loads the firmware, as the resource table is
extracted from the elf file to parse resource before the load of the firmware.

> 
> So that problem is fixed.  Let me know about your FW image and we'll pick it up
> from there.

I use the following example available on the stm32mp1 image:
/usr/local/Cube-M4-examples/STM32MP157C-DK2/Applications/OpenAMP/OpenAMP_TTY_echo_wakeup/lib/firmware/
This exemple use the RPMsg and also blink a LED when while running.

Don't hesitate if you need me to send it to you by mail.

Thank,
Arnaud

> 
> Mathieu
> 
>> +
>>  	ret = __rproc_detach(rproc);
>>  	if (ret) {
>>  		atomic_inc(&rproc->power);
>> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
>>
>>  	rproc_disable_iommu(rproc);
>>
>> +	/* Free the chached table memory that can has been allocated*/
>> +	kfree(rproc->cached_table);
>> +	rproc->cached_table = NULL;
>>  	/*
>> -	 * Set the remote processor's table pointer to NULL.  Since mapping
>> -	 * of the resource table to a virtual address is done in the platform
>> -	 * driver, unmapping should also be done there.
>> +	 * Set the remote processor's table pointer to NULL. If mapping
>> +	 * of the resource table to a virtual address has been done in the
>> +	 * platform driver(attachment to an existing firmware),
>> +	 * unmapping should also be done there.
>>  	 */
>>  	rproc->table_ptr = NULL;
>>  out:
>> -- 
>> 2.17.1
>>
>>
>>
Arnaud POULIQUEN Dec. 9, 2020, 8:47 a.m. UTC | #7
On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> This patch introduces the capability to detach a remote processor
> that has been attached to or booted by the remoteproc core.  For
> that to happen a rproc::ops::detach() operation need to be
> available.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Reviewed-by: Peng Fan <peng.fan@nxp.com>

Reviewed-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>

Thanks,
Arnaud
> ---
>  drivers/remoteproc/remoteproc_cdev.c  | 6 ++++++
>  drivers/remoteproc/remoteproc_sysfs.c | 6 ++++++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
> index 61541bc7d26c..f7645f289563 100644
> --- a/drivers/remoteproc/remoteproc_cdev.c
> +++ b/drivers/remoteproc/remoteproc_cdev.c
> @@ -43,6 +43,12 @@ static ssize_t rproc_cdev_write(struct file *filp, const char __user *buf, size_
>  			return -EINVAL;
>  
>  		ret = rproc_shutdown(rproc);
> +	} else if (!strncmp(cmd, "detach", len)) {
> +		if (rproc->state != RPROC_RUNNING &&
> +		    rproc->state != RPROC_ATTACHED)
> +			return -EINVAL;
> +
> +		ret = rproc_detach(rproc);
>  	} else {
>  		dev_err(&rproc->dev, "Unrecognized option\n");
>  		ret = -EINVAL;
> diff --git a/drivers/remoteproc/remoteproc_sysfs.c b/drivers/remoteproc/remoteproc_sysfs.c
> index 7d281cfe3e03..5a239df5877e 100644
> --- a/drivers/remoteproc/remoteproc_sysfs.c
> +++ b/drivers/remoteproc/remoteproc_sysfs.c
> @@ -207,6 +207,12 @@ static ssize_t state_store(struct device *dev,
>  			return -EINVAL;
>  
>  		ret = rproc_shutdown(rproc);
> +	} else if (sysfs_streq(buf, "detach")) {
> +		if (rproc->state != RPROC_RUNNING &&
> +		    rproc->state != RPROC_ATTACHED)
> +			return -EINVAL;
> +
> +		ret = rproc_detach(rproc);
>  	} else {
>  		dev_err(&rproc->dev, "Unrecognised option: %s\n", buf);
>  		ret = -EINVAL;
>
Arnaud POULIQUEN Dec. 9, 2020, 10:13 a.m. UTC | #8
On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> Refactor function rproc_del() and rproc_cdev_release() to take
> into account the policy specified in the device tree.
> 
> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> ---
>  drivers/remoteproc/remoteproc_cdev.c | 13 +++++++++++-
>  drivers/remoteproc/remoteproc_core.c | 30 ++++++++++++++++++++++++++--
>  include/linux/remoteproc.h           |  4 ++++
>  3 files changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
> index f7645f289563..3dfe555dfc07 100644
> --- a/drivers/remoteproc/remoteproc_cdev.c
> +++ b/drivers/remoteproc/remoteproc_cdev.c
> @@ -88,7 +88,18 @@ static int rproc_cdev_release(struct inode *inode, struct file *filp)
>  {
>  	struct rproc *rproc = container_of(inode->i_cdev, struct rproc, cdev);
>  
> -	if (rproc->cdev_put_on_release && rproc->state == RPROC_RUNNING)
> +	if (!rproc->cdev_put_on_release)
> +		return 0;
> +
> +	/*
> +	 * The application has crashed or is releasing its file handle.  Detach
> +	 * or shutdown the remote processor based on the policy specified in the
> +	 * DT.  No need to check rproc->state right away, it will be done
> +	 * in either rproc_detach() or rproc_shutdown().
> +	 */
> +	if (rproc->autonomous_on_core_shutdown)
> +		rproc_detach(rproc);
> +	else
>  		rproc_shutdown(rproc);

A reason to not propagate the return of functions?

>  
>  	return 0;
> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> index 3d7d245edc4e..1a170103bf27 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -2294,6 +2294,22 @@ static int rproc_alloc_ops(struct rproc *rproc, const struct rproc_ops *ops)
>  	return 0;
>  }
>  
> +static void rproc_set_automation_flags(struct rproc *rproc)
> +{
> +	struct device *dev = rproc->dev.parent;
> +	struct device_node *np = dev->of_node;
> +	bool core_shutdown;
> +
> +	/*
> +	 * When function rproc_cdev_release() or rproc_del() are called and
> +	 * the remote processor has been attached to, it will be detached from
> +	 * (rather than turned off) if "autonomous-on-core-shutdown is specified
> +	 * in the DT.
> +	 */
> +	core_shutdown = of_property_read_bool(np, "autonomous-on-core-shutdown");
> +	rproc->autonomous_on_core_shutdown = core_shutdown;
> +}
> +
>  /**
>   * rproc_alloc() - allocate a remote processor handle
>   * @dev: the underlying device
> @@ -2352,6 +2368,8 @@ struct rproc *rproc_alloc(struct device *dev, const char *name,
>  	if (rproc_alloc_ops(rproc, ops))
>  		goto put_device;
>  
> +	rproc_set_automation_flags(rproc);
> +
>  	/* Assign a unique device index and name */
>  	rproc->index = ida_simple_get(&rproc_dev_index, 0, 0, GFP_KERNEL);
>  	if (rproc->index < 0) {
> @@ -2435,8 +2453,16 @@ int rproc_del(struct rproc *rproc)
>  	if (!rproc)
>  		return -EINVAL;
>  
> -	/* TODO: make sure this works with rproc->power > 1 */
> -	rproc_shutdown(rproc);
> +	/*
> +	 * TODO: make sure this works with rproc->power > 1
> +	 *
> +	 * No need to check rproc->state right away, it will be done in either
> +	 * rproc_detach() or rproc_shutdown().
> +	 */
> +	if (rproc->autonomous_on_core_shutdown)
> +		rproc_detach(rproc);
> +	else
> +		rproc_shutdown(rproc);

same here

>  
>  	mutex_lock(&rproc->lock);
>  	rproc->state = RPROC_DELETED;
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index 02312096d59f..5702f630d810 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -516,6 +516,9 @@ struct rproc_dump_segment {
>   * @nb_vdev: number of vdev currently handled by rproc
>   * @char_dev: character device of the rproc
>   * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release
> + * @autonomous_on_core_shutdown: true if the remote processor should be detached
> + *				 from (rather than turned off) when the remoteproc
> + *				 core goes away.
>   */
>  struct rproc {
>  	struct list_head node;
> @@ -554,6 +557,7 @@ struct rproc {
>  	u16 elf_machine;
>  	struct cdev cdev;
>  	bool cdev_put_on_release;
> +	bool autonomous_on_core_shutdown;
>  };
>  
>  /**
>
Mathieu Poirier Dec. 9, 2020, 9:18 p.m. UTC | #9
On Wed, Dec 09, 2020 at 09:45:32AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 12/9/20 1:53 AM, Mathieu Poirier wrote:
> > On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
> >> Hi Mathieu,
> >>
> >>
> >> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> >>> Introduce function rproc_detach() to enable the remoteproc
> >>> core to release the resources associated with a remote processor
> >>> without stopping its operation.
> >>>
> >>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> >>> Reviewed-by: Peng Fan <peng.fan@nxp.com>
> >>> ---
> >>>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
> >>>  include/linux/remoteproc.h           |  1 +
> >>>  2 files changed, 65 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> >>> index 928b3f975798..f5adf05762e9 100644
> >>> --- a/drivers/remoteproc/remoteproc_core.c
> >>> +++ b/drivers/remoteproc/remoteproc_core.c
> >>> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
> >>>  /*
> >>>   * __rproc_detach(): Does the opposite of rproc_attach()
> >>>   */
> >>> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
> >>> +static int __rproc_detach(struct rproc *rproc)
> >>>  {
> >>>  	struct device *dev = &rproc->dev;
> >>>  	int ret;
> >>> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
> >>>  }
> >>>  EXPORT_SYMBOL(rproc_shutdown);
> >>>  
> >>> +/**
> >>> + * rproc_detach() - Detach the remote processor from the
> >>> + * remoteproc core
> >>> + *
> >>> + * @rproc: the remote processor
> >>> + *
> >>> + * Detach a remote processor (previously attached to with rproc_actuate()).
> >>> + *
> >>> + * In case @rproc is still being used by an additional user(s), then
> >>> + * this function will just decrement the power refcount and exit,
> >>> + * without disconnecting the device.
> >>> + *
> >>> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
> >>> + * processor know that services provided by the application processor are
> >>> + * no longer available.  From there it should be possible to remove the
> >>> + * platform driver and even power cycle the application processor (if the HW
> >>> + * supports it) without needing to switch off the remote processor.
> >>> + */
> >>> +int rproc_detach(struct rproc *rproc)
> >>> +{
> >>> +	struct device *dev = &rproc->dev;
> >>> +	int ret;
> >>> +
> >>> +	ret = mutex_lock_interruptible(&rproc->lock);
> >>> +	if (ret) {
> >>> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
> >>> +		return ret;
> >>> +	}
> >>> +
> >>> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
> >>> +		ret = -EPERM;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/* if the remote proc is still needed, bail out */
> >>> +	if (!atomic_dec_and_test(&rproc->power)) {
> >>> +		ret = -EBUSY;
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	ret = __rproc_detach(rproc);
> >>> +	if (ret) {
> >>> +		atomic_inc(&rproc->power);
> >>> +		goto out;
> >>> +	}
> >>> +
> >>> +	/* clean up all acquired resources */
> >>> +	rproc_resource_cleanup(rproc);
> >>
> >> I started to test the series, I found 2 problems testing in STM32P1 board.
> >>
> >> 1) the resource_table pointer is unmapped if the firmware has been booted by the
> >> Linux, generating a crash in rproc_free_vring.
> >> I attached a fix at the end of the mail.
> >>
> > 
> > I have reproduced the condition on my side and confirm that your solution is
> > correct.  See below for a minor comment. 
> > 
> >> 2) After the detach, the rproc state is "detached"
> >> but it is no longer possible to re-attach to it correctly.
> >> Neither if the firmware is standalone, nor if it has been booted
> >> by the Linux.
> >>
> > 
> > Did you update your FW image?  If so, I need to run the same one.
> > 
> >> I did not investigate, but the issue is probably linked to the resource
> >> table address which is set to NULL.
> >>
> >> So we either have to fix the problem in order to attach or forbid the transition.
> >>
> >>
> >> Regards,
> >> Arnaud
> >>
> >>> +
> >>> +	rproc_disable_iommu(rproc);
> >>> +
> >>> +	/*
> >>> +	 * Set the remote processor's table pointer to NULL.  Since mapping
> >>> +	 * of the resource table to a virtual address is done in the platform
> >>> +	 * driver, unmapping should also be done there.
> >>> +	 */
> >>> +	rproc->table_ptr = NULL;
> >>> +out:
> >>> +	mutex_unlock(&rproc->lock);
> >>> +	return ret;
> >>> +}
> >>> +EXPORT_SYMBOL(rproc_detach);
> >>> +
> >>>  /**
> >>>   * rproc_get_by_phandle() - find a remote processor by phandle
> >>>   * @phandle: phandle to the rproc
> >>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> >>> index da15b77583d3..329c1c071dcf 100644
> >>> --- a/include/linux/remoteproc.h
> >>> +++ b/include/linux/remoteproc.h
> >>> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
> >>>  
> >>>  int rproc_boot(struct rproc *rproc);
> >>>  void rproc_shutdown(struct rproc *rproc);
> >>> +int rproc_detach(struct rproc *rproc);
> >>>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
> >>>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
> >>>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
> >>>
> >>
> >> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> >> Date: Tue, 8 Dec 2020 18:54:51 +0100
> >> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
> >>
> >> If the firmware has been loaded and started by the kernel, the
> >> resource table has probably been mapped by the carveout allocation
> >> (see rproc_elf_find_loaded_rsc_table).
> >> In this case the memory can have been unmapped before the vrings are free.
> >> The result is a crash that occurs in rproc_free_vring while try to use the
> >> unmapped pointer.
> >>
> >> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
> >> ---
> >>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
> >>  1 file changed, 14 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/remoteproc/remoteproc_core.c
> >> b/drivers/remoteproc/remoteproc_core.c
> >> index 2b0a52fb3398..3508ffba4a2a 100644
> >> --- a/drivers/remoteproc/remoteproc_core.c
> >> +++ b/drivers/remoteproc/remoteproc_core.c
> >> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
> >>  		goto out;
> >>  	}
> >>
> >> +	/*
> >> +	 * Prevent case that the installed resource table is no longer
> >> +	 * accessible (e.g. memory unmapped), use the cache if available
> >> +	 */
> >> +	if (rproc->cached_table)
> >> +		rproc->table_ptr = rproc->cached_table;
> > 
> > I don't think there is an explicit need to check ->cached_table.  If the remote
> > processor has been started by the remoteproc core it is valid anyway.  And below
> > kfree() is called invariably. 
> 
> The condition is needed, the  rproc->cached_table is null if the firmware as
> been preloaded and the Linux remote proc just attaches to it.
> The cached is used only when Linux loads the firmware, as the resource table is
> extracted from the elf file to parse resource before the load of the firmware.

I have taken another look at this and you are correct. The if() condition is
needed because ->table_ptr is set only once when the platform driver is
probed.  See further down...

> 
> > 
> > So that problem is fixed.  Let me know about your FW image and we'll pick it up
> > from there.
> 
> I use the following example available on the stm32mp1 image:
> /usr/local/Cube-M4-examples/STM32MP157C-DK2/Applications/OpenAMP/OpenAMP_TTY_echo_wakeup/lib/firmware/
> This exemple use the RPMsg and also blink a LED when while running.
> 
> Don't hesitate if you need me to send it to you by mail.
> 
> Thank,
> Arnaud
> 
> > 
> > Mathieu
> > 
> >> +
> >>  	ret = __rproc_detach(rproc);
> >>  	if (ret) {
> >>  		atomic_inc(&rproc->power);
> >> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
> >>
> >>  	rproc_disable_iommu(rproc);
> >>
> >> +	/* Free the chached table memory that can has been allocated*/
> >> +	kfree(rproc->cached_table);
> >> +	rproc->cached_table = NULL;
> >>  	/*
> >> -	 * Set the remote processor's table pointer to NULL.  Since mapping
> >> -	 * of the resource table to a virtual address is done in the platform
> >> -	 * driver, unmapping should also be done there.
> >> +	 * Set the remote processor's table pointer to NULL. If mapping
> >> +	 * of the resource table to a virtual address has been done in the
> >> +	 * platform driver(attachment to an existing firmware),
> >> +	 * unmapping should also be done there.
> >>  	 */
> >>  	rproc->table_ptr = NULL;

With the above in mind we can't to that, otherwise trying to re-attach with
rproc_attach() won't work because ->table_ptr will be NULL.

I wasn't able to test that code path because I didn't have the FW that supported
detaching.  Now that the feature is maturing it needs to be done.  

> >>  out:
> >> -- 
> >> 2.17.1
> >>
> >>
> >>
Mathieu Poirier Dec. 9, 2020, 9:34 p.m. UTC | #10
On Wed, Dec 09, 2020 at 11:13:07AM +0100, Arnaud POULIQUEN wrote:
> 
> 
> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
> > Refactor function rproc_del() and rproc_cdev_release() to take
> > into account the policy specified in the device tree.
> > 
> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> > ---
> >  drivers/remoteproc/remoteproc_cdev.c | 13 +++++++++++-
> >  drivers/remoteproc/remoteproc_core.c | 30 ++++++++++++++++++++++++++--
> >  include/linux/remoteproc.h           |  4 ++++
> >  3 files changed, 44 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/remoteproc/remoteproc_cdev.c b/drivers/remoteproc/remoteproc_cdev.c
> > index f7645f289563..3dfe555dfc07 100644
> > --- a/drivers/remoteproc/remoteproc_cdev.c
> > +++ b/drivers/remoteproc/remoteproc_cdev.c
> > @@ -88,7 +88,18 @@ static int rproc_cdev_release(struct inode *inode, struct file *filp)
> >  {
> >  	struct rproc *rproc = container_of(inode->i_cdev, struct rproc, cdev);
> >  
> > -	if (rproc->cdev_put_on_release && rproc->state == RPROC_RUNNING)
> > +	if (!rproc->cdev_put_on_release)
> > +		return 0;
> > +
> > +	/*
> > +	 * The application has crashed or is releasing its file handle.  Detach
> > +	 * or shutdown the remote processor based on the policy specified in the
> > +	 * DT.  No need to check rproc->state right away, it will be done
> > +	 * in either rproc_detach() or rproc_shutdown().
> > +	 */
> > +	if (rproc->autonomous_on_core_shutdown)
> > +		rproc_detach(rproc);
> > +	else
> >  		rproc_shutdown(rproc);
> 
> A reason to not propagate the return of functions?

A valid observation...  I'll fix it.

> 
> >  
> >  	return 0;
> > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
> > index 3d7d245edc4e..1a170103bf27 100644
> > --- a/drivers/remoteproc/remoteproc_core.c
> > +++ b/drivers/remoteproc/remoteproc_core.c
> > @@ -2294,6 +2294,22 @@ static int rproc_alloc_ops(struct rproc *rproc, const struct rproc_ops *ops)
> >  	return 0;
> >  }
> >  
> > +static void rproc_set_automation_flags(struct rproc *rproc)
> > +{
> > +	struct device *dev = rproc->dev.parent;
> > +	struct device_node *np = dev->of_node;
> > +	bool core_shutdown;
> > +
> > +	/*
> > +	 * When function rproc_cdev_release() or rproc_del() are called and
> > +	 * the remote processor has been attached to, it will be detached from
> > +	 * (rather than turned off) if "autonomous-on-core-shutdown is specified
> > +	 * in the DT.
> > +	 */
> > +	core_shutdown = of_property_read_bool(np, "autonomous-on-core-shutdown");
> > +	rproc->autonomous_on_core_shutdown = core_shutdown;
> > +}
> > +
> >  /**
> >   * rproc_alloc() - allocate a remote processor handle
> >   * @dev: the underlying device
> > @@ -2352,6 +2368,8 @@ struct rproc *rproc_alloc(struct device *dev, const char *name,
> >  	if (rproc_alloc_ops(rproc, ops))
> >  		goto put_device;
> >  
> > +	rproc_set_automation_flags(rproc);
> > +
> >  	/* Assign a unique device index and name */
> >  	rproc->index = ida_simple_get(&rproc_dev_index, 0, 0, GFP_KERNEL);
> >  	if (rproc->index < 0) {
> > @@ -2435,8 +2453,16 @@ int rproc_del(struct rproc *rproc)
> >  	if (!rproc)
> >  		return -EINVAL;
> >  
> > -	/* TODO: make sure this works with rproc->power > 1 */
> > -	rproc_shutdown(rproc);
> > +	/*
> > +	 * TODO: make sure this works with rproc->power > 1
> > +	 *
> > +	 * No need to check rproc->state right away, it will be done in either
> > +	 * rproc_detach() or rproc_shutdown().
> > +	 */
> > +	if (rproc->autonomous_on_core_shutdown)
> > +		rproc_detach(rproc);
> > +	else
> > +		rproc_shutdown(rproc);
> 
> same here
> 
> >  
> >  	mutex_lock(&rproc->lock);
> >  	rproc->state = RPROC_DELETED;
> > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> > index 02312096d59f..5702f630d810 100644
> > --- a/include/linux/remoteproc.h
> > +++ b/include/linux/remoteproc.h
> > @@ -516,6 +516,9 @@ struct rproc_dump_segment {
> >   * @nb_vdev: number of vdev currently handled by rproc
> >   * @char_dev: character device of the rproc
> >   * @cdev_put_on_release: flag to indicate if remoteproc should be shutdown on @char_dev release
> > + * @autonomous_on_core_shutdown: true if the remote processor should be detached
> > + *				 from (rather than turned off) when the remoteproc
> > + *				 core goes away.
> >   */
> >  struct rproc {
> >  	struct list_head node;
> > @@ -554,6 +557,7 @@ struct rproc {
> >  	u16 elf_machine;
> >  	struct cdev cdev;
> >  	bool cdev_put_on_release;
> > +	bool autonomous_on_core_shutdown;
> >  };
> >  
> >  /**
> >
Arnaud POULIQUEN Dec. 10, 2020, 8:51 a.m. UTC | #11
On 12/9/20 10:18 PM, Mathieu Poirier wrote:
> On Wed, Dec 09, 2020 at 09:45:32AM +0100, Arnaud POULIQUEN wrote:
>>
>>
>> On 12/9/20 1:53 AM, Mathieu Poirier wrote:
>>> On Tue, Dec 08, 2020 at 07:35:18PM +0100, Arnaud POULIQUEN wrote:
>>>> Hi Mathieu,
>>>>
>>>>
>>>> On 11/26/20 10:06 PM, Mathieu Poirier wrote:
>>>>> Introduce function rproc_detach() to enable the remoteproc
>>>>> core to release the resources associated with a remote processor
>>>>> without stopping its operation.
>>>>>
>>>>> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>>> Reviewed-by: Peng Fan <peng.fan@nxp.com>
>>>>> ---
>>>>>  drivers/remoteproc/remoteproc_core.c | 65 +++++++++++++++++++++++++++-
>>>>>  include/linux/remoteproc.h           |  1 +
>>>>>  2 files changed, 65 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c
>>>>> index 928b3f975798..f5adf05762e9 100644
>>>>> --- a/drivers/remoteproc/remoteproc_core.c
>>>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>>>> @@ -1667,7 +1667,7 @@ static int rproc_stop(struct rproc *rproc, bool crashed)
>>>>>  /*
>>>>>   * __rproc_detach(): Does the opposite of rproc_attach()
>>>>>   */
>>>>> -static int __maybe_unused __rproc_detach(struct rproc *rproc)
>>>>> +static int __rproc_detach(struct rproc *rproc)
>>>>>  {
>>>>>  	struct device *dev = &rproc->dev;
>>>>>  	int ret;
>>>>> @@ -1910,6 +1910,69 @@ void rproc_shutdown(struct rproc *rproc)
>>>>>  }
>>>>>  EXPORT_SYMBOL(rproc_shutdown);
>>>>>  
>>>>> +/**
>>>>> + * rproc_detach() - Detach the remote processor from the
>>>>> + * remoteproc core
>>>>> + *
>>>>> + * @rproc: the remote processor
>>>>> + *
>>>>> + * Detach a remote processor (previously attached to with rproc_actuate()).
>>>>> + *
>>>>> + * In case @rproc is still being used by an additional user(s), then
>>>>> + * this function will just decrement the power refcount and exit,
>>>>> + * without disconnecting the device.
>>>>> + *
>>>>> + * Function rproc_detach() calls __rproc_detach() in order to let a remote
>>>>> + * processor know that services provided by the application processor are
>>>>> + * no longer available.  From there it should be possible to remove the
>>>>> + * platform driver and even power cycle the application processor (if the HW
>>>>> + * supports it) without needing to switch off the remote processor.
>>>>> + */
>>>>> +int rproc_detach(struct rproc *rproc)
>>>>> +{
>>>>> +	struct device *dev = &rproc->dev;
>>>>> +	int ret;
>>>>> +
>>>>> +	ret = mutex_lock_interruptible(&rproc->lock);
>>>>> +	if (ret) {
>>>>> +		dev_err(dev, "can't lock rproc %s: %d\n", rproc->name, ret);
>>>>> +		return ret;
>>>>> +	}
>>>>> +
>>>>> +	if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED) {
>>>>> +		ret = -EPERM;
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	/* if the remote proc is still needed, bail out */
>>>>> +	if (!atomic_dec_and_test(&rproc->power)) {
>>>>> +		ret = -EBUSY;
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	ret = __rproc_detach(rproc);
>>>>> +	if (ret) {
>>>>> +		atomic_inc(&rproc->power);
>>>>> +		goto out;
>>>>> +	}
>>>>> +
>>>>> +	/* clean up all acquired resources */
>>>>> +	rproc_resource_cleanup(rproc);
>>>>
>>>> I started to test the series, I found 2 problems testing in STM32P1 board.
>>>>
>>>> 1) the resource_table pointer is unmapped if the firmware has been booted by the
>>>> Linux, generating a crash in rproc_free_vring.
>>>> I attached a fix at the end of the mail.
>>>>
>>>
>>> I have reproduced the condition on my side and confirm that your solution is
>>> correct.  See below for a minor comment. 
>>>
>>>> 2) After the detach, the rproc state is "detached"
>>>> but it is no longer possible to re-attach to it correctly.
>>>> Neither if the firmware is standalone, nor if it has been booted
>>>> by the Linux.
>>>>
>>>
>>> Did you update your FW image?  If so, I need to run the same one.
>>>
>>>> I did not investigate, but the issue is probably linked to the resource
>>>> table address which is set to NULL.
>>>>
>>>> So we either have to fix the problem in order to attach or forbid the transition.
>>>>
>>>>
>>>> Regards,
>>>> Arnaud
>>>>
>>>>> +
>>>>> +	rproc_disable_iommu(rproc);
>>>>> +
>>>>> +	/*
>>>>> +	 * Set the remote processor's table pointer to NULL.  Since mapping
>>>>> +	 * of the resource table to a virtual address is done in the platform
>>>>> +	 * driver, unmapping should also be done there.
>>>>> +	 */
>>>>> +	rproc->table_ptr = NULL;
>>>>> +out:
>>>>> +	mutex_unlock(&rproc->lock);
>>>>> +	return ret;
>>>>> +}
>>>>> +EXPORT_SYMBOL(rproc_detach);
>>>>> +
>>>>>  /**
>>>>>   * rproc_get_by_phandle() - find a remote processor by phandle
>>>>>   * @phandle: phandle to the rproc
>>>>> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
>>>>> index da15b77583d3..329c1c071dcf 100644
>>>>> --- a/include/linux/remoteproc.h
>>>>> +++ b/include/linux/remoteproc.h
>>>>> @@ -656,6 +656,7 @@ rproc_of_resm_mem_entry_init(struct device *dev, u32 of_resm_idx, size_t len,
>>>>>  
>>>>>  int rproc_boot(struct rproc *rproc);
>>>>>  void rproc_shutdown(struct rproc *rproc);
>>>>> +int rproc_detach(struct rproc *rproc);
>>>>>  int rproc_set_firmware(struct rproc *rproc, const char *fw_name);
>>>>>  void rproc_report_crash(struct rproc *rproc, enum rproc_crash_type type);
>>>>>  int rproc_coredump_add_segment(struct rproc *rproc, dma_addr_t da, size_t size);
>>>>>
>>>>
>>>> From: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>>>> Date: Tue, 8 Dec 2020 18:54:51 +0100
>>>> Subject: [PATCH] remoteproc: core: fix detach for unmapped table_ptr
>>>>
>>>> If the firmware has been loaded and started by the kernel, the
>>>> resource table has probably been mapped by the carveout allocation
>>>> (see rproc_elf_find_loaded_rsc_table).
>>>> In this case the memory can have been unmapped before the vrings are free.
>>>> The result is a crash that occurs in rproc_free_vring while try to use the
>>>> unmapped pointer.
>>>>
>>>> Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss-st.com>
>>>> ---
>>>>  drivers/remoteproc/remoteproc_core.c | 17 ++++++++++++++---
>>>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/remoteproc/remoteproc_core.c
>>>> b/drivers/remoteproc/remoteproc_core.c
>>>> index 2b0a52fb3398..3508ffba4a2a 100644
>>>> --- a/drivers/remoteproc/remoteproc_core.c
>>>> +++ b/drivers/remoteproc/remoteproc_core.c
>>>> @@ -1964,6 +1964,13 @@ int rproc_detach(struct rproc *rproc)
>>>>  		goto out;
>>>>  	}
>>>>
>>>> +	/*
>>>> +	 * Prevent case that the installed resource table is no longer
>>>> +	 * accessible (e.g. memory unmapped), use the cache if available
>>>> +	 */
>>>> +	if (rproc->cached_table)
>>>> +		rproc->table_ptr = rproc->cached_table;
>>>
>>> I don't think there is an explicit need to check ->cached_table.  If the remote
>>> processor has been started by the remoteproc core it is valid anyway.  And below
>>> kfree() is called invariably. 
>>
>> The condition is needed, the  rproc->cached_table is null if the firmware as
>> been preloaded and the Linux remote proc just attaches to it.
>> The cached is used only when Linux loads the firmware, as the resource table is
>> extracted from the elf file to parse resource before the load of the firmware.
> 
> I have taken another look at this and you are correct. The if() condition is
> needed because ->table_ptr is set only once when the platform driver is
> probed.  See further down...
> 
>>
>>>
>>> So that problem is fixed.  Let me know about your FW image and we'll pick it up
>>> from there.
>>
>> I use the following example available on the stm32mp1 image:
>> /usr/local/Cube-M4-examples/STM32MP157C-DK2/Applications/OpenAMP/OpenAMP_TTY_echo_wakeup/lib/firmware/
>> This exemple use the RPMsg and also blink a LED when while running.
>>
>> Don't hesitate if you need me to send it to you by mail.
>>
>> Thank,
>> Arnaud
>>
>>>
>>> Mathieu
>>>
>>>> +
>>>>  	ret = __rproc_detach(rproc);
>>>>  	if (ret) {
>>>>  		atomic_inc(&rproc->power);
>>>> @@ -1975,10 +1982,14 @@ int rproc_detach(struct rproc *rproc)
>>>>
>>>>  	rproc_disable_iommu(rproc);
>>>>
>>>> +	/* Free the chached table memory that can has been allocated*/
>>>> +	kfree(rproc->cached_table);
>>>> +	rproc->cached_table = NULL;
>>>>  	/*
>>>> -	 * Set the remote processor's table pointer to NULL.  Since mapping
>>>> -	 * of the resource table to a virtual address is done in the platform
>>>> -	 * driver, unmapping should also be done there.
>>>> +	 * Set the remote processor's table pointer to NULL. If mapping
>>>> +	 * of the resource table to a virtual address has been done in the
>>>> +	 * platform driver(attachment to an existing firmware),
>>>> +	 * unmapping should also be done there.
>>>>  	 */
>>>>  	rproc->table_ptr = NULL;
> 
> With the above in mind we can't to that, otherwise trying to re-attach with
> rproc_attach() won't work because ->table_ptr will be NULL.

Yes, or an alternative would be to call find_loaded_rsc_table on attach. I did
not test it but could make sense to call the ops instead of expecting that the
platform has set table_ptr.

> 
> I wasn't able to test that code path because I didn't have the FW that supported
> detaching.  Now that the feature is maturing it needs to be done.  
> 
>>>>  out:
>>>> -- 
>>>> 2.17.1
>>>>
>>>>
>>>>