Message ID | 20210211234627.2669674-1-mathieu.poirier@linaro.org |
---|---|
Headers | show |
Series | remoteproc: Add support for detaching a remote processor | expand |
Hi Mathieu, On 2/12/21 12:46 AM, Mathieu Poirier wrote: > Add a new get_loaded_rsc_table() operation in order to support > scenarios where the remoteproc core has booted a remote processor > and detaches from it. When re-attaching to the remote processor, > the core needs to know where the resource table has been placed > in memory. > > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> > --- > New for V5: > - Added function rproc_set_loaded_rsc_table() to keep rproc_attach() clean. > - Setting ->cached_table, ->table_ptr and ->table_sz in the remoteproc core > rather than the platform drivers. > --- > drivers/remoteproc/remoteproc_core.c | 35 ++++++++++++++++++++++++ > drivers/remoteproc/remoteproc_internal.h | 10 +++++++ > include/linux/remoteproc.h | 6 +++- > 3 files changed, 50 insertions(+), 1 deletion(-) > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c > index e6606d10a4c8..741bc20de437 100644 > --- a/drivers/remoteproc/remoteproc_core.c > +++ b/drivers/remoteproc/remoteproc_core.c > @@ -1537,6 +1537,35 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) > return ret; > } > > +static int rproc_set_loaded_rsc_table(struct rproc *rproc) > +{ > + struct resource_table *table_ptr; > + struct device *dev = &rproc->dev; > + size_t table_sz; > + int ret; > + > + table_ptr = rproc_get_loaded_rsc_table(rproc, &table_sz); > + if (IS_ERR_OR_NULL(table_ptr)) { > + if (!table_ptr) > + ret = -EINVAL; I did few tests on this showing that this approach does not cover all use cases. The first one is a firmware without resource table. In this case table_ptr should be null, or we have to consider the -ENOENT error as a non error usecase. The second one, more tricky, is a firmware started by the remoteproc framework. In this case the resource table address is retrieved from the ELF file by the core part. So if we detach and reattach rproc_get_loaded_rsc_table cannot return the address. Look to me that we should have also an alocation of the clean_table in rproc_start and then to keep the memory allocated until a shutdown. That said regarding the complexity to re-attach, I wonder if it would not be better to focus first on a simple detach, and address the reattachment in a separate series, to move forward in stages. Regards, Arnaud > + else > + ret = PTR_ERR(table_ptr); > + > + dev_err(dev, "can't load resource table: %d\n", ret); > + return ret; > + } > + > + /* > + * The resource table is already loaded in device memory, no need > + * to work with a cached table. > + */ > + rproc->cached_table = NULL; > + rproc->table_ptr = table_ptr; > + rproc->table_sz = table_sz; > + > + return 0; > +} > + > /* > * Attach to remote processor - similar to rproc_fw_boot() but without > * the steps that deal with the firmware image. > @@ -1556,6 +1585,12 @@ static int rproc_attach(struct rproc *rproc) > return ret; > } > > + ret = rproc_set_loaded_rsc_table(rproc); > + if (ret) { > + dev_err(dev, "can't load resource table: %d\n", ret); > + goto disable_iommu; > + } > + > /* reset max_notifyid */ > rproc->max_notifyid = -1; > > diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h > index c34002888d2c..4f73aac7e60d 100644 > --- a/drivers/remoteproc/remoteproc_internal.h > +++ b/drivers/remoteproc/remoteproc_internal.h > @@ -177,6 +177,16 @@ struct resource_table *rproc_find_loaded_rsc_table(struct rproc *rproc, > return NULL; > } > > +static inline > +struct resource_table *rproc_get_loaded_rsc_table(struct rproc *rproc, > + size_t *size) > +{ > + if (rproc->ops->get_loaded_rsc_table) > + return rproc->ops->get_loaded_rsc_table(rproc, size); > + > + return NULL; > +} > + > static inline > bool rproc_u64_fit_in_size_t(u64 val) > { > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h > index 6b0a0ed30a03..51538a7d120d 100644 > --- a/include/linux/remoteproc.h > +++ b/include/linux/remoteproc.h > @@ -368,7 +368,9 @@ enum rsc_handling_status { > * RSC_HANDLED if resource was handled, RSC_IGNORED if not handled and a > * negative value on error > * @load_rsc_table: load resource table from firmware image > - * @find_loaded_rsc_table: find the loaded resouce table > + * @find_loaded_rsc_table: find the loaded resource table from firmware image > + * @get_loaded_rsc_table: get resource table installed in memory > + * by external entity > * @load: load firmware to memory, where the remote processor > * expects to find it > * @sanity_check: sanity check the fw image > @@ -390,6 +392,8 @@ struct rproc_ops { > int offset, int avail); > struct resource_table *(*find_loaded_rsc_table)( > struct rproc *rproc, const struct firmware *fw); > + struct resource_table *(*get_loaded_rsc_table)( > + struct rproc *rproc, size_t *size); > int (*load)(struct rproc *rproc, const struct firmware *fw); > int (*sanity_check)(struct rproc *rproc, const struct firmware *fw); > u64 (*get_boot_addr)(struct rproc *rproc, const struct firmware *fw); >
On 2/12/21 12:46 AM, Mathieu Poirier wrote: > If it is possible to detach the remote processor, keep an untouched > copy of the resource table. That way we can start from the same > resource table without having to worry about original values or what > elements the startup code has changed when re-attaching to the remote > processor. > > Reported-by: Arnaud POULIQUEN <arnaud.pouliquen@foss.st.com> > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> > --- > drivers/remoteproc/remoteproc_core.c | 70 ++++++++++++++++++++++ > drivers/remoteproc/remoteproc_elf_loader.c | 24 +++++++- > include/linux/remoteproc.h | 3 + > 3 files changed, 95 insertions(+), 2 deletions(-) > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c > index 660dcc002ff6..9a77cb6d6470 100644 > --- a/drivers/remoteproc/remoteproc_core.c > +++ b/drivers/remoteproc/remoteproc_core.c > @@ -1527,7 +1527,9 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) > clean_up_resources: > rproc_resource_cleanup(rproc); > kfree(rproc->cached_table); > + kfree(rproc->clean_table); > rproc->cached_table = NULL; > + rproc->clean_table = NULL; > rproc->table_ptr = NULL; > unprepare_rproc: > /* release HW resources if needed */ > @@ -1555,6 +1557,23 @@ static int rproc_set_loaded_rsc_table(struct rproc *rproc) > return ret; > } > > + /* > + * If it is possible to detach the remote processor, keep an untouched > + * copy of the resource table. That way we can start fresh again when > + * the remote processor is re-attached, that is: > + * > + * DETACHED -> ATTACHED -> DETACHED -> ATTACHED > + * > + * A clean copy of the table is also taken in rproc_elf_load_rsc_table() > + * for cases where the remote processor is booted by the remoteproc > + * core and later detached from. > + */ > + if (rproc->ops->detach) { > + rproc->clean_table = kmemdup(table_ptr, table_sz, GFP_KERNEL); > + if (!rproc->clean_table) > + return -ENOMEM; > + } > + > /* > * The resource table is already loaded in device memory, no need > * to work with a cached table. > @@ -1566,6 +1585,40 @@ static int rproc_set_loaded_rsc_table(struct rproc *rproc) > return 0; > } > > +static int rproc_reset_loaded_rsc_table(struct rproc *rproc) > +{ > + /* > + * In order to detach() from a remote processor a clean resource table > + * _must_ have been allocated at boot time, either from rproc_fw_boot() > + * or from rproc_attach(). If one isn't present something went really > + * wrong and we must complain. > + */ > + if (WARN_ON(!rproc->clean_table)) > + return -EINVAL; > + > + /* > + * Install the clean resource table where the firmware, i.e > + * rproc_get_loaded_rsc_table(), expects it. > + */ > + memcpy(rproc->table_ptr, rproc->clean_table, rproc->table_sz); > + > + /* > + * If the remote processors was started by the core then a cached_table > + * is present and we must follow the same cleanup sequence as we would > + * for a shutdown(). As it is in rproc_stop(), use the cached resource > + * table for the rest of the detach process since ->table_ptr will > + * become invalid as soon as carveouts are released in > + * rproc_resource_cleanup(). > + * > + * If the remote processor was started by an external entity the > + * cached_table is NULL and the rest of the cleanup code in > + * rproc_free_vring() can deal with that. > + */ > + rproc->table_ptr = rproc->cached_table; > + > + return 0; > +} > + > /* > * Attach to remote processor - similar to rproc_fw_boot() but without > * the steps that deal with the firmware image. > @@ -1947,7 +2000,10 @@ void rproc_shutdown(struct rproc *rproc) > > /* Free the copy of the resource table */ > kfree(rproc->cached_table); > + /* Free the clean resource table */ > + kfree(rproc->clean_table); > rproc->cached_table = NULL; > + rproc->clean_table = NULL; > rproc->table_ptr = NULL; > out: > mutex_unlock(&rproc->lock); > @@ -2000,6 +2056,16 @@ int rproc_detach(struct rproc *rproc) > goto out; > } > > + /* > + * Install a clean resource table for re-attach while > + * rproc->table_ptr is still valid. > + */ > + ret = rproc_reset_loaded_rsc_table(rproc); > + if (ret) { > + atomic_inc(&rproc->power); > + goto out; > + } > + Here you rewrite the initial values in the loaded resource table but then rproc_resource_cleanup will clean up the resource table. That can lead to an overwrite, and perhaps to unexpected memory access, as DA and PA addresses are reinitialized. (e.g call of rproc_vdev_release that will overwrite the resource table) And because the vdev release is asynchronous, probably better to reinitialize the resource table on attach or in rproc_handle_resources. Regards, Arnaud > /* clean up all acquired resources */ > rproc_resource_cleanup(rproc); > > @@ -2008,10 +2074,14 @@ int rproc_detach(struct rproc *rproc) > > rproc_disable_iommu(rproc); > > + /* Free the copy of the resource table */ > + kfree(rproc->cached_table); > /* Follow the same sequence as in rproc_shutdown() */ > kfree(rproc->cached_table); > rproc->cached_table = NULL; > + rproc->clean_table = NULL; > rproc->table_ptr = NULL; > + > out: > mutex_unlock(&rproc->lock); > return ret; > diff --git a/drivers/remoteproc/remoteproc_elf_loader.c b/drivers/remoteproc/remoteproc_elf_loader.c > index df68d87752e4..aa09782c932d 100644 > --- a/drivers/remoteproc/remoteproc_elf_loader.c > +++ b/drivers/remoteproc/remoteproc_elf_loader.c > @@ -17,10 +17,11 @@ > > #define pr_fmt(fmt) "%s: " fmt, __func__ > > -#include <linux/module.h> > +#include <linux/elf.h> > #include <linux/firmware.h> > +#include <linux/module.h> > #include <linux/remoteproc.h> > -#include <linux/elf.h> > +#include <linux/slab.h> > > #include "remoteproc_internal.h" > #include "remoteproc_elf_helpers.h" > @@ -338,6 +339,25 @@ int rproc_elf_load_rsc_table(struct rproc *rproc, const struct firmware *fw) > if (!rproc->cached_table) > return -ENOMEM; > > + /* > + * If it is possible to detach the remote processor, keep an untouched > + * copy of the resource table. That way we can start fresh again when > + * the remote processor is re-attached, that is: > + * > + * OFFLINE -> RUNNING -> DETACHED -> ATTACHED > + * > + * A clean copy of the table is also taken in > + * rproc_set_loaded_rsc_table() for cases where the remote processor is > + * booted by an external entity and later detached from. > + */ > + if (rproc->ops->detach) { > + rproc->clean_table = kmemdup(table, tablesz, GFP_KERNEL); > + if (!rproc->clean_table) { > + kfree(rproc->cached_table); > + return -ENOMEM; > + } > + } > + > rproc->table_ptr = rproc->cached_table; > rproc->table_sz = tablesz; > > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h > index e1c843c19cc6..e5f52a12a650 100644 > --- a/include/linux/remoteproc.h > +++ b/include/linux/remoteproc.h > @@ -514,6 +514,8 @@ struct rproc_dump_segment { > * @recovery_disabled: flag that state if recovery was disabled > * @max_notifyid: largest allocated notify id. > * @table_ptr: pointer to the resource table in effect > + * @clean_table: copy of the resource table without modifications. Used > + * when a remote processor is attached or detached from the core > * @cached_table: copy of the resource table > * @table_sz: size of @cached_table > * @has_iommu: flag to indicate if remote processor is behind an MMU > @@ -550,6 +552,7 @@ struct rproc { > bool recovery_disabled; > int max_notifyid; > struct resource_table *table_ptr; > + struct resource_table *clean_table; > struct resource_table *cached_table; > size_t table_sz; > bool has_iommu; >
On Mon, Feb 15, 2021 at 02:10:10PM +0100, Arnaud POULIQUEN wrote: > Hi Mathieu, > > On 2/12/21 12:46 AM, Mathieu Poirier wrote: > > Add a new get_loaded_rsc_table() operation in order to support > > scenarios where the remoteproc core has booted a remote processor > > and detaches from it. When re-attaching to the remote processor, > > the core needs to know where the resource table has been placed > > in memory. > > > > Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> > > --- > > New for V5: > > - Added function rproc_set_loaded_rsc_table() to keep rproc_attach() clean. > > - Setting ->cached_table, ->table_ptr and ->table_sz in the remoteproc core > > rather than the platform drivers. > > --- > > drivers/remoteproc/remoteproc_core.c | 35 ++++++++++++++++++++++++ > > drivers/remoteproc/remoteproc_internal.h | 10 +++++++ > > include/linux/remoteproc.h | 6 +++- > > 3 files changed, 50 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/remoteproc/remoteproc_core.c b/drivers/remoteproc/remoteproc_core.c > > index e6606d10a4c8..741bc20de437 100644 > > --- a/drivers/remoteproc/remoteproc_core.c > > +++ b/drivers/remoteproc/remoteproc_core.c > > @@ -1537,6 +1537,35 @@ static int rproc_fw_boot(struct rproc *rproc, const struct firmware *fw) > > return ret; > > } > > > > +static int rproc_set_loaded_rsc_table(struct rproc *rproc) > > +{ > > + struct resource_table *table_ptr; > > + struct device *dev = &rproc->dev; > > + size_t table_sz; > > + int ret; > > + > > + table_ptr = rproc_get_loaded_rsc_table(rproc, &table_sz); > > + if (IS_ERR_OR_NULL(table_ptr)) { > > + if (!table_ptr) > > + ret = -EINVAL; > > I did few tests on this showing that this approach does not cover all use cases. > > The first one is a firmware without resource table. In this case table_ptr > should be null, or we have to consider the -ENOENT error as a non error usecase. > Right, I'll provision for those cases. > The second one, more tricky, is a firmware started by the remoteproc framework. > In this case the resource table address is retrieved from the ELF file by the > core part. Correct. > So if we detach and reattach rproc_get_loaded_rsc_table cannot return the > address. Look to me that we should have also an alocation of the clean_table in > rproc_start and then to keep the memory allocated until a shutdown. I assumed the address of the resource table found in the ELF image was the same as the one known by the platform driver. In hindsight I realise the platform driver may not know that address. > > That said regarding the complexity to re-attach, I wonder if it would not be > better to focus first on a simple detach, and address the reattachment in a > separate series, to move forward in stages. I agree that OFFLINE -> RUNNING -> DETACHED -> ATTACHED is introducing some complexity related to the management of the resource table that where not expected. We could concentrate on a simple detach and see where that takes us. It would also mean to get rid of the "autonomous-on-core-shutdown" DT binding. Thanks, Mathieu > > Regards, > Arnaud > > > + else > > + ret = PTR_ERR(table_ptr); > > + > > + dev_err(dev, "can't load resource table: %d\n", ret); > > + return ret; > > + } > > + > > + /* > > + * The resource table is already loaded in device memory, no need > > + * to work with a cached table. > > + */ > > + rproc->cached_table = NULL; > > + rproc->table_ptr = table_ptr; > > + rproc->table_sz = table_sz; > > + > > + return 0; > > +} > > + > > /* > > * Attach to remote processor - similar to rproc_fw_boot() but without > > * the steps that deal with the firmware image. > > @@ -1556,6 +1585,12 @@ static int rproc_attach(struct rproc *rproc) > > return ret; > > } > > > > + ret = rproc_set_loaded_rsc_table(rproc); > > + if (ret) { > > + dev_err(dev, "can't load resource table: %d\n", ret); > > + goto disable_iommu; > > + } > > + > > /* reset max_notifyid */ > > rproc->max_notifyid = -1; > > > > diff --git a/drivers/remoteproc/remoteproc_internal.h b/drivers/remoteproc/remoteproc_internal.h > > index c34002888d2c..4f73aac7e60d 100644 > > --- a/drivers/remoteproc/remoteproc_internal.h > > +++ b/drivers/remoteproc/remoteproc_internal.h > > @@ -177,6 +177,16 @@ struct resource_table *rproc_find_loaded_rsc_table(struct rproc *rproc, > > return NULL; > > } > > > > +static inline > > +struct resource_table *rproc_get_loaded_rsc_table(struct rproc *rproc, > > + size_t *size) > > +{ > > + if (rproc->ops->get_loaded_rsc_table) > > + return rproc->ops->get_loaded_rsc_table(rproc, size); > > + > > + return NULL; > > +} > > + > > static inline > > bool rproc_u64_fit_in_size_t(u64 val) > > { > > diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h > > index 6b0a0ed30a03..51538a7d120d 100644 > > --- a/include/linux/remoteproc.h > > +++ b/include/linux/remoteproc.h > > @@ -368,7 +368,9 @@ enum rsc_handling_status { > > * RSC_HANDLED if resource was handled, RSC_IGNORED if not handled and a > > * negative value on error > > * @load_rsc_table: load resource table from firmware image > > - * @find_loaded_rsc_table: find the loaded resouce table > > + * @find_loaded_rsc_table: find the loaded resource table from firmware image > > + * @get_loaded_rsc_table: get resource table installed in memory > > + * by external entity > > * @load: load firmware to memory, where the remote processor > > * expects to find it > > * @sanity_check: sanity check the fw image > > @@ -390,6 +392,8 @@ struct rproc_ops { > > int offset, int avail); > > struct resource_table *(*find_loaded_rsc_table)( > > struct rproc *rproc, const struct firmware *fw); > > + struct resource_table *(*get_loaded_rsc_table)( > > + struct rproc *rproc, size_t *size); > > int (*load)(struct rproc *rproc, const struct firmware *fw); > > int (*sanity_check)(struct rproc *rproc, const struct firmware *fw); > > u64 (*get_boot_addr)(struct rproc *rproc, const struct firmware *fw); > >