mbox series

[RFC,v2,0/8] spi: axi-spi-engine: add offload support

Message ID 20240510-dlech-mainline-spi-engine-offload-2-v2-0-8707a870c435@baylibre.com
Headers show
Series spi: axi-spi-engine: add offload support | expand

Message

David Lechner May 11, 2024, 12:44 a.m. UTC
Continuing the discussion started a few months ago [1]...

It was suggested that we were trying to do too much in one series. So
we split out part of the series into a separate series that adds generic
support for pre-preparing SPI messages [2]. That work has been merged.

[1]: https://lore.kernel.org/linux-spi/20240109-axi-spi-engine-series-3-v1-0-e42c6a986580@baylibre.com/
[2]: https://lore.kernel.org/linux-spi/20240219-mainline-spi-precook-message-v2-0-4a762c6701b9@baylibre.com/

I was hoping to also break down the remaining work into smaller parts by
first only considering more generic offload support that could be used
with the regular SPI message queue and leaving out the hardware trigger
bits. But the hardware I have at hand is not capable of doing that so
this is the best I can do for now. (Happy to take suggestions if someone
knows some other hardware that could be used for this.)

There have been significant changes since the last version so I'll
discuss what changed in each patch instead of having a very lengthy
cover letter.

A working branch complete with extra hacks can be found at [3].

[3]: https://github.com/dlech/linux/tree/axi-spi-engine-offload-v2

---

As a recap, here is the background and end goal of this series:

The AXI SPI Engine is a SPI controller that has the ability to record a
series of SPI transactions and then play them back using a hardware
trigger. This allows operations to be performed, repeating many times,
without any CPU intervention. This is needed for achieving high data
rates (millions of samples per second) from ADCs and DACs that are
connected via a SPI bus.

The offload hardware interface consists of a trigger input and a data
output for the RX data. These are connected to other hardware external
to the SPI controller.

To record one or more transactions, commands and TX data are written
to memories in the controller (RX buffer is not used since RX data gets
piped to an external sink). This sequence of transactions can then be
played back when the trigger input is asserted.

This series includes core SPI support along with the first SPI
controller (AXI SPI Engine) and SPI peripheral (AD7944 ADC) that use
them. This enables capturing analog data at 2 million samples per
second.

The hardware setup looks like this:

+-------------------------------+   +------------------+
|                               |   |                  |
|  SOC/FPGA                     |   |  AD7944 ADC      |
|  +---------------------+      |   |                  |
|  | AXI SPI Engine      |      |   |                  |
|  |             SPI Bus ============ SPI Bus          |
|  |                     |      |   |                  |
|  |  +---------------+  |      |   |                  |
|  |  | Offload 0     |  |      |   +------------------+
|  |  |   RX DATA OUT > > > >   |
|  |  |    TRIGGER IN < < <  v  |
|  |  +---------------+  | ^ v  |
|  +---------------------+ ^ v  |
|  | AXI PWM             | ^ v  |
|  |                 CH0 > ^ v  |
|  +---------------------+   v  |
|  | AXI DMA             |   v  |
|  |                 CH0 < < <  |
|  +---------------------+      |
|                               |
+-------------------------------+

To: Mark Brown <broonie@kernel.org>
To: Jonathan Cameron <jic23@kernel.org>
To: Rob Herring <robh@kernel.org>
To: Krzysztof Kozlowski <krzk+dt@kernel.org>
To: Conor Dooley <conor+dt@kernel.org>
To: Nuno Sá <nuno.sa@analog.com>
Cc: Michael Hennerich <Michael.Hennerich@analog.com>
Cc: Lars-Peter Clausen <lars@metafoo.de>
Cc: David Jander <david@protonic.nl>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc:  <linux-spi@vger.kernel.org>
Cc:  <devicetree@vger.kernel.org>
Cc:  <linux-kernel@vger.kernel.org>
Cc:  <linux-iio@vger.kernel.org>

---
David Lechner (8):
      spi: dt-bindings: spi-peripheral-props: add spi-offloads property
      spi: add basic support for SPI offloading
      spi: add support for hardware triggered offload
      spi: add offload xfer flags
      spi: dt-bindings: axi-spi-engine: document spi-offloads
      spi: axi-spi-engine: add offload support
      dt-bindings: iio: adc: adi,ad7944: add SPI offload properties
      iio: adc: ad7944: add support for SPI offload

 .../devicetree/bindings/iio/adc/adi,ad7944.yaml    |  58 +++++
 .../bindings/spi/adi,axi-spi-engine.yaml           |  14 ++
 .../bindings/spi/spi-peripheral-props.yaml         |  10 +
 drivers/iio/adc/ad7944.c                           | 147 +++++++++---
 drivers/spi/spi-axi-spi-engine.c                   | 267 ++++++++++++++++++++-
 drivers/spi/spi.c                                  | 202 +++++++++++++++-
 include/linux/spi/spi.h                            |  84 +++++++
 7 files changed, 741 insertions(+), 41 deletions(-)
---
base-commit: 14fde009028126f91c2bd72c404e425cf4f8aec3
change-id: 20240510-dlech-mainline-spi-engine-offload-2-afce3790b5ab

Best regards,

Comments

Jonathan Cameron May 11, 2024, 4:51 p.m. UTC | #1
Drive by comment inline.

> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index a8fc16c6bf37..ec8d875d31ff 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -632,6 +632,9 @@ struct spi_controller {
>  	/* Flag indicating that the SPI bus is locked for exclusive use */
>  	bool			bus_lock_flag;
>  
> +	/* Flag indicating the bus is reserved for use by hardware trigger */
> +	bool			offload_hw_trigger_enabled;
> +
>  	/*
>  	 * Setup mode and clock, etc (SPI driver may call many times).
>  	 *
> @@ -1594,12 +1597,26 @@ struct spi_controller_offload_ops {
>  	 * @unprepare: Callback to release any resources used by prepare().
>  	 */
>  	void (*unprepare)(struct spi_device *spi, unsigned int id);
> +
> +	/**
> +	 * @hw_trigger_enable: Callback to enable the hardware trigger for the
> +	 * given offload instance.
> +	 */
> +
Blank line in odd place..

> +	int (*hw_trigger_enable)(struct spi_device *spi, unsigned int id);
> +	/**
> +	 * @hw_trigger_disable: Callback to disable the hardware trigger for the
> +	 * given offload instance.
> +	 */
> +	void (*hw_trigger_disable)(struct spi_device *spi, unsigned int id);
>  };
>  
>  extern int spi_offload_prepare(struct spi_device *spi, unsigned int id,
>  			       struct spi_message *msg);
>  extern void spi_offload_unprepare(struct spi_device *spi, unsigned int id,
>  				  struct spi_message *msg);
> +extern int spi_offload_hw_trigger_enable(struct spi_device *spi, unsigned int id);
> +extern void spi_offload_hw_trigger_disable(struct spi_device *spi, unsigned int id);
>  
>  /*---------------------------------------------------------------------------*/
>  
>
Jonathan Cameron May 11, 2024, 4:58 p.m. UTC | #2
On Fri, 10 May 2024 19:44:31 -0500
David Lechner <dlechner@baylibre.com> wrote:

> This adds support for SPI offload to the ad7944 driver. This allows
> reading data at the max sample rate of 2.5 MSPS.
> 
> Signed-off-by: David Lechner <dlechner@baylibre.com>
> ---
> 
> v2 changes:
> 
> In the previous version, there was a new separate driver for the PWM
> trigger and DMA hardware buffer. This was deemed too complex so they
> are moved into the ad7944 driver.
> 
> It has also been reworked to accommodate for the changes described in
> the other patches.
> 
> RFC: This isn't very polished yet, just FYI. A few things to sort out:
> 
> Rather than making the buffer either triggered buffer or hardware buffer,
> I'm considering allowing both, e.g. buffer0 will always be the triggered
> buffer and buffer1 will will be the hardware buffer if connected to a SPI
> controller with offload support, otherwise buffer1 is absent. But since
> multiple buffers haven't been used much so far, more investigation is
> needed to see how that would work in practice. If we do that though, then
> we would always have the sampling_frequency attribute though even though
> it only applies to one buffer.

Why would someone who has this nice IP in the path want the conventional
triggered buffer?  I'm not against the two buffer option, but I'd like to know
the reasoning not to just provide the hardware buffer if this SPI offload
is available.

I can conjecture reasons but would like you to write them out for me :)
This feels like if someone has paid for the expensive hardware they probably
only want the best performance.

Jonathan


> ---
>  drivers/iio/adc/ad7944.c | 147 +++++++++++++++++++++++++++++++++++------------
>  1 file changed, 111 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/iio/adc/ad7944.c b/drivers/iio/adc/ad7944.c
> index 4602ab5ed2a6..6724d6c92778 100644
> --- a/drivers/iio/adc/ad7944.c
> +++ b/drivers/iio/adc/ad7944.c
> @@ -9,6 +9,7 @@
>  #include <linux/align.h>
>  #include <linux/bitfield.h>
>  #include <linux/bitops.h>
> +#include <linux/clk.h>
>  #include <linux/delay.h>
>  #include <linux/device.h>
>  #include <linux/err.h>
> @@ -21,6 +22,7 @@
>  
>  #include <linux/iio/iio.h>
>  #include <linux/iio/sysfs.h>
> +#include <linux/iio/buffer-dmaengine.h>
>  #include <linux/iio/trigger_consumer.h>
>  #include <linux/iio/triggered_buffer.h>
>  
> @@ -65,6 +67,8 @@ struct ad7944_adc {
>  	bool always_turbo;
>  	/* Reference voltage (millivolts). */
>  	unsigned int ref_mv;
> +	/* Clock that triggers SPI offload. */
> +	struct clk *trigger_clk;
>  
>  	/*
>  	 * DMA (thus cache coherency maintenance) requires the
> @@ -123,6 +127,7 @@ static const struct ad7944_chip_info _name##_chip_info = {		\
>  			.scan_type.endianness = IIO_CPU,		\
>  			.info_mask_separate = BIT(IIO_CHAN_INFO_RAW)	\
>  					| BIT(IIO_CHAN_INFO_SCALE),	\
> +			.info_mask_shared_by_type = BIT(IIO_CHAN_INFO_SAMP_FREQ),\
>  		},							\
>  		IIO_CHAN_SOFT_TIMESTAMP(1),				\
>  	},								\
> @@ -134,18 +139,12 @@ AD7944_DEFINE_CHIP_INFO(ad7985, ad7944, 16, 0);
>  /* fully differential */
>  AD7944_DEFINE_CHIP_INFO(ad7986, ad7986, 18, 1);
>  
> -static void ad7944_unoptimize_msg(void *msg)
> -{
> -	spi_unoptimize_message(msg);
> -}
> -
> -static int ad7944_3wire_cs_mode_init_msg(struct device *dev, struct ad7944_adc *adc,
> -					 const struct iio_chan_spec *chan)
> +static void ad7944_3wire_cs_mode_init_msg(struct device *dev, struct ad7944_adc *adc,
> +					  const struct iio_chan_spec *chan)
>  {
>  	unsigned int t_conv_ns = adc->always_turbo ? adc->timing_spec->turbo_conv_ns
>  						   : adc->timing_spec->conv_ns;
>  	struct spi_transfer *xfers = adc->xfers;
> -	int ret;
>  
>  	/*
>  	 * NB: can get better performance from some SPI controllers if we use
> @@ -174,21 +173,14 @@ static int ad7944_3wire_cs_mode_init_msg(struct device *dev, struct ad7944_adc *
>  	xfers[2].bits_per_word = chan->scan_type.realbits;
>  
>  	spi_message_init_with_transfers(&adc->msg, xfers, 3);
> -
> -	ret = spi_optimize_message(adc->spi, &adc->msg);
> -	if (ret)
> -		return ret;
> -
> -	return devm_add_action_or_reset(dev, ad7944_unoptimize_msg, &adc->msg);
>  }
>  
> -static int ad7944_4wire_mode_init_msg(struct device *dev, struct ad7944_adc *adc,
> -				      const struct iio_chan_spec *chan)
> +static void ad7944_4wire_mode_init_msg(struct device *dev, struct ad7944_adc *adc,
> +				       const struct iio_chan_spec *chan)
>  {
>  	unsigned int t_conv_ns = adc->always_turbo ? adc->timing_spec->turbo_conv_ns
>  						   : adc->timing_spec->conv_ns;
>  	struct spi_transfer *xfers = adc->xfers;
> -	int ret;
>  
>  	/*
>  	 * NB: can get better performance from some SPI controllers if we use
> @@ -208,12 +200,6 @@ static int ad7944_4wire_mode_init_msg(struct device *dev, struct ad7944_adc *adc
>  	xfers[1].bits_per_word = chan->scan_type.realbits;
>  
>  	spi_message_init_with_transfers(&adc->msg, xfers, 2);
> -
> -	ret = spi_optimize_message(adc->spi, &adc->msg);
> -	if (ret)
> -		return ret;
> -
> -	return devm_add_action_or_reset(dev, ad7944_unoptimize_msg, &adc->msg);
>  }
>  
>  static int ad7944_chain_mode_init_msg(struct device *dev, struct ad7944_adc *adc,
> @@ -345,6 +331,30 @@ static int ad7944_read_raw(struct iio_dev *indio_dev,
>  			return -EINVAL;
>  		}
>  
> +	case IIO_CHAN_INFO_SAMP_FREQ:
> +		if (!adc->trigger_clk)
> +			return -EOPNOTSUPP;
> +
> +		*val = clk_get_rate(adc->trigger_clk);
> +		return IIO_VAL_INT;
> +
> +	default:
> +		return -EINVAL;
> +	}
> +}
> +
> +static int ad7944_write_raw(struct iio_dev *indio_dev,
> +			    const struct iio_chan_spec *chan,
> +			    int val, int val2, long info)
> +{
> +	struct ad7944_adc *adc = iio_priv(indio_dev);
> +
> +	switch (info) {
> +	case IIO_CHAN_INFO_SAMP_FREQ:
> +		if (!adc->trigger_clk)
> +			return -EOPNOTSUPP;
> +
> +		return clk_set_rate(adc->trigger_clk, val);
>  	default:
>  		return -EINVAL;
>  	}
> @@ -352,6 +362,28 @@ static int ad7944_read_raw(struct iio_dev *indio_dev,
>  
>  static const struct iio_info ad7944_iio_info = {
>  	.read_raw = &ad7944_read_raw,
> +	.write_raw = &ad7944_write_raw,
> +};
> +
> +static int ad7944_offload_ex_buffer_postenable(struct iio_dev *indio_dev)
> +{
> +	struct ad7944_adc *adc = iio_priv(indio_dev);
> +
> +	return spi_offload_hw_trigger_enable(adc->spi, 0);
> +}
> +
> +static int ad7944_offload_ex_buffer_predisable(struct iio_dev *indio_dev)
> +{
> +	struct ad7944_adc *adc = iio_priv(indio_dev);
> +
> +	spi_offload_hw_trigger_disable(adc->spi, 0);
> +
> +	return 0;
> +}
> +
> +static const struct iio_buffer_setup_ops ad7944_offload_ex_buffer_setup_ops = {
> +	.postenable = &ad7944_offload_ex_buffer_postenable,
> +	.predisable = &ad7944_offload_ex_buffer_predisable,
>  };
>  
>  static irqreturn_t ad7944_trigger_handler(int irq, void *p)
> @@ -471,6 +503,18 @@ static void ad7944_ref_disable(void *ref)
>  	regulator_disable(ref);
>  }
>  
> +static void ad7944_offload_unprepare(void *p)
> +{
> +	struct ad7944_adc *adc = p;
> +
> +	spi_offload_unprepare(adc->spi, 0, &adc->msg);
> +}
> +
> +static void ad7944_unoptimize_msg(void *msg)
> +{
> +	spi_unoptimize_message(msg);
> +}
> +
>  static int ad7944_probe(struct spi_device *spi)
>  {
>  	const struct ad7944_chip_info *chip_info;
> @@ -603,16 +647,10 @@ static int ad7944_probe(struct spi_device *spi)
>  
>  	switch (adc->spi_mode) {
>  	case AD7944_SPI_MODE_DEFAULT:
> -		ret = ad7944_4wire_mode_init_msg(dev, adc, &chip_info->channels[0]);
> -		if (ret)
> -			return ret;
> -
> +		ad7944_4wire_mode_init_msg(dev, adc, &chip_info->channels[0]);
>  		break;
>  	case AD7944_SPI_MODE_SINGLE:
> -		ret = ad7944_3wire_cs_mode_init_msg(dev, adc, &chip_info->channels[0]);
> -		if (ret)
> -			return ret;
> -
> +		ad7944_3wire_cs_mode_init_msg(dev, adc, &chip_info->channels[0]);
>  		break;
>  	case AD7944_SPI_MODE_CHAIN:
>  		ret = device_property_read_u32(dev, "#daisy-chained-devices",
> @@ -649,11 +687,48 @@ static int ad7944_probe(struct spi_device *spi)
>  		indio_dev->num_channels = ARRAY_SIZE(chip_info->channels);
>  	}
>  
> -	ret = devm_iio_triggered_buffer_setup(dev, indio_dev,
> -					      iio_pollfunc_store_time,
> -					      ad7944_trigger_handler, NULL);
> -	if (ret)
> -		return ret;
> +	if (device_property_present(dev, "spi-offloads")) {
> +		/* TODO: make this a parameter to ad7944_3wire_cs_mode_init_msg() */
> +		/* FIXME: wrong index for 4-wire mode */
> +		adc->xfers[2].rx_buf = NULL;
> +		adc->xfers[2].offload_flags = SPI_OFFLOAD_XFER_RX_STREAM;
> +
> +		ret = spi_offload_prepare(adc->spi, 0, &adc->msg);
> +		if (ret)
> +			return dev_err_probe(dev, ret, "failed to prepare offload\n");
> +
> +		ret = devm_add_action_or_reset(dev, ad7944_offload_unprepare, adc);
> +		if (ret)
> +			return ret;
> +
> +		adc->trigger_clk = devm_clk_get_enabled(dev, "trigger");
> +		if (IS_ERR(adc->trigger_clk))
> +			return dev_err_probe(dev, PTR_ERR(adc->trigger_clk),
> +					     "failed to get trigger clk\n");
> +
> +		ret = devm_iio_dmaengine_buffer_setup(dev, indio_dev, "rx");
> +		if (ret)
> +			return ret;
> +
> +		indio_dev->setup_ops = &ad7944_offload_ex_buffer_setup_ops;
> +		/* offload can't have soft timestamp */
> +		indio_dev->num_channels--;
> +	} else {
> +		ret = spi_optimize_message(adc->spi, &adc->msg);
> +		if (ret)
> +			return ret;
> +
> +		ret = devm_add_action_or_reset(dev, ad7944_unoptimize_msg, &adc->msg);
> +		if (ret)
> +			return ret;
> +
> +		ret = devm_iio_triggered_buffer_setup(dev, indio_dev,
> +						      iio_pollfunc_store_time,
> +						      ad7944_trigger_handler,
> +						      NULL);
> +		if (ret)
> +			return ret;
> +	}
>  
>  	return devm_iio_device_register(dev, indio_dev);
>  }
>
David Lechner May 11, 2024, 6:41 p.m. UTC | #3
On Sat, May 11, 2024 at 11:58 AM Jonathan Cameron <jic23@kernel.org> wrote:
>
> On Fri, 10 May 2024 19:44:31 -0500
> David Lechner <dlechner@baylibre.com> wrote:
>
> > This adds support for SPI offload to the ad7944 driver. This allows
> > reading data at the max sample rate of 2.5 MSPS.
> >
> > Signed-off-by: David Lechner <dlechner@baylibre.com>
> > ---
> >
> > v2 changes:
> >
> > In the previous version, there was a new separate driver for the PWM
> > trigger and DMA hardware buffer. This was deemed too complex so they
> > are moved into the ad7944 driver.
> >
> > It has also been reworked to accommodate for the changes described in
> > the other patches.
> >
> > RFC: This isn't very polished yet, just FYI. A few things to sort out:
> >
> > Rather than making the buffer either triggered buffer or hardware buffer,
> > I'm considering allowing both, e.g. buffer0 will always be the triggered
> > buffer and buffer1 will will be the hardware buffer if connected to a SPI
> > controller with offload support, otherwise buffer1 is absent. But since
> > multiple buffers haven't been used much so far, more investigation is
> > needed to see how that would work in practice. If we do that though, then
> > we would always have the sampling_frequency attribute though even though
> > it only applies to one buffer.
>
> Why would someone who has this nice IP in the path want the conventional
> triggered buffer?  I'm not against the two buffer option, but I'd like to know
> the reasoning not to just provide the hardware buffer if this SPI offload
> is available.
>
> I can conjecture reasons but would like you to write them out for me :)
> This feels like if someone has paid for the expensive hardware they probably
> only want the best performance.
>

For me, it was more of a question of if we need to keep the userspace
interface consistent between both with or without offload support. But
if you are happy with it this way where we have only one or the other,
it is less work for me. :-)
Jonathan Cameron May 12, 2024, 11:52 a.m. UTC | #4
On Sat, 11 May 2024 13:41:09 -0500
David Lechner <dlechner@baylibre.com> wrote:

> On Sat, May 11, 2024 at 11:58 AM Jonathan Cameron <jic23@kernel.org> wrote:
> >
> > On Fri, 10 May 2024 19:44:31 -0500
> > David Lechner <dlechner@baylibre.com> wrote:
> >  
> > > This adds support for SPI offload to the ad7944 driver. This allows
> > > reading data at the max sample rate of 2.5 MSPS.
> > >
> > > Signed-off-by: David Lechner <dlechner@baylibre.com>
> > > ---
> > >
> > > v2 changes:
> > >
> > > In the previous version, there was a new separate driver for the PWM
> > > trigger and DMA hardware buffer. This was deemed too complex so they
> > > are moved into the ad7944 driver.
> > >
> > > It has also been reworked to accommodate for the changes described in
> > > the other patches.
> > >
> > > RFC: This isn't very polished yet, just FYI. A few things to sort out:
> > >
> > > Rather than making the buffer either triggered buffer or hardware buffer,
> > > I'm considering allowing both, e.g. buffer0 will always be the triggered
> > > buffer and buffer1 will will be the hardware buffer if connected to a SPI
> > > controller with offload support, otherwise buffer1 is absent. But since
> > > multiple buffers haven't been used much so far, more investigation is
> > > needed to see how that would work in practice. If we do that though, then
> > > we would always have the sampling_frequency attribute though even though
> > > it only applies to one buffer.  
> >
> > Why would someone who has this nice IP in the path want the conventional
> > triggered buffer?  I'm not against the two buffer option, but I'd like to know
> > the reasoning not to just provide the hardware buffer if this SPI offload
> > is available.
> >
> > I can conjecture reasons but would like you to write them out for me :)
> > This feels like if someone has paid for the expensive hardware they probably
> > only want the best performance.
> >  
> 
> For me, it was more of a question of if we need to keep the userspace
> interface consistent between both with or without offload support. But
> if you are happy with it this way where we have only one or the other,
> it is less work for me. :-)

So inconsistency in userspace interfaces can occur for many reasons like
whether the interrupt is wired or not, but in this particularly
case I guess we have ABI stability issue because there are boards out there
today and people using the driver without this offload functionality.
I'd not really thought that bit through, so I think you are correct that
we need to maintain the triggered buffer interface and 'add' the new
ABI for the offloaded case.  The multibuffer approach should work for this.
Will be interesting if any problem surface from having two very different
types of buffer on the same device.

Jonathan
David Lechner May 13, 2024, 3:15 p.m. UTC | #5
On Sun, May 12, 2024 at 6:52 AM Jonathan Cameron <jic23@kernel.org> wrote:
>
> On Sat, 11 May 2024 13:41:09 -0500
> David Lechner <dlechner@baylibre.com> wrote:
>
> > On Sat, May 11, 2024 at 11:58 AM Jonathan Cameron <jic23@kernel.org> wrote:
> > >
> > > On Fri, 10 May 2024 19:44:31 -0500
> > > David Lechner <dlechner@baylibre.com> wrote:
> > >
> > > > This adds support for SPI offload to the ad7944 driver. This allows
> > > > reading data at the max sample rate of 2.5 MSPS.
> > > >
> > > > Signed-off-by: David Lechner <dlechner@baylibre.com>
> > > > ---
> > > >
> > > > v2 changes:
> > > >
> > > > In the previous version, there was a new separate driver for the PWM
> > > > trigger and DMA hardware buffer. This was deemed too complex so they
> > > > are moved into the ad7944 driver.
> > > >
> > > > It has also been reworked to accommodate for the changes described in
> > > > the other patches.
> > > >
> > > > RFC: This isn't very polished yet, just FYI. A few things to sort out:
> > > >
> > > > Rather than making the buffer either triggered buffer or hardware buffer,
> > > > I'm considering allowing both, e.g. buffer0 will always be the triggered
> > > > buffer and buffer1 will will be the hardware buffer if connected to a SPI
> > > > controller with offload support, otherwise buffer1 is absent. But since
> > > > multiple buffers haven't been used much so far, more investigation is
> > > > needed to see how that would work in practice. If we do that though, then
> > > > we would always have the sampling_frequency attribute though even though
> > > > it only applies to one buffer.
> > >
> > > Why would someone who has this nice IP in the path want the conventional
> > > triggered buffer?  I'm not against the two buffer option, but I'd like to know
> > > the reasoning not to just provide the hardware buffer if this SPI offload
> > > is available.
> > >
> > > I can conjecture reasons but would like you to write them out for me :)
> > > This feels like if someone has paid for the expensive hardware they probably
> > > only want the best performance.
> > >
> >
> > For me, it was more of a question of if we need to keep the userspace
> > interface consistent between both with or without offload support. But
> > if you are happy with it this way where we have only one or the other,
> > it is less work for me. :-)
>
> So inconsistency in userspace interfaces can occur for many reasons like
> whether the interrupt is wired or not, but in this particularly
> case I guess we have ABI stability issue because there are boards out there
> today and people using the driver without this offload functionality.

FWIW, the ad7944 driver will be landing in 6.10, so no users to speak of yet.

> I'd not really thought that bit through, so I think you are correct that
> we need to maintain the triggered buffer interface and 'add' the new
> ABI for the offloaded case.  The multibuffer approach should work for this.
> Will be interesting if any problem surface from having two very different
> types of buffer on the same device.
>

In this particular case, I think the issues will be:
1. The hardware buffer can't allow the soft timestamp. Otherwise, the
buffer layout is the same (on other chips, we won't always be so
lucky).
2. The hardware trigger (sampling_frequency attribute) will only apply
if there is the SPI offload support.
Nuno Sá May 21, 2024, 11:57 a.m. UTC | #6
On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote:
> SPI offloading is a feature that allows the SPI controller to perform
> complex transfers without CPU intervention. This is useful, e.g. for
> high-speed data acquisition.
> 
> This patch adds the basic infrastructure to support SPI offloading. It
> introduces new callbacks that are to be implemented by controllers with
> offload capabilities.
> 
> On SPI device probe, the standard spi-offloads devicetree property is
> parsed and passed to the controller driver to reserve the resources
> requested by the peripheral via the map_channel() callback.
> 
> The peripheral driver can then use spi_offload_prepare() to load a SPI
> message into the offload hardware.
> 
> If the controller supports it, this message can then be passed to the
> SPI message queue as if it was a normal message. Future patches will
> will also implement a way to use a hardware trigger to start the message
> transfers rather than going through the message queue.
> 
> Signed-off-by: David Lechner <dlechner@baylibre.com>
> ---
> 
> v2 changes:
> 
> This is a rework of "spi: add core support for controllers with offload
> capabilities" from v1.
> 
> The spi_offload_get() function that Nuno didn't like is gone. Instead,
> there is now a mapping callback that uses the new generic devicetree
> binding to request resources automatically when a SPI device is probed.
> 

Yeah, what I disliked the most was adding the platform devices from spi-engine
driver and then the complexity in the IIO trigger part of it. I also didn't like
(as you said) for the peripheral to have to explicitly request an offload but,
IIRC, Mark was actually ok with the spi_offload_get/put() mechanism so let's see
what he has to say. He's main point (I think) was for the controllers to have a
way to know which offload instance is busy or not (but I guess controllers can
keep track of that as well with this approach and using the enable/disable
callbacks or the prepare/unprepare).

THB, I do like this approach (and it is what I had in mind) and it's simple
enough while covering what we know about this feature atm.

- Nuno Sá
Nuno Sá May 21, 2024, 12:31 p.m. UTC | #7
On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote:
> This implements SPI offload support for the AXI SPI Engine. Currently,
> the hardware only supports triggering offload transfers with a hardware
> trigger so attempting to use an offload message in the regular SPI
> message queue will fail. Also, only allows streaming rx data to an
> external sink, so attempts to use a rx_buf in the offload message will
> fail.
> 
> Signed-off-by: David Lechner <dlechner@baylibre.com>
> ---
> 
> v2 changes:
> 
> This patch has been reworked to accommodate the changes described in all
> of the other patches.
> ---
>  drivers/spi/spi-axi-spi-engine.c | 267
> ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 264 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-
> engine.c
> index e358ac5b4509..95327df572a0 100644
> --- a/drivers/spi/spi-axi-spi-engine.c
> +++ b/drivers/spi/spi-axi-spi-engine.c
> @@ -2,6 +2,7 @@
>  /*
>   * SPI-Engine SPI controller driver
>   * Copyright 2015 Analog Devices Inc.
> + * Copyright 2024 BayLibre, SAS
>   *  Author: Lars-Peter Clausen <lars@metafoo.de>
>   */
>  
> @@ -16,6 +17,7 @@
>  #include <linux/platform_device.h>
>  #include <linux/spi/spi.h>
>  
> +#define SPI_ENGINE_REG_OFFLOAD_MEM_ADDR_WIDTH	0x10
>  #define SPI_ENGINE_REG_RESET			0x40
>  
>  #define SPI_ENGINE_REG_INT_ENABLE		0x80
> @@ -23,6 +25,7 @@
>  #define SPI_ENGINE_REG_INT_SOURCE		0x88
>  
>  #define SPI_ENGINE_REG_SYNC_ID			0xc0
> +#define SPI_ENGINE_REG_OFFLOAD_SYNC_ID		0xc4
>  
>  #define SPI_ENGINE_REG_CMD_FIFO_ROOM		0xd0
>  #define SPI_ENGINE_REG_SDO_FIFO_ROOM		0xd4
> @@ -33,10 +36,24 @@
>  #define SPI_ENGINE_REG_SDI_DATA_FIFO		0xe8
>  #define SPI_ENGINE_REG_SDI_DATA_FIFO_PEEK	0xec
>  
> +#define SPI_ENGINE_MAX_NUM_OFFLOADS		32
> +
> +#define SPI_ENGINE_REG_OFFLOAD_CTRL(x)		(0x100 +
> SPI_ENGINE_MAX_NUM_OFFLOADS * (x))
> +#define SPI_ENGINE_REG_OFFLOAD_STATUS(x)	(0x104 +
> SPI_ENGINE_MAX_NUM_OFFLOADS * (x))
> +#define SPI_ENGINE_REG_OFFLOAD_RESET(x)		(0x108 +
> SPI_ENGINE_MAX_NUM_OFFLOADS * (x))
> +#define SPI_ENGINE_REG_OFFLOAD_CMD_FIFO(x)	(0x110 +
> SPI_ENGINE_MAX_NUM_OFFLOADS * (x))
> +#define SPI_ENGINE_REG_OFFLOAD_SDO_FIFO(x)	(0x114 +
> SPI_ENGINE_MAX_NUM_OFFLOADS * (x))
> +
> +#define SPI_ENGINE_SPI_OFFLOAD_MEM_WIDTH_SDO	GENMASK(15, 8)
> +#define SPI_ENGINE_SPI_OFFLOAD_MEM_WIDTH_CMD	GENMASK(7, 0)
> +
>  #define SPI_ENGINE_INT_CMD_ALMOST_EMPTY		BIT(0)
>  #define SPI_ENGINE_INT_SDO_ALMOST_EMPTY		BIT(1)
>  #define SPI_ENGINE_INT_SDI_ALMOST_FULL		BIT(2)
>  #define SPI_ENGINE_INT_SYNC			BIT(3)
> +#define SPI_ENGINE_INT_OFFLOAD_SYNC		BIT(4)
> +
> +#define SPI_ENGINE_OFFLOAD_CTRL_ENABLE		BIT(0)
>  
>  #define SPI_ENGINE_CONFIG_CPHA			BIT(0)
>  #define SPI_ENGINE_CONFIG_CPOL			BIT(1)
> @@ -74,6 +91,10 @@
>  #define SPI_ENGINE_CMD_SYNC(id) \
>  	SPI_ENGINE_CMD(SPI_ENGINE_INST_MISC, SPI_ENGINE_MISC_SYNC, (id))
>  
> +/* default sizes - can be changed when SPI Engine firmware is compiled */
> +#define SPI_ENGINE_OFFLOAD_CMD_FIFO_SIZE	16
> +#define SPI_ENGINE_OFFLOAD_SDO_FIFO_SIZE	16
> +
>  struct spi_engine_program {
>  	unsigned int length;
>  	uint16_t instructions[] __counted_by(length);
> @@ -101,6 +122,12 @@ struct spi_engine_message_state {
>  	uint8_t *rx_buf;
>  };
>  
> +struct spi_engine_offload {
> +	struct spi_device *spi;
> +	unsigned int id;
> +	bool prepared;
> +};
> +
>  struct spi_engine {
>  	struct clk *clk;
>  	struct clk *ref_clk;
> @@ -111,6 +138,10 @@ struct spi_engine {
>  	struct spi_engine_message_state msg_state;
>  	struct completion msg_complete;
>  	unsigned int int_enable;
> +
> +	unsigned int offload_ctrl_mem_size;
> +	unsigned int offload_sdo_mem_size;
> +	struct spi_engine_offload offload_priv[SPI_ENGINE_MAX_NUM_OFFLOADS];
>  };
>  
>  static void spi_engine_program_add_cmd(struct spi_engine_program *p,
> @@ -154,7 +185,7 @@ static void spi_engine_gen_xfer(struct spi_engine_program
> *p, bool dry,
>  
>  		if (xfer->tx_buf)
>  			flags |= SPI_ENGINE_TRANSFER_WRITE;
> -		if (xfer->rx_buf)
> +		if (xfer->rx_buf || (xfer->offload_flags &
> SPI_OFFLOAD_XFER_RX_STREAM))
>  			flags |= SPI_ENGINE_TRANSFER_READ;
>  
>  		spi_engine_program_add_cmd(p, dry,
> @@ -202,16 +233,24 @@ static void spi_engine_gen_cs(struct spi_engine_program
> *p, bool dry,
>   *
>   * NB: This is separate from spi_engine_compile_message() because the latter
>   * is called twice and would otherwise result in double-evaluation.
> + *
> + * Returns 0 on success, -EINVAL on failure.
>   */
> -static void spi_engine_precompile_message(struct spi_message *msg)
> +static int spi_engine_precompile_message(struct spi_message *msg)
>  {
>  	unsigned int clk_div, max_hz = msg->spi->controller->max_speed_hz;
>  	struct spi_transfer *xfer;
>  
>  	list_for_each_entry(xfer, &msg->transfers, transfer_list) {
> +		/* If we have an offload transfer, we can't rx to buffer */
> +		if (msg->offload && xfer->rx_buf)
> +			return -EINVAL;
> +
>  		clk_div = DIV_ROUND_UP(max_hz, xfer->speed_hz);
>  		xfer->effective_speed_hz = max_hz / min(clk_div, 256U);
>  	}
> +
> +	return 0;
>  }
>  
>  static void spi_engine_compile_message(struct spi_message *msg, bool dry,
> @@ -503,8 +542,11 @@ static irqreturn_t spi_engine_irq(int irq, void *devid)
>  static int spi_engine_optimize_message(struct spi_message *msg)
>  {
>  	struct spi_engine_program p_dry, *p;
> +	int ret;
>  
> -	spi_engine_precompile_message(msg);
> +	ret = spi_engine_precompile_message(msg);
> +	if (ret)
> +		return ret;
>  
>  	p_dry.length = 0;
>  	spi_engine_compile_message(msg, true, &p_dry);
> @@ -539,6 +581,11 @@ static int spi_engine_transfer_one_message(struct
> spi_controller *host,
>  	unsigned int int_enable = 0;
>  	unsigned long flags;
>  
> +	if (msg->offload) {
> +		dev_err(&host->dev, "Single transfer offload not
> supported\n");
> +		return -EOPNOTSUPP;
> +	}
> +
>  	/* reinitialize message state for this transfer */
>  	memset(st, 0, sizeof(*st));
>  	st->cmd_buf = p->instructions;
> @@ -579,6 +626,204 @@ static int spi_engine_transfer_one_message(struct
> spi_controller *host,
>  	return msg->status;
>  }
>  
> +static struct spi_engine_offload *spi_engine_get_offload(struct spi_device
> *spi,
> +							 unsigned int id,
> +							 unsigned int
> *offload_num)
> +{
> +	struct spi_controller *host = spi->controller;
> +	struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> +	struct spi_engine_offload *priv;
> +	int i;
> +
> +	for (i = 0; i < SPI_ENGINE_MAX_NUM_OFFLOADS; i++) {
> +		priv = &spi_engine->offload_priv[i];
> +
> +		if (priv->spi == spi && priv->id == id) {
> +			*offload_num = i;
> +			return priv;
> +		}
> +	}
> +
> +	return ERR_PTR(-ENODEV);
> +}
> +
> +static int spi_engine_offload_map_channel(struct spi_device *spi,
> +					  unsigned int id,
> +					  unsigned int channel)
> +{
> +	struct spi_controller *host = spi->controller;
> +	struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> +	struct spi_engine_offload *priv;
> +
> +	if (channel >= SPI_ENGINE_MAX_NUM_OFFLOADS)
> +		return -EINVAL;
> +
> +	priv = &spi_engine->offload_priv[channel];
> +
> +	if (priv->spi)
> +		return -EBUSY;

I wonder if we need to be this strict? Is there any problem by having two
devices requesting the same offload engine? I would expect that having multiple
peripherals trying to actually use it at the same time (with the prepare()
callback) to be problematic but if they play along it could actually work,
right? In reality that may never be a realistic usecase so this is likely fine.

> +
> +	priv->spi = spi;
> +	priv->id = id;
> +
> +	return 0;
> +}
> +
> +static int spi_engine_offload_prepare(struct spi_device *spi, unsigned int
> id,
> +				      struct spi_message *msg)
> +{
> +	struct spi_controller *host = spi->controller;
> +	struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> +	struct spi_engine_program *p = msg->opt_state;
> +	struct spi_engine_offload *priv;
> +	struct spi_transfer *xfer;
> +	void __iomem *cmd_addr;
> +	void __iomem *sdo_addr;
> +	size_t tx_word_count = 0;
> +	unsigned int offload_num, i;
> +
> +	priv = spi_engine_get_offload(spi, id, &offload_num);
> +	if (IS_ERR(priv))
> +		return PTR_ERR(priv);
> +
> +	if (priv->prepared)
> +		return -EBUSY;
> +
> +	if (p->length > spi_engine->offload_ctrl_mem_size)
> +		return -EINVAL;
> +
> +	/* count total number of tx words in message */
> +	list_for_each_entry(xfer, &msg->transfers, transfer_list) {
> +		if (!xfer->tx_buf)
> +			continue;
> +
> +		if (xfer->bits_per_word <= 8)
> +			tx_word_count += xfer->len;
> +		else if (xfer->bits_per_word <= 16)
> +			tx_word_count += xfer->len / 2;
> +		else
> +			tx_word_count += xfer->len / 4;
> +	}
> +
> +	if (tx_word_count > spi_engine->offload_sdo_mem_size)
> +		return -EINVAL;
> +
> +	cmd_addr = spi_engine->base + SPI_ENGINE_REG_OFFLOAD_CMD_FIFO(priv-
> >id);
> +	sdo_addr = spi_engine->base +
> SPI_ENGINE_REG_OFFLOAD_SDO_FIFO(offload_num);
> +
> +	list_for_each_entry(xfer, &msg->transfers, transfer_list) {
> +		if (!xfer->tx_buf)
> +			continue;
> +
> +		if (xfer->bits_per_word <= 8) {
> +			const u8 *buf = xfer->tx_buf;
> +
> +			for (i = 0; i < xfer->len; i++)
> +				writel_relaxed(buf[i], sdo_addr);
> +		} else if (xfer->bits_per_word <= 16) {
> +			const u16 *buf = xfer->tx_buf;
> +
> +			for (i = 0; i < xfer->len / 2; i++)
> +				writel_relaxed(buf[i], sdo_addr);
> +		} else {
> +			const u32 *buf = xfer->tx_buf;
> +
> +			for (i = 0; i < xfer->len / 4; i++)
> +				writel_relaxed(buf[i], sdo_addr);
> +		}
> +	}
> +
> +	for (i = 0; i < p->length; i++)
> +		writel_relaxed(p->instructions[i], cmd_addr);
> +
> +	msg->offload_state = (void *)(intptr_t)offload_num;
> +	priv->prepared = true;
> +
> +	return 0;
> +}
> +
> +static void spi_engine_offload_unprepare(struct spi_device *spi, unsigned int
> id)
> +{
> +	struct spi_controller *host = spi->controller;
> +	struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> +	struct spi_engine_offload *priv;
> +	unsigned int offload_num;
> +
> +	priv = spi_engine_get_offload(spi, id, &offload_num);
> +	if (IS_ERR(priv)) {
> +		dev_warn(&spi->dev, "failed match offload in unprepare\n");
> +		return;
> +	}
> +
> +	writel_relaxed(1, spi_engine->base +
> SPI_ENGINE_REG_OFFLOAD_RESET(offload_num));
> +	writel_relaxed(0, spi_engine->base +
> SPI_ENGINE_REG_OFFLOAD_RESET(offload_num));
> +
> +	priv->prepared = false;
> +}
> +
> +static int spi_engine_offload_enable(struct spi_device *spi, unsigned int id)
> +{
> +	struct spi_controller *host = spi->controller;
> +	struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> +	struct spi_engine_offload *priv;
> +	unsigned int offload_num, reg;
> +
> +	priv = spi_engine_get_offload(spi, id, &offload_num);
> +	if (IS_ERR(priv))
> +		return PTR_ERR(priv);
> +
> +	reg = readl_relaxed(spi_engine->base +
> +			    SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num));
> +	reg |= SPI_ENGINE_OFFLOAD_CTRL_ENABLE;
> +	writel_relaxed(reg, spi_engine->base +
> +			    SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num));
> +
> +	return 0;
> +}
> +
> +static void spi_engine_offload_disable(struct spi_device *spi, unsigned int
> id)
> +{
> +	struct spi_controller *host = spi->controller;
> +	struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> +	struct spi_engine_offload *priv;
> +	unsigned int offload_num, reg;
> +
> +	priv = spi_engine_get_offload(spi, id, &offload_num);
> +	if (IS_ERR(priv)) {
> +		dev_warn(&spi->dev, "failed match offload in disable\n");
> +		return;
> +	}
> +
> +	reg = readl_relaxed(spi_engine->base +
> +			    SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num));
> +	reg &= ~SPI_ENGINE_OFFLOAD_CTRL_ENABLE;
> +	writel_relaxed(reg, spi_engine->base +
> +			    SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num));
> +}
> +
> +static const struct spi_controller_offload_ops spi_engine_offload_ops = {
> +	.map_channel = spi_engine_offload_map_channel,
> +	.prepare = spi_engine_offload_prepare,
> +	.unprepare = spi_engine_offload_unprepare,
> +	.hw_trigger_enable = spi_engine_offload_enable,
> +	.hw_trigger_disable = spi_engine_offload_disable,

I guess this is what you and Conor are already somehow discussing but I would
expect this to be the actual offload trigger to play a spi transfer. As it
stands, it looks weird (or confusing) to have the enable/disable of the engine
to act as a trigger... Maybe these callbacks could be used to enable/disable the
actual trigger of the offload engine (in our current cases, the PWM)? So this
would make it easy to move the trigger DT property where it belongs. The DMA one
(given it's tight relation with IIO DMA buffers) is another (way more difficult)
story I think.

- Nuno Sá
David Lechner May 21, 2024, 2:28 p.m. UTC | #8
On Tue, May 21, 2024 at 7:27 AM Nuno Sá <noname.nuno@gmail.com> wrote:
>
> On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote:
> > This implements SPI offload support for the AXI SPI Engine. Currently,
> > the hardware only supports triggering offload transfers with a hardware
> > trigger so attempting to use an offload message in the regular SPI
> > message queue will fail. Also, only allows streaming rx data to an
> > external sink, so attempts to use a rx_buf in the offload message will
> > fail.
> >
> > Signed-off-by: David Lechner <dlechner@baylibre.com>
> > ---
> >

...

> > +
> > +static int spi_engine_offload_map_channel(struct spi_device *spi,
> > +                                       unsigned int id,
> > +                                       unsigned int channel)
> > +{
> > +     struct spi_controller *host = spi->controller;
> > +     struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> > +     struct spi_engine_offload *priv;
> > +
> > +     if (channel >= SPI_ENGINE_MAX_NUM_OFFLOADS)
> > +             return -EINVAL;
> > +
> > +     priv = &spi_engine->offload_priv[channel];
> > +
> > +     if (priv->spi)
> > +             return -EBUSY;
>
> I wonder if we need to be this strict? Is there any problem by having two
> devices requesting the same offload engine? I would expect that having multiple
> peripherals trying to actually use it at the same time (with the prepare()
> callback) to be problematic but if they play along it could actually work,
> right? In reality that may never be a realistic usecase so this is likely fine.
>

I guess not. But to keep it simple for now, yeah, let's wait until we
have an actual use case.

...

> > +
> > +static const struct spi_controller_offload_ops spi_engine_offload_ops = {
> > +     .map_channel = spi_engine_offload_map_channel,
> > +     .prepare = spi_engine_offload_prepare,
> > +     .unprepare = spi_engine_offload_unprepare,
> > +     .hw_trigger_enable = spi_engine_offload_enable,
> > +     .hw_trigger_disable = spi_engine_offload_disable,
>
> I guess this is what you and Conor are already somehow discussing but I would
> expect this to be the actual offload trigger to play a spi transfer. As it
> stands, it looks weird (or confusing) to have the enable/disable of the engine
> to act as a trigger...

It isn't acting as the trigger, just configuring the offload instance
for exclusive use by a hardware trigger.

> Maybe these callbacks could be used to enable/disable the
> actual trigger of the offload engine (in our current cases, the PWM)? So this
> would make it easy to move the trigger DT property where it belongs. The DMA one
> (given it's tight relation with IIO DMA buffers) is another (way more difficult)
> story I think.
>

One issue I have with making the actual hardware trigger part of the
SPI controller is that in some cases, the peripheral could actually be
the trigger. For example, in the case of a self-clocked ADC where
there is just a RDY signal from the ADC when sample data is ready to
be read. In this case would the peripheral have to register a trigger
enable callback with the controller so that the controller can
communicate with the peripheral to enable and disable sampling mode,
and therefore the trigger?
Nuno Sá May 22, 2024, 12:08 p.m. UTC | #9
On Tue, 2024-05-21 at 09:28 -0500, David Lechner wrote:
> On Tue, May 21, 2024 at 7:27 AM Nuno Sá <noname.nuno@gmail.com> wrote:
> > 
> > On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote:
> > > This implements SPI offload support for the AXI SPI Engine. Currently,
> > > the hardware only supports triggering offload transfers with a hardware
> > > trigger so attempting to use an offload message in the regular SPI
> > > message queue will fail. Also, only allows streaming rx data to an
> > > external sink, so attempts to use a rx_buf in the offload message will
> > > fail.
> > > 
> > > Signed-off-by: David Lechner <dlechner@baylibre.com>
> > > ---
> > > 
> 
> ...
> 
> > > +
> > > +static int spi_engine_offload_map_channel(struct spi_device *spi,
> > > +                                       unsigned int id,
> > > +                                       unsigned int channel)
> > > +{
> > > +     struct spi_controller *host = spi->controller;
> > > +     struct spi_engine *spi_engine = spi_controller_get_devdata(host);
> > > +     struct spi_engine_offload *priv;
> > > +
> > > +     if (channel >= SPI_ENGINE_MAX_NUM_OFFLOADS)
> > > +             return -EINVAL;
> > > +
> > > +     priv = &spi_engine->offload_priv[channel];
> > > +
> > > +     if (priv->spi)
> > > +             return -EBUSY;
> > 
> > I wonder if we need to be this strict? Is there any problem by having two
> > devices requesting the same offload engine? I would expect that having multiple
> > peripherals trying to actually use it at the same time (with the prepare()
> > callback) to be problematic but if they play along it could actually work,
> > right? In reality that may never be a realistic usecase so this is likely fine.
> > 
> 
> I guess not. But to keep it simple for now, yeah, let's wait until we
> have an actual use case.
> 

Agreed.

> ...
> 
> > > +
> > > +static const struct spi_controller_offload_ops spi_engine_offload_ops = {
> > > +     .map_channel = spi_engine_offload_map_channel,
> > > +     .prepare = spi_engine_offload_prepare,
> > > +     .unprepare = spi_engine_offload_unprepare,
> > > +     .hw_trigger_enable = spi_engine_offload_enable,
> > > +     .hw_trigger_disable = spi_engine_offload_disable,
> > 
> > I guess this is what you and Conor are already somehow discussing but I would
> > expect this to be the actual offload trigger to play a spi transfer. As it
> > stands, it looks weird (or confusing) to have the enable/disable of the engine
> > to act as a trigger...
> 
> It isn't acting as the trigger, just configuring the offload instance
> for exclusive use by a hardware trigger.
> 
> > Maybe these callbacks could be used to enable/disable the
> > actual trigger of the offload engine (in our current cases, the PWM)? So this
> > would make it easy to move the trigger DT property where it belongs. The DMA one
> > (given it's tight relation with IIO DMA buffers) is another (way more difficult)
> > story I think.
> > 
> 
> One issue I have with making the actual hardware trigger part of the
> SPI controller is that in some cases, the peripheral could actually be
> the trigger. For example, in the case of a self-clocked ADC where
> there is just a RDY signal from the ADC when sample data is ready to
> be read. In this case would the peripheral have to register a trigger
> enable callback with the controller so that the controller can
> communicate with the peripheral to enable and disable sampling mode,
> and therefore the trigger?

In those cases, the peripheral would not bother in doing any spi hardware triggering
enable/disable (this assumes that we would need explicit interfaces for offload
enable/disable). In DT, the engine would also have no trigger has it's not required
so the controller even conditionally register these callbacks.

As for the DMA, I have no clue how we can have it associated with the controller
given how we want the data to be exported to userspace. Maybe dma-buf could be used
but it won't be easy. TBH, I'm not sure if it's that bad having the DMA associated
with the peripheral since it's purpose is to transfer peripheral DATA (not like a spi
controller DMA used for the actual SPI transfers).

- Nuno Sá