Message ID | 20240510-dlech-mainline-spi-engine-offload-2-v2-0-8707a870c435@baylibre.com |
---|---|
Headers | show |
Series | spi: axi-spi-engine: add offload support | expand |
Drive by comment inline. > diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h > index a8fc16c6bf37..ec8d875d31ff 100644 > --- a/include/linux/spi/spi.h > +++ b/include/linux/spi/spi.h > @@ -632,6 +632,9 @@ struct spi_controller { > /* Flag indicating that the SPI bus is locked for exclusive use */ > bool bus_lock_flag; > > + /* Flag indicating the bus is reserved for use by hardware trigger */ > + bool offload_hw_trigger_enabled; > + > /* > * Setup mode and clock, etc (SPI driver may call many times). > * > @@ -1594,12 +1597,26 @@ struct spi_controller_offload_ops { > * @unprepare: Callback to release any resources used by prepare(). > */ > void (*unprepare)(struct spi_device *spi, unsigned int id); > + > + /** > + * @hw_trigger_enable: Callback to enable the hardware trigger for the > + * given offload instance. > + */ > + Blank line in odd place.. > + int (*hw_trigger_enable)(struct spi_device *spi, unsigned int id); > + /** > + * @hw_trigger_disable: Callback to disable the hardware trigger for the > + * given offload instance. > + */ > + void (*hw_trigger_disable)(struct spi_device *spi, unsigned int id); > }; > > extern int spi_offload_prepare(struct spi_device *spi, unsigned int id, > struct spi_message *msg); > extern void spi_offload_unprepare(struct spi_device *spi, unsigned int id, > struct spi_message *msg); > +extern int spi_offload_hw_trigger_enable(struct spi_device *spi, unsigned int id); > +extern void spi_offload_hw_trigger_disable(struct spi_device *spi, unsigned int id); > > /*---------------------------------------------------------------------------*/ > >
On Fri, 10 May 2024 19:44:31 -0500 David Lechner <dlechner@baylibre.com> wrote: > This adds support for SPI offload to the ad7944 driver. This allows > reading data at the max sample rate of 2.5 MSPS. > > Signed-off-by: David Lechner <dlechner@baylibre.com> > --- > > v2 changes: > > In the previous version, there was a new separate driver for the PWM > trigger and DMA hardware buffer. This was deemed too complex so they > are moved into the ad7944 driver. > > It has also been reworked to accommodate for the changes described in > the other patches. > > RFC: This isn't very polished yet, just FYI. A few things to sort out: > > Rather than making the buffer either triggered buffer or hardware buffer, > I'm considering allowing both, e.g. buffer0 will always be the triggered > buffer and buffer1 will will be the hardware buffer if connected to a SPI > controller with offload support, otherwise buffer1 is absent. But since > multiple buffers haven't been used much so far, more investigation is > needed to see how that would work in practice. If we do that though, then > we would always have the sampling_frequency attribute though even though > it only applies to one buffer. Why would someone who has this nice IP in the path want the conventional triggered buffer? I'm not against the two buffer option, but I'd like to know the reasoning not to just provide the hardware buffer if this SPI offload is available. I can conjecture reasons but would like you to write them out for me :) This feels like if someone has paid for the expensive hardware they probably only want the best performance. Jonathan > --- > drivers/iio/adc/ad7944.c | 147 +++++++++++++++++++++++++++++++++++------------ > 1 file changed, 111 insertions(+), 36 deletions(-) > > diff --git a/drivers/iio/adc/ad7944.c b/drivers/iio/adc/ad7944.c > index 4602ab5ed2a6..6724d6c92778 100644 > --- a/drivers/iio/adc/ad7944.c > +++ b/drivers/iio/adc/ad7944.c > @@ -9,6 +9,7 @@ > #include <linux/align.h> > #include <linux/bitfield.h> > #include <linux/bitops.h> > +#include <linux/clk.h> > #include <linux/delay.h> > #include <linux/device.h> > #include <linux/err.h> > @@ -21,6 +22,7 @@ > > #include <linux/iio/iio.h> > #include <linux/iio/sysfs.h> > +#include <linux/iio/buffer-dmaengine.h> > #include <linux/iio/trigger_consumer.h> > #include <linux/iio/triggered_buffer.h> > > @@ -65,6 +67,8 @@ struct ad7944_adc { > bool always_turbo; > /* Reference voltage (millivolts). */ > unsigned int ref_mv; > + /* Clock that triggers SPI offload. */ > + struct clk *trigger_clk; > > /* > * DMA (thus cache coherency maintenance) requires the > @@ -123,6 +127,7 @@ static const struct ad7944_chip_info _name##_chip_info = { \ > .scan_type.endianness = IIO_CPU, \ > .info_mask_separate = BIT(IIO_CHAN_INFO_RAW) \ > | BIT(IIO_CHAN_INFO_SCALE), \ > + .info_mask_shared_by_type = BIT(IIO_CHAN_INFO_SAMP_FREQ),\ > }, \ > IIO_CHAN_SOFT_TIMESTAMP(1), \ > }, \ > @@ -134,18 +139,12 @@ AD7944_DEFINE_CHIP_INFO(ad7985, ad7944, 16, 0); > /* fully differential */ > AD7944_DEFINE_CHIP_INFO(ad7986, ad7986, 18, 1); > > -static void ad7944_unoptimize_msg(void *msg) > -{ > - spi_unoptimize_message(msg); > -} > - > -static int ad7944_3wire_cs_mode_init_msg(struct device *dev, struct ad7944_adc *adc, > - const struct iio_chan_spec *chan) > +static void ad7944_3wire_cs_mode_init_msg(struct device *dev, struct ad7944_adc *adc, > + const struct iio_chan_spec *chan) > { > unsigned int t_conv_ns = adc->always_turbo ? adc->timing_spec->turbo_conv_ns > : adc->timing_spec->conv_ns; > struct spi_transfer *xfers = adc->xfers; > - int ret; > > /* > * NB: can get better performance from some SPI controllers if we use > @@ -174,21 +173,14 @@ static int ad7944_3wire_cs_mode_init_msg(struct device *dev, struct ad7944_adc * > xfers[2].bits_per_word = chan->scan_type.realbits; > > spi_message_init_with_transfers(&adc->msg, xfers, 3); > - > - ret = spi_optimize_message(adc->spi, &adc->msg); > - if (ret) > - return ret; > - > - return devm_add_action_or_reset(dev, ad7944_unoptimize_msg, &adc->msg); > } > > -static int ad7944_4wire_mode_init_msg(struct device *dev, struct ad7944_adc *adc, > - const struct iio_chan_spec *chan) > +static void ad7944_4wire_mode_init_msg(struct device *dev, struct ad7944_adc *adc, > + const struct iio_chan_spec *chan) > { > unsigned int t_conv_ns = adc->always_turbo ? adc->timing_spec->turbo_conv_ns > : adc->timing_spec->conv_ns; > struct spi_transfer *xfers = adc->xfers; > - int ret; > > /* > * NB: can get better performance from some SPI controllers if we use > @@ -208,12 +200,6 @@ static int ad7944_4wire_mode_init_msg(struct device *dev, struct ad7944_adc *adc > xfers[1].bits_per_word = chan->scan_type.realbits; > > spi_message_init_with_transfers(&adc->msg, xfers, 2); > - > - ret = spi_optimize_message(adc->spi, &adc->msg); > - if (ret) > - return ret; > - > - return devm_add_action_or_reset(dev, ad7944_unoptimize_msg, &adc->msg); > } > > static int ad7944_chain_mode_init_msg(struct device *dev, struct ad7944_adc *adc, > @@ -345,6 +331,30 @@ static int ad7944_read_raw(struct iio_dev *indio_dev, > return -EINVAL; > } > > + case IIO_CHAN_INFO_SAMP_FREQ: > + if (!adc->trigger_clk) > + return -EOPNOTSUPP; > + > + *val = clk_get_rate(adc->trigger_clk); > + return IIO_VAL_INT; > + > + default: > + return -EINVAL; > + } > +} > + > +static int ad7944_write_raw(struct iio_dev *indio_dev, > + const struct iio_chan_spec *chan, > + int val, int val2, long info) > +{ > + struct ad7944_adc *adc = iio_priv(indio_dev); > + > + switch (info) { > + case IIO_CHAN_INFO_SAMP_FREQ: > + if (!adc->trigger_clk) > + return -EOPNOTSUPP; > + > + return clk_set_rate(adc->trigger_clk, val); > default: > return -EINVAL; > } > @@ -352,6 +362,28 @@ static int ad7944_read_raw(struct iio_dev *indio_dev, > > static const struct iio_info ad7944_iio_info = { > .read_raw = &ad7944_read_raw, > + .write_raw = &ad7944_write_raw, > +}; > + > +static int ad7944_offload_ex_buffer_postenable(struct iio_dev *indio_dev) > +{ > + struct ad7944_adc *adc = iio_priv(indio_dev); > + > + return spi_offload_hw_trigger_enable(adc->spi, 0); > +} > + > +static int ad7944_offload_ex_buffer_predisable(struct iio_dev *indio_dev) > +{ > + struct ad7944_adc *adc = iio_priv(indio_dev); > + > + spi_offload_hw_trigger_disable(adc->spi, 0); > + > + return 0; > +} > + > +static const struct iio_buffer_setup_ops ad7944_offload_ex_buffer_setup_ops = { > + .postenable = &ad7944_offload_ex_buffer_postenable, > + .predisable = &ad7944_offload_ex_buffer_predisable, > }; > > static irqreturn_t ad7944_trigger_handler(int irq, void *p) > @@ -471,6 +503,18 @@ static void ad7944_ref_disable(void *ref) > regulator_disable(ref); > } > > +static void ad7944_offload_unprepare(void *p) > +{ > + struct ad7944_adc *adc = p; > + > + spi_offload_unprepare(adc->spi, 0, &adc->msg); > +} > + > +static void ad7944_unoptimize_msg(void *msg) > +{ > + spi_unoptimize_message(msg); > +} > + > static int ad7944_probe(struct spi_device *spi) > { > const struct ad7944_chip_info *chip_info; > @@ -603,16 +647,10 @@ static int ad7944_probe(struct spi_device *spi) > > switch (adc->spi_mode) { > case AD7944_SPI_MODE_DEFAULT: > - ret = ad7944_4wire_mode_init_msg(dev, adc, &chip_info->channels[0]); > - if (ret) > - return ret; > - > + ad7944_4wire_mode_init_msg(dev, adc, &chip_info->channels[0]); > break; > case AD7944_SPI_MODE_SINGLE: > - ret = ad7944_3wire_cs_mode_init_msg(dev, adc, &chip_info->channels[0]); > - if (ret) > - return ret; > - > + ad7944_3wire_cs_mode_init_msg(dev, adc, &chip_info->channels[0]); > break; > case AD7944_SPI_MODE_CHAIN: > ret = device_property_read_u32(dev, "#daisy-chained-devices", > @@ -649,11 +687,48 @@ static int ad7944_probe(struct spi_device *spi) > indio_dev->num_channels = ARRAY_SIZE(chip_info->channels); > } > > - ret = devm_iio_triggered_buffer_setup(dev, indio_dev, > - iio_pollfunc_store_time, > - ad7944_trigger_handler, NULL); > - if (ret) > - return ret; > + if (device_property_present(dev, "spi-offloads")) { > + /* TODO: make this a parameter to ad7944_3wire_cs_mode_init_msg() */ > + /* FIXME: wrong index for 4-wire mode */ > + adc->xfers[2].rx_buf = NULL; > + adc->xfers[2].offload_flags = SPI_OFFLOAD_XFER_RX_STREAM; > + > + ret = spi_offload_prepare(adc->spi, 0, &adc->msg); > + if (ret) > + return dev_err_probe(dev, ret, "failed to prepare offload\n"); > + > + ret = devm_add_action_or_reset(dev, ad7944_offload_unprepare, adc); > + if (ret) > + return ret; > + > + adc->trigger_clk = devm_clk_get_enabled(dev, "trigger"); > + if (IS_ERR(adc->trigger_clk)) > + return dev_err_probe(dev, PTR_ERR(adc->trigger_clk), > + "failed to get trigger clk\n"); > + > + ret = devm_iio_dmaengine_buffer_setup(dev, indio_dev, "rx"); > + if (ret) > + return ret; > + > + indio_dev->setup_ops = &ad7944_offload_ex_buffer_setup_ops; > + /* offload can't have soft timestamp */ > + indio_dev->num_channels--; > + } else { > + ret = spi_optimize_message(adc->spi, &adc->msg); > + if (ret) > + return ret; > + > + ret = devm_add_action_or_reset(dev, ad7944_unoptimize_msg, &adc->msg); > + if (ret) > + return ret; > + > + ret = devm_iio_triggered_buffer_setup(dev, indio_dev, > + iio_pollfunc_store_time, > + ad7944_trigger_handler, > + NULL); > + if (ret) > + return ret; > + } > > return devm_iio_device_register(dev, indio_dev); > } >
On Sat, May 11, 2024 at 11:58 AM Jonathan Cameron <jic23@kernel.org> wrote: > > On Fri, 10 May 2024 19:44:31 -0500 > David Lechner <dlechner@baylibre.com> wrote: > > > This adds support for SPI offload to the ad7944 driver. This allows > > reading data at the max sample rate of 2.5 MSPS. > > > > Signed-off-by: David Lechner <dlechner@baylibre.com> > > --- > > > > v2 changes: > > > > In the previous version, there was a new separate driver for the PWM > > trigger and DMA hardware buffer. This was deemed too complex so they > > are moved into the ad7944 driver. > > > > It has also been reworked to accommodate for the changes described in > > the other patches. > > > > RFC: This isn't very polished yet, just FYI. A few things to sort out: > > > > Rather than making the buffer either triggered buffer or hardware buffer, > > I'm considering allowing both, e.g. buffer0 will always be the triggered > > buffer and buffer1 will will be the hardware buffer if connected to a SPI > > controller with offload support, otherwise buffer1 is absent. But since > > multiple buffers haven't been used much so far, more investigation is > > needed to see how that would work in practice. If we do that though, then > > we would always have the sampling_frequency attribute though even though > > it only applies to one buffer. > > Why would someone who has this nice IP in the path want the conventional > triggered buffer? I'm not against the two buffer option, but I'd like to know > the reasoning not to just provide the hardware buffer if this SPI offload > is available. > > I can conjecture reasons but would like you to write them out for me :) > This feels like if someone has paid for the expensive hardware they probably > only want the best performance. > For me, it was more of a question of if we need to keep the userspace interface consistent between both with or without offload support. But if you are happy with it this way where we have only one or the other, it is less work for me. :-)
On Sat, 11 May 2024 13:41:09 -0500 David Lechner <dlechner@baylibre.com> wrote: > On Sat, May 11, 2024 at 11:58 AM Jonathan Cameron <jic23@kernel.org> wrote: > > > > On Fri, 10 May 2024 19:44:31 -0500 > > David Lechner <dlechner@baylibre.com> wrote: > > > > > This adds support for SPI offload to the ad7944 driver. This allows > > > reading data at the max sample rate of 2.5 MSPS. > > > > > > Signed-off-by: David Lechner <dlechner@baylibre.com> > > > --- > > > > > > v2 changes: > > > > > > In the previous version, there was a new separate driver for the PWM > > > trigger and DMA hardware buffer. This was deemed too complex so they > > > are moved into the ad7944 driver. > > > > > > It has also been reworked to accommodate for the changes described in > > > the other patches. > > > > > > RFC: This isn't very polished yet, just FYI. A few things to sort out: > > > > > > Rather than making the buffer either triggered buffer or hardware buffer, > > > I'm considering allowing both, e.g. buffer0 will always be the triggered > > > buffer and buffer1 will will be the hardware buffer if connected to a SPI > > > controller with offload support, otherwise buffer1 is absent. But since > > > multiple buffers haven't been used much so far, more investigation is > > > needed to see how that would work in practice. If we do that though, then > > > we would always have the sampling_frequency attribute though even though > > > it only applies to one buffer. > > > > Why would someone who has this nice IP in the path want the conventional > > triggered buffer? I'm not against the two buffer option, but I'd like to know > > the reasoning not to just provide the hardware buffer if this SPI offload > > is available. > > > > I can conjecture reasons but would like you to write them out for me :) > > This feels like if someone has paid for the expensive hardware they probably > > only want the best performance. > > > > For me, it was more of a question of if we need to keep the userspace > interface consistent between both with or without offload support. But > if you are happy with it this way where we have only one or the other, > it is less work for me. :-) So inconsistency in userspace interfaces can occur for many reasons like whether the interrupt is wired or not, but in this particularly case I guess we have ABI stability issue because there are boards out there today and people using the driver without this offload functionality. I'd not really thought that bit through, so I think you are correct that we need to maintain the triggered buffer interface and 'add' the new ABI for the offloaded case. The multibuffer approach should work for this. Will be interesting if any problem surface from having two very different types of buffer on the same device. Jonathan
On Sun, May 12, 2024 at 6:52 AM Jonathan Cameron <jic23@kernel.org> wrote: > > On Sat, 11 May 2024 13:41:09 -0500 > David Lechner <dlechner@baylibre.com> wrote: > > > On Sat, May 11, 2024 at 11:58 AM Jonathan Cameron <jic23@kernel.org> wrote: > > > > > > On Fri, 10 May 2024 19:44:31 -0500 > > > David Lechner <dlechner@baylibre.com> wrote: > > > > > > > This adds support for SPI offload to the ad7944 driver. This allows > > > > reading data at the max sample rate of 2.5 MSPS. > > > > > > > > Signed-off-by: David Lechner <dlechner@baylibre.com> > > > > --- > > > > > > > > v2 changes: > > > > > > > > In the previous version, there was a new separate driver for the PWM > > > > trigger and DMA hardware buffer. This was deemed too complex so they > > > > are moved into the ad7944 driver. > > > > > > > > It has also been reworked to accommodate for the changes described in > > > > the other patches. > > > > > > > > RFC: This isn't very polished yet, just FYI. A few things to sort out: > > > > > > > > Rather than making the buffer either triggered buffer or hardware buffer, > > > > I'm considering allowing both, e.g. buffer0 will always be the triggered > > > > buffer and buffer1 will will be the hardware buffer if connected to a SPI > > > > controller with offload support, otherwise buffer1 is absent. But since > > > > multiple buffers haven't been used much so far, more investigation is > > > > needed to see how that would work in practice. If we do that though, then > > > > we would always have the sampling_frequency attribute though even though > > > > it only applies to one buffer. > > > > > > Why would someone who has this nice IP in the path want the conventional > > > triggered buffer? I'm not against the two buffer option, but I'd like to know > > > the reasoning not to just provide the hardware buffer if this SPI offload > > > is available. > > > > > > I can conjecture reasons but would like you to write them out for me :) > > > This feels like if someone has paid for the expensive hardware they probably > > > only want the best performance. > > > > > > > For me, it was more of a question of if we need to keep the userspace > > interface consistent between both with or without offload support. But > > if you are happy with it this way where we have only one or the other, > > it is less work for me. :-) > > So inconsistency in userspace interfaces can occur for many reasons like > whether the interrupt is wired or not, but in this particularly > case I guess we have ABI stability issue because there are boards out there > today and people using the driver without this offload functionality. FWIW, the ad7944 driver will be landing in 6.10, so no users to speak of yet. > I'd not really thought that bit through, so I think you are correct that > we need to maintain the triggered buffer interface and 'add' the new > ABI for the offloaded case. The multibuffer approach should work for this. > Will be interesting if any problem surface from having two very different > types of buffer on the same device. > In this particular case, I think the issues will be: 1. The hardware buffer can't allow the soft timestamp. Otherwise, the buffer layout is the same (on other chips, we won't always be so lucky). 2. The hardware trigger (sampling_frequency attribute) will only apply if there is the SPI offload support.
On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote: > SPI offloading is a feature that allows the SPI controller to perform > complex transfers without CPU intervention. This is useful, e.g. for > high-speed data acquisition. > > This patch adds the basic infrastructure to support SPI offloading. It > introduces new callbacks that are to be implemented by controllers with > offload capabilities. > > On SPI device probe, the standard spi-offloads devicetree property is > parsed and passed to the controller driver to reserve the resources > requested by the peripheral via the map_channel() callback. > > The peripheral driver can then use spi_offload_prepare() to load a SPI > message into the offload hardware. > > If the controller supports it, this message can then be passed to the > SPI message queue as if it was a normal message. Future patches will > will also implement a way to use a hardware trigger to start the message > transfers rather than going through the message queue. > > Signed-off-by: David Lechner <dlechner@baylibre.com> > --- > > v2 changes: > > This is a rework of "spi: add core support for controllers with offload > capabilities" from v1. > > The spi_offload_get() function that Nuno didn't like is gone. Instead, > there is now a mapping callback that uses the new generic devicetree > binding to request resources automatically when a SPI device is probed. > Yeah, what I disliked the most was adding the platform devices from spi-engine driver and then the complexity in the IIO trigger part of it. I also didn't like (as you said) for the peripheral to have to explicitly request an offload but, IIRC, Mark was actually ok with the spi_offload_get/put() mechanism so let's see what he has to say. He's main point (I think) was for the controllers to have a way to know which offload instance is busy or not (but I guess controllers can keep track of that as well with this approach and using the enable/disable callbacks or the prepare/unprepare). THB, I do like this approach (and it is what I had in mind) and it's simple enough while covering what we know about this feature atm. - Nuno Sá
On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote: > This implements SPI offload support for the AXI SPI Engine. Currently, > the hardware only supports triggering offload transfers with a hardware > trigger so attempting to use an offload message in the regular SPI > message queue will fail. Also, only allows streaming rx data to an > external sink, so attempts to use a rx_buf in the offload message will > fail. > > Signed-off-by: David Lechner <dlechner@baylibre.com> > --- > > v2 changes: > > This patch has been reworked to accommodate the changes described in all > of the other patches. > --- > drivers/spi/spi-axi-spi-engine.c | 267 > ++++++++++++++++++++++++++++++++++++++- > 1 file changed, 264 insertions(+), 3 deletions(-) > > diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi- > engine.c > index e358ac5b4509..95327df572a0 100644 > --- a/drivers/spi/spi-axi-spi-engine.c > +++ b/drivers/spi/spi-axi-spi-engine.c > @@ -2,6 +2,7 @@ > /* > * SPI-Engine SPI controller driver > * Copyright 2015 Analog Devices Inc. > + * Copyright 2024 BayLibre, SAS > * Author: Lars-Peter Clausen <lars@metafoo.de> > */ > > @@ -16,6 +17,7 @@ > #include <linux/platform_device.h> > #include <linux/spi/spi.h> > > +#define SPI_ENGINE_REG_OFFLOAD_MEM_ADDR_WIDTH 0x10 > #define SPI_ENGINE_REG_RESET 0x40 > > #define SPI_ENGINE_REG_INT_ENABLE 0x80 > @@ -23,6 +25,7 @@ > #define SPI_ENGINE_REG_INT_SOURCE 0x88 > > #define SPI_ENGINE_REG_SYNC_ID 0xc0 > +#define SPI_ENGINE_REG_OFFLOAD_SYNC_ID 0xc4 > > #define SPI_ENGINE_REG_CMD_FIFO_ROOM 0xd0 > #define SPI_ENGINE_REG_SDO_FIFO_ROOM 0xd4 > @@ -33,10 +36,24 @@ > #define SPI_ENGINE_REG_SDI_DATA_FIFO 0xe8 > #define SPI_ENGINE_REG_SDI_DATA_FIFO_PEEK 0xec > > +#define SPI_ENGINE_MAX_NUM_OFFLOADS 32 > + > +#define SPI_ENGINE_REG_OFFLOAD_CTRL(x) (0x100 + > SPI_ENGINE_MAX_NUM_OFFLOADS * (x)) > +#define SPI_ENGINE_REG_OFFLOAD_STATUS(x) (0x104 + > SPI_ENGINE_MAX_NUM_OFFLOADS * (x)) > +#define SPI_ENGINE_REG_OFFLOAD_RESET(x) (0x108 + > SPI_ENGINE_MAX_NUM_OFFLOADS * (x)) > +#define SPI_ENGINE_REG_OFFLOAD_CMD_FIFO(x) (0x110 + > SPI_ENGINE_MAX_NUM_OFFLOADS * (x)) > +#define SPI_ENGINE_REG_OFFLOAD_SDO_FIFO(x) (0x114 + > SPI_ENGINE_MAX_NUM_OFFLOADS * (x)) > + > +#define SPI_ENGINE_SPI_OFFLOAD_MEM_WIDTH_SDO GENMASK(15, 8) > +#define SPI_ENGINE_SPI_OFFLOAD_MEM_WIDTH_CMD GENMASK(7, 0) > + > #define SPI_ENGINE_INT_CMD_ALMOST_EMPTY BIT(0) > #define SPI_ENGINE_INT_SDO_ALMOST_EMPTY BIT(1) > #define SPI_ENGINE_INT_SDI_ALMOST_FULL BIT(2) > #define SPI_ENGINE_INT_SYNC BIT(3) > +#define SPI_ENGINE_INT_OFFLOAD_SYNC BIT(4) > + > +#define SPI_ENGINE_OFFLOAD_CTRL_ENABLE BIT(0) > > #define SPI_ENGINE_CONFIG_CPHA BIT(0) > #define SPI_ENGINE_CONFIG_CPOL BIT(1) > @@ -74,6 +91,10 @@ > #define SPI_ENGINE_CMD_SYNC(id) \ > SPI_ENGINE_CMD(SPI_ENGINE_INST_MISC, SPI_ENGINE_MISC_SYNC, (id)) > > +/* default sizes - can be changed when SPI Engine firmware is compiled */ > +#define SPI_ENGINE_OFFLOAD_CMD_FIFO_SIZE 16 > +#define SPI_ENGINE_OFFLOAD_SDO_FIFO_SIZE 16 > + > struct spi_engine_program { > unsigned int length; > uint16_t instructions[] __counted_by(length); > @@ -101,6 +122,12 @@ struct spi_engine_message_state { > uint8_t *rx_buf; > }; > > +struct spi_engine_offload { > + struct spi_device *spi; > + unsigned int id; > + bool prepared; > +}; > + > struct spi_engine { > struct clk *clk; > struct clk *ref_clk; > @@ -111,6 +138,10 @@ struct spi_engine { > struct spi_engine_message_state msg_state; > struct completion msg_complete; > unsigned int int_enable; > + > + unsigned int offload_ctrl_mem_size; > + unsigned int offload_sdo_mem_size; > + struct spi_engine_offload offload_priv[SPI_ENGINE_MAX_NUM_OFFLOADS]; > }; > > static void spi_engine_program_add_cmd(struct spi_engine_program *p, > @@ -154,7 +185,7 @@ static void spi_engine_gen_xfer(struct spi_engine_program > *p, bool dry, > > if (xfer->tx_buf) > flags |= SPI_ENGINE_TRANSFER_WRITE; > - if (xfer->rx_buf) > + if (xfer->rx_buf || (xfer->offload_flags & > SPI_OFFLOAD_XFER_RX_STREAM)) > flags |= SPI_ENGINE_TRANSFER_READ; > > spi_engine_program_add_cmd(p, dry, > @@ -202,16 +233,24 @@ static void spi_engine_gen_cs(struct spi_engine_program > *p, bool dry, > * > * NB: This is separate from spi_engine_compile_message() because the latter > * is called twice and would otherwise result in double-evaluation. > + * > + * Returns 0 on success, -EINVAL on failure. > */ > -static void spi_engine_precompile_message(struct spi_message *msg) > +static int spi_engine_precompile_message(struct spi_message *msg) > { > unsigned int clk_div, max_hz = msg->spi->controller->max_speed_hz; > struct spi_transfer *xfer; > > list_for_each_entry(xfer, &msg->transfers, transfer_list) { > + /* If we have an offload transfer, we can't rx to buffer */ > + if (msg->offload && xfer->rx_buf) > + return -EINVAL; > + > clk_div = DIV_ROUND_UP(max_hz, xfer->speed_hz); > xfer->effective_speed_hz = max_hz / min(clk_div, 256U); > } > + > + return 0; > } > > static void spi_engine_compile_message(struct spi_message *msg, bool dry, > @@ -503,8 +542,11 @@ static irqreturn_t spi_engine_irq(int irq, void *devid) > static int spi_engine_optimize_message(struct spi_message *msg) > { > struct spi_engine_program p_dry, *p; > + int ret; > > - spi_engine_precompile_message(msg); > + ret = spi_engine_precompile_message(msg); > + if (ret) > + return ret; > > p_dry.length = 0; > spi_engine_compile_message(msg, true, &p_dry); > @@ -539,6 +581,11 @@ static int spi_engine_transfer_one_message(struct > spi_controller *host, > unsigned int int_enable = 0; > unsigned long flags; > > + if (msg->offload) { > + dev_err(&host->dev, "Single transfer offload not > supported\n"); > + return -EOPNOTSUPP; > + } > + > /* reinitialize message state for this transfer */ > memset(st, 0, sizeof(*st)); > st->cmd_buf = p->instructions; > @@ -579,6 +626,204 @@ static int spi_engine_transfer_one_message(struct > spi_controller *host, > return msg->status; > } > > +static struct spi_engine_offload *spi_engine_get_offload(struct spi_device > *spi, > + unsigned int id, > + unsigned int > *offload_num) > +{ > + struct spi_controller *host = spi->controller; > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > + struct spi_engine_offload *priv; > + int i; > + > + for (i = 0; i < SPI_ENGINE_MAX_NUM_OFFLOADS; i++) { > + priv = &spi_engine->offload_priv[i]; > + > + if (priv->spi == spi && priv->id == id) { > + *offload_num = i; > + return priv; > + } > + } > + > + return ERR_PTR(-ENODEV); > +} > + > +static int spi_engine_offload_map_channel(struct spi_device *spi, > + unsigned int id, > + unsigned int channel) > +{ > + struct spi_controller *host = spi->controller; > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > + struct spi_engine_offload *priv; > + > + if (channel >= SPI_ENGINE_MAX_NUM_OFFLOADS) > + return -EINVAL; > + > + priv = &spi_engine->offload_priv[channel]; > + > + if (priv->spi) > + return -EBUSY; I wonder if we need to be this strict? Is there any problem by having two devices requesting the same offload engine? I would expect that having multiple peripherals trying to actually use it at the same time (with the prepare() callback) to be problematic but if they play along it could actually work, right? In reality that may never be a realistic usecase so this is likely fine. > + > + priv->spi = spi; > + priv->id = id; > + > + return 0; > +} > + > +static int spi_engine_offload_prepare(struct spi_device *spi, unsigned int > id, > + struct spi_message *msg) > +{ > + struct spi_controller *host = spi->controller; > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > + struct spi_engine_program *p = msg->opt_state; > + struct spi_engine_offload *priv; > + struct spi_transfer *xfer; > + void __iomem *cmd_addr; > + void __iomem *sdo_addr; > + size_t tx_word_count = 0; > + unsigned int offload_num, i; > + > + priv = spi_engine_get_offload(spi, id, &offload_num); > + if (IS_ERR(priv)) > + return PTR_ERR(priv); > + > + if (priv->prepared) > + return -EBUSY; > + > + if (p->length > spi_engine->offload_ctrl_mem_size) > + return -EINVAL; > + > + /* count total number of tx words in message */ > + list_for_each_entry(xfer, &msg->transfers, transfer_list) { > + if (!xfer->tx_buf) > + continue; > + > + if (xfer->bits_per_word <= 8) > + tx_word_count += xfer->len; > + else if (xfer->bits_per_word <= 16) > + tx_word_count += xfer->len / 2; > + else > + tx_word_count += xfer->len / 4; > + } > + > + if (tx_word_count > spi_engine->offload_sdo_mem_size) > + return -EINVAL; > + > + cmd_addr = spi_engine->base + SPI_ENGINE_REG_OFFLOAD_CMD_FIFO(priv- > >id); > + sdo_addr = spi_engine->base + > SPI_ENGINE_REG_OFFLOAD_SDO_FIFO(offload_num); > + > + list_for_each_entry(xfer, &msg->transfers, transfer_list) { > + if (!xfer->tx_buf) > + continue; > + > + if (xfer->bits_per_word <= 8) { > + const u8 *buf = xfer->tx_buf; > + > + for (i = 0; i < xfer->len; i++) > + writel_relaxed(buf[i], sdo_addr); > + } else if (xfer->bits_per_word <= 16) { > + const u16 *buf = xfer->tx_buf; > + > + for (i = 0; i < xfer->len / 2; i++) > + writel_relaxed(buf[i], sdo_addr); > + } else { > + const u32 *buf = xfer->tx_buf; > + > + for (i = 0; i < xfer->len / 4; i++) > + writel_relaxed(buf[i], sdo_addr); > + } > + } > + > + for (i = 0; i < p->length; i++) > + writel_relaxed(p->instructions[i], cmd_addr); > + > + msg->offload_state = (void *)(intptr_t)offload_num; > + priv->prepared = true; > + > + return 0; > +} > + > +static void spi_engine_offload_unprepare(struct spi_device *spi, unsigned int > id) > +{ > + struct spi_controller *host = spi->controller; > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > + struct spi_engine_offload *priv; > + unsigned int offload_num; > + > + priv = spi_engine_get_offload(spi, id, &offload_num); > + if (IS_ERR(priv)) { > + dev_warn(&spi->dev, "failed match offload in unprepare\n"); > + return; > + } > + > + writel_relaxed(1, spi_engine->base + > SPI_ENGINE_REG_OFFLOAD_RESET(offload_num)); > + writel_relaxed(0, spi_engine->base + > SPI_ENGINE_REG_OFFLOAD_RESET(offload_num)); > + > + priv->prepared = false; > +} > + > +static int spi_engine_offload_enable(struct spi_device *spi, unsigned int id) > +{ > + struct spi_controller *host = spi->controller; > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > + struct spi_engine_offload *priv; > + unsigned int offload_num, reg; > + > + priv = spi_engine_get_offload(spi, id, &offload_num); > + if (IS_ERR(priv)) > + return PTR_ERR(priv); > + > + reg = readl_relaxed(spi_engine->base + > + SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num)); > + reg |= SPI_ENGINE_OFFLOAD_CTRL_ENABLE; > + writel_relaxed(reg, spi_engine->base + > + SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num)); > + > + return 0; > +} > + > +static void spi_engine_offload_disable(struct spi_device *spi, unsigned int > id) > +{ > + struct spi_controller *host = spi->controller; > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > + struct spi_engine_offload *priv; > + unsigned int offload_num, reg; > + > + priv = spi_engine_get_offload(spi, id, &offload_num); > + if (IS_ERR(priv)) { > + dev_warn(&spi->dev, "failed match offload in disable\n"); > + return; > + } > + > + reg = readl_relaxed(spi_engine->base + > + SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num)); > + reg &= ~SPI_ENGINE_OFFLOAD_CTRL_ENABLE; > + writel_relaxed(reg, spi_engine->base + > + SPI_ENGINE_REG_OFFLOAD_CTRL(offload_num)); > +} > + > +static const struct spi_controller_offload_ops spi_engine_offload_ops = { > + .map_channel = spi_engine_offload_map_channel, > + .prepare = spi_engine_offload_prepare, > + .unprepare = spi_engine_offload_unprepare, > + .hw_trigger_enable = spi_engine_offload_enable, > + .hw_trigger_disable = spi_engine_offload_disable, I guess this is what you and Conor are already somehow discussing but I would expect this to be the actual offload trigger to play a spi transfer. As it stands, it looks weird (or confusing) to have the enable/disable of the engine to act as a trigger... Maybe these callbacks could be used to enable/disable the actual trigger of the offload engine (in our current cases, the PWM)? So this would make it easy to move the trigger DT property where it belongs. The DMA one (given it's tight relation with IIO DMA buffers) is another (way more difficult) story I think. - Nuno Sá
On Tue, May 21, 2024 at 7:27 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote: > > This implements SPI offload support for the AXI SPI Engine. Currently, > > the hardware only supports triggering offload transfers with a hardware > > trigger so attempting to use an offload message in the regular SPI > > message queue will fail. Also, only allows streaming rx data to an > > external sink, so attempts to use a rx_buf in the offload message will > > fail. > > > > Signed-off-by: David Lechner <dlechner@baylibre.com> > > --- > > ... > > + > > +static int spi_engine_offload_map_channel(struct spi_device *spi, > > + unsigned int id, > > + unsigned int channel) > > +{ > > + struct spi_controller *host = spi->controller; > > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > > + struct spi_engine_offload *priv; > > + > > + if (channel >= SPI_ENGINE_MAX_NUM_OFFLOADS) > > + return -EINVAL; > > + > > + priv = &spi_engine->offload_priv[channel]; > > + > > + if (priv->spi) > > + return -EBUSY; > > I wonder if we need to be this strict? Is there any problem by having two > devices requesting the same offload engine? I would expect that having multiple > peripherals trying to actually use it at the same time (with the prepare() > callback) to be problematic but if they play along it could actually work, > right? In reality that may never be a realistic usecase so this is likely fine. > I guess not. But to keep it simple for now, yeah, let's wait until we have an actual use case. ... > > + > > +static const struct spi_controller_offload_ops spi_engine_offload_ops = { > > + .map_channel = spi_engine_offload_map_channel, > > + .prepare = spi_engine_offload_prepare, > > + .unprepare = spi_engine_offload_unprepare, > > + .hw_trigger_enable = spi_engine_offload_enable, > > + .hw_trigger_disable = spi_engine_offload_disable, > > I guess this is what you and Conor are already somehow discussing but I would > expect this to be the actual offload trigger to play a spi transfer. As it > stands, it looks weird (or confusing) to have the enable/disable of the engine > to act as a trigger... It isn't acting as the trigger, just configuring the offload instance for exclusive use by a hardware trigger. > Maybe these callbacks could be used to enable/disable the > actual trigger of the offload engine (in our current cases, the PWM)? So this > would make it easy to move the trigger DT property where it belongs. The DMA one > (given it's tight relation with IIO DMA buffers) is another (way more difficult) > story I think. > One issue I have with making the actual hardware trigger part of the SPI controller is that in some cases, the peripheral could actually be the trigger. For example, in the case of a self-clocked ADC where there is just a RDY signal from the ADC when sample data is ready to be read. In this case would the peripheral have to register a trigger enable callback with the controller so that the controller can communicate with the peripheral to enable and disable sampling mode, and therefore the trigger?
On Tue, 2024-05-21 at 09:28 -0500, David Lechner wrote: > On Tue, May 21, 2024 at 7:27 AM Nuno Sá <noname.nuno@gmail.com> wrote: > > > > On Fri, 2024-05-10 at 19:44 -0500, David Lechner wrote: > > > This implements SPI offload support for the AXI SPI Engine. Currently, > > > the hardware only supports triggering offload transfers with a hardware > > > trigger so attempting to use an offload message in the regular SPI > > > message queue will fail. Also, only allows streaming rx data to an > > > external sink, so attempts to use a rx_buf in the offload message will > > > fail. > > > > > > Signed-off-by: David Lechner <dlechner@baylibre.com> > > > --- > > > > > ... > > > > + > > > +static int spi_engine_offload_map_channel(struct spi_device *spi, > > > + unsigned int id, > > > + unsigned int channel) > > > +{ > > > + struct spi_controller *host = spi->controller; > > > + struct spi_engine *spi_engine = spi_controller_get_devdata(host); > > > + struct spi_engine_offload *priv; > > > + > > > + if (channel >= SPI_ENGINE_MAX_NUM_OFFLOADS) > > > + return -EINVAL; > > > + > > > + priv = &spi_engine->offload_priv[channel]; > > > + > > > + if (priv->spi) > > > + return -EBUSY; > > > > I wonder if we need to be this strict? Is there any problem by having two > > devices requesting the same offload engine? I would expect that having multiple > > peripherals trying to actually use it at the same time (with the prepare() > > callback) to be problematic but if they play along it could actually work, > > right? In reality that may never be a realistic usecase so this is likely fine. > > > > I guess not. But to keep it simple for now, yeah, let's wait until we > have an actual use case. > Agreed. > ... > > > > + > > > +static const struct spi_controller_offload_ops spi_engine_offload_ops = { > > > + .map_channel = spi_engine_offload_map_channel, > > > + .prepare = spi_engine_offload_prepare, > > > + .unprepare = spi_engine_offload_unprepare, > > > + .hw_trigger_enable = spi_engine_offload_enable, > > > + .hw_trigger_disable = spi_engine_offload_disable, > > > > I guess this is what you and Conor are already somehow discussing but I would > > expect this to be the actual offload trigger to play a spi transfer. As it > > stands, it looks weird (or confusing) to have the enable/disable of the engine > > to act as a trigger... > > It isn't acting as the trigger, just configuring the offload instance > for exclusive use by a hardware trigger. > > > Maybe these callbacks could be used to enable/disable the > > actual trigger of the offload engine (in our current cases, the PWM)? So this > > would make it easy to move the trigger DT property where it belongs. The DMA one > > (given it's tight relation with IIO DMA buffers) is another (way more difficult) > > story I think. > > > > One issue I have with making the actual hardware trigger part of the > SPI controller is that in some cases, the peripheral could actually be > the trigger. For example, in the case of a self-clocked ADC where > there is just a RDY signal from the ADC when sample data is ready to > be read. In this case would the peripheral have to register a trigger > enable callback with the controller so that the controller can > communicate with the peripheral to enable and disable sampling mode, > and therefore the trigger? In those cases, the peripheral would not bother in doing any spi hardware triggering enable/disable (this assumes that we would need explicit interfaces for offload enable/disable). In DT, the engine would also have no trigger has it's not required so the controller even conditionally register these callbacks. As for the DMA, I have no clue how we can have it associated with the controller given how we want the data to be exported to userspace. Maybe dma-buf could be used but it won't be easy. TBH, I'm not sure if it's that bad having the DMA associated with the peripheral since it's purpose is to transfer peripheral DATA (not like a spi controller DMA used for the actual SPI transfers). - Nuno Sá