mbox series

[v3,0/4] Introduce GENI SE Controller Driver

Message ID 1519781889-16117-1-git-send-email-kramasub@codeaurora.org
Headers show
Series Introduce GENI SE Controller Driver | expand

Message

Karthikeyan Ramasubramanian Feb. 28, 2018, 1:38 a.m. UTC
Generic Interface (GENI) firmware based Qualcomm Universal Peripheral (QUP)
Wrapper is a next generation programmable module for supporting a wide
range of serial interfaces like UART, SPI, I2C, I3C, etc. A single QUP
module can provide upto 8 Serial Interfaces using its internal Serial
Engines (SE). The protocol supported by each interface is determined by
the firmware loaded to the Serial Engine.

This patch series introduces GENI SE Driver to manage the GENI based QUP
Wrapper and the common aspects of all SEs inside the QUP Wrapper. This
patch series also introduces the UART and I2C Controller drivers to
drive the SEs that are programmed with the respective protocols.

[v3]
 * Update the driver dependencies
 * Use the SPDX License Expression
 * Squash all the controller device tree bindings together
 * Use kernel doc format for documentation
 * Add additional documentation for packing configuration
 * Use clk_bulk_* API for related clocks
 * Remove driver references to pinctrl and their states
 * Replace magic numbers with appropriate macros
 * Update memory barrier usage and associated comments
 * Reduce interlacing of register reads/writes
 * Fix poll_get_char() operation in console UART driver under polling mode
 * Address other comments from Bjorn Andersson to improve code readability

[v2]
 * Updated device tree bindings to describe the hardware
 * Updated SE DT node as child node of QUP Wrapper DT node
 * Moved common AHB clocks to QUP Wrapper DT node
 * Use the standard "clock-frequency" I2C property
 * Update compatible field in UART Controller to reflect hardware manual
 * Addressed other device tree binding specific comments from Rob Herring

Karthikeyan Ramasubramanian (4):
  dt-bindings: soc: qcom: Add device tree binding for GENI SE
  soc: qcom: Add GENI based QUP Wrapper driver
  i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C
    controller
  tty: serial: msm_geni_serial: Add serial driver support for GENI based
    QUP

 .../devicetree/bindings/soc/qcom/qcom,geni-se.txt  |   89 ++
 drivers/i2c/busses/Kconfig                         |   11 +
 drivers/i2c/busses/Makefile                        |    1 +
 drivers/i2c/busses/i2c-qcom-geni.c                 |  626 +++++++++++
 drivers/soc/qcom/Kconfig                           |    9 +
 drivers/soc/qcom/Makefile                          |    1 +
 drivers/soc/qcom/qcom-geni-se.c                    |  971 ++++++++++++++++
 drivers/tty/serial/Kconfig                         |   11 +
 drivers/tty/serial/Makefile                        |    1 +
 drivers/tty/serial/qcom_geni_serial.c              | 1181 ++++++++++++++++++++
 include/linux/qcom-geni-se.h                       |  247 ++++
 11 files changed, 3148 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/qcom/qcom,geni-se.txt
 create mode 100644 drivers/i2c/busses/i2c-qcom-geni.c
 create mode 100644 drivers/soc/qcom/qcom-geni-se.c
 create mode 100644 drivers/tty/serial/qcom_geni_serial.c
 create mode 100644 include/linux/qcom-geni-se.h

Comments

Stephen Boyd March 2, 2018, 8:41 p.m. UTC | #1
Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:07)
> This driver manages the Generic Interface (GENI) firmware based Qualcomm
> Universal Peripheral (QUP) Wrapper. GENI based QUP is the next generation
> programmable module composed of multiple Serial Engines (SE) and supports
> a wide range of serial interfaces like UART, SPI, I2C, I3C, etc. This
> driver also enables managing the serial interface independent aspects of
> Serial Engines.
> 
> Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
> Signed-off-by: Sagar Dharia <sdharia@codeaurora.org>
> Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
> ---
>  drivers/soc/qcom/Kconfig        |   9 +
>  drivers/soc/qcom/Makefile       |   1 +
>  drivers/soc/qcom/qcom-geni-se.c | 971 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/qcom-geni-se.h    | 247 ++++++++++
>  4 files changed, 1228 insertions(+)
>  create mode 100644 drivers/soc/qcom/qcom-geni-se.c
>  create mode 100644 include/linux/qcom-geni-se.h
> 
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index e050eb8..cc460d0 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -3,6 +3,15 @@
>  #
>  menu "Qualcomm SoC drivers"
>  
> +config QCOM_GENI_SE
> +       tristate "QCOM GENI Serial Engine Driver"
> +       depends on ARCH_QCOM

Add || COMPILE_TEST?

> +       help
> +         This module is used to manage Generic Interface (GENI) firmware based

s/module/driver?

> +         Qualcomm Technologies, Inc. Universal Peripheral (QUP) Wrapper. This
> +         module is also used to manage the common aspects of multiple Serial
s/module/driver?

> +         Engines present in the QUP.
> +
>  config QCOM_GLINK_SSR
>         tristate "Qualcomm Glink SSR driver"
>         depends on RPMSG
> diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
> new file mode 100644
> index 0000000..61335b8
> --- /dev/null
> +++ b/drivers/soc/qcom/qcom-geni-se.c
> @@ -0,0 +1,971 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
> +
> +#include <linux/clk.h>
> +#include <linux/slab.h>
> +#include <linux/dma-mapping.h>
> +#include <linux/io.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_platform.h>
> +#include <linux/qcom-geni-se.h>

#include <linux/platform_device.h>

> +
> +/**
> + * DOC: Overview
> + *
> + * Generic Interface (GENI) Serial Engine (SE) Wrapper driver is introduced
> + * to manage GENI firmware based Qualcomm Universal Peripheral (QUP) Wrapper
> + * controller. QUP Wrapper is designed to support various serial bus protocols
> + * like UART, SPI, I2C, I3C, etc.
> + */
> +
> +/**
> + * DOC: Hardware description
> + *
> + * GENI based QUP is a highly-flexible and programmable module for supporting
> + * a wide range of serial interfaces like UART, SPI, I2C, I3C, etc. A single
> + * QUP module can provide upto 8 Serial Interfaces, using its internal
> + * Serial Engines. The actual configuration is determined by the target
> + * platform configuration. The protocol supported by each interface is
> + * determined by the firmware loaded to the Serial Engine. Each SE consists
> + * of a DMA Engine and GENI sub modules which enable Serial Engines to
> + * support FIFO and DMA modes of operation.
> + *
> + *
> + *                      +-----------------------------------------+
> + *                      |QUP Wrapper                              |
> + *                      |         +----------------------------+  |
> + *   --QUP & SE Clocks-->         | Serial Engine N            |  +-IO------>
> + *                      |         | ...                        |  | Interface
> + *   <---Clock Perf.----+    +----+-----------------------+    |  |
> + *     State Interface  |    | Serial Engine 1            |    |  |
> + *                      |    |                            |    |  |
> + *                      |    |                            |    |  |
> + *   <--------AHB------->    |                            |    |  |
> + *                      |    |                            +----+  |
> + *                      |    |                            |       |
> + *                      |    |                            |       |
> + *   <------SE IRQ------+    +----------------------------+       |
> + *                      |                                         |
> + *                      +-----------------------------------------+
> + *
> + *                         Figure 1: GENI based QUP Wrapper

The code talks about primary and secondary sequencers, but this hardware
description doesn't talk about it. Can you add some more information
here about that aspect too?

> + */
> +
> +/**
> + * DOC: Software description
> + *
> + * GENI SE Wrapper driver is structured into 2 parts:
> + *
> + * geni_wrapper represents QUP Wrapper controller. This part of the driver
> + * manages QUP Wrapper information such as hardware version, clock
> + * performance table that is common to all the internal Serial Engines.
> + *
> + * geni_se represents Serial Engine. This part of the driver manages Serial
> + * Engine information such as clocks, containing QUP Wrapper etc. This part

Insert a comma here                                           ^

> + * of driver also supports operations(eg. initialize the concerned Serial

Space                                   ^

> + * Engine, select between FIFO and DMA mode of operation etc.) that are
> + * common to all the Serial Engines and are independent of Serial Interfaces.

Why are Serial Interfaces and Serial Engine always capitalized?

> + */
> +
> +#define MAX_CLK_PERF_LEVEL 32
> +#define NUM_AHB_CLKS 2
> +static const char m_ahb_clk[] = "m-ahb";
> +static const char s_ahb_clk[] = "s-ahb";

These are used in one place. Inline them?

> +
> +/**
> + * @struct geni_wrapper - Data structure to represent the QUP Wrapper Core
> + * @dev:               Device pointer of the QUP wrapper core.
> + * @base:              Base address of this instance of QUP wrapper core.
> + * @ahb_clks:          Handle to the primary & secondary AHB clocks.
> + * @lock:              Lock to protect the device elements.

What does 'device elements' mean?

> + * @num_clk_levels:    Number of valid clock levels in clk_perf_tbl.
> + * @clk_perf_tbl:      Table of clock frequency input to Serial Engine clock.

Kernel-doc normally doesn't have a full-stop on member descriptions.

> + */
> +struct geni_wrapper {
> +       struct device *dev;
> +       void __iomem *base;
> +       struct clk_bulk_data ahb_clks[NUM_AHB_CLKS];
> +       struct mutex lock;
> +       unsigned int num_clk_levels;
> +       unsigned long *clk_perf_tbl;
> +};
> +
> +/* Offset of QUP Hardware Version Register */

Useless comment?

> +#define QUP_HW_VER_REG                 0x4
> +
> +#define HW_VER_MAJOR_MASK              GENMASK(31, 28)
> +#define HW_VER_MAJOR_SHFT              28
> +#define HW_VER_MINOR_MASK              GENMASK(27, 16)
> +#define HW_VER_MINOR_SHFT              16
> +#define HW_VER_STEP_MASK               GENMASK(15, 0)
> +
> +/* Common SE registers */
> +#define GENI_INIT_CFG_REVISION         0x0
> +#define GENI_S_INIT_CFG_REVISION       0x4
> +#define GENI_OUTPUT_CTRL               0x24
> +#define GENI_CGC_CTRL                  0x28
> +#define GENI_CLK_CTRL_RO               0x60
> +#define GENI_IF_DISABLE_RO             0x64
> +#define GENI_FW_REVISION_RO            0x68
> +#define GENI_FW_S_REVISION_RO          0x6c
> +#define SE_GENI_BYTE_GRAN              0x254
> +#define SE_GENI_TX_PACKING_CFG0                0x260
> +#define SE_GENI_TX_PACKING_CFG1                0x264
> +#define SE_GENI_RX_PACKING_CFG0                0x284
> +#define SE_GENI_RX_PACKING_CFG1                0x288
> +#define SE_GENI_M_GP_LENGTH            0x910
> +#define SE_GENI_S_GP_LENGTH            0x914
> +#define SE_DMA_TX_PTR_L                        0xc30
> +#define SE_DMA_TX_PTR_H                        0xc34
> +#define SE_DMA_TX_ATTR                 0xc38
> +#define SE_DMA_TX_LEN                  0xc3c
> +#define SE_DMA_TX_IRQ_EN               0xc48
> +#define SE_DMA_TX_IRQ_EN_SET           0xc4c
> +#define SE_DMA_TX_IRQ_EN_CLR           0xc50
> +#define SE_DMA_TX_LEN_IN               0xc54
> +#define SE_DMA_TX_MAX_BURST            0xc5c
> +#define SE_DMA_RX_PTR_L                        0xd30
> +#define SE_DMA_RX_PTR_H                        0xd34
> +#define SE_DMA_RX_ATTR                 0xd38
> +#define SE_DMA_RX_LEN                  0xd3c
> +#define SE_DMA_RX_IRQ_EN               0xd48
> +#define SE_DMA_RX_IRQ_EN_SET           0xd4c
> +#define SE_DMA_RX_IRQ_EN_CLR           0xd50
> +#define SE_DMA_RX_LEN_IN               0xd54
> +#define SE_DMA_RX_MAX_BURST            0xd5c
> +#define SE_DMA_RX_FLUSH                        0xd60
> +#define SE_GSI_EVENT_EN                        0xe18
> +#define SE_IRQ_EN                      0xe1c
> +#define SE_HW_PARAM_0                  0xe24
> +#define SE_HW_PARAM_1                  0xe28
> +#define SE_DMA_GENERAL_CFG             0xe30
> +
> +/* GENI_OUTPUT_CTRL fields */
> +#define DEFAULT_IO_OUTPUT_CTRL_MSK     GENMASK(6, 0)
> +
> +/* GENI_CGC_CTRL fields */
> +#define CFG_AHB_CLK_CGC_ON             BIT(0)
> +#define CFG_AHB_WR_ACLK_CGC_ON         BIT(1)
> +#define DATA_AHB_CLK_CGC_ON            BIT(2)
> +#define SCLK_CGC_ON                    BIT(3)
> +#define TX_CLK_CGC_ON                  BIT(4)
> +#define RX_CLK_CGC_ON                  BIT(5)
> +#define EXT_CLK_CGC_ON                 BIT(6)
> +#define PROG_RAM_HCLK_OFF              BIT(8)
> +#define PROG_RAM_SCLK_OFF              BIT(9)
> +#define DEFAULT_CGC_EN                 GENMASK(6, 0)
> +
> +/* FW_REVISION_RO fields */
> +#define FW_REV_PROTOCOL_MSK            GENMASK(15, 8)
> +#define FW_REV_PROTOCOL_SHFT           8
> +
> +/* SE_GSI_EVENT_EN fields */
> +#define DMA_RX_EVENT_EN                        BIT(0)
> +#define DMA_TX_EVENT_EN                        BIT(1)
> +#define GENI_M_EVENT_EN                        BIT(2)
> +#define GENI_S_EVENT_EN                        BIT(3)
> +
> +/* SE_IRQ_EN fields */
> +#define DMA_RX_IRQ_EN                  BIT(0)
> +#define DMA_TX_IRQ_EN                  BIT(1)
> +#define GENI_M_IRQ_EN                  BIT(2)
> +#define GENI_S_IRQ_EN                  BIT(3)
> +
> +/* SE_HW_PARAM_0 fields */
> +#define TX_FIFO_WIDTH_MSK              GENMASK(29, 24)
> +#define TX_FIFO_WIDTH_SHFT             24
> +#define TX_FIFO_DEPTH_MSK              GENMASK(21, 16)
> +#define TX_FIFO_DEPTH_SHFT             16
> +
> +/* SE_HW_PARAM_1 fields */
> +#define RX_FIFO_WIDTH_MSK              GENMASK(29, 24)
> +#define RX_FIFO_WIDTH_SHFT             24
> +#define RX_FIFO_DEPTH_MSK              GENMASK(21, 16)
> +#define RX_FIFO_DEPTH_SHFT             16
> +
> +/* SE_DMA_GENERAL_CFG */
> +#define DMA_RX_CLK_CGC_ON              BIT(0)
> +#define DMA_TX_CLK_CGC_ON              BIT(1)
> +#define DMA_AHB_SLV_CFG_ON             BIT(2)
> +#define AHB_SEC_SLV_CLK_CGC_ON         BIT(3)
> +#define DUMMY_RX_NON_BUFFERABLE                BIT(4)
> +#define RX_DMA_ZERO_PADDING_EN         BIT(5)
> +#define RX_DMA_IRQ_DELAY_MSK           GENMASK(8, 6)
> +#define RX_DMA_IRQ_DELAY_SHFT          6
> +
> +/**
> + * geni_se_get_qup_hw_version() - Read the QUP wrapper Hardware version
> + * @se:                        Pointer to the corresponding Serial Engine.
> + * @major:             Buffer for Major Version field.
> + * @minor:             Buffer for Minor Version field.
> + * @step:              Buffer for Step Version field.
> + */
> +void geni_se_get_qup_hw_version(struct geni_se *se, unsigned int *major,
> +                               unsigned int *minor, unsigned int *step)
> +{
> +       unsigned int version;
> +       struct geni_wrapper *wrapper = se->wrapper;
> +
> +       version = readl_relaxed(wrapper->base + QUP_HW_VER_REG);
> +       *major = (version & HW_VER_MAJOR_MASK) >> HW_VER_MAJOR_SHFT;
> +       *minor = (version & HW_VER_MINOR_MASK) >> HW_VER_MINOR_SHFT;
> +       *step = version & HW_VER_STEP_MASK;
> +}
> +EXPORT_SYMBOL(geni_se_get_qup_hw_version);

Is this used?

> +
> +/**
> + * geni_se_read_proto() - Read the protocol configured for a Serial Engine
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * Return: Protocol value as configured in the serial engine.
> + */
> +u32 geni_se_read_proto(struct geni_se *se)
> +{
> +       u32 val;
> +
> +       val = readl_relaxed(se->base + GENI_FW_REVISION_RO);
> +
> +       return (val & FW_REV_PROTOCOL_MSK) >> FW_REV_PROTOCOL_SHFT;
> +}
> +EXPORT_SYMBOL(geni_se_read_proto);

Is this API really needed outside of this file? It would seem like the
drivers that implement the protocol, which are child devices, would only
use this API to confirm that the protocol chosen is for their particular
protocol.


> +
> +static void geni_se_io_set_mode(void __iomem *base)
> +{
> +       u32 val;
> +
> +       val = readl_relaxed(base + SE_IRQ_EN);
> +       val |= (GENI_M_IRQ_EN | GENI_S_IRQ_EN);
> +       val |= (DMA_TX_IRQ_EN | DMA_RX_IRQ_EN);

Drop useless parenthesis please.

> +       writel_relaxed(val, base + SE_IRQ_EN);
> +
> +       val = readl_relaxed(base + SE_GENI_DMA_MODE_EN);
> +       val &= ~GENI_DMA_MODE_EN;
> +       writel_relaxed(val, base + SE_GENI_DMA_MODE_EN);
> +
> +       writel_relaxed(0, base + SE_GSI_EVENT_EN);
> +}
> +
> +static void geni_se_io_init(void __iomem *base)
> +{
> +       u32 val;
> +
> +       val = readl_relaxed(base + GENI_CGC_CTRL);
> +       val |= DEFAULT_CGC_EN;
> +       writel_relaxed(val, base + GENI_CGC_CTRL);
> +
> +       val = readl_relaxed(base + SE_DMA_GENERAL_CFG);
> +       val |= AHB_SEC_SLV_CLK_CGC_ON | DMA_AHB_SLV_CFG_ON;
> +       val |= DMA_TX_CLK_CGC_ON | DMA_RX_CLK_CGC_ON;
> +       writel_relaxed(val, base + SE_DMA_GENERAL_CFG);
> +
> +       writel_relaxed(DEFAULT_IO_OUTPUT_CTRL_MSK, base + GENI_OUTPUT_CTRL);
> +       writel_relaxed(FORCE_DEFAULT, base + GENI_FORCE_DEFAULT_REG);
> +}
> +
> +/**
> + * geni_se_init() - Initialize the GENI Serial Engine
> + * @se:                Pointer to the concerned Serial Engine.
> + * @rx_wm:     Receive watermark, in units of FIFO words.
> + * @rx_rfr_wm: Ready-for-receive watermark, in units of FIFO words.
> + *
> + * This function is used to initialize the GENI serial engine, configure
> + * receive watermark and ready-for-receive watermarks.
> + *
> + * Return: 0 on success, standard Linux error codes on failure/error.

It never returns an error. Change to void?

> + */
> +int geni_se_init(struct geni_se *se, u32 rx_wm, u32 rx_rfr)
> +{
> +       u32 val;
> +
> +       geni_se_io_init(se->base);
> +       geni_se_io_set_mode(se->base);
> +
> +       writel_relaxed(rx_wm, se->base + SE_GENI_RX_WATERMARK_REG);
> +       writel_relaxed(rx_rfr, se->base + SE_GENI_RX_RFR_WATERMARK_REG);
> +
> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> +       val |= M_COMMON_GENI_M_IRQ_EN;
> +       writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> +
> +       val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
> +       val |= S_COMMON_GENI_S_IRQ_EN;
> +       writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
> +       return 0;
> +}
> +EXPORT_SYMBOL(geni_se_init);
> +
> +static void geni_se_select_fifo_mode(struct geni_se *se)
> +{
> +       u32 proto = geni_se_read_proto(se);
> +       u32 val;
> +
> +       writel_relaxed(0, se->base + SE_GSI_EVENT_EN);
> +       writel_relaxed(0xffffffff, se->base + SE_GENI_M_IRQ_CLEAR);
> +       writel_relaxed(0xffffffff, se->base + SE_GENI_S_IRQ_CLEAR);
> +       writel_relaxed(0xffffffff, se->base + SE_DMA_TX_IRQ_CLR);
> +       writel_relaxed(0xffffffff, se->base + SE_DMA_RX_IRQ_CLR);
> +       writel_relaxed(0xffffffff, se->base + SE_IRQ_EN);
> +
> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
> +       if (proto != GENI_SE_UART) {
> +               val |= M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN;
> +               val |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
> +       }
> +       writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
> +
> +       val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
> +       if (proto != GENI_SE_UART)
> +               val |= S_CMD_DONE_EN;
> +       writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
> +
> +       val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
> +       val &= ~GENI_DMA_MODE_EN;
> +       writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
> +}
> +
> +static void geni_se_select_dma_mode(struct geni_se *se)
> +{
> +       u32 val;
> +
> +       writel_relaxed(0, se->base + SE_GSI_EVENT_EN);
> +       writel_relaxed(0xffffffff, se->base + SE_GENI_M_IRQ_CLEAR);
> +       writel_relaxed(0xffffffff, se->base + SE_GENI_S_IRQ_CLEAR);
> +       writel_relaxed(0xffffffff, se->base + SE_DMA_TX_IRQ_CLR);
> +       writel_relaxed(0xffffffff, se->base + SE_DMA_RX_IRQ_CLR);
> +       writel_relaxed(0xffffffff, se->base + SE_IRQ_EN);
> +
> +       val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
> +       val |= GENI_DMA_MODE_EN;
> +       writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
> +}
> +
> +/**
> + * geni_se_select_mode() - Select the serial engine transfer mode
> + * @se:                Pointer to the concerned Serial Engine.
> + * @mode:      Transfer mode to be selected.
> + */
> +void geni_se_select_mode(struct geni_se *se, int mode)

enum mode?

> +{
> +       WARN_ON(mode != GENI_SE_FIFO && mode != GENI_SE_DMA);
> +
> +       switch (mode) {
> +       case GENI_SE_FIFO:
> +               geni_se_select_fifo_mode(se);
> +               break;
> +       case GENI_SE_DMA:
> +               geni_se_select_dma_mode(se);
> +               break;
> +       }
> +}
> +EXPORT_SYMBOL(geni_se_select_mode);
> +
> +/**
> + * geni_se_setup_m_cmd() - Setup the primary sequencer
> + * @se:                Pointer to the concerned Serial Engine.
> + * @cmd:       Command/Operation to setup in the primary sequencer.
> + * @params:    Parameter for the sequencer command.
> + *
> + * This function is used to configure the primary sequencer with the
> + * command and its associated parameters.
> + */
> +void geni_se_setup_m_cmd(struct geni_se *se, u32 cmd, u32 params)
> +{
> +       u32 m_cmd;
> +
> +       m_cmd = (cmd << M_OPCODE_SHFT) | (params & M_PARAMS_MSK);
> +       writel_relaxed(m_cmd, se->base + SE_GENI_M_CMD0);
> +}
> +EXPORT_SYMBOL(geni_se_setup_m_cmd);
> +
> +/**
> + * geni_se_setup_s_cmd() - Setup the secondary sequencer
> + * @se:                Pointer to the concerned Serial Engine.
> + * @cmd:       Command/Operation to setup in the secondary sequencer.
> + * @params:    Parameter for the sequencer command.
> + *
> + * This function is used to configure the secondary sequencer with the
> + * command and its associated parameters.
> + */
> +void geni_se_setup_s_cmd(struct geni_se *se, u32 cmd, u32 params)
> +{
> +       u32 s_cmd;
> +
> +       s_cmd = readl_relaxed(se->base + SE_GENI_S_CMD0);
> +       s_cmd &= ~(S_OPCODE_MSK | S_PARAMS_MSK);
> +       s_cmd |= (cmd << S_OPCODE_SHFT);
> +       s_cmd |= (params & S_PARAMS_MSK);
> +       writel_relaxed(s_cmd, se->base + SE_GENI_S_CMD0);
> +}
> +EXPORT_SYMBOL(geni_se_setup_s_cmd);
> +
> +/**
> + * geni_se_cancel_m_cmd() - Cancel the command configured in the primary
> + *                          sequencer
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to cancel the currently configured command in the
> + * primary sequencer.
> + */
> +void geni_se_cancel_m_cmd(struct geni_se *se)
> +{
> +       writel_relaxed(M_GENI_CMD_CANCEL, se->base + SE_GENI_M_CMD_CTRL_REG);
> +}
> +EXPORT_SYMBOL(geni_se_cancel_m_cmd);
> +
> +/**
> + * geni_se_cancel_s_cmd() - Cancel the command configured in the secondary
> + *                          sequencer
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to cancel the currently configured command in the
> + * secondary sequencer.
> + */
> +void geni_se_cancel_s_cmd(struct geni_se *se)
> +{
> +       writel_relaxed(S_GENI_CMD_CANCEL, se->base + SE_GENI_S_CMD_CTRL_REG);
> +}
> +EXPORT_SYMBOL(geni_se_cancel_s_cmd);
> +
> +/**
> + * geni_se_abort_m_cmd() - Abort the command configured in the primary sequencer
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to force abort the currently configured command in the
> + * primary sequencer.
> + */
> +void geni_se_abort_m_cmd(struct geni_se *se)
> +{
> +       writel_relaxed(M_GENI_CMD_ABORT, se->base + SE_GENI_M_CMD_CTRL_REG);
> +}
> +EXPORT_SYMBOL(geni_se_abort_m_cmd);
> +
> +/**
> + * geni_se_abort_s_cmd() - Abort the command configured in the secondary
> + *                         sequencer
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to force abort the currently configured command in the
> + * secondary sequencer.
> + */
> +void geni_se_abort_s_cmd(struct geni_se *se)
> +{
> +       writel_relaxed(S_GENI_CMD_ABORT, se->base + SE_GENI_S_CMD_CTRL_REG);
> +}
> +EXPORT_SYMBOL(geni_se_abort_s_cmd);

Can these one-liners go into the header file and be marked static
inline? I would guess call-sites already have se->base in hand, so
registers might be reused more efficiently and it may result in a single
store instruction instead of a branch and load/store.

> +
> +/**
> + * geni_se_get_tx_fifo_depth() - Get the TX fifo depth of the serial engine
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to get the depth i.e. number of elements in the
> + * TX fifo of the serial engine.
> + *
> + * Return: TX fifo depth in units of FIFO words.
> + */
> +u32 geni_se_get_tx_fifo_depth(struct geni_se *se)
> +{
> +       u32 val;
> +
> +       val = readl_relaxed(se->base + SE_HW_PARAM_0);
> +
> +       return (val & TX_FIFO_DEPTH_MSK) >> TX_FIFO_DEPTH_SHFT;
> +}
> +EXPORT_SYMBOL(geni_se_get_tx_fifo_depth);
> +
> +/**
> + * geni_se_get_tx_fifo_width() - Get the TX fifo width of the serial engine
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to get the width i.e. word size per element in the
> + * TX fifo of the serial engine.
> + *
> + * Return: TX fifo width in bits
> + */
> +u32 geni_se_get_tx_fifo_width(struct geni_se *se)
> +{
> +       u32 val;
> +
> +       val = readl_relaxed(se->base + SE_HW_PARAM_0);
> +
> +       return (val & TX_FIFO_WIDTH_MSK) >> TX_FIFO_WIDTH_SHFT;
> +}
> +EXPORT_SYMBOL(geni_se_get_tx_fifo_width);
> +
> +/**
> + * geni_se_get_rx_fifo_depth() - Get the RX fifo depth of the serial engine
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * This function is used to get the depth i.e. number of elements in the
> + * RX fifo of the serial engine.
> + *
> + * Return: RX fifo depth in units of FIFO words
> + */
> +u32 geni_se_get_rx_fifo_depth(struct geni_se *se)
> +{
> +       u32 val;
> +
> +       val = readl_relaxed(se->base + SE_HW_PARAM_1);
> +
> +       return (val & RX_FIFO_DEPTH_MSK) >> RX_FIFO_DEPTH_SHFT;
> +}
> +EXPORT_SYMBOL(geni_se_get_rx_fifo_depth);

These ones too, can probably just be static inline.

> +
> +/**
> + * DOC: Overview
> + *
> + * GENI FIFO packing is highly configurable. TX/RX packing/unpacking consist
> + * of up to 4 operations, each operation represented by 4 configuration vectors
> + * of 10 bits programmed in GENI_TX_PACKING_CFG0 and GENI_TX_PACKING_CFG1 for
> + * TX FIFO and in GENI_RX_PACKING_CFG0 and GENI_RX_PACKING_CFG1 for RX FIFO.
> + * Refer to below examples for detailed bit-field description.
> + *
> + * Example 1: word_size = 7, packing_mode = 4 x 8, msb_to_lsb = 1
> + *
> + *        +-----------+-------+-------+-------+-------+
> + *        |           | vec_0 | vec_1 | vec_2 | vec_3 |
> + *        +-----------+-------+-------+-------+-------+
> + *        | start     | 0x6   | 0xe   | 0x16  | 0x1e  |
> + *        | direction | 1     | 1     | 1     | 1     |
> + *        | length    | 6     | 6     | 6     | 6     |
> + *        | stop      | 0     | 0     | 0     | 1     |
> + *        +-----------+-------+-------+-------+-------+
> + *
> + * Example 2: word_size = 15, packing_mode = 2 x 16, msb_to_lsb = 0
> + *
> + *        +-----------+-------+-------+-------+-------+
> + *        |           | vec_0 | vec_1 | vec_2 | vec_3 |
> + *        +-----------+-------+-------+-------+-------+
> + *        | start     | 0x0   | 0x8   | 0x10  | 0x18  |
> + *        | direction | 0     | 0     | 0     | 0     |
> + *        | length    | 7     | 6     | 7     | 6     |
> + *        | stop      | 0     | 0     | 0     | 1     |
> + *        +-----------+-------+-------+-------+-------+
> + *
> + * Example 3: word_size = 23, packing_mode = 1 x 32, msb_to_lsb = 1
> + *
> + *        +-----------+-------+-------+-------+-------+
> + *        |           | vec_0 | vec_1 | vec_2 | vec_3 |
> + *        +-----------+-------+-------+-------+-------+
> + *        | start     | 0x16  | 0xe   | 0x6   | 0x0   |
> + *        | direction | 1     | 1     | 1     | 1     |
> + *        | length    | 7     | 7     | 6     | 0     |
> + *        | stop      | 0     | 0     | 1     | 0     |
> + *        +-----------+-------+-------+-------+-------+
> + *
> + */
> +
> +#define NUM_PACKING_VECTORS 4
> +#define PACKING_START_SHIFT 5
> +#define PACKING_DIR_SHIFT 4
> +#define PACKING_LEN_SHIFT 1
> +#define PACKING_STOP_BIT BIT(0)
> +#define PACKING_VECTOR_SHIFT 10
> +/**
> + * geni_se_config_packing() - Packing configuration of the serial engine
> + * @se:                Pointer to the concerned Serial Engine
> + * @bpw:       Bits of data per transfer word.
> + * @pack_words:        Number of words per fifo element.
> + * @msb_to_lsb:        Transfer from MSB to LSB or vice-versa.
> + * @tx_cfg:    Flag to configure the TX Packing.
> + * @rx_cfg:    Flag to configure the RX Packing.
> + *
> + * This function is used to configure the packing rules for the current
> + * transfer.
> + */
> +void geni_se_config_packing(struct geni_se *se, int bpw, int pack_words,
> +                           bool msb_to_lsb, bool tx_cfg, bool rx_cfg)
> +{
> +       u32 cfg0, cfg1, cfg[NUM_PACKING_VECTORS] = {0};
> +       int len;
> +       int temp_bpw = bpw;
> +       int idx_start = msb_to_lsb ? bpw - 1 : 0;
> +       int idx = idx_start;
> +       int idx_delta = msb_to_lsb ? -BITS_PER_BYTE : BITS_PER_BYTE;
> +       int ceil_bpw = (bpw + (BITS_PER_BYTE - 1)) & ~(BITS_PER_BYTE - 1);

ALIGN(bpw, BITS_PER_BYTE)?

> +       int iter = (ceil_bpw * pack_words) / BITS_PER_BYTE;
> +       int i;
> +
> +       if (iter <= 0 || iter > NUM_PACKING_VECTORS)
> +               return;
> +
> +       for (i = 0; i < iter; i++) {
> +               if (temp_bpw < BITS_PER_BYTE)
> +                       len = temp_bpw - 1;
> +               else
> +                       len = BITS_PER_BYTE - 1;

len = min(temp_bpw, BITS_PER_BYTE) - 1;

> +
> +               cfg[i] = idx << PACKING_START_SHIFT;
> +               cfg[i] |= msb_to_lsb << PACKING_DIR_SHIFT;
> +               cfg[i] |= len << PACKING_LEN_SHIFT;
> +
> +               if (temp_bpw <= BITS_PER_BYTE) {
> +                       idx = ((i + 1) * BITS_PER_BYTE) + idx_start;
> +                       temp_bpw = bpw;
> +               } else {
> +                       idx = idx + idx_delta;
> +                       temp_bpw = temp_bpw - BITS_PER_BYTE;
> +               }
> +       }
> +       cfg[iter - 1] |= PACKING_STOP_BIT;
> +       cfg0 = cfg[0] | (cfg[1] << PACKING_VECTOR_SHIFT);
> +       cfg1 = cfg[2] | (cfg[3] << PACKING_VECTOR_SHIFT);
> +
> +       if (tx_cfg) {
> +               writel_relaxed(cfg0, se->base + SE_GENI_TX_PACKING_CFG0);
> +               writel_relaxed(cfg1, se->base + SE_GENI_TX_PACKING_CFG1);
> +       }
> +       if (rx_cfg) {
> +               writel_relaxed(cfg0, se->base + SE_GENI_RX_PACKING_CFG0);
> +               writel_relaxed(cfg1, se->base + SE_GENI_RX_PACKING_CFG1);
> +       }
> +
> +       /*
> +        * Number of protocol words in each FIFO entry
> +        * 0 - 4x8, four words in each entry, max word size of 8 bits
> +        * 1 - 2x16, two words in each entry, max word size of 16 bits
> +        * 2 - 1x32, one word in each entry, max word size of 32 bits
> +        * 3 - undefined
> +        */
> +       if (pack_words || bpw == 32)
> +               writel_relaxed(bpw / 16, se->base + SE_GENI_BYTE_GRAN);
> +}
> +EXPORT_SYMBOL(geni_se_config_packing);
> +
> +static void geni_se_clks_off(struct geni_se *se)
> +{
> +       struct geni_wrapper *wrapper = se->wrapper;
> +
> +       clk_disable_unprepare(se->clk);
> +       clk_bulk_disable_unprepare(ARRAY_SIZE(wrapper->ahb_clks),
> +                                               wrapper->ahb_clks);
> +}
> +
> +/**
> + * geni_se_resources_off() - Turn off resources associated with the serial
> + *                           engine
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * Return: 0 on success, standard Linux error codes on failure/error.
> + */
> +int geni_se_resources_off(struct geni_se *se)
> +{
> +       int ret;
> +
> +       ret = pinctrl_pm_select_sleep_state(se->dev);
> +       if (ret)
> +               return ret;
> +
> +       geni_se_clks_off(se);
> +       return 0;
> +}
> +EXPORT_SYMBOL(geni_se_resources_off);
> +
> +static int geni_se_clks_on(struct geni_se *se)
> +{
> +       int ret;
> +       struct geni_wrapper *wrapper = se->wrapper;
> +
> +       ret = clk_bulk_prepare_enable(ARRAY_SIZE(wrapper->ahb_clks),
> +                                               wrapper->ahb_clks);
> +       if (ret)
> +               return ret;
> +
> +       ret = clk_prepare_enable(se->clk);
> +       if (ret)
> +               clk_bulk_disable_unprepare(ARRAY_SIZE(wrapper->ahb_clks),
> +                                                       wrapper->ahb_clks);
> +       return ret;
> +}
> +
> +/**
> + * geni_se_resources_on() - Turn on resources associated with the serial
> + *                          engine
> + * @se:        Pointer to the concerned Serial Engine.
> + *
> + * Return: 0 on success, standard Linux error codes on failure/error.
> + */
> +int geni_se_resources_on(struct geni_se *se)
> +{
> +       int ret = 0;

Don't assign variables and then reassign them on the next line.

> +
> +       ret = geni_se_clks_on(se);
> +       if (ret)
> +               return ret;
> +
> +       ret = pinctrl_pm_select_default_state(se->dev);
> +       if (ret)
> +               geni_se_clks_off(se);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL(geni_se_resources_on);

IS there a reason why we can't use runtime PM or normal linux PM
infrastructure to power on the wrapper and keep it powered while the
protocol driver is active?

> +
> +/**
> + * geni_se_clk_tbl_get() - Get the clock table to program DFS
> + * @se:                Pointer to the concerned Serial Engine.
> + * @tbl:       Table in which the output is returned.
> + *
> + * This function is called by the protocol drivers to determine the different
> + * clock frequencies supported by Serial Engine Core Clock. The protocol
> + * drivers use the output to determine the clock frequency index to be
> + * programmed into DFS.
> + *
> + * Return: number of valid performance levels in the table on success,
> + *        standard Linux error codes on failure.
> + */
> +int geni_se_clk_tbl_get(struct geni_se *se, unsigned long **tbl)
> +{
> +       struct geni_wrapper *wrapper = se->wrapper;
> +       unsigned long freq = 0;
> +       int i;
> +       int ret = 0;
> +
> +       mutex_lock(&wrapper->lock);
> +       if (wrapper->clk_perf_tbl) {
> +               *tbl = wrapper->clk_perf_tbl;
> +               ret = wrapper->num_clk_levels;
> +               goto out_unlock;
> +       }
> +
> +       wrapper->clk_perf_tbl = kcalloc(MAX_CLK_PERF_LEVEL,
> +                                       sizeof(*wrapper->clk_perf_tbl),
> +                                       GFP_KERNEL);
> +       if (!wrapper->clk_perf_tbl) {
> +               ret = -ENOMEM;
> +               goto out_unlock;
> +       }
> +
> +       for (i = 0; i < MAX_CLK_PERF_LEVEL; i++) {
> +               freq = clk_round_rate(se->clk, freq + 1);
> +               if (!freq || freq == wrapper->clk_perf_tbl[i - 1])
> +                       break;
> +               wrapper->clk_perf_tbl[i] = freq;
> +       }
> +       wrapper->num_clk_levels = i;
> +       *tbl = wrapper->clk_perf_tbl;
> +       ret = wrapper->num_clk_levels;
> +out_unlock:
> +       mutex_unlock(&wrapper->lock);

Is this lock actually protecting anything? I mean to say, is any more
than one geni protocol driver calling this function at a time? Or is
the same geni protocol driver calling this from multiple threads at the
same time? The lock looks almost useless.

> +       return ret;
> +}
> +EXPORT_SYMBOL(geni_se_clk_tbl_get);
> +
> +/**
> + * geni_se_clk_freq_match() - Get the matching or closest SE clock frequency
> + * @se:                Pointer to the concerned Serial Engine.
> + * @req_freq:  Requested clock frequency.
> + * @index:     Index of the resultant frequency in the table.
> + * @res_freq:  Resultant frequency which matches or is closer to the
> + *             requested frequency.
> + * @exact:     Flag to indicate exact multiple requirement of the requested
> + *             frequency.
> + *
> + * This function is called by the protocol drivers to determine the matching
> + * or exact multiple of the requested frequency, as provided by the Serial
> + * Engine clock in order to meet the performance requirements. If there is
> + * no matching or exact multiple of the requested frequency found, then it
> + * selects the closest floor frequency, if exact flag is not set.
> + *
> + * Return: 0 on success, standard Linux error codes on failure.
> + */
> +int geni_se_clk_freq_match(struct geni_se *se, unsigned long req_freq,
> +                          unsigned int *index, unsigned long *res_freq,
> +                          bool exact)
> +{
> +       unsigned long *tbl;
> +       int num_clk_levels;
> +       int i;
> +
> +       num_clk_levels = geni_se_clk_tbl_get(se, &tbl);
> +       if (num_clk_levels < 0)
> +               return num_clk_levels;
> +
> +       if (num_clk_levels == 0)
> +               return -EFAULT;

I believe this would mean userspace thought the syscall faulted.
Perhaps -EINVAL instead?

> +
> +       *res_freq = 0;
> +       for (i = 0; i < num_clk_levels; i++) {
> +               if (!(tbl[i] % req_freq)) {
> +                       *index = i;
> +                       *res_freq = tbl[i];
> +                       return 0;
> +               }
> +
> +               if (!(*res_freq) || ((tbl[i] > *res_freq) &&
> +                                    (tbl[i] < req_freq))) {
> +                       *index = i;
> +                       *res_freq = tbl[i];
> +               }
> +       }
> +
> +       if (exact)
> +               return -ENOKEY;

Interesting error code. Doubtful this is correct because it seems to be
related to crypto keys.

> +
> +       return 0;
> +}
> +EXPORT_SYMBOL(geni_se_clk_freq_match);
> +
> +#define GENI_SE_DMA_DONE_EN BIT(0)
> +#define GENI_SE_DMA_EOT_EN BIT(1)
> +#define GENI_SE_DMA_AHB_ERR_EN BIT(2)
> +#define GENI_SE_DMA_EOT_BUF BIT(0)
> +/**
> + * geni_se_tx_dma_prep() - Prepare the Serial Engine for TX DMA transfer
> + * @se:                        Pointer to the concerned Serial Engine.
> + * @buf:               Pointer to the TX buffer.
> + * @len:               Length of the TX buffer.
> + *
> + * This function is used to prepare the buffers for DMA TX.
> + *
> + * Return: Mapped DMA Address of the buffer on success, NULL on failure.
> + */
> +dma_addr_t geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len)
> +{
> +       dma_addr_t iova;
> +       struct geni_wrapper *wrapper = se->wrapper;
> +       u32 val;
> +
> +       iova = dma_map_single(wrapper->dev, buf, len, DMA_TO_DEVICE);
> +       if (dma_mapping_error(wrapper->dev, iova))
> +               return (dma_addr_t)NULL;
> +
> +       val = GENI_SE_DMA_DONE_EN;
> +       val |= GENI_SE_DMA_EOT_EN;
> +       val |= GENI_SE_DMA_AHB_ERR_EN;
> +       writel_relaxed(val, se->base + SE_DMA_TX_IRQ_EN_SET);
> +       writel_relaxed((u32)iova, se->base + SE_DMA_TX_PTR_L);

lower_32_bits()

> +       writel_relaxed((u32)(iova >> 32), se->base + SE_DMA_TX_PTR_H);

upper_32_bits()

> +       writel_relaxed(GENI_SE_DMA_EOT_BUF, se->base + SE_DMA_TX_ATTR);
> +       writel_relaxed((u32)len, se->base + SE_DMA_TX_LEN);

Useless cast.

> +       return iova;
> +}
> +EXPORT_SYMBOL(geni_se_tx_dma_prep);
> +
> +/**
> + * geni_se_rx_dma_prep() - Prepare the Serial Engine for RX DMA transfer
> + * @se:                        Pointer to the concerned Serial Engine.
> + * @buf:               Pointer to the RX buffer.
> + * @len:               Length of the RX buffer.
> + *
> + * This function is used to prepare the buffers for DMA RX.
> + *
> + * Return: Mapped DMA Address of the buffer on success, NULL on failure.
> + */
> +dma_addr_t geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len)
> +{
> +       dma_addr_t iova;
> +       struct geni_wrapper *wrapper = se->wrapper;
> +       u32 val;
> +
> +       iova = dma_map_single(wrapper->dev, buf, len, DMA_FROM_DEVICE);
> +       if (dma_mapping_error(wrapper->dev, iova))
> +               return (dma_addr_t)NULL;

Can't return a dma_mapping_error address to the caller and have them
figure it out?

> +
> +       val = GENI_SE_DMA_DONE_EN;
> +       val |= GENI_SE_DMA_EOT_EN;
> +       val |= GENI_SE_DMA_AHB_ERR_EN;
> +       writel_relaxed(val, se->base + SE_DMA_RX_IRQ_EN_SET);
> +       writel_relaxed((u32)iova, se->base + SE_DMA_RX_PTR_L);
> +       writel_relaxed((u32)(iova >> 32), se->base + SE_DMA_RX_PTR_H);

upper/lower macros again.

> +       /* RX does not have EOT buffer type bit. So just reset RX_ATTR */
> +       writel_relaxed(0, se->base + SE_DMA_RX_ATTR);
> +       writel_relaxed((u32)len, se->base + SE_DMA_RX_LEN);

Drop cast?

> +       return iova;
> +}
> +EXPORT_SYMBOL(geni_se_rx_dma_prep);
> +
> +/**
> + * geni_se_tx_dma_unprep() - Unprepare the Serial Engine after TX DMA transfer
> + * @se:                        Pointer to the concerned Serial Engine.
> + * @iova:              DMA address of the TX buffer.
> + * @len:               Length of the TX buffer.
> + *
> + * This function is used to unprepare the DMA buffers after DMA TX.
> + */
> +void geni_se_tx_dma_unprep(struct geni_se *se, dma_addr_t iova, size_t len)
> +{
> +       struct geni_wrapper *wrapper = se->wrapper;
> +
> +       if (iova)
> +               dma_unmap_single(wrapper->dev, iova, len, DMA_TO_DEVICE);
> +}
> +EXPORT_SYMBOL(geni_se_tx_dma_unprep);
> +
> +/**
> + * geni_se_rx_dma_unprep() - Unprepare the Serial Engine after RX DMA transfer
> + * @se:                        Pointer to the concerned Serial Engine.
> + * @iova:              DMA address of the RX buffer.
> + * @len:               Length of the RX buffer.
> + *
> + * This function is used to unprepare the DMA buffers after DMA RX.
> + */
> +void geni_se_rx_dma_unprep(struct geni_se *se, dma_addr_t iova, size_t len)
> +{
> +       struct geni_wrapper *wrapper = se->wrapper;
> +
> +       if (iova)
> +               dma_unmap_single(wrapper->dev, iova, len, DMA_FROM_DEVICE);
> +}
> +EXPORT_SYMBOL(geni_se_rx_dma_unprep);

Instead of having the functions exported, could we set the dma_ops on
all child devices of the wrapper that this driver populates and then
implement the DMA ops for those devices here? I assume that there's
never another DMA master between the wrapper and the serial engine, so I
think it would work.

> +
> +static int geni_se_probe(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct resource *res;
> +       struct geni_wrapper *wrapper;
> +       int ret;
> +
> +       wrapper = devm_kzalloc(dev, sizeof(*wrapper), GFP_KERNEL);
> +       if (!wrapper)
> +               return -ENOMEM;
> +
> +       wrapper->dev = dev;
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       wrapper->base = devm_ioremap_resource(dev, res);
> +       if (IS_ERR(wrapper->base)) {
> +               dev_err(dev, "%s: Error mapping the resource\n", __func__);

Drop error message, devm_ioremap_resource() already does it.

> +               return -EFAULT;

return PTR_ERR(wrapper->base);

> +       }
> +
> +       wrapper->ahb_clks[0].id = m_ahb_clk;
> +       wrapper->ahb_clks[1].id = s_ahb_clk;
> +       ret = devm_clk_bulk_get(dev, NUM_AHB_CLKS, wrapper->ahb_clks);
> +       if (ret) {
> +               dev_err(dev, "Err getting AHB clks %d\n", ret);
> +               return ret;
> +       }
> +
> +       mutex_init(&wrapper->lock);
> +       dev_set_drvdata(dev, wrapper);
> +       dev_dbg(dev, "GENI SE Driver probed\n");
> +       return devm_of_platform_populate(dev);
> +}
> +
> +static int geni_se_remove(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct geni_wrapper *wrapper = dev_get_drvdata(dev);
> +
> +       kfree(wrapper->clk_perf_tbl);

Why not devm_kzalloc() this?

> +       return 0;
> +}
> +
> +static const struct of_device_id geni_se_dt_match[] = {
> +       { .compatible = "qcom,geni-se-qup", },
> +       {}
> +};
> +MODULE_DEVICE_TABLE(of, geni_se_dt_match);
> +
> +static struct platform_driver geni_se_driver = {
> +       .driver = {
> +               .name = "geni_se_qup",
> +               .of_match_table = geni_se_dt_match,
> +       },
> +       .probe = geni_se_probe,
> +       .remove = geni_se_remove,
> +};
> +module_platform_driver(geni_se_driver);
> +
> +MODULE_DESCRIPTION("GENI Serial Engine Driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
> new file mode 100644
> index 0000000..4996de7
> --- /dev/null
> +++ b/include/linux/qcom-geni-se.h
> @@ -0,0 +1,247 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +/*
> + * Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
> + */
> +
> +#ifndef _LINUX_QCOM_GENI_SE
> +#define _LINUX_QCOM_GENI_SE
> +#include <linux/clk.h>

Please forward declare struct clk and drop this include here.

> +#include <linux/dma-direction.h>

Drop?

> +
> +/* Transfer mode supported by GENI Serial Engines */
> +enum geni_se_xfer_mode {
> +       GENI_SE_INVALID,
> +       GENI_SE_FIFO,
> +       GENI_SE_DMA,
> +};
> +
> +/* Protocols supported by GENI Serial Engines */
> +enum geni_se_protocol_types {
> +       GENI_SE_NONE,
> +       GENI_SE_SPI,
> +       GENI_SE_UART,
> +       GENI_SE_I2C,
> +       GENI_SE_I3C,
> +};
> +
> +/**
> + * struct geni_se - GENI Serial Engine
> + * @base:              Base Address of the Serial Engine's register block.
> + * @dev:               Pointer to the Serial Engine device.
> + * @wrapper:           Pointer to the parent QUP Wrapper core.
> + * @clk:               Handle to the core serial engine clock.
> + */
> +struct geni_se {
> +       void __iomem *base;
> +       struct device *dev;
> +       void *wrapper;

Can this get the geni_wrapper type? It could be opaque if you like.

> +       struct clk *clk;
> +};
> +
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Evan Green March 2, 2018, 8:58 p.m. UTC | #2
Hi Karthik,

On Tue, Feb 27, 2018 at 5:38 PM, Karthikeyan Ramasubramanian
<kramasub@codeaurora.org> wrote:
> This driver manages the Generic Interface (GENI) firmware based Qualcomm
> Universal Peripheral (QUP) Wrapper. GENI based QUP is the next generation
> programmable module composed of multiple Serial Engines (SE) and supports
> a wide range of serial interfaces like UART, SPI, I2C, I3C, etc. This
> driver also enables managing the serial interface independent aspects of
> Serial Engines.
>
> Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
> Signed-off-by: Sagar Dharia <sdharia@codeaurora.org>
> Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
> ---
>  drivers/soc/qcom/Kconfig        |   9 +
>  drivers/soc/qcom/Makefile       |   1 +
>  drivers/soc/qcom/qcom-geni-se.c | 971 ++++++++++++++++++++++++++++++++++++++++
>  include/linux/qcom-geni-se.h    | 247 ++++++++++
>  4 files changed, 1228 insertions(+)
>  create mode 100644 drivers/soc/qcom/qcom-geni-se.c
>  create mode 100644 include/linux/qcom-geni-se.h
>
[...]
> diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
> new file mode 100644
> index 0000000..61335b8
> --- /dev/null
> +++ b/drivers/soc/qcom/qcom-geni-se.c
> +
> +/**
> + * geni_se_clk_tbl_get() - Get the clock table to program DFS
> + * @se:                Pointer to the concerned Serial Engine.
> + * @tbl:       Table in which the output is returned.
> + *
> + * This function is called by the protocol drivers to determine the different
> + * clock frequencies supported by Serial Engine Core Clock. The protocol
> + * drivers use the output to determine the clock frequency index to be
> + * programmed into DFS.
> + *
> + * Return: number of valid performance levels in the table on success,
> + *        standard Linux error codes on failure.
> + */
> +int geni_se_clk_tbl_get(struct geni_se *se, unsigned long **tbl)
> +{
> +       struct geni_wrapper *wrapper = se->wrapper;
> +       unsigned long freq = 0;
> +       int i;
> +       int ret = 0;
> +
> +       mutex_lock(&wrapper->lock);
> +       if (wrapper->clk_perf_tbl) {
> +               *tbl = wrapper->clk_perf_tbl;
> +               ret = wrapper->num_clk_levels;
> +               goto out_unlock;
> +       }
> +
> +       wrapper->clk_perf_tbl = kcalloc(MAX_CLK_PERF_LEVEL,
> +                                       sizeof(*wrapper->clk_perf_tbl),
> +                                       GFP_KERNEL);
> +       if (!wrapper->clk_perf_tbl) {
> +               ret = -ENOMEM;
> +               goto out_unlock;
> +       }
> +
> +       for (i = 0; i < MAX_CLK_PERF_LEVEL; i++) {
> +               freq = clk_round_rate(se->clk, freq + 1);
> +               if (!freq || freq == wrapper->clk_perf_tbl[i - 1])
> +                       break;
> +               wrapper->clk_perf_tbl[i] = freq;
> +       }
> +       wrapper->num_clk_levels = i;
> +       *tbl = wrapper->clk_perf_tbl;
> +       ret = wrapper->num_clk_levels;
> +out_unlock:
> +       mutex_unlock(&wrapper->lock);
> +       return ret;
> +}
> +EXPORT_SYMBOL(geni_se_clk_tbl_get);

I think Bjorn had this feedback before, but if you did this work in
probe you could remove the mutex altogether.

> +       wrapper->ahb_clks[0].id = m_ahb_clk;
> +       wrapper->ahb_clks[1].id = s_ahb_clk;
> +       ret = devm_clk_bulk_get(dev, NUM_AHB_CLKS, wrapper->ahb_clks);
> +       if (ret) {
> +               dev_err(dev, "Err getting AHB clks %d\n", ret);
> +               return ret;
> +       }
> +
> +       mutex_init(&wrapper->lock);
> +       dev_set_drvdata(dev, wrapper);
> +       dev_dbg(dev, "GENI SE Driver probed\n");
> +       return devm_of_platform_populate(dev);
> +}
> +
> +static int geni_se_remove(struct platform_device *pdev)
> +{
> +       struct device *dev = &pdev->dev;
> +       struct geni_wrapper *wrapper = dev_get_drvdata(dev);
> +
> +       kfree(wrapper->clk_perf_tbl);

Maybe null out clk_perf_tbl for safety?

> diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
> new file mode 100644
> index 0000000..4996de7
> --- /dev/null
> +++ b/include/linux/qcom-geni-se.h
[...]
> +/* SE_DMA_RX_IRQ_STAT Register fields */
> +#define RX_DMA_DONE                    BIT(0)
> +#define RX_EOT                         BIT(1)
> +#define RX_SBE                         BIT(2)
> +#define RX_RESET_DONE                  BIT(3)
> +#define RX_FLUSH_DONE                  BIT(4)
> +#define RX_GENI_GP_IRQ                 GENMASK(10, 5)
> +#define RX_GENI_CANCEL_IRQ             BIT(11)
> +#define RX_GENI_GP_IRQ_EXT             GENMASK(13, 12)
> +
> +#ifdef CONFIG_QCOM_GENI_SE

I think this needs to be #if IS_ENABLED(CONFIG_QCOM_GENI_SE), since
the function prototypes below won't light up if this is built as a
module.

Thanks,
Evan
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Boyd March 2, 2018, 10:11 p.m. UTC | #3
Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:09)
> diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
> index 3682fd3..c6b1500 100644
> --- a/drivers/tty/serial/Kconfig
> +++ b/drivers/tty/serial/Kconfig
> @@ -1104,6 +1104,17 @@ config SERIAL_MSM_CONSOLE
>         select SERIAL_CORE_CONSOLE
>         select SERIAL_EARLYCON
>  
> +config SERIAL_QCOM_GENI
> +       bool "QCOM on-chip GENI based serial port support"

This can be tristate.

> +       depends on ARCH_QCOM

|| COMPILE_TEST
?

> +       depends on QCOM_GENI_SE
> +       select SERIAL_CORE

This can stay.

> +       select SERIAL_CORE_CONSOLE
> +       select SERIAL_EARLYCON

These two can go to a new config option, like SERIAL_QCOM_GENI_CONSOLE,
and that would be bool. Please take a look at the existing SERIAL_MSM
and SERIAL_MSM_CONSOLE setup to understand how to do it.

> +       help
> +         Serial driver for Qualcomm Technologies Inc's GENI based QUP
> +         hardware.
> +
>  config SERIAL_VT8500
>         bool "VIA VT8500 on-chip serial port support"
>         depends on ARCH_VT8500
> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> new file mode 100644
> index 0000000..8536b7d
> --- /dev/null
> +++ b/drivers/tty/serial/qcom_geni_serial.c
> @@ -0,0 +1,1181 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2017-2018, The Linux foundation. All rights reserved.
> +
> +#include <linux/console.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/qcom-geni-se.h>
> +#include <linux/serial.h>
> +#include <linux/serial_core.h>
> +#include <linux/slab.h>
> +#include <linux/tty.h>
> +#include <linux/tty_flip.h>
> +
> +/* UART specific GENI registers */
> +#define SE_UART_TX_TRANS_CFG           0x25c
> +#define SE_UART_TX_WORD_LEN            0x268
> +#define SE_UART_TX_STOP_BIT_LEN                0x26c
> +#define SE_UART_TX_TRANS_LEN           0x270
> +#define SE_UART_RX_TRANS_CFG           0x280
> +#define SE_UART_RX_WORD_LEN            0x28c
> +#define SE_UART_RX_STALE_CNT           0x294
> +#define SE_UART_TX_PARITY_CFG          0x2a4
> +#define SE_UART_RX_PARITY_CFG          0x2a8
> +
> +/* SE_UART_TRANS_CFG */
> +#define UART_TX_PAR_EN         BIT(0)
> +#define UART_CTS_MASK          BIT(1)
> +
> +/* SE_UART_TX_WORD_LEN */
> +#define TX_WORD_LEN_MSK                GENMASK(9, 0)
> +
> +/* SE_UART_TX_STOP_BIT_LEN */
> +#define TX_STOP_BIT_LEN_MSK    GENMASK(23, 0)
> +#define TX_STOP_BIT_LEN_1      0
> +#define TX_STOP_BIT_LEN_1_5    1
> +#define TX_STOP_BIT_LEN_2      2
> +
> +/* SE_UART_TX_TRANS_LEN */
> +#define TX_TRANS_LEN_MSK       GENMASK(23, 0)
> +
> +/* SE_UART_RX_TRANS_CFG */
> +#define UART_RX_INS_STATUS_BIT BIT(2)
> +#define UART_RX_PAR_EN         BIT(3)
> +
> +/* SE_UART_RX_WORD_LEN */
> +#define RX_WORD_LEN_MASK       GENMASK(9, 0)
> +
> +/* SE_UART_RX_STALE_CNT */
> +#define RX_STALE_CNT           GENMASK(23, 0)
> +
> +/* SE_UART_TX_PARITY_CFG/RX_PARITY_CFG */
> +#define PAR_CALC_EN            BIT(0)
> +#define PAR_MODE_MSK           GENMASK(2, 1)
> +#define PAR_MODE_SHFT          1
> +#define PAR_EVEN               0x00
> +#define PAR_ODD                        0x01
> +#define PAR_SPACE              0x10
> +#define PAR_MARK               0x11
> +
> +/* UART M_CMD OP codes */
> +#define UART_START_TX          0x1
> +#define UART_START_BREAK       0x4
> +#define UART_STOP_BREAK                0x5
> +/* UART S_CMD OP codes */
> +#define UART_START_READ                0x1
> +#define UART_PARAM             0x1
> +
> +#define UART_OVERSAMPLING      32
> +#define STALE_TIMEOUT          16
> +#define DEFAULT_BITS_PER_CHAR  10
> +#define GENI_UART_CONS_PORTS   1
> +#define DEF_FIFO_DEPTH_WORDS   16
> +#define DEF_TX_WM              2
> +#define DEF_FIFO_WIDTH_BITS    32
> +#define UART_CONSOLE_RX_WM     2
> +
> +#ifdef CONFIG_CONSOLE_POLL
> +#define RX_BYTES_PW 1
> +#else
> +#define RX_BYTES_PW 4
> +#endif
> +
> +struct qcom_geni_serial_port {
> +       struct uart_port uport;
> +       struct geni_se se;
> +       char name[20];
> +       u32 tx_fifo_depth;
> +       u32 tx_fifo_width;
> +       u32 rx_fifo_depth;
> +       u32 tx_wm;
> +       u32 rx_wm;
> +       u32 rx_rfr;
> +       int xfer_mode;

Can this be an enum?

> +       bool port_setup;

Maybe just 'setup'? Port is in the type already.

> +       int (*handle_rx)(struct uart_port *uport,
> +                               u32 rx_bytes, bool drop_rx);

s/rx_bytes/bytes/
s/drop_rx/drop/

> +       unsigned int xmit_size;
> +       unsigned int cur_baud;

s/cur//

> +       unsigned int tx_bytes_pw;
> +       unsigned int rx_bytes_pw;
> +};
> +
> +static const struct uart_ops qcom_geni_serial_pops;
> +static struct uart_driver qcom_geni_console_driver;
> +static int handle_rx_console(struct uart_port *uport,
> +                               u32 rx_bytes, bool drop_rx);

s/rx_bytes/bytes/
s/drop_rx/drop/

> +static unsigned int qcom_geni_serial_tx_empty(struct uart_port *port);
> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> +                               int offset, int bit_field, bool set);

No need to forward declare this?

s/bit_// ?

> +static void qcom_geni_serial_stop_rx(struct uart_port *uport);
> +
> +static atomic_t uart_line_id = ATOMIC_INIT(0);

Do we need this? How about rely on DT to always have aliases instead?
Given we only have one port I don't actually understand how this is
supposed to work anyway.

> +static const unsigned long root_freq[] = {7372800, 14745600, 19200000, 29491200,
> +                                       32000000, 48000000, 64000000, 80000000,
> +                                       96000000, 100000000};
> +
> +#define to_dev_port(ptr, member) \
> +               container_of(ptr, struct qcom_geni_serial_port, member)
> +
> +static struct qcom_geni_serial_port qcom_geni_console_port;

Why singleton? Couldn't there be many?

> +
> +static void qcom_geni_serial_config_port(struct uart_port *uport, int cfg_flags)
> +{
> +       if (cfg_flags & UART_CONFIG_TYPE)
> +               uport->type = PORT_MSM;
> +}
> +
> +static unsigned int qcom_geni_cons_get_mctrl(struct uart_port *uport)
> +{
> +       return TIOCM_DSR | TIOCM_CAR | TIOCM_CTS;
> +}
> +
> +static void qcom_geni_cons_set_mctrl(struct uart_port *uport,
> +                                                       unsigned int mctrl)
> +{
> +}
> +
> +static const char *qcom_geni_serial_get_type(struct uart_port *uport)
> +{
> +       return "MSM";
> +}
> +
> +static struct qcom_geni_serial_port *get_port_from_line(int line)
> +{
> +       if ((line < 0) || (line >= GENI_UART_CONS_PORTS))

Drop useless parenthesis please.

> +               return ERR_PTR(-ENXIO);
> +       return &qcom_geni_console_port;
> +}
> +
> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> +                               int offset, int bit_field, bool set)
> +{
> +       u32 reg;
> +       struct qcom_geni_serial_port *port;
> +       unsigned int baud;
> +       unsigned int fifo_bits;
> +       unsigned long timeout_us = 20000;
> +
> +       /* Ensure polling is not re-ordered before the prior writes/reads */
> +       mb();
> +
> +       if (uport->private_data) {
> +               port = to_dev_port(uport, uport);
> +               baud = port->cur_baud;
> +               if (!baud)
> +                       baud = 115200;
> +               fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
> +               /*
> +                * Total polling iterations based on FIFO worth of bytes to be
> +                * sent at current baud .Add a little fluff to the wait.

Bad space here                           ^

> +                */
> +               timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;
> +       }
> +
> +       return !readl_poll_timeout_atomic(uport->membase + offset, reg,
> +                        (bool)(reg & bit_field) == set, 10, timeout_us);
> +}
> +
> +static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
> +{
> +       u32 m_cmd;
> +
> +       writel_relaxed(xmit_size, uport->membase + SE_UART_TX_TRANS_LEN);
> +       m_cmd = UART_START_TX << M_OPCODE_SHFT;
> +       writel(m_cmd, uport->membase + SE_GENI_M_CMD0);
> +}
> +
> +static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
> +{
> +       int done;
> +       u32 irq_clear = M_CMD_DONE_EN;
> +
> +       done = qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> +                                               M_CMD_DONE_EN, true);
> +       if (!done) {
> +               writel_relaxed(M_GENI_CMD_ABORT, uport->membase +
> +                                               SE_GENI_M_CMD_CTRL_REG);
> +               irq_clear |= M_CMD_ABORT_EN;
> +               qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> +                                                       M_CMD_ABORT_EN, true);
> +       }
> +       writel_relaxed(irq_clear, uport->membase + SE_GENI_M_IRQ_CLEAR);
> +}
> +
> +static void qcom_geni_serial_abort_rx(struct uart_port *uport)
> +{
> +       u32 irq_clear = S_CMD_DONE_EN | S_CMD_ABORT_EN;
> +
> +       writel(S_GENI_CMD_ABORT, uport->membase + SE_GENI_S_CMD_CTRL_REG);
> +       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
> +                                       S_GENI_CMD_ABORT, false);
> +       writel_relaxed(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
> +       writel_relaxed(FORCE_DEFAULT, uport->membase + GENI_FORCE_DEFAULT_REG);
> +}
> +
> +#ifdef CONFIG_CONSOLE_POLL
> +static int qcom_geni_serial_get_char(struct uart_port *uport)
> +{
> +       u32 rx_fifo;
> +       u32 status;
> +
> +       status = readl_relaxed(uport->membase + SE_GENI_M_IRQ_STATUS);
> +       writel_relaxed(status, uport->membase + SE_GENI_M_IRQ_CLEAR);
> +
> +       status = readl_relaxed(uport->membase + SE_GENI_S_IRQ_STATUS);
> +       writel_relaxed(status, uport->membase + SE_GENI_S_IRQ_CLEAR);
> +
> +       /*
> +        * Ensure the writes to clear interrupts is not re-ordered after
> +        * reading the data.
> +        */
> +       mb();
> +
> +       status = readl_relaxed(uport->membase + SE_GENI_RX_FIFO_STATUS);
> +       if (!(status & RX_FIFO_WC_MSK))
> +               return NO_POLL_CHAR;
> +
> +       rx_fifo = readl(uport->membase + SE_GENI_RX_FIFOn);
> +       return rx_fifo & 0xff;
> +}
> +
> +static void qcom_geni_serial_poll_put_char(struct uart_port *uport,
> +                                                       unsigned char c)
> +{
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       writel_relaxed(port->tx_wm, uport->membase + SE_GENI_TX_WATERMARK_REG);
> +       qcom_geni_serial_setup_tx(uport, 1);
> +       WARN_ON(!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> +                                               M_TX_FIFO_WATERMARK_EN, true));
> +       writel_relaxed((u32)c, uport->membase + SE_GENI_TX_FIFOn);

Drop useless cast.

> +       writel_relaxed(M_TX_FIFO_WATERMARK_EN, uport->membase +
> +                                                       SE_GENI_M_IRQ_CLEAR);
> +       qcom_geni_serial_poll_tx_done(uport);
> +}
> +#endif
> +
> +#if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)
> +static void qcom_geni_serial_wr_char(struct uart_port *uport, int ch)
> +{
> +       writel_relaxed(ch, uport->membase + SE_GENI_TX_FIFOn);
> +}
> +
> +static void
> +__qcom_geni_serial_console_write(struct uart_port *uport, const char *s,
> +                                unsigned int count)
> +{
> +       int new_line = 0;

Drop

> +       int i;
> +       u32 bytes_to_send = count;
> +
> +       for (i = 0; i < count; i++) {
> +               if (s[i] == '\n')
> +                       new_line++;

bytes_to_send++;

> +       }
> +
> +       bytes_to_send += new_line;

Drop.

> +       writel_relaxed(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
> +       qcom_geni_serial_setup_tx(uport, bytes_to_send);
> +       i = 0;
> +       while (i < count) {

	for (i = 0; i < count; ) {

would be more normal, but ok.

> +               size_t chars_to_write = 0;
> +               size_t avail = DEF_FIFO_DEPTH_WORDS - DEF_TX_WM;
> +
> +               /*
> +                * If the WM bit never set, then the Tx state machine is not
> +                * in a valid state, so break, cancel/abort any existing
> +                * command. Unfortunately the current data being written is
> +                * lost.
> +                */
> +               while (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> +                                               M_TX_FIFO_WATERMARK_EN, true))

Does this ever timeout? So many nested while loops makes it hard to
follow.

> +                       break;
> +               chars_to_write = min_t(size_t, (size_t)(count - i), avail / 2);
> +               uart_console_write(uport, (s + i), chars_to_write,

Drop useless parenthesis please.

> +                                               qcom_geni_serial_wr_char);
> +               writel_relaxed(M_TX_FIFO_WATERMARK_EN, uport->membase +
> +                                                       SE_GENI_M_IRQ_CLEAR);
> +               i += chars_to_write;
> +       }
> +       qcom_geni_serial_poll_tx_done(uport);
> +}
> +
> +static void qcom_geni_serial_console_write(struct console *co, const char *s,
> +                             unsigned int count)
> +{
> +       struct uart_port *uport;
> +       struct qcom_geni_serial_port *port;
> +       bool locked = true;
> +       unsigned long flags;
> +
> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
> +
> +       port = get_port_from_line(co->index);
> +       if (IS_ERR(port))
> +               return;
> +
> +       uport = &port->uport;
> +       if (oops_in_progress)
> +               locked = spin_trylock_irqsave(&uport->lock, flags);
> +       else
> +               spin_lock_irqsave(&uport->lock, flags);
> +
> +       if (locked) {
> +               __qcom_geni_serial_console_write(uport, s, count);

So if oops is in progress, and we didn't lock here, we don't output
data? I'd think we would always want to write to the fifo, just make the
lock grab/release conditional.

> +               spin_unlock_irqrestore(&uport->lock, flags);
> +       }
> +}
> +
> +static int handle_rx_console(struct uart_port *uport, u32 rx_bytes, bool drop)
> +{
> +       u32 i = rx_bytes;
> +       u32 rx_fifo;
> +       unsigned char *buf;
> +       struct tty_port *tport;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       tport = &uport->state->port;
> +       while (i > 0) {
> +               int c;
> +               int bytes = i > port->rx_bytes_pw ? port->rx_bytes_pw : i;
> +
> +               rx_fifo = readl_relaxed(uport->membase + SE_GENI_RX_FIFOn);

Please use ioread32_rep(..., 1) here.

> +               i -= bytes;
> +               if (drop)
> +                       continue;
> +               buf = (unsigned char *)&rx_fifo;

So that this cast becomes unnecessary, and endian agnostic.

> +
> +               for (c = 0; c < bytes; c++) {
> +                       int sysrq;
> +
> +                       uport->icount.rx++;
> +                       sysrq = uart_handle_sysrq_char(uport, buf[c]);

And so this does the right thing in whatever world we live in.

> +                       if (!sysrq)
> +                               tty_insert_flip_char(tport, buf[c], TTY_NORMAL);
> +               }
> +       }
> +       if (!drop)
> +               tty_flip_buffer_push(tport);
> +       return 0;
> +}
> +#else
> +static int handle_rx_console(struct uart_port *uport,
> +                       unsigned int rx_fifo_wc,
> +                       unsigned int rx_last_byte_valid,
> +                       unsigned int rx_last,
> +                       bool drop_rx)
> +{
> +       return -EPERM;
> +}
> +
> +#endif /* (CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)) */
> +
> +static void qcom_geni_serial_start_tx(struct uart_port *uport)
> +{
> +       u32 irq_en;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +       u32 status;
> +
> +       if (port->xfer_mode == GENI_SE_FIFO) {
> +               status = readl_relaxed(uport->membase + SE_GENI_STATUS);
> +               if (status & M_GENI_CMD_ACTIVE)
> +                       return;
> +
> +               if (!qcom_geni_serial_tx_empty(uport))
> +                       return;
> +
> +               /*
> +                * Ensure writing to IRQ_EN & watermark registers are not
> +                * re-ordered before checking the status of the Serial
> +                * Engine and TX FIFO
> +                */
> +               mb();
> +
> +               irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
> +               irq_en |= M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN;
> +
> +               writel_relaxed(port->tx_wm, uport->membase +
> +                                               SE_GENI_TX_WATERMARK_REG);
> +               writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
> +       }
> +}
> +
> +static void qcom_geni_serial_stop_tx(struct uart_port *uport)
> +{
> +       u32 irq_en;
> +       u32 status;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
> +       irq_en &= ~M_CMD_DONE_EN;
> +       if (port->xfer_mode == GENI_SE_FIFO) {
> +               irq_en &= ~M_TX_FIFO_WATERMARK_EN;
> +               writel_relaxed(0, uport->membase +
> +                                    SE_GENI_TX_WATERMARK_REG);
> +       }
> +       port->xmit_size = 0;
> +       writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
> +       /* Possible stop tx is called multiple times. */
> +       if (!(status & M_GENI_CMD_ACTIVE))
> +               return;
> +
> +       /*
> +        * Ensure cancel command write is not re-ordered before checking
> +        * checking the status of the Primary Sequencer.
> +        */
> +       mb();
> +
> +       geni_se_cancel_m_cmd(&port->se);
> +       if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> +                                               M_CMD_CANCEL_EN, true)) {
> +               geni_se_abort_m_cmd(&port->se);
> +               qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> +                                               M_CMD_ABORT_EN, true);
> +               writel_relaxed(M_CMD_ABORT_EN, uport->membase +
> +                                                       SE_GENI_M_IRQ_CLEAR);
> +       }
> +       writel_relaxed(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
> +}
> +
> +static void qcom_geni_serial_start_rx(struct uart_port *uport)
> +{
> +       u32 irq_en;
> +       u32 status;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
> +       if (status & S_GENI_CMD_ACTIVE)
> +               qcom_geni_serial_stop_rx(uport);
> +
> +       /*
> +        * Ensure setup command write is not re-ordered before checking
> +        * checking the status of the Secondary Sequencer.
> +        */
> +       mb();
> +
> +       geni_se_setup_s_cmd(&port->se, UART_START_READ, 0);
> +
> +       if (port->xfer_mode == GENI_SE_FIFO) {
> +               irq_en = readl_relaxed(uport->membase + SE_GENI_S_IRQ_EN);
> +               irq_en |= S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN;
> +               writel_relaxed(irq_en, uport->membase + SE_GENI_S_IRQ_EN);
> +
> +               irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
> +               irq_en |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
> +               writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
> +       }
> +}
> +
> +static void qcom_geni_serial_stop_rx(struct uart_port *uport)
> +{
> +       u32 irq_en;
> +       u32 status;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +       u32 irq_clear = S_CMD_DONE_EN;
> +
> +       if (port->xfer_mode == GENI_SE_FIFO) {
> +               irq_en = readl_relaxed(uport->membase + SE_GENI_S_IRQ_EN);
> +               irq_en &= ~(S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN);
> +               writel_relaxed(irq_en, uport->membase + SE_GENI_S_IRQ_EN);
> +
> +               irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
> +               irq_en &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
> +               writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
> +       }
> +
> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
> +       /* Possible stop rx is called multiple times. */
> +       if (!(status & S_GENI_CMD_ACTIVE))
> +               return;
> +
> +       /*
> +        * Ensure cancel command write is not re-ordered before checking
> +        * checking the status of the Secondary Sequencer.

Each of these comments has 'checking' twice.

> +        */
> +       mb();
> +
> +       geni_se_cancel_s_cmd(&port->se);
> +       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
> +                                       S_GENI_CMD_CANCEL, false);
> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
> +       writel_relaxed(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
> +       if (status & S_GENI_CMD_ACTIVE)
> +               qcom_geni_serial_abort_rx(uport);
> +}
> +
> +static void qcom_geni_serial_handle_rx(struct uart_port *uport, bool drop_rx)

s/drop_rx/drop/

> +{
> +       u32 status;
> +       u32 word_cnt;
> +       u32 last_word_byte_cnt;
> +       u32 last_word_partial;
> +       u32 total_bytes;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       status = readl_relaxed(uport->membase + SE_GENI_RX_FIFO_STATUS);
> +       word_cnt = status & RX_FIFO_WC_MSK;
> +       last_word_partial = status & RX_LAST;
> +       last_word_byte_cnt = (status & RX_LAST_BYTE_VALID_MSK) >>
> +                                               RX_LAST_BYTE_VALID_SHFT;
> +
> +       if (!word_cnt)
> +               return;
> +       total_bytes = port->rx_bytes_pw * (word_cnt - 1);
> +       if (last_word_partial && last_word_byte_cnt)
> +               total_bytes += last_word_byte_cnt;
> +       else
> +               total_bytes += port->rx_bytes_pw;
> +       port->handle_rx(uport, total_bytes, drop_rx);
> +}
> +
> +static int qcom_geni_serial_handle_tx(struct uart_port *uport)
> +{
> +       int ret = 0;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +       struct circ_buf *xmit = &uport->state->xmit;
> +       size_t avail;
> +       size_t remaining;
> +       int i = 0;
> +       u32 status;
> +       unsigned int chunk;
> +       int tail;
> +
> +       chunk = uart_circ_chars_pending(xmit);
> +       status = readl_relaxed(uport->membase + SE_GENI_TX_FIFO_STATUS);
> +       /* Both FIFO and framework buffer are drained */
> +       if ((chunk == port->xmit_size) && !status) {

Drop useless parenthesis.

> +               port->xmit_size = 0;
> +               uart_circ_clear(xmit);
> +               qcom_geni_serial_stop_tx(uport);
> +               goto out_write_wakeup;
> +       }
> +       chunk -= port->xmit_size;
> +
> +       avail = (port->tx_fifo_depth - port->tx_wm) * port->tx_bytes_pw;
> +       tail = (xmit->tail + port->xmit_size) & (UART_XMIT_SIZE - 1);
> +       if (chunk > (UART_XMIT_SIZE - tail))
> +               chunk = UART_XMIT_SIZE - tail;
> +       if (chunk > avail)
> +               chunk = avail;
> +
> +       if (!chunk)
> +               goto out_write_wakeup;
> +
> +       qcom_geni_serial_setup_tx(uport, chunk);
> +
> +       remaining = chunk;
> +       while (i < chunk) {

	for (i = 0; i < chunk; ) {

> +               unsigned int tx_bytes;
> +               unsigned int buf = 0;
> +               int c;
> +
> +               tx_bytes = min_t(size_t, remaining, (size_t)port->tx_bytes_pw);
> +               for (c = 0; c < tx_bytes ; c++)
> +                       buf |= (xmit->buf[tail + c] << (c * BITS_PER_BYTE));
> +
> +               writel_relaxed(buf, uport->membase + SE_GENI_TX_FIFOn);
> +
> +               i += tx_bytes;
> +               tail = (tail + tx_bytes) & (UART_XMIT_SIZE - 1);
> +               uport->icount.tx += tx_bytes;
> +               remaining -= tx_bytes;
> +       }
> +       qcom_geni_serial_poll_tx_done(uport);
> +       port->xmit_size += chunk;
> +out_write_wakeup:
> +       uart_write_wakeup(uport);
> +       return ret;
> +}
> +
> +static irqreturn_t qcom_geni_serial_isr(int isr, void *dev)
> +{
> +       unsigned int m_irq_status;
> +       unsigned int s_irq_status;
> +       struct uart_port *uport = dev;
> +       unsigned long flags;
> +       unsigned int m_irq_en;
> +       bool drop_rx = false;
> +       struct tty_port *tport = &uport->state->port;
> +
> +       if (uport->suspended)
> +               return IRQ_HANDLED;
> +
> +       spin_lock_irqsave(&uport->lock, flags);
> +       m_irq_status = readl_relaxed(uport->membase + SE_GENI_M_IRQ_STATUS);
> +       s_irq_status = readl_relaxed(uport->membase + SE_GENI_S_IRQ_STATUS);
> +       m_irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
> +       writel_relaxed(m_irq_status, uport->membase + SE_GENI_M_IRQ_CLEAR);
> +       writel_relaxed(s_irq_status, uport->membase + SE_GENI_S_IRQ_CLEAR);
> +
> +       if (WARN_ON(m_irq_status & M_ILLEGAL_CMD_EN))
> +               goto out_unlock;
> +
> +       if (s_irq_status & S_RX_FIFO_WR_ERR_EN) {
> +               uport->icount.overrun++;
> +               tty_insert_flip_char(tport, 0, TTY_OVERRUN);
> +       }
> +
> +       if (m_irq_status & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN) &&
> +           m_irq_en & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN))
> +               qcom_geni_serial_handle_tx(uport);
> +
> +       if (s_irq_status & S_GP_IRQ_0_EN || s_irq_status & S_GP_IRQ_1_EN) {
> +               if (s_irq_status & S_GP_IRQ_0_EN)
> +                       uport->icount.parity++;
> +               drop_rx = true;
> +       } else if (s_irq_status & S_GP_IRQ_2_EN ||
> +                                       s_irq_status & S_GP_IRQ_3_EN) {
> +               uport->icount.brk++;

How does break character handling work? I see the accounting here, but
don't see any uart_handle_break() call anywhere.

> +       }
> +
> +       if (s_irq_status & S_RX_FIFO_WATERMARK_EN ||
> +                                       s_irq_status & S_RX_FIFO_LAST_EN)
> +               qcom_geni_serial_handle_rx(uport, drop_rx);
> +
> +out_unlock:
> +       spin_unlock_irqrestore(&uport->lock, flags);
> +       return IRQ_HANDLED;
> +}
> +
> +static int get_tx_fifo_size(struct qcom_geni_serial_port *port)
> +{
> +       struct uart_port *uport;
> +
> +       if (!port)
> +               return -ENODEV;
> +
> +       uport = &port->uport;
> +       port->tx_fifo_depth = geni_se_get_tx_fifo_depth(&port->se);
> +       if (!port->tx_fifo_depth) {
> +               dev_err(uport->dev, "%s:Invalid TX FIFO depth read\n",
> +                                                               __func__);
> +               return -ENXIO;
> +       }
> +
> +       port->tx_fifo_width = geni_se_get_tx_fifo_width(&port->se);
> +       if (!port->tx_fifo_width) {
> +               dev_err(uport->dev, "%s:Invalid TX FIFO width read\n",
> +                                                               __func__);
> +               return -ENXIO;
> +       }
> +
> +       port->rx_fifo_depth = geni_se_get_rx_fifo_depth(&port->se);
> +       if (!port->rx_fifo_depth) {
> +               dev_err(uport->dev, "%s:Invalid RX FIFO depth read\n",
> +                                                               __func__);
> +               return -ENXIO;
> +       }

Are these checks verifying the hardware has a proper setting for fifo
depth and width? How is that possible to mess up? Do these ever fail?

> +
> +       uport->fifosize =
> +               (port->tx_fifo_depth * port->tx_fifo_width) / BITS_PER_BYTE;
> +       return 0;
> +}
> +
> +static void set_rfr_wm(struct qcom_geni_serial_port *port)
> +{
> +       /*
> +        * Set RFR (Flow off) to FIFO_DEPTH - 2.
> +        * RX WM level at 10% RX_FIFO_DEPTH.
> +        * TX WM level at 10% TX_FIFO_DEPTH.
> +        */
> +       port->rx_rfr = port->rx_fifo_depth - 2;
> +       port->rx_wm = UART_CONSOLE_RX_WM;
> +       port->tx_wm = 2;

port->tx_wm = DEF_TX_WM?

> +}
> +
> +static void qcom_geni_serial_shutdown(struct uart_port *uport)
> +{
> +       unsigned long flags;
> +
> +       /* Stop the console before stopping the current tx */
> +       console_stop(uport->cons);
> +
> +       disable_irq(uport->irq);
> +       free_irq(uport->irq, uport);
> +       spin_lock_irqsave(&uport->lock, flags);
> +       qcom_geni_serial_stop_tx(uport);
> +       qcom_geni_serial_stop_rx(uport);
> +       spin_unlock_irqrestore(&uport->lock, flags);
> +}
> +
> +static int qcom_geni_serial_port_setup(struct uart_port *uport)
> +{
> +       int ret;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +       unsigned int rxstale = DEFAULT_BITS_PER_CHAR * STALE_TIMEOUT;
> +
> +       set_rfr_wm(port);
> +       writel_relaxed(rxstale, uport->membase + SE_UART_RX_STALE_CNT);
> +       /*
> +        * Make an unconditional cancel on the main sequencer to reset
> +        * it else we could end up in data loss scenarios.
> +        */
> +       port->xfer_mode = GENI_SE_FIFO;
> +       qcom_geni_serial_poll_tx_done(uport);
> +       geni_se_config_packing(&port->se, BITS_PER_BYTE, port->tx_bytes_pw,
> +                                               false, true, false);
> +       geni_se_config_packing(&port->se, BITS_PER_BYTE, port->rx_bytes_pw,
> +                                               false, false, true);
> +       ret = geni_se_init(&port->se, port->rx_wm, port->rx_rfr);
> +       if (ret) {
> +               dev_err(uport->dev, "%s: Fail\n", __func__);
> +               return ret;
> +       }
> +
> +       geni_se_select_mode(&port->se, port->xfer_mode);
> +       port->port_setup = true;
> +       return ret;
> +}
> +
> +static int qcom_geni_serial_startup(struct uart_port *uport)
> +{
> +       int ret;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       scnprintf(port->name, sizeof(port->name),
> +                 "qcom_serial_geni%d", uport->line);
> +
> +       if (geni_se_read_proto(&port->se) != GENI_SE_UART) {
> +               dev_err(uport->dev, "Invalid FW %d loaded.\n",
> +                                geni_se_read_proto(&port->se));

Please don't read proto twice.

> +               return -ENXIO;
> +       }
> +
> +       get_tx_fifo_size(port);
> +       if (!port->port_setup) {
> +               ret = qcom_geni_serial_port_setup(uport);
> +               if (ret)
> +                       return ret;
> +       }
> +
> +       ret = request_irq(uport->irq, qcom_geni_serial_isr, IRQF_TRIGGER_HIGH,
> +                                                       port->name, uport);
> +       if (ret)
> +               dev_err(uport->dev, "Failed to get IRQ ret %d\n", ret);
> +       return ret;
> +}
> +
> +static unsigned long get_clk_cfg(unsigned long clk_freq)
> +{
> +       int i;
> +
> +       for (i = 0; i < ARRAY_SIZE(root_freq); i++) {
> +               if (!(root_freq[i] % clk_freq))
> +                       return root_freq[i];
> +       }
> +       return 0;
> +}
> +
> +static void geni_serial_write_term_regs(struct uart_port *uport,
> +               u32 tx_trans_cfg, u32 tx_parity_cfg, u32 rx_trans_cfg,
> +               u32 rx_parity_cfg, u32 bits_per_char, u32 stop_bit_len,
> +               u32 s_clk_cfg)
> +{
> +       writel_relaxed(tx_trans_cfg, uport->membase + SE_UART_TX_TRANS_CFG);
> +       writel_relaxed(tx_parity_cfg, uport->membase + SE_UART_TX_PARITY_CFG);
> +       writel_relaxed(rx_trans_cfg, uport->membase + SE_UART_RX_TRANS_CFG);
> +       writel_relaxed(rx_parity_cfg, uport->membase + SE_UART_RX_PARITY_CFG);
> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_TX_WORD_LEN);
> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_RX_WORD_LEN);
> +       writel_relaxed(stop_bit_len, uport->membase + SE_UART_TX_STOP_BIT_LEN);
> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_M_CLK_CFG);
> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_S_CLK_CFG);

Can you please inline this function into the caller and put the writels
where the values are calculated? It would reduce the mental work to keep
track of all the variables to find out that they just get written in the
end. Also, this is weirdly placed in the file when get_clk_div_rate()
calls get_clk_cfg() but this function is between them.

> +}
> +
> +static unsigned long get_clk_div_rate(unsigned int baud, unsigned int *clk_div)
> +{
> +       unsigned long ser_clk;
> +       unsigned long desired_clk;
> +
> +       desired_clk = baud * UART_OVERSAMPLING;
> +       ser_clk = get_clk_cfg(desired_clk);
> +       if (!ser_clk) {
> +               pr_err("%s: Can't find matching DFS entry for baud %d\n",
> +                                                               __func__, baud);
> +               return ser_clk;
> +       }
> +
> +       *clk_div = ser_clk / desired_clk;

How wide can clk_div be? It may be better to implement the ser_clk as an
actual clk in the common clk framework, and then have the serial driver
or the i2c driver call clk_set_rate() on that clk and have the CCF
implementation take care of determining the rate that the parent clk can
supply and how it can fit it into the frequency that the divider can
support.

> +       return ser_clk;
> +}
> +
> +static void qcom_geni_serial_set_termios(struct uart_port *uport,
> +                               struct ktermios *termios, struct ktermios *old)
> +{
> +       unsigned int baud;
> +       unsigned int bits_per_char;
> +       unsigned int tx_trans_cfg;
> +       unsigned int tx_parity_cfg;
> +       unsigned int rx_trans_cfg;
> +       unsigned int rx_parity_cfg;
> +       unsigned int stop_bit_len;
> +       unsigned int clk_div;
> +       unsigned long ser_clk_cfg;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +       unsigned long clk_rate;
> +
> +       qcom_geni_serial_stop_rx(uport);
> +       /* baud rate */
> +       baud = uart_get_baud_rate(uport, termios, old, 300, 4000000);
> +       port->cur_baud = baud;
> +       clk_rate = get_clk_div_rate(baud, &clk_div);
> +       if (!clk_rate)
> +               goto out_restart_rx;
> +
> +       uport->uartclk = clk_rate;
> +       clk_set_rate(port->se.clk, clk_rate);
> +       ser_clk_cfg = SER_CLK_EN;
> +       ser_clk_cfg |= (clk_div << CLK_DIV_SHFT);

Drop useless cast.

> +
> +       /* parity */
> +       tx_trans_cfg = readl_relaxed(uport->membase + SE_UART_TX_TRANS_CFG);
> +       tx_parity_cfg = readl_relaxed(uport->membase + SE_UART_TX_PARITY_CFG);
> +       rx_trans_cfg = readl_relaxed(uport->membase + SE_UART_RX_TRANS_CFG);
> +       rx_parity_cfg = readl_relaxed(uport->membase + SE_UART_RX_PARITY_CFG);
> +       if (termios->c_cflag & PARENB) {
> +               tx_trans_cfg |= UART_TX_PAR_EN;
> +               rx_trans_cfg |= UART_RX_PAR_EN;
> +               tx_parity_cfg |= PAR_CALC_EN;
> +               rx_parity_cfg |= PAR_CALC_EN;
> +               if (termios->c_cflag & PARODD) {
> +                       tx_parity_cfg |= PAR_ODD;
> +                       rx_parity_cfg |= PAR_ODD;
> +               } else if (termios->c_cflag & CMSPAR) {
> +                       tx_parity_cfg |= PAR_SPACE;
> +                       rx_parity_cfg |= PAR_SPACE;
> +               } else {
> +                       tx_parity_cfg |= PAR_EVEN;
> +                       rx_parity_cfg |= PAR_EVEN;
> +               }
> +       } else {
> +               tx_trans_cfg &= ~UART_TX_PAR_EN;
> +               rx_trans_cfg &= ~UART_RX_PAR_EN;
> +               tx_parity_cfg &= ~PAR_CALC_EN;
> +               rx_parity_cfg &= ~PAR_CALC_EN;
> +       }
> +
> +       /* bits per char */
> +       switch (termios->c_cflag & CSIZE) {
> +       case CS5:
> +               bits_per_char = 5;
> +               break;
> +       case CS6:
> +               bits_per_char = 6;
> +               break;
> +       case CS7:
> +               bits_per_char = 7;
> +               break;
> +       case CS8:
> +       default:
> +               bits_per_char = 8;
> +               break;
> +       }
> +
> +       /* stop bits */
> +       if (termios->c_cflag & CSTOPB)
> +               stop_bit_len = TX_STOP_BIT_LEN_2;
> +       else
> +               stop_bit_len = TX_STOP_BIT_LEN_1;
> +
> +       /* flow control, clear the CTS_MASK bit if using flow control. */
> +       if (termios->c_cflag & CRTSCTS)
> +               tx_trans_cfg &= ~UART_CTS_MASK;
> +       else
> +               tx_trans_cfg |= UART_CTS_MASK;
> +
> +       if (baud)
> +               uart_update_timeout(uport, termios->c_cflag, baud);
> +
> +       geni_serial_write_term_regs(uport, tx_trans_cfg, tx_parity_cfg,
> +               rx_trans_cfg, rx_parity_cfg, bits_per_char, stop_bit_len,
> +                                                               ser_clk_cfg);
> +out_restart_rx:
> +       qcom_geni_serial_start_rx(uport);
> +}
> +
> +static unsigned int qcom_geni_serial_tx_empty(struct uart_port *uport)
> +{
> +       return !readl_relaxed(uport->membase + SE_GENI_TX_FIFO_STATUS);
> +}
> +
> +#if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)
> +static int __init qcom_geni_console_setup(struct console *co, char *options)
> +{
> +       struct uart_port *uport;
> +       struct qcom_geni_serial_port *port;
> +       int baud;
> +       int bits = 8;
> +       int parity = 'n';
> +       int flow = 'n';
> +
> +       if (co->index >= GENI_UART_CONS_PORTS  || co->index < 0)
> +               return -ENXIO;
> +
> +       port = get_port_from_line(co->index);
> +       if (IS_ERR(port)) {
> +               pr_err("Invalid line %d(%d)\n", co->index, (int)PTR_ERR(port));
> +               return PTR_ERR(port);
> +       }
> +
> +       uport = &port->uport;
> +
> +       if (unlikely(!uport->membase))
> +               return -ENXIO;
> +
> +       if (geni_se_resources_on(&port->se)) {
> +               dev_err(port->se.dev, "Error turning on resources\n");
> +               return -ENXIO;
> +       }
> +
> +       if (unlikely(geni_se_read_proto(&port->se) != GENI_SE_UART)) {

Looks like we're validating the configuration of the DT here. Maybe this
can go into the wrapper code and be put behind some DEBUG_KERNEL check
so we can debug bad bootloader configurations if needed? Especially if
this is the only API that's left exposed from the wrapper to the serial
engine/protocol driver.

> +               geni_se_resources_off(&port->se);
> +               return -ENXIO;
> +       }
> +
> +       if (!port->port_setup) {
> +               port->tx_bytes_pw = 1;
> +               port->rx_bytes_pw = RX_BYTES_PW;
> +               qcom_geni_serial_stop_rx(uport);
> +               qcom_geni_serial_port_setup(uport);
> +       }
> +
> +       if (options)
> +               uart_parse_options(options, &baud, &parity, &bits, &flow);
> +
> +       return uart_set_options(uport, co, baud, parity, bits, flow);
> +}
> +
> +static int console_register(struct uart_driver *drv)

__init

> +{
> +       return uart_register_driver(drv);
> +}
> +
> +static void console_unregister(struct uart_driver *drv)
> +{
> +       uart_unregister_driver(drv);
> +}
> +
> +static struct console cons_ops = {
> +       .name = "ttyMSM",
> +       .write = qcom_geni_serial_console_write,
> +       .device = uart_console_device,
> +       .setup = qcom_geni_console_setup,
> +       .flags = CON_PRINTBUFFER,
> +       .index = -1,
> +       .data = &qcom_geni_console_driver,
> +};
> +
> +static struct uart_driver qcom_geni_console_driver = {
> +       .owner = THIS_MODULE,
> +       .driver_name = "qcom_geni_console",
> +       .dev_name = "ttyMSM",
> +       .nr =  GENI_UART_CONS_PORTS,
> +       .cons = &cons_ops,
> +};
> +#else
> +static int console_register(struct uart_driver *drv)
> +{
> +       return 0;
> +}
> +
> +static void console_unregister(struct uart_driver *drv)
> +{
> +}
> +#endif /* defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL) */
> +
> +static void qcom_geni_serial_cons_pm(struct uart_port *uport,
> +               unsigned int new_state, unsigned int old_state)
> +{
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       if (unlikely(!uart_console(uport)))
> +               return;
> +
> +       if (new_state == UART_PM_STATE_ON && old_state == UART_PM_STATE_OFF)
> +               geni_se_resources_on(&port->se);
> +       else if (new_state == UART_PM_STATE_OFF &&
> +                       old_state == UART_PM_STATE_ON)
> +               geni_se_resources_off(&port->se);
> +}
> +
> +static const struct uart_ops qcom_geni_console_pops = {
> +       .tx_empty = qcom_geni_serial_tx_empty,
> +       .stop_tx = qcom_geni_serial_stop_tx,
> +       .start_tx = qcom_geni_serial_start_tx,
> +       .stop_rx = qcom_geni_serial_stop_rx,
> +       .set_termios = qcom_geni_serial_set_termios,
> +       .startup = qcom_geni_serial_startup,
> +       .config_port = qcom_geni_serial_config_port,
> +       .shutdown = qcom_geni_serial_shutdown,
> +       .type = qcom_geni_serial_get_type,
> +       .set_mctrl = qcom_geni_cons_set_mctrl,
> +       .get_mctrl = qcom_geni_cons_get_mctrl,
> +#ifdef CONFIG_CONSOLE_POLL
> +       .poll_get_char  = qcom_geni_serial_get_char,
> +       .poll_put_char  = qcom_geni_serial_poll_put_char,
> +#endif
> +       .pm = qcom_geni_serial_cons_pm,
> +};
> +
> +static int qcom_geni_serial_probe(struct platform_device *pdev)
> +{
> +       int ret = 0;
> +       int line = -1;
> +       struct qcom_geni_serial_port *port;
> +       struct uart_port *uport;
> +       struct resource *res;
> +       struct uart_driver *drv;
> +
> +       drv = (void *)of_device_get_match_data(&pdev->dev);

Useless cast.

> +       if (!drv) {
> +               dev_err(&pdev->dev, "%s: No matching device found", __func__);
> +               return -ENODEV;
> +       }
> +
> +       if (pdev->dev.of_node)
> +               line = of_alias_get_id(pdev->dev.of_node, "serial");
> +       else
> +               line = pdev->id;
> +
> +       if (line < 0)
> +               line = atomic_inc_return(&uart_line_id) - 1;
> +
> +       if ((line < 0) || (line >= GENI_UART_CONS_PORTS))

Useless parenthesis.

> +               return -ENXIO;
> +       port = get_port_from_line(line);
> +       if (IS_ERR(port)) {
> +               ret = PTR_ERR(port);
> +               dev_err(&pdev->dev, "Invalid line %d(%d)\n", line, ret);
> +               return ret;
> +       }
> +
> +       uport = &port->uport;
> +       /* Don't allow 2 drivers to access the same port */
> +       if (uport->private_data)
> +               return -ENODEV;
> +
> +       uport->dev = &pdev->dev;
> +       port->se.dev = &pdev->dev;
> +       port->se.wrapper = dev_get_drvdata(pdev->dev.parent);
> +       port->se.clk = devm_clk_get(&pdev->dev, "se");
> +       if (IS_ERR(port->se.clk)) {
> +               ret = PTR_ERR(port->se.clk);
> +               dev_err(&pdev->dev, "Err getting SE Core clk %d\n", ret);
> +               return ret;
> +       }
> +
> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> +       uport->mapbase = res->start;
> +       uport->membase = devm_ioremap_resource(&pdev->dev, res);
> +       if (!uport->membase) {

Check for IS_ERR()

> +               dev_err(&pdev->dev, "Err IO mapping serial iomem");

No need for error message with devm_ioremap_resource()

> +               return -ENOMEM;

return PTR_ERR(..)

Also, I see some serial drivers do the mapping when the port is
requested. That can't be done here?

> +       }
> +       port->se.base = uport->membase;
> +
> +       port->tx_fifo_depth = DEF_FIFO_DEPTH_WORDS;
> +       port->rx_fifo_depth = DEF_FIFO_DEPTH_WORDS;
> +       port->tx_fifo_width = DEF_FIFO_WIDTH_BITS;
> +
> +       uport->irq = platform_get_irq(pdev, 0);
> +       if (uport->irq < 0) {
> +               dev_err(&pdev->dev, "Failed to get IRQ %d\n", uport->irq);
> +               return uport->irq;
> +       }
> +
> +       uport->private_data = drv;
> +       platform_set_drvdata(pdev, port);
> +       port->handle_rx = handle_rx_console;
> +       port->port_setup = false;
> +       return uart_add_one_port(drv, uport);
> +}
> +
> +static int qcom_geni_serial_remove(struct platform_device *pdev)
> +{
> +       struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
> +       struct uart_driver *drv = port->uport.private_data;
> +
> +       uart_remove_one_port(drv, &port->uport);
> +       return 0;
> +}
> +
> +static int __maybe_unused qcom_geni_serial_sys_suspend_noirq(struct device *dev)
> +{
> +       struct platform_device *pdev = to_platform_device(dev);
> +       struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
> +       struct uart_port *uport = &port->uport;
> +
> +       uart_suspend_port(uport->private_data, uport);
> +       return 0;
> +}
> +
> +static int __maybe_unused qcom_geni_serial_sys_resume_noirq(struct device *dev)
> +{
> +       struct platform_device *pdev = to_platform_device(dev);
> +       struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
> +       struct uart_port *uport = &port->uport;
> +
> +       if (console_suspend_enabled && uport->suspended) {
> +               uart_resume_port(uport->private_data, uport);
> +               disable_irq(uport->irq);
> +       }
> +       return 0;
> +}
> +
> +static const struct dev_pm_ops qcom_geni_serial_pm_ops = {
> +       .suspend_noirq = qcom_geni_serial_sys_suspend_noirq,
> +       .resume_noirq = qcom_geni_serial_sys_resume_noirq,

Why are these noirq variants? Please add a comment.

> +};
> +
> +static const struct of_device_id qcom_geni_serial_match_table[] = {
> +       { .compatible = "qcom,geni-debug-uart",
> +                               .data = &qcom_geni_console_driver, },
> +       {}
> +};
> +MODULE_DEVICE_TABLE(of, qcom_geni_serial_match_table);
> +
> +static struct platform_driver qcom_geni_serial_platform_driver = {
> +       .remove = qcom_geni_serial_remove,
> +       .probe = qcom_geni_serial_probe,
> +       .driver = {
> +               .name = "qcom_geni_serial",
> +               .of_match_table = qcom_geni_serial_match_table,
> +               .pm = &qcom_geni_serial_pm_ops,
> +       },
> +};
> +
> +static int __init qcom_geni_serial_init(void)
> +{
> +       int ret = 0;

Drop assignment please.

> +
> +       qcom_geni_console_port.uport.iotype = UPIO_MEM;
> +       qcom_geni_console_port.uport.ops = &qcom_geni_console_pops;
> +       qcom_geni_console_port.uport.flags = UPF_BOOT_AUTOCONF;
> +       qcom_geni_console_port.uport.line = 0;
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Evan Green March 3, 2018, 12:11 a.m. UTC | #4
Hello Karthik,

On Tue, Feb 27, 2018 at 5:38 PM, Karthikeyan Ramasubramanian
<kramasub@codeaurora.org> wrote:
> This driver supports GENI based UART Controller in the Qualcomm SOCs. The
> Qualcomm Generic Interface (GENI) is a programmable module supporting a
> wide range of serial interfaces including UART. This driver support console
> operations using FIFO mode of transfer.
>
> Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
> Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
> Signed-off-by: Sagar Dharia <sdharia@codeaurora.org>
> Signed-off-by: Doug Anderson <dianders@google.com>
> ---
>  drivers/tty/serial/Kconfig            |   11 +
>  drivers/tty/serial/Makefile           |    1 +
>  drivers/tty/serial/qcom_geni_serial.c | 1181 +++++++++++++++++++++++++++++++++
>  3 files changed, 1193 insertions(+)
>  create mode 100644 drivers/tty/serial/qcom_geni_serial.c
>
> diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
> index 3682fd3..c6b1500 100644
> --- a/drivers/tty/serial/Kconfig
> +++ b/drivers/tty/serial/Kconfig
> @@ -1104,6 +1104,17 @@ config SERIAL_MSM_CONSOLE
>         select SERIAL_CORE_CONSOLE
>         select SERIAL_EARLYCON
>
> +config SERIAL_QCOM_GENI
> +       bool "QCOM on-chip GENI based serial port support"
> +       depends on ARCH_QCOM
> +       depends on QCOM_GENI_SE

My understanding is that this has to be "bool" because there's a
console in there, and consoles cannot be built as modules. Stephen is
suggesting splitting this option up into two, so you could have serial
with or without the console. That's fine, and probably the preferred
way. However, you do want to make sure that if serial (or what's soon
to be serial+console) is enabled, that QCOM_GENI_SE has to be built
=y, and not =m. I'd suggest "select QCOM_GENI_SE" in the new
SERIAL_QCOM_GENI_CONSOLE (or whatever it's called). As it is now, if
SERIAL_QCOM_GENI=y and QCOM_GENI_SE=m, there's a build failure.

GENI_SE is allowed to be built as a module if serial is not enabled
and I2C is built as a module. In order to keep the dependency arrows
going in the same direction, you might want the I2C driver to "select
QCOM_GENI_SE" as well, in order to upgrade GENI_SE to y if I2C is y.

> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
> new file mode 100644
> index 0000000..8536b7d
> --- /dev/null
> +++ b/drivers/tty/serial/qcom_geni_serial.c
> @@ -0,0 +1,1181 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2017-2018, The Linux foundation. All rights reserved.
> +
> +#include <linux/console.h>
> +#include <linux/io.h>
> +#include <linux/iopoll.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/qcom-geni-se.h>
> +#include <linux/serial.h>
> +#include <linux/serial_core.h>
> +#include <linux/slab.h>
> +#include <linux/tty.h>
> +#include <linux/tty_flip.h>
> +
> +/* UART specific GENI registers */
> +#define SE_UART_TX_TRANS_CFG           0x25c
> +#define SE_UART_TX_WORD_LEN            0x268
> +#define SE_UART_TX_STOP_BIT_LEN                0x26c
> +#define SE_UART_TX_TRANS_LEN           0x270
> +#define SE_UART_RX_TRANS_CFG           0x280
> +#define SE_UART_RX_WORD_LEN            0x28c
> +#define SE_UART_RX_STALE_CNT           0x294
> +#define SE_UART_TX_PARITY_CFG          0x2a4
> +#define SE_UART_RX_PARITY_CFG          0x2a8
> +
> +/* SE_UART_TRANS_CFG */
> +#define UART_TX_PAR_EN         BIT(0)
> +#define UART_CTS_MASK          BIT(1)
> +
> +/* SE_UART_TX_WORD_LEN */
> +#define TX_WORD_LEN_MSK                GENMASK(9, 0)
> +
> +/* SE_UART_TX_STOP_BIT_LEN */
> +#define TX_STOP_BIT_LEN_MSK    GENMASK(23, 0)
> +#define TX_STOP_BIT_LEN_1      0
> +#define TX_STOP_BIT_LEN_1_5    1
> +#define TX_STOP_BIT_LEN_2      2
> +
> +/* SE_UART_TX_TRANS_LEN */
> +#define TX_TRANS_LEN_MSK       GENMASK(23, 0)
> +
> +/* SE_UART_RX_TRANS_CFG */
> +#define UART_RX_INS_STATUS_BIT BIT(2)
> +#define UART_RX_PAR_EN         BIT(3)
> +
> +/* SE_UART_RX_WORD_LEN */
> +#define RX_WORD_LEN_MASK       GENMASK(9, 0)
> +
> +/* SE_UART_RX_STALE_CNT */
> +#define RX_STALE_CNT           GENMASK(23, 0)
> +
> +/* SE_UART_TX_PARITY_CFG/RX_PARITY_CFG */
> +#define PAR_CALC_EN            BIT(0)
> +#define PAR_MODE_MSK           GENMASK(2, 1)
> +#define PAR_MODE_SHFT          1
> +#define PAR_EVEN               0x00
> +#define PAR_ODD                        0x01
> +#define PAR_SPACE              0x10
> +#define PAR_MARK               0x11
> +
> +/* UART M_CMD OP codes */
> +#define UART_START_TX          0x1
> +#define UART_START_BREAK       0x4
> +#define UART_STOP_BREAK                0x5
> +/* UART S_CMD OP codes */
> +#define UART_START_READ                0x1
> +#define UART_PARAM             0x1
> +
> +#define UART_OVERSAMPLING      32
> +#define STALE_TIMEOUT          16
> +#define DEFAULT_BITS_PER_CHAR  10
> +#define GENI_UART_CONS_PORTS   1
> +#define DEF_FIFO_DEPTH_WORDS   16
> +#define DEF_TX_WM              2
> +#define DEF_FIFO_WIDTH_BITS    32
> +#define UART_CONSOLE_RX_WM     2
> +
> +#ifdef CONFIG_CONSOLE_POLL
> +#define RX_BYTES_PW 1
> +#else
> +#define RX_BYTES_PW 4
> +#endif

This seems fishy to me. Does either setting work? If so, why not just
have one value?

> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> +                               int offset, int bit_field, bool set)
> +{
> +       u32 reg;
> +       struct qcom_geni_serial_port *port;
> +       unsigned int baud;
> +       unsigned int fifo_bits;
> +       unsigned long timeout_us = 20000;
> +
> +       /* Ensure polling is not re-ordered before the prior writes/reads */
> +       mb();
> +
> +       if (uport->private_data) {
> +               port = to_dev_port(uport, uport);
> +               baud = port->cur_baud;
> +               if (!baud)
> +                       baud = 115200;
> +               fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
> +               /*
> +                * Total polling iterations based on FIFO worth of bytes to be
> +                * sent at current baud .Add a little fluff to the wait.
> +                */
> +               timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;

This fluff is a little mysterious, can it be explained at all? Do you
think the fluff factor is in units of time (as you have it) or bits?
Time makes sense I guess if we're worried about clock source
differences.

> +
> +static void qcom_geni_serial_console_write(struct console *co, const char *s,
> +                             unsigned int count)
> +{
> +       struct uart_port *uport;
> +       struct qcom_geni_serial_port *port;
> +       bool locked = true;
> +       unsigned long flags;
> +
> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
> +
> +       port = get_port_from_line(co->index);
> +       if (IS_ERR(port))
> +               return;
> +
> +       uport = &port->uport;
> +       if (oops_in_progress)
> +               locked = spin_trylock_irqsave(&uport->lock, flags);
> +       else
> +               spin_lock_irqsave(&uport->lock, flags);
> +
> +       if (locked) {
> +               __qcom_geni_serial_console_write(uport, s, count);
> +               spin_unlock_irqrestore(&uport->lock, flags);

I too am a little lost on the locking here. What exactly is the lock
protecting? Looks like for the most part it's trying to synchronize
with the ISR? What specifically in the ISR? I just wanted to go
through and check to make sure whatever the shared resource is is
appropriately protected.

> +       }
> +}
> +
> +static int handle_rx_console(struct uart_port *uport, u32 rx_bytes, bool drop)
> +{
> +       u32 i = rx_bytes;
> +       u32 rx_fifo;
> +       unsigned char *buf;
> +       struct tty_port *tport;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +
> +       tport = &uport->state->port;
> +       while (i > 0) {
> +               int c;
> +               int bytes = i > port->rx_bytes_pw ? port->rx_bytes_pw : i;

Please replace this with a min macro.

> +static int qcom_geni_serial_handle_tx(struct uart_port *uport)
> +{
> +       int ret = 0;
> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> +       struct circ_buf *xmit = &uport->state->xmit;
> +       size_t avail;
> +       size_t remaining;
> +       int i = 0;
> +       u32 status;
> +       unsigned int chunk;
> +       int tail;
> +
> +       chunk = uart_circ_chars_pending(xmit);
> +       status = readl_relaxed(uport->membase + SE_GENI_TX_FIFO_STATUS);
> +       /* Both FIFO and framework buffer are drained */
> +       if ((chunk == port->xmit_size) && !status) {
> +               port->xmit_size = 0;
> +               uart_circ_clear(xmit);
> +               qcom_geni_serial_stop_tx(uport);
> +               goto out_write_wakeup;
> +       }
> +       chunk -= port->xmit_size;
> +
> +       avail = (port->tx_fifo_depth - port->tx_wm) * port->tx_bytes_pw;
> +       tail = (xmit->tail + port->xmit_size) & (UART_XMIT_SIZE - 1);
> +       if (chunk > (UART_XMIT_SIZE - tail))
> +               chunk = UART_XMIT_SIZE - tail;
> +       if (chunk > avail)
> +               chunk = avail;
> +
> +       if (!chunk)
> +               goto out_write_wakeup;
> +
> +       qcom_geni_serial_setup_tx(uport, chunk);
> +
> +       remaining = chunk;
> +       while (i < chunk) {
> +               unsigned int tx_bytes;
> +               unsigned int buf = 0;
> +               int c;
> +
> +               tx_bytes = min_t(size_t, remaining, (size_t)port->tx_bytes_pw);
> +               for (c = 0; c < tx_bytes ; c++)
> +                       buf |= (xmit->buf[tail + c] << (c * BITS_PER_BYTE));
> +
> +               writel_relaxed(buf, uport->membase + SE_GENI_TX_FIFOn);
> +
> +               i += tx_bytes;
> +               tail = (tail + tx_bytes) & (UART_XMIT_SIZE - 1);
> +               uport->icount.tx += tx_bytes;
> +               remaining -= tx_bytes;
> +       }
> +       qcom_geni_serial_poll_tx_done(uport);
> +       port->xmit_size += chunk;
> +out_write_wakeup:
> +       uart_write_wakeup(uport);
> +       return ret;
> +}

This function can't fail, please change the return type to void.

> +
> +static void qcom_geni_serial_shutdown(struct uart_port *uport)
> +{
> +       unsigned long flags;
> +
> +       /* Stop the console before stopping the current tx */
> +       console_stop(uport->cons);
> +
> +       disable_irq(uport->irq);
> +       free_irq(uport->irq, uport);
> +       spin_lock_irqsave(&uport->lock, flags);
> +       qcom_geni_serial_stop_tx(uport);
> +       qcom_geni_serial_stop_rx(uport);
> +       spin_unlock_irqrestore(&uport->lock, flags);

This is one part of where I'm confused. What are we protecting here
with the lock? disable_irq waits for any pending ISRs to finish
according to its comment, so you know you're not racing with the ISR.

> +static void geni_serial_write_term_regs(struct uart_port *uport,
> +               u32 tx_trans_cfg, u32 tx_parity_cfg, u32 rx_trans_cfg,
> +               u32 rx_parity_cfg, u32 bits_per_char, u32 stop_bit_len,
> +               u32 s_clk_cfg)
> +{
> +       writel_relaxed(tx_trans_cfg, uport->membase + SE_UART_TX_TRANS_CFG);
> +       writel_relaxed(tx_parity_cfg, uport->membase + SE_UART_TX_PARITY_CFG);
> +       writel_relaxed(rx_trans_cfg, uport->membase + SE_UART_RX_TRANS_CFG);
> +       writel_relaxed(rx_parity_cfg, uport->membase + SE_UART_RX_PARITY_CFG);
> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_TX_WORD_LEN);
> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_RX_WORD_LEN);
> +       writel_relaxed(stop_bit_len, uport->membase + SE_UART_TX_STOP_BIT_LEN);
> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_M_CLK_CFG);
> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_S_CLK_CFG);
> +}
> +

I agree with Stephen's comment, this should be inlined into the single
place it's called from.

Thanks Karthik!
-Evan
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karthikeyan Ramasubramanian March 3, 2018, 12:58 a.m. UTC | #5
On 3/2/2018 1:41 PM, Stephen Boyd wrote:
> Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:07)
>> This driver manages the Generic Interface (GENI) firmware based Qualcomm
>> Universal Peripheral (QUP) Wrapper. GENI based QUP is the next generation
>> programmable module composed of multiple Serial Engines (SE) and supports
>> a wide range of serial interfaces like UART, SPI, I2C, I3C, etc. This
>> driver also enables managing the serial interface independent aspects of
>> Serial Engines.
>>
>> Signed-off-by: Karthikeyan Ramasubramanian <kramasub@codeaurora.org>
>> Signed-off-by: Sagar Dharia <sdharia@codeaurora.org>
>> Signed-off-by: Girish Mahadevan <girishm@codeaurora.org>
>> ---
>>   drivers/soc/qcom/Kconfig        |   9 +
>>   drivers/soc/qcom/Makefile       |   1 +
>>   drivers/soc/qcom/qcom-geni-se.c | 971 ++++++++++++++++++++++++++++++++++++++++
>>   include/linux/qcom-geni-se.h    | 247 ++++++++++
>>   4 files changed, 1228 insertions(+)
>>   create mode 100644 drivers/soc/qcom/qcom-geni-se.c
>>   create mode 100644 include/linux/qcom-geni-se.h
>>
>> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
>> index e050eb8..cc460d0 100644
>> --- a/drivers/soc/qcom/Kconfig
>> +++ b/drivers/soc/qcom/Kconfig
>> @@ -3,6 +3,15 @@
>>   #
>>   menu "Qualcomm SoC drivers"
>>   
>> +config QCOM_GENI_SE
>> +       tristate "QCOM GENI Serial Engine Driver"
>> +       depends on ARCH_QCOM
> 
> Add || COMPILE_TEST?
Ok.
> 
>> +       help
>> +         This module is used to manage Generic Interface (GENI) firmware based
> 
> s/module/driver?
> 
>> +         Qualcomm Technologies, Inc. Universal Peripheral (QUP) Wrapper. This
>> +         module is also used to manage the common aspects of multiple Serial
> s/module/driver?
Ok.
> 
>> +         Engines present in the QUP.
>> +
>>   config QCOM_GLINK_SSR
>>          tristate "Qualcomm Glink SSR driver"
>>          depends on RPMSG
>> diff --git a/drivers/soc/qcom/qcom-geni-se.c b/drivers/soc/qcom/qcom-geni-se.c
>> new file mode 100644
>> index 0000000..61335b8
>> --- /dev/null
>> +++ b/drivers/soc/qcom/qcom-geni-se.c
>> @@ -0,0 +1,971 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
>> +
>> +#include <linux/clk.h>
>> +#include <linux/slab.h>
>> +#include <linux/dma-mapping.h>
>> +#include <linux/io.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/of_platform.h>
>> +#include <linux/qcom-geni-se.h>
> 
> #include <linux/platform_device.h>
Ok
> 
>> +
>> +/**
>> + * DOC: Overview
>> + *
>> + * Generic Interface (GENI) Serial Engine (SE) Wrapper driver is introduced
>> + * to manage GENI firmware based Qualcomm Universal Peripheral (QUP) Wrapper
>> + * controller. QUP Wrapper is designed to support various serial bus protocols
>> + * like UART, SPI, I2C, I3C, etc.
>> + */
>> +
>> +/**
>> + * DOC: Hardware description
>> + *
>> + * GENI based QUP is a highly-flexible and programmable module for supporting
>> + * a wide range of serial interfaces like UART, SPI, I2C, I3C, etc. A single
>> + * QUP module can provide upto 8 Serial Interfaces, using its internal
>> + * Serial Engines. The actual configuration is determined by the target
>> + * platform configuration. The protocol supported by each interface is
>> + * determined by the firmware loaded to the Serial Engine. Each SE consists
>> + * of a DMA Engine and GENI sub modules which enable Serial Engines to
>> + * support FIFO and DMA modes of operation.
>> + *
>> + *
>> + *                      +-----------------------------------------+
>> + *                      |QUP Wrapper                              |
>> + *                      |         +----------------------------+  |
>> + *   --QUP & SE Clocks-->         | Serial Engine N            |  +-IO------>
>> + *                      |         | ...                        |  | Interface
>> + *   <---Clock Perf.----+    +----+-----------------------+    |  |
>> + *     State Interface  |    | Serial Engine 1            |    |  |
>> + *                      |    |                            |    |  |
>> + *                      |    |                            |    |  |
>> + *   <--------AHB------->    |                            |    |  |
>> + *                      |    |                            +----+  |
>> + *                      |    |                            |       |
>> + *                      |    |                            |       |
>> + *   <------SE IRQ------+    +----------------------------+       |
>> + *                      |                                         |
>> + *                      +-----------------------------------------+
>> + *
>> + *                         Figure 1: GENI based QUP Wrapper
> 
> The code talks about primary and secondary sequencers, but this hardware
> description doesn't talk about it. Can you add some more information
> here about that aspect too?
Ok.
> 
>> + */
>> +
>> +/**
>> + * DOC: Software description
>> + *
>> + * GENI SE Wrapper driver is structured into 2 parts:
>> + *
>> + * geni_wrapper represents QUP Wrapper controller. This part of the driver
>> + * manages QUP Wrapper information such as hardware version, clock
>> + * performance table that is common to all the internal Serial Engines.
>> + *
>> + * geni_se represents Serial Engine. This part of the driver manages Serial
>> + * Engine information such as clocks, containing QUP Wrapper etc. This part
> 
> Insert a comma here                                           ^
Ok.
> 
>> + * of driver also supports operations(eg. initialize the concerned Serial
> 
> Space
Ok.
                                    ^
> 
>> + * Engine, select between FIFO and DMA mode of operation etc.) that are
>> + * common to all the Serial Engines and are independent of Serial Interfaces.
> 
> Why are Serial Interfaces and Serial Engine always capitalized?
No special reason. I will change it to small letter.
> 
>> + */
>> +
>> +#define MAX_CLK_PERF_LEVEL 32
>> +#define NUM_AHB_CLKS 2
>> +static const char m_ahb_clk[] = "m-ahb";
>> +static const char s_ahb_clk[] = "s-ahb";
> 
> These are used in one place. Inline them?
Ok.
> 
>> +
>> +/**
>> + * @struct geni_wrapper - Data structure to represent the QUP Wrapper Core
>> + * @dev:               Device pointer of the QUP wrapper core.
>> + * @base:              Base address of this instance of QUP wrapper core.
>> + * @ahb_clks:          Handle to the primary & secondary AHB clocks.
>> + * @lock:              Lock to protect the device elements.
> 
> What does 'device elements' mean?
It means members of geni_wrapper structure. I will document that way.
> 
>> + * @num_clk_levels:    Number of valid clock levels in clk_perf_tbl.
>> + * @clk_perf_tbl:      Table of clock frequency input to Serial Engine clock.
> 
> Kernel-doc normally doesn't have a full-stop on member descriptions.
Ok. I will remove the full-stop.
> 
>> + */
>> +struct geni_wrapper {
>> +       struct device *dev;
>> +       void __iomem *base;
>> +       struct clk_bulk_data ahb_clks[NUM_AHB_CLKS];
>> +       struct mutex lock;
>> +       unsigned int num_clk_levels;
>> +       unsigned long *clk_perf_tbl;
>> +};
>> +
>> +/* Offset of QUP Hardware Version Register */
> 
> Useless comment?
I will remove it.
> 
>> +#define QUP_HW_VER_REG                 0x4
>> +
>> +#define HW_VER_MAJOR_MASK              GENMASK(31, 28)
>> +#define HW_VER_MAJOR_SHFT              28
>> +#define HW_VER_MINOR_MASK              GENMASK(27, 16)
>> +#define HW_VER_MINOR_SHFT              16
>> +#define HW_VER_STEP_MASK               GENMASK(15, 0)
>> +
>> +/* Common SE registers */
>> +#define GENI_INIT_CFG_REVISION         0x0
>> +#define GENI_S_INIT_CFG_REVISION       0x4
>> +#define GENI_OUTPUT_CTRL               0x24
>> +#define GENI_CGC_CTRL                  0x28
>> +#define GENI_CLK_CTRL_RO               0x60
>> +#define GENI_IF_DISABLE_RO             0x64
>> +#define GENI_FW_REVISION_RO            0x68
>> +#define GENI_FW_S_REVISION_RO          0x6c
>> +#define SE_GENI_BYTE_GRAN              0x254
>> +#define SE_GENI_TX_PACKING_CFG0                0x260
>> +#define SE_GENI_TX_PACKING_CFG1                0x264
>> +#define SE_GENI_RX_PACKING_CFG0                0x284
>> +#define SE_GENI_RX_PACKING_CFG1                0x288
>> +#define SE_GENI_M_GP_LENGTH            0x910
>> +#define SE_GENI_S_GP_LENGTH            0x914
>> +#define SE_DMA_TX_PTR_L                        0xc30
>> +#define SE_DMA_TX_PTR_H                        0xc34
>> +#define SE_DMA_TX_ATTR                 0xc38
>> +#define SE_DMA_TX_LEN                  0xc3c
>> +#define SE_DMA_TX_IRQ_EN               0xc48
>> +#define SE_DMA_TX_IRQ_EN_SET           0xc4c
>> +#define SE_DMA_TX_IRQ_EN_CLR           0xc50
>> +#define SE_DMA_TX_LEN_IN               0xc54
>> +#define SE_DMA_TX_MAX_BURST            0xc5c
>> +#define SE_DMA_RX_PTR_L                        0xd30
>> +#define SE_DMA_RX_PTR_H                        0xd34
>> +#define SE_DMA_RX_ATTR                 0xd38
>> +#define SE_DMA_RX_LEN                  0xd3c
>> +#define SE_DMA_RX_IRQ_EN               0xd48
>> +#define SE_DMA_RX_IRQ_EN_SET           0xd4c
>> +#define SE_DMA_RX_IRQ_EN_CLR           0xd50
>> +#define SE_DMA_RX_LEN_IN               0xd54
>> +#define SE_DMA_RX_MAX_BURST            0xd5c
>> +#define SE_DMA_RX_FLUSH                        0xd60
>> +#define SE_GSI_EVENT_EN                        0xe18
>> +#define SE_IRQ_EN                      0xe1c
>> +#define SE_HW_PARAM_0                  0xe24
>> +#define SE_HW_PARAM_1                  0xe28
>> +#define SE_DMA_GENERAL_CFG             0xe30
>> +
>> +/* GENI_OUTPUT_CTRL fields */
>> +#define DEFAULT_IO_OUTPUT_CTRL_MSK     GENMASK(6, 0)
>> +
>> +/* GENI_CGC_CTRL fields */
>> +#define CFG_AHB_CLK_CGC_ON             BIT(0)
>> +#define CFG_AHB_WR_ACLK_CGC_ON         BIT(1)
>> +#define DATA_AHB_CLK_CGC_ON            BIT(2)
>> +#define SCLK_CGC_ON                    BIT(3)
>> +#define TX_CLK_CGC_ON                  BIT(4)
>> +#define RX_CLK_CGC_ON                  BIT(5)
>> +#define EXT_CLK_CGC_ON                 BIT(6)
>> +#define PROG_RAM_HCLK_OFF              BIT(8)
>> +#define PROG_RAM_SCLK_OFF              BIT(9)
>> +#define DEFAULT_CGC_EN                 GENMASK(6, 0)
>> +
>> +/* FW_REVISION_RO fields */
>> +#define FW_REV_PROTOCOL_MSK            GENMASK(15, 8)
>> +#define FW_REV_PROTOCOL_SHFT           8
>> +
>> +/* SE_GSI_EVENT_EN fields */
>> +#define DMA_RX_EVENT_EN                        BIT(0)
>> +#define DMA_TX_EVENT_EN                        BIT(1)
>> +#define GENI_M_EVENT_EN                        BIT(2)
>> +#define GENI_S_EVENT_EN                        BIT(3)
>> +
>> +/* SE_IRQ_EN fields */
>> +#define DMA_RX_IRQ_EN                  BIT(0)
>> +#define DMA_TX_IRQ_EN                  BIT(1)
>> +#define GENI_M_IRQ_EN                  BIT(2)
>> +#define GENI_S_IRQ_EN                  BIT(3)
>> +
>> +/* SE_HW_PARAM_0 fields */
>> +#define TX_FIFO_WIDTH_MSK              GENMASK(29, 24)
>> +#define TX_FIFO_WIDTH_SHFT             24
>> +#define TX_FIFO_DEPTH_MSK              GENMASK(21, 16)
>> +#define TX_FIFO_DEPTH_SHFT             16
>> +
>> +/* SE_HW_PARAM_1 fields */
>> +#define RX_FIFO_WIDTH_MSK              GENMASK(29, 24)
>> +#define RX_FIFO_WIDTH_SHFT             24
>> +#define RX_FIFO_DEPTH_MSK              GENMASK(21, 16)
>> +#define RX_FIFO_DEPTH_SHFT             16
>> +
>> +/* SE_DMA_GENERAL_CFG */
>> +#define DMA_RX_CLK_CGC_ON              BIT(0)
>> +#define DMA_TX_CLK_CGC_ON              BIT(1)
>> +#define DMA_AHB_SLV_CFG_ON             BIT(2)
>> +#define AHB_SEC_SLV_CLK_CGC_ON         BIT(3)
>> +#define DUMMY_RX_NON_BUFFERABLE                BIT(4)
>> +#define RX_DMA_ZERO_PADDING_EN         BIT(5)
>> +#define RX_DMA_IRQ_DELAY_MSK           GENMASK(8, 6)
>> +#define RX_DMA_IRQ_DELAY_SHFT          6
>> +
>> +/**
>> + * geni_se_get_qup_hw_version() - Read the QUP wrapper Hardware version
>> + * @se:                        Pointer to the corresponding Serial Engine.
>> + * @major:             Buffer for Major Version field.
>> + * @minor:             Buffer for Minor Version field.
>> + * @step:              Buffer for Step Version field.
>> + */
>> +void geni_se_get_qup_hw_version(struct geni_se *se, unsigned int *major,
>> +                               unsigned int *minor, unsigned int *step)
>> +{
>> +       unsigned int version;
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +
>> +       version = readl_relaxed(wrapper->base + QUP_HW_VER_REG);
>> +       *major = (version & HW_VER_MAJOR_MASK) >> HW_VER_MAJOR_SHFT;
>> +       *minor = (version & HW_VER_MINOR_MASK) >> HW_VER_MINOR_SHFT;
>> +       *step = version & HW_VER_STEP_MASK;
>> +}
>> +EXPORT_SYMBOL(geni_se_get_qup_hw_version);
> 
> Is this used?
SPI controller driver uses this API and it will be uploaded sooner.
> 
>> +
>> +/**
>> + * geni_se_read_proto() - Read the protocol configured for a Serial Engine
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * Return: Protocol value as configured in the serial engine.
>> + */
>> +u32 geni_se_read_proto(struct geni_se *se)
>> +{
>> +       u32 val;
>> +
>> +       val = readl_relaxed(se->base + GENI_FW_REVISION_RO);
>> +
>> +       return (val & FW_REV_PROTOCOL_MSK) >> FW_REV_PROTOCOL_SHFT;
>> +}
>> +EXPORT_SYMBOL(geni_se_read_proto);
> 
> Is this API really needed outside of this file? It would seem like the
> drivers that implement the protocol, which are child devices, would only
> use this API to confirm that the protocol chosen is for their particular
> protocol.
No, this API is meant for the protocol drivers to confirm that the 
serial engine is programmed with the firmware for the concerned protocol 
before using the serial engine. If the check fails, the protocol drivers 
stop using the serial engine.
> 
> 
>> +
>> +static void geni_se_io_set_mode(void __iomem *base)
>> +{
>> +       u32 val;
>> +
>> +       val = readl_relaxed(base + SE_IRQ_EN);
>> +       val |= (GENI_M_IRQ_EN | GENI_S_IRQ_EN);
>> +       val |= (DMA_TX_IRQ_EN | DMA_RX_IRQ_EN);
> 
> Drop useless parenthesis please.
Ok.
> 
>> +       writel_relaxed(val, base + SE_IRQ_EN);
>> +
>> +       val = readl_relaxed(base + SE_GENI_DMA_MODE_EN);
>> +       val &= ~GENI_DMA_MODE_EN;
>> +       writel_relaxed(val, base + SE_GENI_DMA_MODE_EN);
>> +
>> +       writel_relaxed(0, base + SE_GSI_EVENT_EN);
>> +}
>> +
>> +static void geni_se_io_init(void __iomem *base)
>> +{
>> +       u32 val;
>> +
>> +       val = readl_relaxed(base + GENI_CGC_CTRL);
>> +       val |= DEFAULT_CGC_EN;
>> +       writel_relaxed(val, base + GENI_CGC_CTRL);
>> +
>> +       val = readl_relaxed(base + SE_DMA_GENERAL_CFG);
>> +       val |= AHB_SEC_SLV_CLK_CGC_ON | DMA_AHB_SLV_CFG_ON;
>> +       val |= DMA_TX_CLK_CGC_ON | DMA_RX_CLK_CGC_ON;
>> +       writel_relaxed(val, base + SE_DMA_GENERAL_CFG);
>> +
>> +       writel_relaxed(DEFAULT_IO_OUTPUT_CTRL_MSK, base + GENI_OUTPUT_CTRL);
>> +       writel_relaxed(FORCE_DEFAULT, base + GENI_FORCE_DEFAULT_REG);
>> +}
>> +
>> +/**
>> + * geni_se_init() - Initialize the GENI Serial Engine
>> + * @se:                Pointer to the concerned Serial Engine.
>> + * @rx_wm:     Receive watermark, in units of FIFO words.
>> + * @rx_rfr_wm: Ready-for-receive watermark, in units of FIFO words.
>> + *
>> + * This function is used to initialize the GENI serial engine, configure
>> + * receive watermark and ready-for-receive watermarks.
>> + *
>> + * Return: 0 on success, standard Linux error codes on failure/error.
> 
> It never returns an error. Change to void?
Ok.
> 
>> + */
>> +int geni_se_init(struct geni_se *se, u32 rx_wm, u32 rx_rfr)
>> +{
>> +       u32 val;
>> +
>> +       geni_se_io_init(se->base);
>> +       geni_se_io_set_mode(se->base);
>> +
>> +       writel_relaxed(rx_wm, se->base + SE_GENI_RX_WATERMARK_REG);
>> +       writel_relaxed(rx_rfr, se->base + SE_GENI_RX_RFR_WATERMARK_REG);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
>> +       val |= M_COMMON_GENI_M_IRQ_EN;
>> +       writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
>> +       val |= S_COMMON_GENI_S_IRQ_EN;
>> +       writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
>> +       return 0;
>> +}
>> +EXPORT_SYMBOL(geni_se_init);
>> +
>> +static void geni_se_select_fifo_mode(struct geni_se *se)
>> +{
>> +       u32 proto = geni_se_read_proto(se);
>> +       u32 val;
>> +
>> +       writel_relaxed(0, se->base + SE_GSI_EVENT_EN);
>> +       writel_relaxed(0xffffffff, se->base + SE_GENI_M_IRQ_CLEAR);
>> +       writel_relaxed(0xffffffff, se->base + SE_GENI_S_IRQ_CLEAR);
>> +       writel_relaxed(0xffffffff, se->base + SE_DMA_TX_IRQ_CLR);
>> +       writel_relaxed(0xffffffff, se->base + SE_DMA_RX_IRQ_CLR);
>> +       writel_relaxed(0xffffffff, se->base + SE_IRQ_EN);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_M_IRQ_EN);
>> +       if (proto != GENI_SE_UART) {
>> +               val |= M_CMD_DONE_EN | M_TX_FIFO_WATERMARK_EN;
>> +               val |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
>> +       }
>> +       writel_relaxed(val, se->base + SE_GENI_M_IRQ_EN);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_S_IRQ_EN);
>> +       if (proto != GENI_SE_UART)
>> +               val |= S_CMD_DONE_EN;
>> +       writel_relaxed(val, se->base + SE_GENI_S_IRQ_EN);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
>> +       val &= ~GENI_DMA_MODE_EN;
>> +       writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
>> +}
>> +
>> +static void geni_se_select_dma_mode(struct geni_se *se)
>> +{
>> +       u32 val;
>> +
>> +       writel_relaxed(0, se->base + SE_GSI_EVENT_EN);
>> +       writel_relaxed(0xffffffff, se->base + SE_GENI_M_IRQ_CLEAR);
>> +       writel_relaxed(0xffffffff, se->base + SE_GENI_S_IRQ_CLEAR);
>> +       writel_relaxed(0xffffffff, se->base + SE_DMA_TX_IRQ_CLR);
>> +       writel_relaxed(0xffffffff, se->base + SE_DMA_RX_IRQ_CLR);
>> +       writel_relaxed(0xffffffff, se->base + SE_IRQ_EN);
>> +
>> +       val = readl_relaxed(se->base + SE_GENI_DMA_MODE_EN);
>> +       val |= GENI_DMA_MODE_EN;
>> +       writel_relaxed(val, se->base + SE_GENI_DMA_MODE_EN);
>> +}
>> +
>> +/**
>> + * geni_se_select_mode() - Select the serial engine transfer mode
>> + * @se:                Pointer to the concerned Serial Engine.
>> + * @mode:      Transfer mode to be selected.
>> + */
>> +void geni_se_select_mode(struct geni_se *se, int mode)
> 
> enum mode?
Ok.
> 
>> +{
>> +       WARN_ON(mode != GENI_SE_FIFO && mode != GENI_SE_DMA);
>> +
>> +       switch (mode) {
>> +       case GENI_SE_FIFO:
>> +               geni_se_select_fifo_mode(se);
>> +               break;
>> +       case GENI_SE_DMA:
>> +               geni_se_select_dma_mode(se);
>> +               break;
>> +       }
>> +}
>> +EXPORT_SYMBOL(geni_se_select_mode);
>> +
>> +/**
>> + * geni_se_setup_m_cmd() - Setup the primary sequencer
>> + * @se:                Pointer to the concerned Serial Engine.
>> + * @cmd:       Command/Operation to setup in the primary sequencer.
>> + * @params:    Parameter for the sequencer command.
>> + *
>> + * This function is used to configure the primary sequencer with the
>> + * command and its associated parameters.
>> + */
>> +void geni_se_setup_m_cmd(struct geni_se *se, u32 cmd, u32 params)
>> +{
>> +       u32 m_cmd;
>> +
>> +       m_cmd = (cmd << M_OPCODE_SHFT) | (params & M_PARAMS_MSK);
>> +       writel_relaxed(m_cmd, se->base + SE_GENI_M_CMD0);
>> +}
>> +EXPORT_SYMBOL(geni_se_setup_m_cmd);
>> +
>> +/**
>> + * geni_se_setup_s_cmd() - Setup the secondary sequencer
>> + * @se:                Pointer to the concerned Serial Engine.
>> + * @cmd:       Command/Operation to setup in the secondary sequencer.
>> + * @params:    Parameter for the sequencer command.
>> + *
>> + * This function is used to configure the secondary sequencer with the
>> + * command and its associated parameters.
>> + */
>> +void geni_se_setup_s_cmd(struct geni_se *se, u32 cmd, u32 params)
>> +{
>> +       u32 s_cmd;
>> +
>> +       s_cmd = readl_relaxed(se->base + SE_GENI_S_CMD0);
>> +       s_cmd &= ~(S_OPCODE_MSK | S_PARAMS_MSK);
>> +       s_cmd |= (cmd << S_OPCODE_SHFT);
>> +       s_cmd |= (params & S_PARAMS_MSK);
>> +       writel_relaxed(s_cmd, se->base + SE_GENI_S_CMD0);
>> +}
>> +EXPORT_SYMBOL(geni_se_setup_s_cmd);
>> +
>> +/**
>> + * geni_se_cancel_m_cmd() - Cancel the command configured in the primary
>> + *                          sequencer
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to cancel the currently configured command in the
>> + * primary sequencer.
>> + */
>> +void geni_se_cancel_m_cmd(struct geni_se *se)
>> +{
>> +       writel_relaxed(M_GENI_CMD_CANCEL, se->base + SE_GENI_M_CMD_CTRL_REG);
>> +}
>> +EXPORT_SYMBOL(geni_se_cancel_m_cmd);
>> +
>> +/**
>> + * geni_se_cancel_s_cmd() - Cancel the command configured in the secondary
>> + *                          sequencer
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to cancel the currently configured command in the
>> + * secondary sequencer.
>> + */
>> +void geni_se_cancel_s_cmd(struct geni_se *se)
>> +{
>> +       writel_relaxed(S_GENI_CMD_CANCEL, se->base + SE_GENI_S_CMD_CTRL_REG);
>> +}
>> +EXPORT_SYMBOL(geni_se_cancel_s_cmd);
>> +
>> +/**
>> + * geni_se_abort_m_cmd() - Abort the command configured in the primary sequencer
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to force abort the currently configured command in the
>> + * primary sequencer.
>> + */
>> +void geni_se_abort_m_cmd(struct geni_se *se)
>> +{
>> +       writel_relaxed(M_GENI_CMD_ABORT, se->base + SE_GENI_M_CMD_CTRL_REG);
>> +}
>> +EXPORT_SYMBOL(geni_se_abort_m_cmd);
>> +
>> +/**
>> + * geni_se_abort_s_cmd() - Abort the command configured in the secondary
>> + *                         sequencer
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to force abort the currently configured command in the
>> + * secondary sequencer.
>> + */
>> +void geni_se_abort_s_cmd(struct geni_se *se)
>> +{
>> +       writel_relaxed(S_GENI_CMD_ABORT, se->base + SE_GENI_S_CMD_CTRL_REG);
>> +}
>> +EXPORT_SYMBOL(geni_se_abort_s_cmd);
> 
> Can these one-liners go into the header file and be marked static
> inline? I would guess call-sites already have se->base in hand, so
> registers might be reused more efficiently and it may result in a single
> store instruction instead of a branch and load/store.
Ok.
> 
>> +
>> +/**
>> + * geni_se_get_tx_fifo_depth() - Get the TX fifo depth of the serial engine
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to get the depth i.e. number of elements in the
>> + * TX fifo of the serial engine.
>> + *
>> + * Return: TX fifo depth in units of FIFO words.
>> + */
>> +u32 geni_se_get_tx_fifo_depth(struct geni_se *se)
>> +{
>> +       u32 val;
>> +
>> +       val = readl_relaxed(se->base + SE_HW_PARAM_0);
>> +
>> +       return (val & TX_FIFO_DEPTH_MSK) >> TX_FIFO_DEPTH_SHFT;
>> +}
>> +EXPORT_SYMBOL(geni_se_get_tx_fifo_depth);
>> +
>> +/**
>> + * geni_se_get_tx_fifo_width() - Get the TX fifo width of the serial engine
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to get the width i.e. word size per element in the
>> + * TX fifo of the serial engine.
>> + *
>> + * Return: TX fifo width in bits
>> + */
>> +u32 geni_se_get_tx_fifo_width(struct geni_se *se)
>> +{
>> +       u32 val;
>> +
>> +       val = readl_relaxed(se->base + SE_HW_PARAM_0);
>> +
>> +       return (val & TX_FIFO_WIDTH_MSK) >> TX_FIFO_WIDTH_SHFT;
>> +}
>> +EXPORT_SYMBOL(geni_se_get_tx_fifo_width);
>> +
>> +/**
>> + * geni_se_get_rx_fifo_depth() - Get the RX fifo depth of the serial engine
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * This function is used to get the depth i.e. number of elements in the
>> + * RX fifo of the serial engine.
>> + *
>> + * Return: RX fifo depth in units of FIFO words
>> + */
>> +u32 geni_se_get_rx_fifo_depth(struct geni_se *se)
>> +{
>> +       u32 val;
>> +
>> +       val = readl_relaxed(se->base + SE_HW_PARAM_1);
>> +
>> +       return (val & RX_FIFO_DEPTH_MSK) >> RX_FIFO_DEPTH_SHFT;
>> +}
>> +EXPORT_SYMBOL(geni_se_get_rx_fifo_depth);
> 
> These ones too, can probably just be static inline.
Ok. Just for my knowledge - is there any reference guideline regarding 
when to use static inline myself and when to let the compiler do the 
clever thing?
> 
>> +
>> +/**
>> + * DOC: Overview
>> + *
>> + * GENI FIFO packing is highly configurable. TX/RX packing/unpacking consist
>> + * of up to 4 operations, each operation represented by 4 configuration vectors
>> + * of 10 bits programmed in GENI_TX_PACKING_CFG0 and GENI_TX_PACKING_CFG1 for
>> + * TX FIFO and in GENI_RX_PACKING_CFG0 and GENI_RX_PACKING_CFG1 for RX FIFO.
>> + * Refer to below examples for detailed bit-field description.
>> + *
>> + * Example 1: word_size = 7, packing_mode = 4 x 8, msb_to_lsb = 1
>> + *
>> + *        +-----------+-------+-------+-------+-------+
>> + *        |           | vec_0 | vec_1 | vec_2 | vec_3 |
>> + *        +-----------+-------+-------+-------+-------+
>> + *        | start     | 0x6   | 0xe   | 0x16  | 0x1e  |
>> + *        | direction | 1     | 1     | 1     | 1     |
>> + *        | length    | 6     | 6     | 6     | 6     |
>> + *        | stop      | 0     | 0     | 0     | 1     |
>> + *        +-----------+-------+-------+-------+-------+
>> + *
>> + * Example 2: word_size = 15, packing_mode = 2 x 16, msb_to_lsb = 0
>> + *
>> + *        +-----------+-------+-------+-------+-------+
>> + *        |           | vec_0 | vec_1 | vec_2 | vec_3 |
>> + *        +-----------+-------+-------+-------+-------+
>> + *        | start     | 0x0   | 0x8   | 0x10  | 0x18  |
>> + *        | direction | 0     | 0     | 0     | 0     |
>> + *        | length    | 7     | 6     | 7     | 6     |
>> + *        | stop      | 0     | 0     | 0     | 1     |
>> + *        +-----------+-------+-------+-------+-------+
>> + *
>> + * Example 3: word_size = 23, packing_mode = 1 x 32, msb_to_lsb = 1
>> + *
>> + *        +-----------+-------+-------+-------+-------+
>> + *        |           | vec_0 | vec_1 | vec_2 | vec_3 |
>> + *        +-----------+-------+-------+-------+-------+
>> + *        | start     | 0x16  | 0xe   | 0x6   | 0x0   |
>> + *        | direction | 1     | 1     | 1     | 1     |
>> + *        | length    | 7     | 7     | 6     | 0     |
>> + *        | stop      | 0     | 0     | 1     | 0     |
>> + *        +-----------+-------+-------+-------+-------+
>> + *
>> + */
>> +
>> +#define NUM_PACKING_VECTORS 4
>> +#define PACKING_START_SHIFT 5
>> +#define PACKING_DIR_SHIFT 4
>> +#define PACKING_LEN_SHIFT 1
>> +#define PACKING_STOP_BIT BIT(0)
>> +#define PACKING_VECTOR_SHIFT 10
>> +/**
>> + * geni_se_config_packing() - Packing configuration of the serial engine
>> + * @se:                Pointer to the concerned Serial Engine
>> + * @bpw:       Bits of data per transfer word.
>> + * @pack_words:        Number of words per fifo element.
>> + * @msb_to_lsb:        Transfer from MSB to LSB or vice-versa.
>> + * @tx_cfg:    Flag to configure the TX Packing.
>> + * @rx_cfg:    Flag to configure the RX Packing.
>> + *
>> + * This function is used to configure the packing rules for the current
>> + * transfer.
>> + */
>> +void geni_se_config_packing(struct geni_se *se, int bpw, int pack_words,
>> +                           bool msb_to_lsb, bool tx_cfg, bool rx_cfg)
>> +{
>> +       u32 cfg0, cfg1, cfg[NUM_PACKING_VECTORS] = {0};
>> +       int len;
>> +       int temp_bpw = bpw;
>> +       int idx_start = msb_to_lsb ? bpw - 1 : 0;
>> +       int idx = idx_start;
>> +       int idx_delta = msb_to_lsb ? -BITS_PER_BYTE : BITS_PER_BYTE;
>> +       int ceil_bpw = (bpw + (BITS_PER_BYTE - 1)) & ~(BITS_PER_BYTE - 1);
> 
> ALIGN(bpw, BITS_PER_BYTE)?
Ok.
> 
>> +       int iter = (ceil_bpw * pack_words) / BITS_PER_BYTE;
>> +       int i;
>> +
>> +       if (iter <= 0 || iter > NUM_PACKING_VECTORS)
>> +               return;
>> +
>> +       for (i = 0; i < iter; i++) {
>> +               if (temp_bpw < BITS_PER_BYTE)
>> +                       len = temp_bpw - 1;
>> +               else
>> +                       len = BITS_PER_BYTE - 1;
> 
> len = min(temp_bpw, BITS_PER_BYTE) - 1;
Ok.
> 
>> +
>> +               cfg[i] = idx << PACKING_START_SHIFT;
>> +               cfg[i] |= msb_to_lsb << PACKING_DIR_SHIFT;
>> +               cfg[i] |= len << PACKING_LEN_SHIFT;
>> +
>> +               if (temp_bpw <= BITS_PER_BYTE) {
>> +                       idx = ((i + 1) * BITS_PER_BYTE) + idx_start;
>> +                       temp_bpw = bpw;
>> +               } else {
>> +                       idx = idx + idx_delta;
>> +                       temp_bpw = temp_bpw - BITS_PER_BYTE;
>> +               }
>> +       }
>> +       cfg[iter - 1] |= PACKING_STOP_BIT;
>> +       cfg0 = cfg[0] | (cfg[1] << PACKING_VECTOR_SHIFT);
>> +       cfg1 = cfg[2] | (cfg[3] << PACKING_VECTOR_SHIFT);
>> +
>> +       if (tx_cfg) {
>> +               writel_relaxed(cfg0, se->base + SE_GENI_TX_PACKING_CFG0);
>> +               writel_relaxed(cfg1, se->base + SE_GENI_TX_PACKING_CFG1);
>> +       }
>> +       if (rx_cfg) {
>> +               writel_relaxed(cfg0, se->base + SE_GENI_RX_PACKING_CFG0);
>> +               writel_relaxed(cfg1, se->base + SE_GENI_RX_PACKING_CFG1);
>> +       }
>> +
>> +       /*
>> +        * Number of protocol words in each FIFO entry
>> +        * 0 - 4x8, four words in each entry, max word size of 8 bits
>> +        * 1 - 2x16, two words in each entry, max word size of 16 bits
>> +        * 2 - 1x32, one word in each entry, max word size of 32 bits
>> +        * 3 - undefined
>> +        */
>> +       if (pack_words || bpw == 32)
>> +               writel_relaxed(bpw / 16, se->base + SE_GENI_BYTE_GRAN);
>> +}
>> +EXPORT_SYMBOL(geni_se_config_packing);
>> +
>> +static void geni_se_clks_off(struct geni_se *se)
>> +{
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +
>> +       clk_disable_unprepare(se->clk);
>> +       clk_bulk_disable_unprepare(ARRAY_SIZE(wrapper->ahb_clks),
>> +                                               wrapper->ahb_clks);
>> +}
>> +
>> +/**
>> + * geni_se_resources_off() - Turn off resources associated with the serial
>> + *                           engine
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * Return: 0 on success, standard Linux error codes on failure/error.
>> + */
>> +int geni_se_resources_off(struct geni_se *se)
>> +{
>> +       int ret;
>> +
>> +       ret = pinctrl_pm_select_sleep_state(se->dev);
>> +       if (ret)
>> +               return ret;
>> +
>> +       geni_se_clks_off(se);
>> +       return 0;
>> +}
>> +EXPORT_SYMBOL(geni_se_resources_off);
>> +
>> +static int geni_se_clks_on(struct geni_se *se)
>> +{
>> +       int ret;
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +
>> +       ret = clk_bulk_prepare_enable(ARRAY_SIZE(wrapper->ahb_clks),
>> +                                               wrapper->ahb_clks);
>> +       if (ret)
>> +               return ret;
>> +
>> +       ret = clk_prepare_enable(se->clk);
>> +       if (ret)
>> +               clk_bulk_disable_unprepare(ARRAY_SIZE(wrapper->ahb_clks),
>> +                                                       wrapper->ahb_clks);
>> +       return ret;
>> +}
>> +
>> +/**
>> + * geni_se_resources_on() - Turn on resources associated with the serial
>> + *                          engine
>> + * @se:        Pointer to the concerned Serial Engine.
>> + *
>> + * Return: 0 on success, standard Linux error codes on failure/error.
>> + */
>> +int geni_se_resources_on(struct geni_se *se)
>> +{
>> +       int ret = 0;
> 
> Don't assign variables and then reassign them on the next line.
Ok.
> 
>> +
>> +       ret = geni_se_clks_on(se);
>> +       if (ret)
>> +               return ret;
>> +
>> +       ret = pinctrl_pm_select_default_state(se->dev);
>> +       if (ret)
>> +               geni_se_clks_off(se);
>> +
>> +       return ret;
>> +}
>> +EXPORT_SYMBOL(geni_se_resources_on);
> 
> IS there a reason why we can't use runtime PM or normal linux PM
> infrastructure to power on the wrapper and keep it powered while the
> protocol driver is active?
Besides turning on the clocks & pinctrl settings, wrapper also has to do 
the bus scaling votes. The bus scaling votes depend on the individual 
serial interface bandwidth requirements. The bus scaling votes is not 
present currently. But once the support comes in, this function enables 
adding it.
> 
>> +
>> +/**
>> + * geni_se_clk_tbl_get() - Get the clock table to program DFS
>> + * @se:                Pointer to the concerned Serial Engine.
>> + * @tbl:       Table in which the output is returned.
>> + *
>> + * This function is called by the protocol drivers to determine the different
>> + * clock frequencies supported by Serial Engine Core Clock. The protocol
>> + * drivers use the output to determine the clock frequency index to be
>> + * programmed into DFS.
>> + *
>> + * Return: number of valid performance levels in the table on success,
>> + *        standard Linux error codes on failure.
>> + */
>> +int geni_se_clk_tbl_get(struct geni_se *se, unsigned long **tbl)
>> +{
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +       unsigned long freq = 0;
>> +       int i;
>> +       int ret = 0;
>> +
>> +       mutex_lock(&wrapper->lock);
>> +       if (wrapper->clk_perf_tbl) {
>> +               *tbl = wrapper->clk_perf_tbl;
>> +               ret = wrapper->num_clk_levels;
>> +               goto out_unlock;
>> +       }
>> +
>> +       wrapper->clk_perf_tbl = kcalloc(MAX_CLK_PERF_LEVEL,
>> +                                       sizeof(*wrapper->clk_perf_tbl),
>> +                                       GFP_KERNEL);
>> +       if (!wrapper->clk_perf_tbl) {
>> +               ret = -ENOMEM;
>> +               goto out_unlock;
>> +       }
>> +
>> +       for (i = 0; i < MAX_CLK_PERF_LEVEL; i++) {
>> +               freq = clk_round_rate(se->clk, freq + 1);
>> +               if (!freq || freq == wrapper->clk_perf_tbl[i - 1])
>> +                       break;
>> +               wrapper->clk_perf_tbl[i] = freq;
>> +       }
>> +       wrapper->num_clk_levels = i;
>> +       *tbl = wrapper->clk_perf_tbl;
>> +       ret = wrapper->num_clk_levels;
>> +out_unlock:
>> +       mutex_unlock(&wrapper->lock);
> 
> Is this lock actually protecting anything? I mean to say, is any more
> than one geni protocol driver calling this function at a time? Or is
> the same geni protocol driver calling this from multiple threads at the
> same time? The lock looks almost useless.
Yes, there is a possibility of multiple I2C instances within the same 
wrapper trying to get this table simultaneously.

As Evan mentioned in the other thread, Bjorn had the comment to move it 
to the probe and remove the lock. I looked into the possibility of it. 
 From the hardware perspective, this table belongs to the wrapper and is 
shared by all the serial engines within the wrapper. But due to software 
implementation reasons, clk_round_rate can be be performed only on the 
clocks that are tagged as DFS compatible and only the serial engine 
clocks are tagged so. At least this was the understanding based on our 
earlier discussion with the concerned folks. We will revisit it and 
check if anything has changed recently.
> 
>> +       return ret;
>> +}
>> +EXPORT_SYMBOL(geni_se_clk_tbl_get);
>> +
>> +/**
>> + * geni_se_clk_freq_match() - Get the matching or closest SE clock frequency
>> + * @se:                Pointer to the concerned Serial Engine.
>> + * @req_freq:  Requested clock frequency.
>> + * @index:     Index of the resultant frequency in the table.
>> + * @res_freq:  Resultant frequency which matches or is closer to the
>> + *             requested frequency.
>> + * @exact:     Flag to indicate exact multiple requirement of the requested
>> + *             frequency.
>> + *
>> + * This function is called by the protocol drivers to determine the matching
>> + * or exact multiple of the requested frequency, as provided by the Serial
>> + * Engine clock in order to meet the performance requirements. If there is
>> + * no matching or exact multiple of the requested frequency found, then it
>> + * selects the closest floor frequency, if exact flag is not set.
>> + *
>> + * Return: 0 on success, standard Linux error codes on failure.
>> + */
>> +int geni_se_clk_freq_match(struct geni_se *se, unsigned long req_freq,
>> +                          unsigned int *index, unsigned long *res_freq,
>> +                          bool exact)
>> +{
>> +       unsigned long *tbl;
>> +       int num_clk_levels;
>> +       int i;
>> +
>> +       num_clk_levels = geni_se_clk_tbl_get(se, &tbl);
>> +       if (num_clk_levels < 0)
>> +               return num_clk_levels;
>> +
>> +       if (num_clk_levels == 0)
>> +               return -EFAULT;
> 
> I believe this would mean userspace thought the syscall faulted.
> Perhaps -EINVAL instead?
Ok.
> 
>> +
>> +       *res_freq = 0;
>> +       for (i = 0; i < num_clk_levels; i++) {
>> +               if (!(tbl[i] % req_freq)) {
>> +                       *index = i;
>> +                       *res_freq = tbl[i];
>> +                       return 0;
>> +               }
>> +
>> +               if (!(*res_freq) || ((tbl[i] > *res_freq) &&
>> +                                    (tbl[i] < req_freq))) {
>> +                       *index = i;
>> +                       *res_freq = tbl[i];
>> +               }
>> +       }
>> +
>> +       if (exact)
>> +               return -ENOKEY;
> 
> Interesting error code. Doubtful this is correct because it seems to be
> related to crypto keys.
Ok.
> 
>> +
>> +       return 0;
>> +}
>> +EXPORT_SYMBOL(geni_se_clk_freq_match);
>> +
>> +#define GENI_SE_DMA_DONE_EN BIT(0)
>> +#define GENI_SE_DMA_EOT_EN BIT(1)
>> +#define GENI_SE_DMA_AHB_ERR_EN BIT(2)
>> +#define GENI_SE_DMA_EOT_BUF BIT(0)
>> +/**
>> + * geni_se_tx_dma_prep() - Prepare the Serial Engine for TX DMA transfer
>> + * @se:                        Pointer to the concerned Serial Engine.
>> + * @buf:               Pointer to the TX buffer.
>> + * @len:               Length of the TX buffer.
>> + *
>> + * This function is used to prepare the buffers for DMA TX.
>> + *
>> + * Return: Mapped DMA Address of the buffer on success, NULL on failure.
>> + */
>> +dma_addr_t geni_se_tx_dma_prep(struct geni_se *se, void *buf, size_t len)
>> +{
>> +       dma_addr_t iova;
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +       u32 val;
>> +
>> +       iova = dma_map_single(wrapper->dev, buf, len, DMA_TO_DEVICE);
>> +       if (dma_mapping_error(wrapper->dev, iova))
>> +               return (dma_addr_t)NULL;
>> +
>> +       val = GENI_SE_DMA_DONE_EN;
>> +       val |= GENI_SE_DMA_EOT_EN;
>> +       val |= GENI_SE_DMA_AHB_ERR_EN;
>> +       writel_relaxed(val, se->base + SE_DMA_TX_IRQ_EN_SET);
>> +       writel_relaxed((u32)iova, se->base + SE_DMA_TX_PTR_L);
> 
> lower_32_bits()
Ok.
> 
>> +       writel_relaxed((u32)(iova >> 32), se->base + SE_DMA_TX_PTR_H);
> 
> upper_32_bits()
Ok.
> 
>> +       writel_relaxed(GENI_SE_DMA_EOT_BUF, se->base + SE_DMA_TX_ATTR);
>> +       writel_relaxed((u32)len, se->base + SE_DMA_TX_LEN);
> 
> Useless cast.
I will remove the casting.
> 
>> +       return iova;
>> +}
>> +EXPORT_SYMBOL(geni_se_tx_dma_prep);
>> +
>> +/**
>> + * geni_se_rx_dma_prep() - Prepare the Serial Engine for RX DMA transfer
>> + * @se:                        Pointer to the concerned Serial Engine.
>> + * @buf:               Pointer to the RX buffer.
>> + * @len:               Length of the RX buffer.
>> + *
>> + * This function is used to prepare the buffers for DMA RX.
>> + *
>> + * Return: Mapped DMA Address of the buffer on success, NULL on failure.
>> + */
>> +dma_addr_t geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len)
>> +{
>> +       dma_addr_t iova;
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +       u32 val;
>> +
>> +       iova = dma_map_single(wrapper->dev, buf, len, DMA_FROM_DEVICE);
>> +       if (dma_mapping_error(wrapper->dev, iova))
>> +               return (dma_addr_t)NULL;
> 
> Can't return a dma_mapping_error address to the caller and have them
> figure it out?
Earlier we used to return the DMA_ERROR_CODE which has been removed 
recently in arm64 architecture. If we return the dma_mapping_error, then 
the caller also needs the device which encountered the mapping error. 
The serial interface drivers can use their parent currently to resolve 
the mapping error. Once the wrapper starts mapping using IOMMU context 
bank, then the serial interface drivers do not know which device to use 
to know if there is an error.

Having said that, the dma_ops suggestion might help with handling this 
situation. I will look into it further.
> 
>> +
>> +       val = GENI_SE_DMA_DONE_EN;
>> +       val |= GENI_SE_DMA_EOT_EN;
>> +       val |= GENI_SE_DMA_AHB_ERR_EN;
>> +       writel_relaxed(val, se->base + SE_DMA_RX_IRQ_EN_SET);
>> +       writel_relaxed((u32)iova, se->base + SE_DMA_RX_PTR_L);
>> +       writel_relaxed((u32)(iova >> 32), se->base + SE_DMA_RX_PTR_H);
> 
> upper/lower macros again.
Ok.
> 
>> +       /* RX does not have EOT buffer type bit. So just reset RX_ATTR */
>> +       writel_relaxed(0, se->base + SE_DMA_RX_ATTR);
>> +       writel_relaxed((u32)len, se->base + SE_DMA_RX_LEN);
> 
> Drop cast?
Ok.
> 
>> +       return iova;
>> +}
>> +EXPORT_SYMBOL(geni_se_rx_dma_prep);
>> +
>> +/**
>> + * geni_se_tx_dma_unprep() - Unprepare the Serial Engine after TX DMA transfer
>> + * @se:                        Pointer to the concerned Serial Engine.
>> + * @iova:              DMA address of the TX buffer.
>> + * @len:               Length of the TX buffer.
>> + *
>> + * This function is used to unprepare the DMA buffers after DMA TX.
>> + */
>> +void geni_se_tx_dma_unprep(struct geni_se *se, dma_addr_t iova, size_t len)
>> +{
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +
>> +       if (iova)
>> +               dma_unmap_single(wrapper->dev, iova, len, DMA_TO_DEVICE);
>> +}
>> +EXPORT_SYMBOL(geni_se_tx_dma_unprep);
>> +
>> +/**
>> + * geni_se_rx_dma_unprep() - Unprepare the Serial Engine after RX DMA transfer
>> + * @se:                        Pointer to the concerned Serial Engine.
>> + * @iova:              DMA address of the RX buffer.
>> + * @len:               Length of the RX buffer.
>> + *
>> + * This function is used to unprepare the DMA buffers after DMA RX.
>> + */
>> +void geni_se_rx_dma_unprep(struct geni_se *se, dma_addr_t iova, size_t len)
>> +{
>> +       struct geni_wrapper *wrapper = se->wrapper;
>> +
>> +       if (iova)
>> +               dma_unmap_single(wrapper->dev, iova, len, DMA_FROM_DEVICE);
>> +}
>> +EXPORT_SYMBOL(geni_se_rx_dma_unprep);
> 
> Instead of having the functions exported, could we set the dma_ops on
> all child devices of the wrapper that this driver populates and then
> implement the DMA ops for those devices here? I assume that there's
> never another DMA master between the wrapper and the serial engine, so I
> think it would work.
This suggestion looks like it will work.
> 
>> +
>> +static int geni_se_probe(struct platform_device *pdev)
>> +{
>> +       struct device *dev = &pdev->dev;
>> +       struct resource *res;
>> +       struct geni_wrapper *wrapper;
>> +       int ret;
>> +
>> +       wrapper = devm_kzalloc(dev, sizeof(*wrapper), GFP_KERNEL);
>> +       if (!wrapper)
>> +               return -ENOMEM;
>> +
>> +       wrapper->dev = dev;
>> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +       wrapper->base = devm_ioremap_resource(dev, res);
>> +       if (IS_ERR(wrapper->base)) {
>> +               dev_err(dev, "%s: Error mapping the resource\n", __func__);
> 
> Drop error message, devm_ioremap_resource() already does it.
Ok.
> 
>> +               return -EFAULT;
> 
> return PTR_ERR(wrapper->base);
> 
>> +       }
>> +
>> +       wrapper->ahb_clks[0].id = m_ahb_clk;
>> +       wrapper->ahb_clks[1].id = s_ahb_clk;
>> +       ret = devm_clk_bulk_get(dev, NUM_AHB_CLKS, wrapper->ahb_clks);
>> +       if (ret) {
>> +               dev_err(dev, "Err getting AHB clks %d\n", ret);
>> +               return ret;
>> +       }
>> +
>> +       mutex_init(&wrapper->lock);
>> +       dev_set_drvdata(dev, wrapper);
>> +       dev_dbg(dev, "GENI SE Driver probed\n");
>> +       return devm_of_platform_populate(dev);
>> +}
>> +
>> +static int geni_se_remove(struct platform_device *pdev)
>> +{
>> +       struct device *dev = &pdev->dev;
>> +       struct geni_wrapper *wrapper = dev_get_drvdata(dev);
>> +
>> +       kfree(wrapper->clk_perf_tbl);
> 
> Why not devm_kzalloc() this?
I will check it.
> 
>> +       return 0;
>> +}
>> +
>> +static const struct of_device_id geni_se_dt_match[] = {
>> +       { .compatible = "qcom,geni-se-qup", },
>> +       {}
>> +};
>> +MODULE_DEVICE_TABLE(of, geni_se_dt_match);
>> +
>> +static struct platform_driver geni_se_driver = {
>> +       .driver = {
>> +               .name = "geni_se_qup",
>> +               .of_match_table = geni_se_dt_match,
>> +       },
>> +       .probe = geni_se_probe,
>> +       .remove = geni_se_remove,
>> +};
>> +module_platform_driver(geni_se_driver);
>> +
>> +MODULE_DESCRIPTION("GENI Serial Engine Driver");
>> +MODULE_LICENSE("GPL v2");
>> diff --git a/include/linux/qcom-geni-se.h b/include/linux/qcom-geni-se.h
>> new file mode 100644
>> index 0000000..4996de7
>> --- /dev/null
>> +++ b/include/linux/qcom-geni-se.h
>> @@ -0,0 +1,247 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
>> +/*
>> + * Copyright (c) 2017-2018, The Linux Foundation. All rights reserved.
>> + */
>> +
>> +#ifndef _LINUX_QCOM_GENI_SE
>> +#define _LINUX_QCOM_GENI_SE
>> +#include <linux/clk.h>
> 
> Please forward declare struct clk and drop this include here.
Ok.
> 
>> +#include <linux/dma-direction.h>
> 
> Drop?
There was a map function which got dropped in v3 patch series. I will 
drop including this header file.
> 
>> +
>> +/* Transfer mode supported by GENI Serial Engines */
>> +enum geni_se_xfer_mode {
>> +       GENI_SE_INVALID,
>> +       GENI_SE_FIFO,
>> +       GENI_SE_DMA,
>> +};
>> +
>> +/* Protocols supported by GENI Serial Engines */
>> +enum geni_se_protocol_types {
>> +       GENI_SE_NONE,
>> +       GENI_SE_SPI,
>> +       GENI_SE_UART,
>> +       GENI_SE_I2C,
>> +       GENI_SE_I3C,
>> +};
>> +
>> +/**
>> + * struct geni_se - GENI Serial Engine
>> + * @base:              Base Address of the Serial Engine's register block.
>> + * @dev:               Pointer to the Serial Engine device.
>> + * @wrapper:           Pointer to the parent QUP Wrapper core.
>> + * @clk:               Handle to the core serial engine clock.
>> + */
>> +struct geni_se {
>> +       void __iomem *base;
>> +       struct device *dev;
>> +       void *wrapper;
> 
> Can this get the geni_wrapper type? It could be opaque if you like.
I am not sure if it is ok to have the children know the details of the 
parent. That is why it is kept as opaque.
> 
>> +       struct clk *clk;
>> +};
>> +
Regards,
Karthik.
Karthikeyan Ramasubramanian March 6, 2018, 12:51 a.m. UTC | #6
On 3/2/2018 3:11 PM, Stephen Boyd wrote:
> Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:09)
>> diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
>> index 3682fd3..c6b1500 100644
>> --- a/drivers/tty/serial/Kconfig
>> +++ b/drivers/tty/serial/Kconfig
>> @@ -1104,6 +1104,17 @@ config SERIAL_MSM_CONSOLE
>>          select SERIAL_CORE_CONSOLE
>>          select SERIAL_EARLYCON
>>   
>> +config SERIAL_QCOM_GENI
>> +       bool "QCOM on-chip GENI based serial port support"
> 
> This can be tristate.
> 
>> +       depends on ARCH_QCOM
> 
> || COMPILE_TEST
> ?
Ok.
> 
>> +       depends on QCOM_GENI_SE
>> +       select SERIAL_CORE
> 
> This can stay.
> 
>> +       select SERIAL_CORE_CONSOLE
>> +       select SERIAL_EARLYCON
> 
> These two can go to a new config option, like SERIAL_QCOM_GENI_CONSOLE,
> and that would be bool. Please take a look at the existing SERIAL_MSM
> and SERIAL_MSM_CONSOLE setup to understand how to do it.
Ok.
> 
>> +       help
>> +         Serial driver for Qualcomm Technologies Inc's GENI based QUP
>> +         hardware.
>> +
>>   config SERIAL_VT8500
>>          bool "VIA VT8500 on-chip serial port support"
>>          depends on ARCH_VT8500
>> diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
>> new file mode 100644
>> index 0000000..8536b7d
>> --- /dev/null
>> +++ b/drivers/tty/serial/qcom_geni_serial.c
>> @@ -0,0 +1,1181 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (c) 2017-2018, The Linux foundation. All rights reserved.
>> +
>> +#include <linux/console.h>
>> +#include <linux/io.h>
>> +#include <linux/iopoll.h>
>> +#include <linux/module.h>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/qcom-geni-se.h>
>> +#include <linux/serial.h>
>> +#include <linux/serial_core.h>
>> +#include <linux/slab.h>
>> +#include <linux/tty.h>
>> +#include <linux/tty_flip.h>
>> +
>> +/* UART specific GENI registers */
>> +#define SE_UART_TX_TRANS_CFG           0x25c
>> +#define SE_UART_TX_WORD_LEN            0x268
>> +#define SE_UART_TX_STOP_BIT_LEN                0x26c
>> +#define SE_UART_TX_TRANS_LEN           0x270
>> +#define SE_UART_RX_TRANS_CFG           0x280
>> +#define SE_UART_RX_WORD_LEN            0x28c
>> +#define SE_UART_RX_STALE_CNT           0x294
>> +#define SE_UART_TX_PARITY_CFG          0x2a4
>> +#define SE_UART_RX_PARITY_CFG          0x2a8
>> +
>> +/* SE_UART_TRANS_CFG */
>> +#define UART_TX_PAR_EN         BIT(0)
>> +#define UART_CTS_MASK          BIT(1)
>> +
>> +/* SE_UART_TX_WORD_LEN */
>> +#define TX_WORD_LEN_MSK                GENMASK(9, 0)
>> +
>> +/* SE_UART_TX_STOP_BIT_LEN */
>> +#define TX_STOP_BIT_LEN_MSK    GENMASK(23, 0)
>> +#define TX_STOP_BIT_LEN_1      0
>> +#define TX_STOP_BIT_LEN_1_5    1
>> +#define TX_STOP_BIT_LEN_2      2
>> +
>> +/* SE_UART_TX_TRANS_LEN */
>> +#define TX_TRANS_LEN_MSK       GENMASK(23, 0)
>> +
>> +/* SE_UART_RX_TRANS_CFG */
>> +#define UART_RX_INS_STATUS_BIT BIT(2)
>> +#define UART_RX_PAR_EN         BIT(3)
>> +
>> +/* SE_UART_RX_WORD_LEN */
>> +#define RX_WORD_LEN_MASK       GENMASK(9, 0)
>> +
>> +/* SE_UART_RX_STALE_CNT */
>> +#define RX_STALE_CNT           GENMASK(23, 0)
>> +
>> +/* SE_UART_TX_PARITY_CFG/RX_PARITY_CFG */
>> +#define PAR_CALC_EN            BIT(0)
>> +#define PAR_MODE_MSK           GENMASK(2, 1)
>> +#define PAR_MODE_SHFT          1
>> +#define PAR_EVEN               0x00
>> +#define PAR_ODD                        0x01
>> +#define PAR_SPACE              0x10
>> +#define PAR_MARK               0x11
>> +
>> +/* UART M_CMD OP codes */
>> +#define UART_START_TX          0x1
>> +#define UART_START_BREAK       0x4
>> +#define UART_STOP_BREAK                0x5
>> +/* UART S_CMD OP codes */
>> +#define UART_START_READ                0x1
>> +#define UART_PARAM             0x1
>> +
>> +#define UART_OVERSAMPLING      32
>> +#define STALE_TIMEOUT          16
>> +#define DEFAULT_BITS_PER_CHAR  10
>> +#define GENI_UART_CONS_PORTS   1
>> +#define DEF_FIFO_DEPTH_WORDS   16
>> +#define DEF_TX_WM              2
>> +#define DEF_FIFO_WIDTH_BITS    32
>> +#define UART_CONSOLE_RX_WM     2
>> +
>> +#ifdef CONFIG_CONSOLE_POLL
>> +#define RX_BYTES_PW 1
>> +#else
>> +#define RX_BYTES_PW 4
>> +#endif
>> +
>> +struct qcom_geni_serial_port {
>> +       struct uart_port uport;
>> +       struct geni_se se;
>> +       char name[20];
>> +       u32 tx_fifo_depth;
>> +       u32 tx_fifo_width;
>> +       u32 rx_fifo_depth;
>> +       u32 tx_wm;
>> +       u32 rx_wm;
>> +       u32 rx_rfr;
>> +       int xfer_mode;
> 
> Can this be an enum?
Ok.
> 
>> +       bool port_setup;
> 
> Maybe just 'setup'? Port is in the type already.
Ok.
> 
>> +       int (*handle_rx)(struct uart_port *uport,
>> +                               u32 rx_bytes, bool drop_rx);
> 
> s/rx_bytes/bytes/
> s/drop_rx/drop/
Ok.
> 
>> +       unsigned int xmit_size;
>> +       unsigned int cur_baud;
> 
> s/cur//
Ok.
> 
>> +       unsigned int tx_bytes_pw;
>> +       unsigned int rx_bytes_pw;
>> +};
>> +
>> +static const struct uart_ops qcom_geni_serial_pops;
>> +static struct uart_driver qcom_geni_console_driver;
>> +static int handle_rx_console(struct uart_port *uport,
>> +                               u32 rx_bytes, bool drop_rx);
> 
> s/rx_bytes/bytes/
> s/drop_rx/drop/
Ok.
> 
>> +static unsigned int qcom_geni_serial_tx_empty(struct uart_port *port);
>> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
>> +                               int offset, int bit_field, bool set);
> 
> No need to forward declare this?
I will check and remove if not required.
> 
> s/bit_// ?
Ok.
> 
>> +static void qcom_geni_serial_stop_rx(struct uart_port *uport);
>> +
>> +static atomic_t uart_line_id = ATOMIC_INIT(0);
> 
> Do we need this? How about rely on DT to always have aliases instead?
> Given we only have one port I don't actually understand how this is
> supposed to work anyway.
Ok. I will remove it and rely on DT always having alias.
> 
>> +static const unsigned long root_freq[] = {7372800, 14745600, 19200000, 29491200,
>> +                                       32000000, 48000000, 64000000, 80000000,
>> +                                       96000000, 100000000};
>> +
>> +#define to_dev_port(ptr, member) \
>> +               container_of(ptr, struct qcom_geni_serial_port, member)
>> +
>> +static struct qcom_geni_serial_port qcom_geni_console_port;
> 
> Why singleton? Couldn't there be many?
Our current use-case does not need more than one instance. But more 
instances can be added if desired.
> 
>> +
>> +static void qcom_geni_serial_config_port(struct uart_port *uport, int cfg_flags)
>> +{
>> +       if (cfg_flags & UART_CONFIG_TYPE)
>> +               uport->type = PORT_MSM;
>> +}
>> +
>> +static unsigned int qcom_geni_cons_get_mctrl(struct uart_port *uport)
>> +{
>> +       return TIOCM_DSR | TIOCM_CAR | TIOCM_CTS;
>> +}
>> +
>> +static void qcom_geni_cons_set_mctrl(struct uart_port *uport,
>> +                                                       unsigned int mctrl)
>> +{
>> +}
>> +
>> +static const char *qcom_geni_serial_get_type(struct uart_port *uport)
>> +{
>> +       return "MSM";
>> +}
>> +
>> +static struct qcom_geni_serial_port *get_port_from_line(int line)
>> +{
>> +       if ((line < 0) || (line >= GENI_UART_CONS_PORTS))
> 
> Drop useless parenthesis please.
Ok.
> 
>> +               return ERR_PTR(-ENXIO);
>> +       return &qcom_geni_console_port;
>> +}
>> +
>> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
>> +                               int offset, int bit_field, bool set)
>> +{
>> +       u32 reg;
>> +       struct qcom_geni_serial_port *port;
>> +       unsigned int baud;
>> +       unsigned int fifo_bits;
>> +       unsigned long timeout_us = 20000;
>> +
>> +       /* Ensure polling is not re-ordered before the prior writes/reads */
>> +       mb();
>> +
>> +       if (uport->private_data) {
>> +               port = to_dev_port(uport, uport);
>> +               baud = port->cur_baud;
>> +               if (!baud)
>> +                       baud = 115200;
>> +               fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
>> +               /*
>> +                * Total polling iterations based on FIFO worth of bytes to be
>> +                * sent at current baud .Add a little fluff to the wait.
> 
> Bad space here                           ^
> 
I will fix it.
>> +                */
>> +               timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;
>> +       }
>> +
>> +       return !readl_poll_timeout_atomic(uport->membase + offset, reg,
>> +                        (bool)(reg & bit_field) == set, 10, timeout_us);
>> +}
>> +
>> +static void qcom_geni_serial_setup_tx(struct uart_port *uport, u32 xmit_size)
>> +{
>> +       u32 m_cmd;
>> +
>> +       writel_relaxed(xmit_size, uport->membase + SE_UART_TX_TRANS_LEN);
>> +       m_cmd = UART_START_TX << M_OPCODE_SHFT;
>> +       writel(m_cmd, uport->membase + SE_GENI_M_CMD0);
>> +}
>> +
>> +static void qcom_geni_serial_poll_tx_done(struct uart_port *uport)
>> +{
>> +       int done;
>> +       u32 irq_clear = M_CMD_DONE_EN;
>> +
>> +       done = qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>> +                                               M_CMD_DONE_EN, true);
>> +       if (!done) {
>> +               writel_relaxed(M_GENI_CMD_ABORT, uport->membase +
>> +                                               SE_GENI_M_CMD_CTRL_REG);
>> +               irq_clear |= M_CMD_ABORT_EN;
>> +               qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>> +                                                       M_CMD_ABORT_EN, true);
>> +       }
>> +       writel_relaxed(irq_clear, uport->membase + SE_GENI_M_IRQ_CLEAR);
>> +}
>> +
>> +static void qcom_geni_serial_abort_rx(struct uart_port *uport)
>> +{
>> +       u32 irq_clear = S_CMD_DONE_EN | S_CMD_ABORT_EN;
>> +
>> +       writel(S_GENI_CMD_ABORT, uport->membase + SE_GENI_S_CMD_CTRL_REG);
>> +       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
>> +                                       S_GENI_CMD_ABORT, false);
>> +       writel_relaxed(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
>> +       writel_relaxed(FORCE_DEFAULT, uport->membase + GENI_FORCE_DEFAULT_REG);
>> +}
>> +
>> +#ifdef CONFIG_CONSOLE_POLL
>> +static int qcom_geni_serial_get_char(struct uart_port *uport)
>> +{
>> +       u32 rx_fifo;
>> +       u32 status;
>> +
>> +       status = readl_relaxed(uport->membase + SE_GENI_M_IRQ_STATUS);
>> +       writel_relaxed(status, uport->membase + SE_GENI_M_IRQ_CLEAR);
>> +
>> +       status = readl_relaxed(uport->membase + SE_GENI_S_IRQ_STATUS);
>> +       writel_relaxed(status, uport->membase + SE_GENI_S_IRQ_CLEAR);
>> +
>> +       /*
>> +        * Ensure the writes to clear interrupts is not re-ordered after
>> +        * reading the data.
>> +        */
>> +       mb();
>> +
>> +       status = readl_relaxed(uport->membase + SE_GENI_RX_FIFO_STATUS);
>> +       if (!(status & RX_FIFO_WC_MSK))
>> +               return NO_POLL_CHAR;
>> +
>> +       rx_fifo = readl(uport->membase + SE_GENI_RX_FIFOn);
>> +       return rx_fifo & 0xff;
>> +}
>> +
>> +static void qcom_geni_serial_poll_put_char(struct uart_port *uport,
>> +                                                       unsigned char c)
>> +{
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       writel_relaxed(port->tx_wm, uport->membase + SE_GENI_TX_WATERMARK_REG);
>> +       qcom_geni_serial_setup_tx(uport, 1);
>> +       WARN_ON(!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>> +                                               M_TX_FIFO_WATERMARK_EN, true));
>> +       writel_relaxed((u32)c, uport->membase + SE_GENI_TX_FIFOn);
> 
> Drop useless cast.
Ok.
> 
>> +       writel_relaxed(M_TX_FIFO_WATERMARK_EN, uport->membase +
>> +                                                       SE_GENI_M_IRQ_CLEAR);
>> +       qcom_geni_serial_poll_tx_done(uport);
>> +}
>> +#endif
>> +
>> +#if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)
>> +static void qcom_geni_serial_wr_char(struct uart_port *uport, int ch)
>> +{
>> +       writel_relaxed(ch, uport->membase + SE_GENI_TX_FIFOn);
>> +}
>> +
>> +static void
>> +__qcom_geni_serial_console_write(struct uart_port *uport, const char *s,
>> +                                unsigned int count)
>> +{
>> +       int new_line = 0;
> 
> Drop
Ok.
> 
>> +       int i;
>> +       u32 bytes_to_send = count;
>> +
>> +       for (i = 0; i < count; i++) {
>> +               if (s[i] == '\n')
>> +                       new_line++;
> 
> bytes_to_send++;
Ok.
> 
>> +       }
>> +
>> +       bytes_to_send += new_line;
> 
> Drop.
Ok.
> 
>> +       writel_relaxed(DEF_TX_WM, uport->membase + SE_GENI_TX_WATERMARK_REG);
>> +       qcom_geni_serial_setup_tx(uport, bytes_to_send);
>> +       i = 0;
>> +       while (i < count) {
> 
> 	for (i = 0; i < count; ) {
> 
> would be more normal, but ok.
> 
>> +               size_t chars_to_write = 0;
>> +               size_t avail = DEF_FIFO_DEPTH_WORDS - DEF_TX_WM;
>> +
>> +               /*
>> +                * If the WM bit never set, then the Tx state machine is not
>> +                * in a valid state, so break, cancel/abort any existing
>> +                * command. Unfortunately the current data being written is
>> +                * lost.
>> +                */
>> +               while (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>> +                                               M_TX_FIFO_WATERMARK_EN, true))
> 
> Does this ever timeout? So many nested while loops makes it hard to
> follow.
Yes. Based on the baud rate of 115200 and the FIFO Depth & Width of (16 
* 32), the poll should not take more than 5 ms under the timeout scenario.
> 
>> +                       break;
>> +               chars_to_write = min_t(size_t, (size_t)(count - i), avail / 2);
>> +               uart_console_write(uport, (s + i), chars_to_write,
> 
> Drop useless parenthesis please.
Ok.
> 
>> +                                               qcom_geni_serial_wr_char);
>> +               writel_relaxed(M_TX_FIFO_WATERMARK_EN, uport->membase +
>> +                                                       SE_GENI_M_IRQ_CLEAR);
>> +               i += chars_to_write;
>> +       }
>> +       qcom_geni_serial_poll_tx_done(uport);
>> +}
>> +
>> +static void qcom_geni_serial_console_write(struct console *co, const char *s,
>> +                             unsigned int count)
>> +{
>> +       struct uart_port *uport;
>> +       struct qcom_geni_serial_port *port;
>> +       bool locked = true;
>> +       unsigned long flags;
>> +
>> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
>> +
>> +       port = get_port_from_line(co->index);
>> +       if (IS_ERR(port))
>> +               return;
>> +
>> +       uport = &port->uport;
>> +       if (oops_in_progress)
>> +               locked = spin_trylock_irqsave(&uport->lock, flags);
>> +       else
>> +               spin_lock_irqsave(&uport->lock, flags);
>> +
>> +       if (locked) {
>> +               __qcom_geni_serial_console_write(uport, s, count);
> 
> So if oops is in progress, and we didn't lock here, we don't output
> data? I'd think we would always want to write to the fifo, just make the
> lock grab/release conditional.
If we fail to grab the lock, then there is another active writer. So 
trying to write to the fifo will put the hardware in bad state because 
writer has programmed the hardware to write 'x' number of words and this 
thread will over-write it with 'y' number of words. Also the data that 
you might see in the console might be garbled.
> 
>> +               spin_unlock_irqrestore(&uport->lock, flags);
>> +       }
>> +}
>> +
>> +static int handle_rx_console(struct uart_port *uport, u32 rx_bytes, bool drop)
>> +{
>> +       u32 i = rx_bytes;
>> +       u32 rx_fifo;
>> +       unsigned char *buf;
>> +       struct tty_port *tport;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       tport = &uport->state->port;
>> +       while (i > 0) {
>> +               int c;
>> +               int bytes = i > port->rx_bytes_pw ? port->rx_bytes_pw : i;
>> +
>> +               rx_fifo = readl_relaxed(uport->membase + SE_GENI_RX_FIFOn);
> 
> Please use ioread32_rep(..., 1) here.
Ok.
> 
>> +               i -= bytes;
>> +               if (drop)
>> +                       continue;
>> +               buf = (unsigned char *)&rx_fifo;
> 
> So that this cast becomes unnecessary, and endian agnostic.
Ok.
> 
>> +
>> +               for (c = 0; c < bytes; c++) {
>> +                       int sysrq;
>> +
>> +                       uport->icount.rx++;
>> +                       sysrq = uart_handle_sysrq_char(uport, buf[c]);
> 
> And so this does the right thing in whatever world we live in.
Ok.
> 
>> +                       if (!sysrq)
>> +                               tty_insert_flip_char(tport, buf[c], TTY_NORMAL);
>> +               }
>> +       }
>> +       if (!drop)
>> +               tty_flip_buffer_push(tport);
>> +       return 0;
>> +}
>> +#else
>> +static int handle_rx_console(struct uart_port *uport,
>> +                       unsigned int rx_fifo_wc,
>> +                       unsigned int rx_last_byte_valid,
>> +                       unsigned int rx_last,
>> +                       bool drop_rx)
>> +{
>> +       return -EPERM;
>> +}
>> +
>> +#endif /* (CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)) */
>> +
>> +static void qcom_geni_serial_start_tx(struct uart_port *uport)
>> +{
>> +       u32 irq_en;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +       u32 status;
>> +
>> +       if (port->xfer_mode == GENI_SE_FIFO) {
>> +               status = readl_relaxed(uport->membase + SE_GENI_STATUS);
>> +               if (status & M_GENI_CMD_ACTIVE)
>> +                       return;
>> +
>> +               if (!qcom_geni_serial_tx_empty(uport))
>> +                       return;
>> +
>> +               /*
>> +                * Ensure writing to IRQ_EN & watermark registers are not
>> +                * re-ordered before checking the status of the Serial
>> +                * Engine and TX FIFO
>> +                */
>> +               mb();
>> +
>> +               irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
>> +               irq_en |= M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN;
>> +
>> +               writel_relaxed(port->tx_wm, uport->membase +
>> +                                               SE_GENI_TX_WATERMARK_REG);
>> +               writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
>> +       }
>> +}
>> +
>> +static void qcom_geni_serial_stop_tx(struct uart_port *uport)
>> +{
>> +       u32 irq_en;
>> +       u32 status;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
>> +       irq_en &= ~M_CMD_DONE_EN;
>> +       if (port->xfer_mode == GENI_SE_FIFO) {
>> +               irq_en &= ~M_TX_FIFO_WATERMARK_EN;
>> +               writel_relaxed(0, uport->membase +
>> +                                    SE_GENI_TX_WATERMARK_REG);
>> +       }
>> +       port->xmit_size = 0;
>> +       writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
>> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
>> +       /* Possible stop tx is called multiple times. */
>> +       if (!(status & M_GENI_CMD_ACTIVE))
>> +               return;
>> +
>> +       /*
>> +        * Ensure cancel command write is not re-ordered before checking
>> +        * checking the status of the Primary Sequencer.
>> +        */
>> +       mb();
>> +
>> +       geni_se_cancel_m_cmd(&port->se);
>> +       if (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>> +                                               M_CMD_CANCEL_EN, true)) {
>> +               geni_se_abort_m_cmd(&port->se);
>> +               qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>> +                                               M_CMD_ABORT_EN, true);
>> +               writel_relaxed(M_CMD_ABORT_EN, uport->membase +
>> +                                                       SE_GENI_M_IRQ_CLEAR);
>> +       }
>> +       writel_relaxed(M_CMD_CANCEL_EN, uport->membase + SE_GENI_M_IRQ_CLEAR);
>> +}
>> +
>> +static void qcom_geni_serial_start_rx(struct uart_port *uport)
>> +{
>> +       u32 irq_en;
>> +       u32 status;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
>> +       if (status & S_GENI_CMD_ACTIVE)
>> +               qcom_geni_serial_stop_rx(uport);
>> +
>> +       /*
>> +        * Ensure setup command write is not re-ordered before checking
>> +        * checking the status of the Secondary Sequencer.
>> +        */
>> +       mb();
>> +
>> +       geni_se_setup_s_cmd(&port->se, UART_START_READ, 0);
>> +
>> +       if (port->xfer_mode == GENI_SE_FIFO) {
>> +               irq_en = readl_relaxed(uport->membase + SE_GENI_S_IRQ_EN);
>> +               irq_en |= S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN;
>> +               writel_relaxed(irq_en, uport->membase + SE_GENI_S_IRQ_EN);
>> +
>> +               irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
>> +               irq_en |= M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN;
>> +               writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
>> +       }
>> +}
>> +
>> +static void qcom_geni_serial_stop_rx(struct uart_port *uport)
>> +{
>> +       u32 irq_en;
>> +       u32 status;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +       u32 irq_clear = S_CMD_DONE_EN;
>> +
>> +       if (port->xfer_mode == GENI_SE_FIFO) {
>> +               irq_en = readl_relaxed(uport->membase + SE_GENI_S_IRQ_EN);
>> +               irq_en &= ~(S_RX_FIFO_WATERMARK_EN | S_RX_FIFO_LAST_EN);
>> +               writel_relaxed(irq_en, uport->membase + SE_GENI_S_IRQ_EN);
>> +
>> +               irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
>> +               irq_en &= ~(M_RX_FIFO_WATERMARK_EN | M_RX_FIFO_LAST_EN);
>> +               writel_relaxed(irq_en, uport->membase + SE_GENI_M_IRQ_EN);
>> +       }
>> +
>> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
>> +       /* Possible stop rx is called multiple times. */
>> +       if (!(status & S_GENI_CMD_ACTIVE))
>> +               return;
>> +
>> +       /*
>> +        * Ensure cancel command write is not re-ordered before checking
>> +        * checking the status of the Secondary Sequencer.
> 
> Each of these comments has 'checking' twice.
I will fix the comments.
> 
>> +        */
>> +       mb();
>> +
>> +       geni_se_cancel_s_cmd(&port->se);
>> +       qcom_geni_serial_poll_bit(uport, SE_GENI_S_CMD_CTRL_REG,
>> +                                       S_GENI_CMD_CANCEL, false);
>> +       status = readl_relaxed(uport->membase + SE_GENI_STATUS);
>> +       writel_relaxed(irq_clear, uport->membase + SE_GENI_S_IRQ_CLEAR);
>> +       if (status & S_GENI_CMD_ACTIVE)
>> +               qcom_geni_serial_abort_rx(uport);
>> +}
>> +
>> +static void qcom_geni_serial_handle_rx(struct uart_port *uport, bool drop_rx)
> 
> s/drop_rx/drop/
Ok.
> 
>> +{
>> +       u32 status;
>> +       u32 word_cnt;
>> +       u32 last_word_byte_cnt;
>> +       u32 last_word_partial;
>> +       u32 total_bytes;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       status = readl_relaxed(uport->membase + SE_GENI_RX_FIFO_STATUS);
>> +       word_cnt = status & RX_FIFO_WC_MSK;
>> +       last_word_partial = status & RX_LAST;
>> +       last_word_byte_cnt = (status & RX_LAST_BYTE_VALID_MSK) >>
>> +                                               RX_LAST_BYTE_VALID_SHFT;
>> +
>> +       if (!word_cnt)
>> +               return;
>> +       total_bytes = port->rx_bytes_pw * (word_cnt - 1);
>> +       if (last_word_partial && last_word_byte_cnt)
>> +               total_bytes += last_word_byte_cnt;
>> +       else
>> +               total_bytes += port->rx_bytes_pw;
>> +       port->handle_rx(uport, total_bytes, drop_rx);
>> +}
>> +
>> +static int qcom_geni_serial_handle_tx(struct uart_port *uport)
>> +{
>> +       int ret = 0;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +       struct circ_buf *xmit = &uport->state->xmit;
>> +       size_t avail;
>> +       size_t remaining;
>> +       int i = 0;
>> +       u32 status;
>> +       unsigned int chunk;
>> +       int tail;
>> +
>> +       chunk = uart_circ_chars_pending(xmit);
>> +       status = readl_relaxed(uport->membase + SE_GENI_TX_FIFO_STATUS);
>> +       /* Both FIFO and framework buffer are drained */
>> +       if ((chunk == port->xmit_size) && !status) {
> 
> Drop useless parenthesis.
Ok.
> 
>> +               port->xmit_size = 0;
>> +               uart_circ_clear(xmit);
>> +               qcom_geni_serial_stop_tx(uport);
>> +               goto out_write_wakeup;
>> +       }
>> +       chunk -= port->xmit_size;
>> +
>> +       avail = (port->tx_fifo_depth - port->tx_wm) * port->tx_bytes_pw;
>> +       tail = (xmit->tail + port->xmit_size) & (UART_XMIT_SIZE - 1);
>> +       if (chunk > (UART_XMIT_SIZE - tail))
>> +               chunk = UART_XMIT_SIZE - tail;
>> +       if (chunk > avail)
>> +               chunk = avail;
>> +
>> +       if (!chunk)
>> +               goto out_write_wakeup;
>> +
>> +       qcom_geni_serial_setup_tx(uport, chunk);
>> +
>> +       remaining = chunk;
>> +       while (i < chunk) {
> 
> 	for (i = 0; i < chunk; ) {
Ok.
> 
>> +               unsigned int tx_bytes;
>> +               unsigned int buf = 0;
>> +               int c;
>> +
>> +               tx_bytes = min_t(size_t, remaining, (size_t)port->tx_bytes_pw);
>> +               for (c = 0; c < tx_bytes ; c++)
>> +                       buf |= (xmit->buf[tail + c] << (c * BITS_PER_BYTE));
>> +
>> +               writel_relaxed(buf, uport->membase + SE_GENI_TX_FIFOn);
>> +
>> +               i += tx_bytes;
>> +               tail = (tail + tx_bytes) & (UART_XMIT_SIZE - 1);
>> +               uport->icount.tx += tx_bytes;
>> +               remaining -= tx_bytes;
>> +       }
>> +       qcom_geni_serial_poll_tx_done(uport);
>> +       port->xmit_size += chunk;
>> +out_write_wakeup:
>> +       uart_write_wakeup(uport);
>> +       return ret;
>> +}
>> +
>> +static irqreturn_t qcom_geni_serial_isr(int isr, void *dev)
>> +{
>> +       unsigned int m_irq_status;
>> +       unsigned int s_irq_status;
>> +       struct uart_port *uport = dev;
>> +       unsigned long flags;
>> +       unsigned int m_irq_en;
>> +       bool drop_rx = false;
>> +       struct tty_port *tport = &uport->state->port;
>> +
>> +       if (uport->suspended)
>> +               return IRQ_HANDLED;
>> +
>> +       spin_lock_irqsave(&uport->lock, flags);
>> +       m_irq_status = readl_relaxed(uport->membase + SE_GENI_M_IRQ_STATUS);
>> +       s_irq_status = readl_relaxed(uport->membase + SE_GENI_S_IRQ_STATUS);
>> +       m_irq_en = readl_relaxed(uport->membase + SE_GENI_M_IRQ_EN);
>> +       writel_relaxed(m_irq_status, uport->membase + SE_GENI_M_IRQ_CLEAR);
>> +       writel_relaxed(s_irq_status, uport->membase + SE_GENI_S_IRQ_CLEAR);
>> +
>> +       if (WARN_ON(m_irq_status & M_ILLEGAL_CMD_EN))
>> +               goto out_unlock;
>> +
>> +       if (s_irq_status & S_RX_FIFO_WR_ERR_EN) {
>> +               uport->icount.overrun++;
>> +               tty_insert_flip_char(tport, 0, TTY_OVERRUN);
>> +       }
>> +
>> +       if (m_irq_status & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN) &&
>> +           m_irq_en & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN))
>> +               qcom_geni_serial_handle_tx(uport);
>> +
>> +       if (s_irq_status & S_GP_IRQ_0_EN || s_irq_status & S_GP_IRQ_1_EN) {
>> +               if (s_irq_status & S_GP_IRQ_0_EN)
>> +                       uport->icount.parity++;
>> +               drop_rx = true;
>> +       } else if (s_irq_status & S_GP_IRQ_2_EN ||
>> +                                       s_irq_status & S_GP_IRQ_3_EN) {
>> +               uport->icount.brk++;
> 
> How does break character handling work? I see the accounting here, but
> don't see any uart_handle_break() call anywhere.
The reason it is not added is because the hardware does not indicate 
when the break character occured. It can be any one of the FIFO words. 
The statistics is updated to give an idea that the break happened. We 
can add uart_handle_break() but it may not be at an accurate position 
for the above mentioned reason.
> 
>> +       }
>> +
>> +       if (s_irq_status & S_RX_FIFO_WATERMARK_EN ||
>> +                                       s_irq_status & S_RX_FIFO_LAST_EN)
>> +               qcom_geni_serial_handle_rx(uport, drop_rx);
>> +
>> +out_unlock:
>> +       spin_unlock_irqrestore(&uport->lock, flags);
>> +       return IRQ_HANDLED;
>> +}
>> +
>> +static int get_tx_fifo_size(struct qcom_geni_serial_port *port)
>> +{
>> +       struct uart_port *uport;
>> +
>> +       if (!port)
>> +               return -ENODEV;
>> +
>> +       uport = &port->uport;
>> +       port->tx_fifo_depth = geni_se_get_tx_fifo_depth(&port->se);
>> +       if (!port->tx_fifo_depth) {
>> +               dev_err(uport->dev, "%s:Invalid TX FIFO depth read\n",
>> +                                                               __func__);
>> +               return -ENXIO;
>> +       }
>> +
>> +       port->tx_fifo_width = geni_se_get_tx_fifo_width(&port->se);
>> +       if (!port->tx_fifo_width) {
>> +               dev_err(uport->dev, "%s:Invalid TX FIFO width read\n",
>> +                                                               __func__);
>> +               return -ENXIO;
>> +       }
>> +
>> +       port->rx_fifo_depth = geni_se_get_rx_fifo_depth(&port->se);
>> +       if (!port->rx_fifo_depth) {
>> +               dev_err(uport->dev, "%s:Invalid RX FIFO depth read\n",
>> +                                                               __func__);
>> +               return -ENXIO;
>> +       }
> 
> Are these checks verifying the hardware has a proper setting for fifo
> depth and width? How is that possible to mess up? Do these ever fail?
We haven't seen a failure yet. I can drop the check and rely on the fact 
that the hardware is programmed correctly.
> 
>> +
>> +       uport->fifosize =
>> +               (port->tx_fifo_depth * port->tx_fifo_width) / BITS_PER_BYTE;
>> +       return 0;
>> +}
>> +
>> +static void set_rfr_wm(struct qcom_geni_serial_port *port)
>> +{
>> +       /*
>> +        * Set RFR (Flow off) to FIFO_DEPTH - 2.
>> +        * RX WM level at 10% RX_FIFO_DEPTH.
>> +        * TX WM level at 10% TX_FIFO_DEPTH.
>> +        */
>> +       port->rx_rfr = port->rx_fifo_depth - 2;
>> +       port->rx_wm = UART_CONSOLE_RX_WM;
>> +       port->tx_wm = 2;
> 
> port->tx_wm = DEF_TX_WM?
Ok.
> 
>> +}
>> +
>> +static void qcom_geni_serial_shutdown(struct uart_port *uport)
>> +{
>> +       unsigned long flags;
>> +
>> +       /* Stop the console before stopping the current tx */
>> +       console_stop(uport->cons);
>> +
>> +       disable_irq(uport->irq);
>> +       free_irq(uport->irq, uport);
>> +       spin_lock_irqsave(&uport->lock, flags);
>> +       qcom_geni_serial_stop_tx(uport);
>> +       qcom_geni_serial_stop_rx(uport);
>> +       spin_unlock_irqrestore(&uport->lock, flags);
>> +}
>> +
>> +static int qcom_geni_serial_port_setup(struct uart_port *uport)
>> +{
>> +       int ret;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +       unsigned int rxstale = DEFAULT_BITS_PER_CHAR * STALE_TIMEOUT;
>> +
>> +       set_rfr_wm(port);
>> +       writel_relaxed(rxstale, uport->membase + SE_UART_RX_STALE_CNT);
>> +       /*
>> +        * Make an unconditional cancel on the main sequencer to reset
>> +        * it else we could end up in data loss scenarios.
>> +        */
>> +       port->xfer_mode = GENI_SE_FIFO;
>> +       qcom_geni_serial_poll_tx_done(uport);
>> +       geni_se_config_packing(&port->se, BITS_PER_BYTE, port->tx_bytes_pw,
>> +                                               false, true, false);
>> +       geni_se_config_packing(&port->se, BITS_PER_BYTE, port->rx_bytes_pw,
>> +                                               false, false, true);
>> +       ret = geni_se_init(&port->se, port->rx_wm, port->rx_rfr);
>> +       if (ret) {
>> +               dev_err(uport->dev, "%s: Fail\n", __func__);
>> +               return ret;
>> +       }
>> +
>> +       geni_se_select_mode(&port->se, port->xfer_mode);
>> +       port->port_setup = true;
>> +       return ret;
>> +}
>> +
>> +static int qcom_geni_serial_startup(struct uart_port *uport)
>> +{
>> +       int ret;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       scnprintf(port->name, sizeof(port->name),
>> +                 "qcom_serial_geni%d", uport->line);
>> +
>> +       if (geni_se_read_proto(&port->se) != GENI_SE_UART) {
>> +               dev_err(uport->dev, "Invalid FW %d loaded.\n",
>> +                                geni_se_read_proto(&port->se));
> 
> Please don't read proto twice.
Ok.
> 
>> +               return -ENXIO;
>> +       }
>> +
>> +       get_tx_fifo_size(port);
>> +       if (!port->port_setup) {
>> +               ret = qcom_geni_serial_port_setup(uport);
>> +               if (ret)
>> +                       return ret;
>> +       }
>> +
>> +       ret = request_irq(uport->irq, qcom_geni_serial_isr, IRQF_TRIGGER_HIGH,
>> +                                                       port->name, uport);
>> +       if (ret)
>> +               dev_err(uport->dev, "Failed to get IRQ ret %d\n", ret);
>> +       return ret;
>> +}
>> +
>> +static unsigned long get_clk_cfg(unsigned long clk_freq)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < ARRAY_SIZE(root_freq); i++) {
>> +               if (!(root_freq[i] % clk_freq))
>> +                       return root_freq[i];
>> +       }
>> +       return 0;
>> +}
>> +
>> +static void geni_serial_write_term_regs(struct uart_port *uport,
>> +               u32 tx_trans_cfg, u32 tx_parity_cfg, u32 rx_trans_cfg,
>> +               u32 rx_parity_cfg, u32 bits_per_char, u32 stop_bit_len,
>> +               u32 s_clk_cfg)
>> +{
>> +       writel_relaxed(tx_trans_cfg, uport->membase + SE_UART_TX_TRANS_CFG);
>> +       writel_relaxed(tx_parity_cfg, uport->membase + SE_UART_TX_PARITY_CFG);
>> +       writel_relaxed(rx_trans_cfg, uport->membase + SE_UART_RX_TRANS_CFG);
>> +       writel_relaxed(rx_parity_cfg, uport->membase + SE_UART_RX_PARITY_CFG);
>> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_TX_WORD_LEN);
>> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_RX_WORD_LEN);
>> +       writel_relaxed(stop_bit_len, uport->membase + SE_UART_TX_STOP_BIT_LEN);
>> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_M_CLK_CFG);
>> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_S_CLK_CFG);
> 
> Can you please inline this function into the caller and put the writels
> where the values are calculated? It would reduce the mental work to keep
> track of all the variables to find out that they just get written in the
> end. Also, this is weirdly placed in the file when get_clk_div_rate()
> calls get_clk_cfg() but this function is between them.
Bjorn had a similar comment and there I mentioned that the writes are 
required during early console setup as well. Since the popular vote is 
towards inlining these writes, I will update it accordingly.
> 
>> +}
>> +
>> +static unsigned long get_clk_div_rate(unsigned int baud, unsigned int *clk_div)
>> +{
>> +       unsigned long ser_clk;
>> +       unsigned long desired_clk;
>> +
>> +       desired_clk = baud * UART_OVERSAMPLING;
>> +       ser_clk = get_clk_cfg(desired_clk);
>> +       if (!ser_clk) {
>> +               pr_err("%s: Can't find matching DFS entry for baud %d\n",
>> +                                                               __func__, baud);
>> +               return ser_clk;
>> +       }
>> +
>> +       *clk_div = ser_clk / desired_clk;
> 
> How wide can clk_div be? It may be better to implement the ser_clk as an
> actual clk in the common clk framework, and then have the serial driver
> or the i2c driver call clk_set_rate() on that clk and have the CCF
> implementation take care of determining the rate that the parent clk can
> supply and how it can fit it into the frequency that the divider can
> support.
Based on my current expertise with the CCF, I will address this comment 
in a later patchset than the next one.
> 
>> +       return ser_clk;
>> +}
>> +
>> +static void qcom_geni_serial_set_termios(struct uart_port *uport,
>> +                               struct ktermios *termios, struct ktermios *old)
>> +{
>> +       unsigned int baud;
>> +       unsigned int bits_per_char;
>> +       unsigned int tx_trans_cfg;
>> +       unsigned int tx_parity_cfg;
>> +       unsigned int rx_trans_cfg;
>> +       unsigned int rx_parity_cfg;
>> +       unsigned int stop_bit_len;
>> +       unsigned int clk_div;
>> +       unsigned long ser_clk_cfg;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +       unsigned long clk_rate;
>> +
>> +       qcom_geni_serial_stop_rx(uport);
>> +       /* baud rate */
>> +       baud = uart_get_baud_rate(uport, termios, old, 300, 4000000);
>> +       port->cur_baud = baud;
>> +       clk_rate = get_clk_div_rate(baud, &clk_div);
>> +       if (!clk_rate)
>> +               goto out_restart_rx;
>> +
>> +       uport->uartclk = clk_rate;
>> +       clk_set_rate(port->se.clk, clk_rate);
>> +       ser_clk_cfg = SER_CLK_EN;
>> +       ser_clk_cfg |= (clk_div << CLK_DIV_SHFT);
> 
> Drop useless cast.
I think you mean parenthesis. I do not see casting here.
> 
>> +
>> +       /* parity */
>> +       tx_trans_cfg = readl_relaxed(uport->membase + SE_UART_TX_TRANS_CFG);
>> +       tx_parity_cfg = readl_relaxed(uport->membase + SE_UART_TX_PARITY_CFG);
>> +       rx_trans_cfg = readl_relaxed(uport->membase + SE_UART_RX_TRANS_CFG);
>> +       rx_parity_cfg = readl_relaxed(uport->membase + SE_UART_RX_PARITY_CFG);
>> +       if (termios->c_cflag & PARENB) {
>> +               tx_trans_cfg |= UART_TX_PAR_EN;
>> +               rx_trans_cfg |= UART_RX_PAR_EN;
>> +               tx_parity_cfg |= PAR_CALC_EN;
>> +               rx_parity_cfg |= PAR_CALC_EN;
>> +               if (termios->c_cflag & PARODD) {
>> +                       tx_parity_cfg |= PAR_ODD;
>> +                       rx_parity_cfg |= PAR_ODD;
>> +               } else if (termios->c_cflag & CMSPAR) {
>> +                       tx_parity_cfg |= PAR_SPACE;
>> +                       rx_parity_cfg |= PAR_SPACE;
>> +               } else {
>> +                       tx_parity_cfg |= PAR_EVEN;
>> +                       rx_parity_cfg |= PAR_EVEN;
>> +               }
>> +       } else {
>> +               tx_trans_cfg &= ~UART_TX_PAR_EN;
>> +               rx_trans_cfg &= ~UART_RX_PAR_EN;
>> +               tx_parity_cfg &= ~PAR_CALC_EN;
>> +               rx_parity_cfg &= ~PAR_CALC_EN;
>> +       }
>> +
>> +       /* bits per char */
>> +       switch (termios->c_cflag & CSIZE) {
>> +       case CS5:
>> +               bits_per_char = 5;
>> +               break;
>> +       case CS6:
>> +               bits_per_char = 6;
>> +               break;
>> +       case CS7:
>> +               bits_per_char = 7;
>> +               break;
>> +       case CS8:
>> +       default:
>> +               bits_per_char = 8;
>> +               break;
>> +       }
>> +
>> +       /* stop bits */
>> +       if (termios->c_cflag & CSTOPB)
>> +               stop_bit_len = TX_STOP_BIT_LEN_2;
>> +       else
>> +               stop_bit_len = TX_STOP_BIT_LEN_1;
>> +
>> +       /* flow control, clear the CTS_MASK bit if using flow control. */
>> +       if (termios->c_cflag & CRTSCTS)
>> +               tx_trans_cfg &= ~UART_CTS_MASK;
>> +       else
>> +               tx_trans_cfg |= UART_CTS_MASK;
>> +
>> +       if (baud)
>> +               uart_update_timeout(uport, termios->c_cflag, baud);
>> +
>> +       geni_serial_write_term_regs(uport, tx_trans_cfg, tx_parity_cfg,
>> +               rx_trans_cfg, rx_parity_cfg, bits_per_char, stop_bit_len,
>> +                                                               ser_clk_cfg);
>> +out_restart_rx:
>> +       qcom_geni_serial_start_rx(uport);
>> +}
>> +
>> +static unsigned int qcom_geni_serial_tx_empty(struct uart_port *uport)
>> +{
>> +       return !readl_relaxed(uport->membase + SE_GENI_TX_FIFO_STATUS);
>> +}
>> +
>> +#if defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL)
>> +static int __init qcom_geni_console_setup(struct console *co, char *options)
>> +{
>> +       struct uart_port *uport;
>> +       struct qcom_geni_serial_port *port;
>> +       int baud;
>> +       int bits = 8;
>> +       int parity = 'n';
>> +       int flow = 'n';
>> +
>> +       if (co->index >= GENI_UART_CONS_PORTS  || co->index < 0)
>> +               return -ENXIO;
>> +
>> +       port = get_port_from_line(co->index);
>> +       if (IS_ERR(port)) {
>> +               pr_err("Invalid line %d(%d)\n", co->index, (int)PTR_ERR(port));
>> +               return PTR_ERR(port);
>> +       }
>> +
>> +       uport = &port->uport;
>> +
>> +       if (unlikely(!uport->membase))
>> +               return -ENXIO;
>> +
>> +       if (geni_se_resources_on(&port->se)) {
>> +               dev_err(port->se.dev, "Error turning on resources\n");
>> +               return -ENXIO;
>> +       }
>> +
>> +       if (unlikely(geni_se_read_proto(&port->se) != GENI_SE_UART)) {
> 
> Looks like we're validating the configuration of the DT here. Maybe this
> can go into the wrapper code and be put behind some DEBUG_KERNEL check
> so we can debug bad bootloader configurations if needed? Especially if
> this is the only API that's left exposed from the wrapper to the serial
> engine/protocol driver.
Ok.
> 
>> +               geni_se_resources_off(&port->se);
>> +               return -ENXIO;
>> +       }
>> +
>> +       if (!port->port_setup) {
>> +               port->tx_bytes_pw = 1;
>> +               port->rx_bytes_pw = RX_BYTES_PW;
>> +               qcom_geni_serial_stop_rx(uport);
>> +               qcom_geni_serial_port_setup(uport);
>> +       }
>> +
>> +       if (options)
>> +               uart_parse_options(options, &baud, &parity, &bits, &flow);
>> +
>> +       return uart_set_options(uport, co, baud, parity, bits, flow);
>> +}
>> +
>> +static int console_register(struct uart_driver *drv)
> 
> __init
Ok.
> 
>> +{
>> +       return uart_register_driver(drv);
>> +}
>> +
>> +static void console_unregister(struct uart_driver *drv)
>> +{
>> +       uart_unregister_driver(drv);
>> +}
>> +
>> +static struct console cons_ops = {
>> +       .name = "ttyMSM",
>> +       .write = qcom_geni_serial_console_write,
>> +       .device = uart_console_device,
>> +       .setup = qcom_geni_console_setup,
>> +       .flags = CON_PRINTBUFFER,
>> +       .index = -1,
>> +       .data = &qcom_geni_console_driver,
>> +};
>> +
>> +static struct uart_driver qcom_geni_console_driver = {
>> +       .owner = THIS_MODULE,
>> +       .driver_name = "qcom_geni_console",
>> +       .dev_name = "ttyMSM",
>> +       .nr =  GENI_UART_CONS_PORTS,
>> +       .cons = &cons_ops,
>> +};
>> +#else
>> +static int console_register(struct uart_driver *drv)
>> +{
>> +       return 0;
>> +}
>> +
>> +static void console_unregister(struct uart_driver *drv)
>> +{
>> +}
>> +#endif /* defined(CONFIG_SERIAL_CORE_CONSOLE) || defined(CONFIG_CONSOLE_POLL) */
>> +
>> +static void qcom_geni_serial_cons_pm(struct uart_port *uport,
>> +               unsigned int new_state, unsigned int old_state)
>> +{
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       if (unlikely(!uart_console(uport)))
>> +               return;
>> +
>> +       if (new_state == UART_PM_STATE_ON && old_state == UART_PM_STATE_OFF)
>> +               geni_se_resources_on(&port->se);
>> +       else if (new_state == UART_PM_STATE_OFF &&
>> +                       old_state == UART_PM_STATE_ON)
>> +               geni_se_resources_off(&port->se);
>> +}
>> +
>> +static const struct uart_ops qcom_geni_console_pops = {
>> +       .tx_empty = qcom_geni_serial_tx_empty,
>> +       .stop_tx = qcom_geni_serial_stop_tx,
>> +       .start_tx = qcom_geni_serial_start_tx,
>> +       .stop_rx = qcom_geni_serial_stop_rx,
>> +       .set_termios = qcom_geni_serial_set_termios,
>> +       .startup = qcom_geni_serial_startup,
>> +       .config_port = qcom_geni_serial_config_port,
>> +       .shutdown = qcom_geni_serial_shutdown,
>> +       .type = qcom_geni_serial_get_type,
>> +       .set_mctrl = qcom_geni_cons_set_mctrl,
>> +       .get_mctrl = qcom_geni_cons_get_mctrl,
>> +#ifdef CONFIG_CONSOLE_POLL
>> +       .poll_get_char  = qcom_geni_serial_get_char,
>> +       .poll_put_char  = qcom_geni_serial_poll_put_char,
>> +#endif
>> +       .pm = qcom_geni_serial_cons_pm,
>> +};
>> +
>> +static int qcom_geni_serial_probe(struct platform_device *pdev)
>> +{
>> +       int ret = 0;
>> +       int line = -1;
>> +       struct qcom_geni_serial_port *port;
>> +       struct uart_port *uport;
>> +       struct resource *res;
>> +       struct uart_driver *drv;
>> +
>> +       drv = (void *)of_device_get_match_data(&pdev->dev);
> 
> Useless cast.
There were compiler warnings assigning a const void * to a void *. That 
is why I have the cast in place.
> 
>> +       if (!drv) {
>> +               dev_err(&pdev->dev, "%s: No matching device found", __func__);
>> +               return -ENODEV;
>> +       }
>> +
>> +       if (pdev->dev.of_node)
>> +               line = of_alias_get_id(pdev->dev.of_node, "serial");
>> +       else
>> +               line = pdev->id;
>> +
>> +       if (line < 0)
>> +               line = atomic_inc_return(&uart_line_id) - 1;
>> +
>> +       if ((line < 0) || (line >= GENI_UART_CONS_PORTS))
> 
> Useless parenthesis.
I will drop it.
> 
>> +               return -ENXIO;
>> +       port = get_port_from_line(line);
>> +       if (IS_ERR(port)) {
>> +               ret = PTR_ERR(port);
>> +               dev_err(&pdev->dev, "Invalid line %d(%d)\n", line, ret);
>> +               return ret;
>> +       }
>> +
>> +       uport = &port->uport;
>> +       /* Don't allow 2 drivers to access the same port */
>> +       if (uport->private_data)
>> +               return -ENODEV;
>> +
>> +       uport->dev = &pdev->dev;
>> +       port->se.dev = &pdev->dev;
>> +       port->se.wrapper = dev_get_drvdata(pdev->dev.parent);
>> +       port->se.clk = devm_clk_get(&pdev->dev, "se");
>> +       if (IS_ERR(port->se.clk)) {
>> +               ret = PTR_ERR(port->se.clk);
>> +               dev_err(&pdev->dev, "Err getting SE Core clk %d\n", ret);
>> +               return ret;
>> +       }
>> +
>> +       res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> +       uport->mapbase = res->start;
>> +       uport->membase = devm_ioremap_resource(&pdev->dev, res);
>> +       if (!uport->membase) {
> 
> Check for IS_ERR()
Ok.
> 
>> +               dev_err(&pdev->dev, "Err IO mapping serial iomem");
> 
> No need for error message with devm_ioremap_resource()
Ok.
> 
>> +               return -ENOMEM;
> 
> return PTR_ERR(..)
Ok.
> 
> Also, I see some serial drivers do the mapping when the port is
> requested. That can't be done here?
By "when the port is requested" do you mean console's setup operation. 
It can be done, but given that it is a one-time operation I am not sure 
if it makes any difference between mapping here and there.
> 
>> +       }
>> +       port->se.base = uport->membase;
>> +
>> +       port->tx_fifo_depth = DEF_FIFO_DEPTH_WORDS;
>> +       port->rx_fifo_depth = DEF_FIFO_DEPTH_WORDS;
>> +       port->tx_fifo_width = DEF_FIFO_WIDTH_BITS;
>> +
>> +       uport->irq = platform_get_irq(pdev, 0);
>> +       if (uport->irq < 0) {
>> +               dev_err(&pdev->dev, "Failed to get IRQ %d\n", uport->irq);
>> +               return uport->irq;
>> +       }
>> +
>> +       uport->private_data = drv;
>> +       platform_set_drvdata(pdev, port);
>> +       port->handle_rx = handle_rx_console;
>> +       port->port_setup = false;
>> +       return uart_add_one_port(drv, uport);
>> +}
>> +
>> +static int qcom_geni_serial_remove(struct platform_device *pdev)
>> +{
>> +       struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
>> +       struct uart_driver *drv = port->uport.private_data;
>> +
>> +       uart_remove_one_port(drv, &port->uport);
>> +       return 0;
>> +}
>> +
>> +static int __maybe_unused qcom_geni_serial_sys_suspend_noirq(struct device *dev)
>> +{
>> +       struct platform_device *pdev = to_platform_device(dev);
>> +       struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
>> +       struct uart_port *uport = &port->uport;
>> +
>> +       uart_suspend_port(uport->private_data, uport);
>> +       return 0;
>> +}
>> +
>> +static int __maybe_unused qcom_geni_serial_sys_resume_noirq(struct device *dev)
>> +{
>> +       struct platform_device *pdev = to_platform_device(dev);
>> +       struct qcom_geni_serial_port *port = platform_get_drvdata(pdev);
>> +       struct uart_port *uport = &port->uport;
>> +
>> +       if (console_suspend_enabled && uport->suspended) {
>> +               uart_resume_port(uport->private_data, uport);
>> +               disable_irq(uport->irq);
>> +       }
>> +       return 0;
>> +}
>> +
>> +static const struct dev_pm_ops qcom_geni_serial_pm_ops = {
>> +       .suspend_noirq = qcom_geni_serial_sys_suspend_noirq,
>> +       .resume_noirq = qcom_geni_serial_sys_resume_noirq,
> 
> Why are these noirq variants? Please add a comment.
The intention is to enable the console UART port usage as late as 
possible in the suspend cycle. Hence noirq variants. I will add this as 
a comment. Please let me know if the usage does not match the intention.
> 
>> +};
>> +
>> +static const struct of_device_id qcom_geni_serial_match_table[] = {
>> +       { .compatible = "qcom,geni-debug-uart",
>> +                               .data = &qcom_geni_console_driver, },
>> +       {}
>> +};
>> +MODULE_DEVICE_TABLE(of, qcom_geni_serial_match_table);
>> +
>> +static struct platform_driver qcom_geni_serial_platform_driver = {
>> +       .remove = qcom_geni_serial_remove,
>> +       .probe = qcom_geni_serial_probe,
>> +       .driver = {
>> +               .name = "qcom_geni_serial",
>> +               .of_match_table = qcom_geni_serial_match_table,
>> +               .pm = &qcom_geni_serial_pm_ops,
>> +       },
>> +};
>> +
>> +static int __init qcom_geni_serial_init(void)
>> +{
>> +       int ret = 0;
> 
> Drop assignment please.
Ok.
> 
>> +
>> +       qcom_geni_console_port.uport.iotype = UPIO_MEM;
>> +       qcom_geni_console_port.uport.ops = &qcom_geni_console_pops;
>> +       qcom_geni_console_port.uport.flags = UPF_BOOT_AUTOCONF;
>> +       qcom_geni_console_port.uport.line = 0;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Regards,
Karthik.
Stephen Boyd March 6, 2018, 9:45 p.m. UTC | #7
Quoting Karthik Ramasubramanian (2018-03-05 16:51:23)
> On 3/2/2018 3:11 PM, Stephen Boyd wrote:
> > Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:09)
> > 
> >> +               size_t chars_to_write = 0;
> >> +               size_t avail = DEF_FIFO_DEPTH_WORDS - DEF_TX_WM;
> >> +
> >> +               /*
> >> +                * If the WM bit never set, then the Tx state machine is not
> >> +                * in a valid state, so break, cancel/abort any existing
> >> +                * command. Unfortunately the current data being written is
> >> +                * lost.
> >> +                */
> >> +               while (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
> >> +                                               M_TX_FIFO_WATERMARK_EN, true))
> > 
> > Does this ever timeout? So many nested while loops makes it hard to
> > follow.
> Yes. Based on the baud rate of 115200 and the FIFO Depth & Width of (16 
> * 32), the poll should not take more than 5 ms under the timeout scenario.

Sure, but I'm asking if this has any sort of timeout associated with it.
It looks to be a while loop that could possibly go forever?

> >> +static void qcom_geni_serial_console_write(struct console *co, const char *s,
> >> +                             unsigned int count)
> >> +{
> >> +       struct uart_port *uport;
> >> +       struct qcom_geni_serial_port *port;
> >> +       bool locked = true;
> >> +       unsigned long flags;
> >> +
> >> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
> >> +
> >> +       port = get_port_from_line(co->index);
> >> +       if (IS_ERR(port))
> >> +               return;
> >> +
> >> +       uport = &port->uport;
> >> +       if (oops_in_progress)
> >> +               locked = spin_trylock_irqsave(&uport->lock, flags);
> >> +       else
> >> +               spin_lock_irqsave(&uport->lock, flags);
> >> +
> >> +       if (locked) {
> >> +               __qcom_geni_serial_console_write(uport, s, count);
> > 
> > So if oops is in progress, and we didn't lock here, we don't output
> > data? I'd think we would always want to write to the fifo, just make the
> > lock grab/release conditional.
> If we fail to grab the lock, then there is another active writer. So 
> trying to write to the fifo will put the hardware in bad state because 
> writer has programmed the hardware to write 'x' number of words and this 
> thread will over-write it with 'y' number of words. Also the data that 
> you might see in the console might be garbled.

I suspect that because this is the serial console, and we want it to
always output stuff even when we're going down in flames, we may want to
ignore that case and just wait for the fifo to free up and then
overwrite the number of words that we want to output and push out more
characters.

I always get confused though because there seem to be lots of different
ways to do things in serial drivers and not too much clear documentation
that I've read describing what to do.

> > 
> >> +               spin_unlock_irqrestore(&uport->lock, flags);
> >> +       }
> >> +}
[...]
> >> +
> >> +       if (m_irq_status & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN) &&
> >> +           m_irq_en & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN))
> >> +               qcom_geni_serial_handle_tx(uport);
> >> +
> >> +       if (s_irq_status & S_GP_IRQ_0_EN || s_irq_status & S_GP_IRQ_1_EN) {
> >> +               if (s_irq_status & S_GP_IRQ_0_EN)
> >> +                       uport->icount.parity++;
> >> +               drop_rx = true;
> >> +       } else if (s_irq_status & S_GP_IRQ_2_EN ||
> >> +                                       s_irq_status & S_GP_IRQ_3_EN) {
> >> +               uport->icount.brk++;
> > 
> > How does break character handling work? I see the accounting here, but
> > don't see any uart_handle_break() call anywhere.
> The reason it is not added is because the hardware does not indicate 
> when the break character occured. It can be any one of the FIFO words. 
> The statistics is updated to give an idea that the break happened. We 
> can add uart_handle_break() but it may not be at an accurate position 
> for the above mentioned reason.

Sounds like the previous uart design. We would want uart_handle_break()
support for kgdb to work over serial. Do we still get an interrupt
signal that a break character is somewhere in the fifo? If we get that,
then does it also put a NUL (0) character into the fifo that we can
find? I had to do something like that before, where we detect the irq
and then we go through the fifo a character at a time to find the break
character that looks like a NUL, and then let sysrq core handle the
characters after that break character comes in.

> > 
> > 
> >> +}
> >> +
> >> +static unsigned long get_clk_div_rate(unsigned int baud, unsigned int *clk_div)
> >> +{
> >> +       unsigned long ser_clk;
> >> +       unsigned long desired_clk;
> >> +
> >> +       desired_clk = baud * UART_OVERSAMPLING;
> >> +       ser_clk = get_clk_cfg(desired_clk);
> >> +       if (!ser_clk) {
> >> +               pr_err("%s: Can't find matching DFS entry for baud %d\n",
> >> +                                                               __func__, baud);
> >> +               return ser_clk;
> >> +       }
> >> +
> >> +       *clk_div = ser_clk / desired_clk;
> > 
> > How wide can clk_div be? It may be better to implement the ser_clk as an
> > actual clk in the common clk framework, and then have the serial driver
> > or the i2c driver call clk_set_rate() on that clk and have the CCF
> > implementation take care of determining the rate that the parent clk can
> > supply and how it can fit it into the frequency that the divider can
> > support.
> Based on my current expertise with the CCF, I will address this comment 
> in a later patchset than the next one.

Ok. I've seen similar designs in some mmc drivers. For example, you can
look at drivers/mmc/host/meson-gx-mmc.c and see that they register a
clk_ops and then just start using that clk directly from the driver.
There are also some helper functions for dividers that would hopefully
make the job of calculating the best divider easier.

> >> +
> >> +       uport->uartclk = clk_rate;
> >> +       clk_set_rate(port->se.clk, clk_rate);
> >> +       ser_clk_cfg = SER_CLK_EN;
> >> +       ser_clk_cfg |= (clk_div << CLK_DIV_SHFT);
> > 
> > Drop useless cast.
> I think you mean parenthesis. I do not see casting here.

Yes! You got it.

> >> +#endif
> >> +       .pm = qcom_geni_serial_cons_pm,
> >> +};
> >> +
> >> +static int qcom_geni_serial_probe(struct platform_device *pdev)
> >> +{
> >> +       int ret = 0;
> >> +       int line = -1;
> >> +       struct qcom_geni_serial_port *port;
> >> +       struct uart_port *uport;
> >> +       struct resource *res;
> >> +       struct uart_driver *drv;
> >> +
> >> +       drv = (void *)of_device_get_match_data(&pdev->dev);
> > 
> > Useless cast.
> There were compiler warnings assigning a const void * to a void *. That 
> is why I have the cast in place.

Oh. Yes you shouldn't cast away the const. Please make the compiler
warning go away without casting it away.

> > 
> > 
> > Also, I see some serial drivers do the mapping when the port is
> > requested. That can't be done here?
> By "when the port is requested" do you mean console's setup operation. 
> It can be done, but given that it is a one-time operation I am not sure 
> if it makes any difference between mapping here and there.

No. I meant the uart_ops::uart_request() function and also
uart_ops::config_port(). Take a look at msm_config_port() and
msm_request_port() for how it was done on pre-geni qcom uarts.

I suppose we should try to probe this as a module and run a getty on it
and then remove and probe the module a couple times after that.
That should shake out some bugs.

> >> +
> >> +static const struct dev_pm_ops qcom_geni_serial_pm_ops = {
> >> +       .suspend_noirq = qcom_geni_serial_sys_suspend_noirq,
> >> +       .resume_noirq = qcom_geni_serial_sys_resume_noirq,
> > 
> > Why are these noirq variants? Please add a comment.
> The intention is to enable the console UART port usage as late as 
> possible in the suspend cycle. Hence noirq variants. I will add this as 
> a comment. Please let me know if the usage does not match the intention.

Hm.. Does no_console_suspend not work? Or not work well enough?

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Boyd March 6, 2018, 9:56 p.m. UTC | #8
Quoting Karthik Ramasubramanian (2018-03-02 16:58:23)
> 
> 
> On 3/2/2018 1:41 PM, Stephen Boyd wrote:
> > Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:07)
> >> +
> >> +/**
> >> + * geni_se_get_qup_hw_version() - Read the QUP wrapper Hardware version
> >> + * @se:                        Pointer to the corresponding Serial Engine.
> >> + * @major:             Buffer for Major Version field.
> >> + * @minor:             Buffer for Minor Version field.
> >> + * @step:              Buffer for Step Version field.
> >> + */
> >> +void geni_se_get_qup_hw_version(struct geni_se *se, unsigned int *major,
> >> +                               unsigned int *minor, unsigned int *step)
> >> +{
> >> +       unsigned int version;
> >> +       struct geni_wrapper *wrapper = se->wrapper;
> >> +
> >> +       version = readl_relaxed(wrapper->base + QUP_HW_VER_REG);
> >> +       *major = (version & HW_VER_MAJOR_MASK) >> HW_VER_MAJOR_SHFT;
> >> +       *minor = (version & HW_VER_MINOR_MASK) >> HW_VER_MINOR_SHFT;
> >> +       *step = version & HW_VER_STEP_MASK;
> >> +}
> >> +EXPORT_SYMBOL(geni_se_get_qup_hw_version);
> > 
> > Is this used?
> SPI controller driver uses this API and it will be uploaded sooner.

Ok. Maybe it can also be a macro to get the u32 and then some more
macros on top of that to pick out the major/minor/step out of the u32
that you read.

> > 
> >> +
> >> +/**
> >> + * geni_se_read_proto() - Read the protocol configured for a Serial Engine
> >> + * @se:        Pointer to the concerned Serial Engine.
> >> + *
> >> + * Return: Protocol value as configured in the serial engine.
> >> + */
> >> +u32 geni_se_read_proto(struct geni_se *se)
> >> +{
> >> +       u32 val;
> >> +
> >> +       val = readl_relaxed(se->base + GENI_FW_REVISION_RO);
> >> +
> >> +       return (val & FW_REV_PROTOCOL_MSK) >> FW_REV_PROTOCOL_SHFT;
> >> +}
> >> +EXPORT_SYMBOL(geni_se_read_proto);
> > 
> > Is this API really needed outside of this file? It would seem like the
> > drivers that implement the protocol, which are child devices, would only
> > use this API to confirm that the protocol chosen is for their particular
> > protocol.
> No, this API is meant for the protocol drivers to confirm that the 
> serial engine is programmed with the firmware for the concerned protocol 
> before using the serial engine. If the check fails, the protocol drivers 
> stop using the serial engine.

Ok maybe we don't really need it then?

> >> + * RX fifo of the serial engine.
> >> + *
> >> + * Return: RX fifo depth in units of FIFO words
> >> + */
> >> +u32 geni_se_get_rx_fifo_depth(struct geni_se *se)
> >> +{
> >> +       u32 val;
> >> +
> >> +       val = readl_relaxed(se->base + SE_HW_PARAM_1);
> >> +
> >> +       return (val & RX_FIFO_DEPTH_MSK) >> RX_FIFO_DEPTH_SHFT;
> >> +}
> >> +EXPORT_SYMBOL(geni_se_get_rx_fifo_depth);
> > 
> > These ones too, can probably just be static inline.
> Ok. Just for my knowledge - is there any reference guideline regarding 
> when to use static inline myself and when to let the compiler do the 
> clever thing?

Not that I'm aware of. It's really up to you to decide.

> > 
> >> +
> >> +       ret = geni_se_clks_on(se);
> >> +       if (ret)
> >> +               return ret;
> >> +
> >> +       ret = pinctrl_pm_select_default_state(se->dev);
> >> +       if (ret)
> >> +               geni_se_clks_off(se);
> >> +
> >> +       return ret;
> >> +}
> >> +EXPORT_SYMBOL(geni_se_resources_on);
> > 
> > IS there a reason why we can't use runtime PM or normal linux PM
> > infrastructure to power on the wrapper and keep it powered while the
> > protocol driver is active?
> Besides turning on the clocks & pinctrl settings, wrapper also has to do 
> the bus scaling votes. The bus scaling votes depend on the individual 
> serial interface bandwidth requirements. The bus scaling votes is not 
> present currently. But once the support comes in, this function enables 
> adding it.

Ok, but that would basically be some code consolidation around picking a
bandwidth and enabling/disabling? It sounds like it could go into either
the serial interface drivers or into the runtime PM path of the wrapper.

> > 
> >> +
> >> +/**
> >> + * geni_se_clk_tbl_get() - Get the clock table to program DFS
> >> + * @se:                Pointer to the concerned Serial Engine.
> >> + * @tbl:       Table in which the output is returned.
> >> + *
> >> + * This function is called by the protocol drivers to determine the different
> >> + * clock frequencies supported by Serial Engine Core Clock. The protocol
> >> + * drivers use the output to determine the clock frequency index to be
> >> + * programmed into DFS.
> >> + *
> >> + * Return: number of valid performance levels in the table on success,
> >> + *        standard Linux error codes on failure.
> >> + */
> >> +int geni_se_clk_tbl_get(struct geni_se *se, unsigned long **tbl)
> >> +{
> >> +       struct geni_wrapper *wrapper = se->wrapper;
> >> +       unsigned long freq = 0;
> >> +       int i;
> >> +       int ret = 0;
> >> +
> >> +       mutex_lock(&wrapper->lock);
> >> +       if (wrapper->clk_perf_tbl) {
> >> +               *tbl = wrapper->clk_perf_tbl;
> >> +               ret = wrapper->num_clk_levels;
> >> +               goto out_unlock;
> >> +       }
> >> +
> >> +       wrapper->clk_perf_tbl = kcalloc(MAX_CLK_PERF_LEVEL,
> >> +                                       sizeof(*wrapper->clk_perf_tbl),
> >> +                                       GFP_KERNEL);
> >> +       if (!wrapper->clk_perf_tbl) {
> >> +               ret = -ENOMEM;
> >> +               goto out_unlock;
> >> +       }
> >> +
> >> +       for (i = 0; i < MAX_CLK_PERF_LEVEL; i++) {
> >> +               freq = clk_round_rate(se->clk, freq + 1);
> >> +               if (!freq || freq == wrapper->clk_perf_tbl[i - 1])
> >> +                       break;
> >> +               wrapper->clk_perf_tbl[i] = freq;
> >> +       }
> >> +       wrapper->num_clk_levels = i;
> >> +       *tbl = wrapper->clk_perf_tbl;
> >> +       ret = wrapper->num_clk_levels;
> >> +out_unlock:
> >> +       mutex_unlock(&wrapper->lock);
> > 
> > Is this lock actually protecting anything? I mean to say, is any more
> > than one geni protocol driver calling this function at a time? Or is
> > the same geni protocol driver calling this from multiple threads at the
> > same time? The lock looks almost useless.
> Yes, there is a possibility of multiple I2C instances within the same 
> wrapper trying to get this table simultaneously.
> 
> As Evan mentioned in the other thread, Bjorn had the comment to move it 
> to the probe and remove the lock. I looked into the possibility of it. 
>  From the hardware perspective, this table belongs to the wrapper and is 
> shared by all the serial engines within the wrapper. But due to software 
> implementation reasons, clk_round_rate can be be performed only on the 
> clocks that are tagged as DFS compatible and only the serial engine 
> clocks are tagged so. At least this was the understanding based on our 
> earlier discussion with the concerned folks. We will revisit it and 
> check if anything has changed recently.

Hmm sounds like the round rate should happen on the parent of the
se_clk, and this wrapper DT binding should get the clk for the parent of
the se->clk to run round_rate() on. Then it could all be done in probe,
which sounds good.

> >> +       return iova;
> >> +}
> >> +EXPORT_SYMBOL(geni_se_tx_dma_prep);
> >> +
> >> +/**
> >> + * geni_se_rx_dma_prep() - Prepare the Serial Engine for RX DMA transfer
> >> + * @se:                        Pointer to the concerned Serial Engine.
> >> + * @buf:               Pointer to the RX buffer.
> >> + * @len:               Length of the RX buffer.
> >> + *
> >> + * This function is used to prepare the buffers for DMA RX.
> >> + *
> >> + * Return: Mapped DMA Address of the buffer on success, NULL on failure.
> >> + */
> >> +dma_addr_t geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len)
> >> +{
> >> +       dma_addr_t iova;
> >> +       struct geni_wrapper *wrapper = se->wrapper;
> >> +       u32 val;
> >> +
> >> +       iova = dma_map_single(wrapper->dev, buf, len, DMA_FROM_DEVICE);
> >> +       if (dma_mapping_error(wrapper->dev, iova))
> >> +               return (dma_addr_t)NULL;
> > 
> > Can't return a dma_mapping_error address to the caller and have them
> > figure it out?
> Earlier we used to return the DMA_ERROR_CODE which has been removed 
> recently in arm64 architecture. If we return the dma_mapping_error, then 
> the caller also needs the device which encountered the mapping error. 
> The serial interface drivers can use their parent currently to resolve 
> the mapping error. Once the wrapper starts mapping using IOMMU context 
> bank, then the serial interface drivers do not know which device to use 
> to know if there is an error.
> 
> Having said that, the dma_ops suggestion might help with handling this 
> situation. I will look into it further.

Ok, thanks.

> >> +{
> >> +       struct geni_wrapper *wrapper = se->wrapper;
> >> +
> >> +       if (iova)
> >> +               dma_unmap_single(wrapper->dev, iova, len, DMA_FROM_DEVICE);
> >> +}
> >> +EXPORT_SYMBOL(geni_se_rx_dma_unprep);
> > 
> > Instead of having the functions exported, could we set the dma_ops on
> > all child devices of the wrapper that this driver populates and then
> > implement the DMA ops for those devices here? I assume that there's
> > never another DMA master between the wrapper and the serial engine, so I
> > think it would work.
> This suggestion looks like it will work.

It would be a good idea to check with some other people on the dma_ops
suggestion. Maybe add the DMA mapping subsystem folks to help out here

DMA MAPPING HELPERS
M:      Christoph Hellwig <hch@lst.de>
M:      Marek Szyprowski <m.szyprowski@samsung.com>
R:      Robin Murphy <robin.murphy@arm.com>
L:      iommu@lists.linux-foundation.org

> > 
> >> +
> >> +/* Transfer mode supported by GENI Serial Engines */
> >> +enum geni_se_xfer_mode {
> >> +       GENI_SE_INVALID,
> >> +       GENI_SE_FIFO,
> >> +       GENI_SE_DMA,
> >> +};
> >> +
> >> +/* Protocols supported by GENI Serial Engines */
> >> +enum geni_se_protocol_types {
> >> +       GENI_SE_NONE,
> >> +       GENI_SE_SPI,
> >> +       GENI_SE_UART,
> >> +       GENI_SE_I2C,
> >> +       GENI_SE_I3C,
> >> +};
> >> +
> >> +/**
> >> + * struct geni_se - GENI Serial Engine
> >> + * @base:              Base Address of the Serial Engine's register block.
> >> + * @dev:               Pointer to the Serial Engine device.
> >> + * @wrapper:           Pointer to the parent QUP Wrapper core.
> >> + * @clk:               Handle to the core serial engine clock.
> >> + */
> >> +struct geni_se {
> >> +       void __iomem *base;
> >> +       struct device *dev;
> >> +       void *wrapper;
> > 
> > Can this get the geni_wrapper type? It could be opaque if you like.
> I am not sure if it is ok to have the children know the details of the 
> parent. That is why it is kept as opaque.

That's fine, but I mean to have struct geni_wrapper *wrapper, and then
struct geni_wrapper; in this file. Children won't know details and we
get slightly more type safety.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karthikeyan Ramasubramanian March 8, 2018, 6:06 a.m. UTC | #9
On 3/6/2018 2:45 PM, Stephen Boyd wrote:
> Quoting Karthik Ramasubramanian (2018-03-05 16:51:23)
>> On 3/2/2018 3:11 PM, Stephen Boyd wrote:
>>> Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:09)
>>>
>>>> +               size_t chars_to_write = 0;
>>>> +               size_t avail = DEF_FIFO_DEPTH_WORDS - DEF_TX_WM;
>>>> +
>>>> +               /*
>>>> +                * If the WM bit never set, then the Tx state machine is not
>>>> +                * in a valid state, so break, cancel/abort any existing
>>>> +                * command. Unfortunately the current data being written is
>>>> +                * lost.
>>>> +                */
>>>> +               while (!qcom_geni_serial_poll_bit(uport, SE_GENI_M_IRQ_STATUS,
>>>> +                                               M_TX_FIFO_WATERMARK_EN, true))
>>>
>>> Does this ever timeout? So many nested while loops makes it hard to
>>> follow.
>> Yes. Based on the baud rate of 115200 and the FIFO Depth & Width of (16
>> * 32), the poll should not take more than 5 ms under the timeout scenario.
> 
> Sure, but I'm asking if this has any sort of timeout associated with it.
> It looks to be a while loop that could possibly go forever?
I will change it from a while loop to if condition to make it clear.
> 
>>>> +static void qcom_geni_serial_console_write(struct console *co, const char *s,
>>>> +                             unsigned int count)
>>>> +{
>>>> +       struct uart_port *uport;
>>>> +       struct qcom_geni_serial_port *port;
>>>> +       bool locked = true;
>>>> +       unsigned long flags;
>>>> +
>>>> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
>>>> +
>>>> +       port = get_port_from_line(co->index);
>>>> +       if (IS_ERR(port))
>>>> +               return;
>>>> +
>>>> +       uport = &port->uport;
>>>> +       if (oops_in_progress)
>>>> +               locked = spin_trylock_irqsave(&uport->lock, flags);
>>>> +       else
>>>> +               spin_lock_irqsave(&uport->lock, flags);
>>>> +
>>>> +       if (locked) {
>>>> +               __qcom_geni_serial_console_write(uport, s, count);
>>>
>>> So if oops is in progress, and we didn't lock here, we don't output
>>> data? I'd think we would always want to write to the fifo, just make the
>>> lock grab/release conditional.
>> If we fail to grab the lock, then there is another active writer. So
>> trying to write to the fifo will put the hardware in bad state because
>> writer has programmed the hardware to write 'x' number of words and this
>> thread will over-write it with 'y' number of words. Also the data that
>> you might see in the console might be garbled.
> 
> I suspect that because this is the serial console, and we want it to
> always output stuff even when we're going down in flames, we may want to
> ignore that case and just wait for the fifo to free up and then
> overwrite the number of words that we want to output and push out more
> characters.
> 
> I always get confused though because there seem to be lots of different
> ways to do things in serial drivers and not too much clear documentation
> that I've read describing what to do.
Ok. If the active writer is interrupted due to OOPS handler, then the 
interrupted write can be cancelled and the write from OOPS handler can 
be performed.
> 
>>>
>>>> +               spin_unlock_irqrestore(&uport->lock, flags);
>>>> +       }
>>>> +}
> [...]
>>>> +
>>>> +       if (m_irq_status & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN) &&
>>>> +           m_irq_en & (M_TX_FIFO_WATERMARK_EN | M_CMD_DONE_EN))
>>>> +               qcom_geni_serial_handle_tx(uport);
>>>> +
>>>> +       if (s_irq_status & S_GP_IRQ_0_EN || s_irq_status & S_GP_IRQ_1_EN) {
>>>> +               if (s_irq_status & S_GP_IRQ_0_EN)
>>>> +                       uport->icount.parity++;
>>>> +               drop_rx = true;
>>>> +       } else if (s_irq_status & S_GP_IRQ_2_EN ||
>>>> +                                       s_irq_status & S_GP_IRQ_3_EN) {
>>>> +               uport->icount.brk++;
>>>
>>> How does break character handling work? I see the accounting here, but
>>> don't see any uart_handle_break() call anywhere.
>> The reason it is not added is because the hardware does not indicate
>> when the break character occured. It can be any one of the FIFO words.
>> The statistics is updated to give an idea that the break happened. We
>> can add uart_handle_break() but it may not be at an accurate position
>> for the above mentioned reason.
> 
> Sounds like the previous uart design. We would want uart_handle_break()
> support for kgdb to work over serial. Do we still get an interrupt
> signal that a break character is somewhere in the fifo? If we get that,
> then does it also put a NUL (0) character into the fifo that we can
> find? I had to do something like that before, where we detect the irq
> and then we go through the fifo a character at a time to find the break
> character that looks like a NUL, and then let sysrq core handle the
> characters after that break character comes in.
I will use the same logic as in blsp2 serial to catch the break 
character, same as NULL and push the break character to the framework.
> 
>>>
>>>
>>>> +}
>>>> +
>>>> +static unsigned long get_clk_div_rate(unsigned int baud, unsigned int *clk_div)
>>>> +{
>>>> +       unsigned long ser_clk;
>>>> +       unsigned long desired_clk;
>>>> +
>>>> +       desired_clk = baud * UART_OVERSAMPLING;
>>>> +       ser_clk = get_clk_cfg(desired_clk);
>>>> +       if (!ser_clk) {
>>>> +               pr_err("%s: Can't find matching DFS entry for baud %d\n",
>>>> +                                                               __func__, baud);
>>>> +               return ser_clk;
>>>> +       }
>>>> +
>>>> +       *clk_div = ser_clk / desired_clk;
>>>
>>> How wide can clk_div be? It may be better to implement the ser_clk as an
>>> actual clk in the common clk framework, and then have the serial driver
>>> or the i2c driver call clk_set_rate() on that clk and have the CCF
>>> implementation take care of determining the rate that the parent clk can
>>> supply and how it can fit it into the frequency that the divider can
>>> support.
>> Based on my current expertise with the CCF, I will address this comment
>> in a later patchset than the next one.
> 
> Ok. I've seen similar designs in some mmc drivers. For example, you can
> look at drivers/mmc/host/meson-gx-mmc.c and see that they register a
> clk_ops and then just start using that clk directly from the driver.
> There are also some helper functions for dividers that would hopefully
> make the job of calculating the best divider easier.
Thanks for the pointers. I will take a look at it. In the meanwhile I 
had discussions with our clock team. They pointed out that the register 
to write the divider value here is outside the scope of clock controller 
which makes it trickier to implement your suggestion. They are already 
in the mailing list and we will discuss further and get back to you in 
this regard.
> 
>>>> +
>>>> +       uport->uartclk = clk_rate;
>>>> +       clk_set_rate(port->se.clk, clk_rate);
>>>> +       ser_clk_cfg = SER_CLK_EN;
>>>> +       ser_clk_cfg |= (clk_div << CLK_DIV_SHFT);
>>>
>>> Drop useless cast.
>> I think you mean parenthesis. I do not see casting here.
> 
> Yes! You got it.
> 
>>>> +#endif
>>>> +       .pm = qcom_geni_serial_cons_pm,
>>>> +};
>>>> +
>>>> +static int qcom_geni_serial_probe(struct platform_device *pdev)
>>>> +{
>>>> +       int ret = 0;
>>>> +       int line = -1;
>>>> +       struct qcom_geni_serial_port *port;
>>>> +       struct uart_port *uport;
>>>> +       struct resource *res;
>>>> +       struct uart_driver *drv;
>>>> +
>>>> +       drv = (void *)of_device_get_match_data(&pdev->dev);
>>>
>>> Useless cast.
>> There were compiler warnings assigning a const void * to a void *. That
>> is why I have the cast in place.
> 
> Oh. Yes you shouldn't cast away the const. Please make the compiler
> warning go away without casting it away.
Ok. I will figure out an alternative to this one.
> 
>>>
>>>
>>> Also, I see some serial drivers do the mapping when the port is
>>> requested. That can't be done here?
>> By "when the port is requested" do you mean console's setup operation.
>> It can be done, but given that it is a one-time operation I am not sure
>> if it makes any difference between mapping here and there.
> 
> No. I meant the uart_ops::uart_request() function and also
> uart_ops::config_port(). Take a look at msm_config_port() and
> msm_request_port() for how it was done on pre-geni qcom uarts.
> 
I will take a look at it and update accordingly.
> I suppose we should try to probe this as a module and run a getty on it
> and then remove and probe the module a couple times after that.
> That should shake out some bugs.
> 
>>>> +
>>>> +static const struct dev_pm_ops qcom_geni_serial_pm_ops = {
>>>> +       .suspend_noirq = qcom_geni_serial_sys_suspend_noirq,
>>>> +       .resume_noirq = qcom_geni_serial_sys_resume_noirq,
>>>
>>> Why are these noirq variants? Please add a comment.
>> The intention is to enable the console UART port usage as late as
>> possible in the suspend cycle. Hence noirq variants. I will add this as
>> a comment. Please let me know if the usage does not match the intention.
> 
> Hm.. Does no_console_suspend not work? Or not work well enough?
It works. When console suspend is disabled, the suspend operation does 
not get triggered and the resume operation checks if the console suspend 
is disabled and performs the needed thing.
> 
Regards,
Karthik.
Karthikeyan Ramasubramanian March 8, 2018, 6:46 a.m. UTC | #10
On 3/6/2018 2:56 PM, Stephen Boyd wrote:
> Quoting Karthik Ramasubramanian (2018-03-02 16:58:23)
> 
>>>> +       return iova;
>>>> +}
>>>> +EXPORT_SYMBOL(geni_se_tx_dma_prep);
>>>> +
>>>> +/**
>>>> + * geni_se_rx_dma_prep() - Prepare the Serial Engine for RX DMA transfer
>>>> + * @se:                        Pointer to the concerned Serial Engine.
>>>> + * @buf:               Pointer to the RX buffer.
>>>> + * @len:               Length of the RX buffer.
>>>> + *
>>>> + * This function is used to prepare the buffers for DMA RX.
>>>> + *
>>>> + * Return: Mapped DMA Address of the buffer on success, NULL on failure.
>>>> + */
>>>> +dma_addr_t geni_se_rx_dma_prep(struct geni_se *se, void *buf, size_t len)
>>>> +{
>>>> +       dma_addr_t iova;
>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>> +       u32 val;
>>>> +
>>>> +       iova = dma_map_single(wrapper->dev, buf, len, DMA_FROM_DEVICE);
>>>> +       if (dma_mapping_error(wrapper->dev, iova))
>>>> +               return (dma_addr_t)NULL;
>>>
>>> Can't return a dma_mapping_error address to the caller and have them
>>> figure it out?
>> Earlier we used to return the DMA_ERROR_CODE which has been removed
>> recently in arm64 architecture. If we return the dma_mapping_error, then
>> the caller also needs the device which encountered the mapping error.
>> The serial interface drivers can use their parent currently to resolve
>> the mapping error. Once the wrapper starts mapping using IOMMU context
>> bank, then the serial interface drivers do not know which device to use
>> to know if there is an error.
>>
>> Having said that, the dma_ops suggestion might help with handling this
>> situation. I will look into it further.
> 
> Ok, thanks.
> 
>>>> +{
>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>> +
>>>> +       if (iova)
>>>> +               dma_unmap_single(wrapper->dev, iova, len, DMA_FROM_DEVICE);
>>>> +}
>>>> +EXPORT_SYMBOL(geni_se_rx_dma_unprep);
>>>
>>> Instead of having the functions exported, could we set the dma_ops on
>>> all child devices of the wrapper that this driver populates and then
>>> implement the DMA ops for those devices here? I assume that there's
>>> never another DMA master between the wrapper and the serial engine, so I
>>> think it would work.
>> This suggestion looks like it will work.
> 
> It would be a good idea to check with some other people on the dma_ops
> suggestion. Maybe add the DMA mapping subsystem folks to help out here
I have added the DMA mapping subsystem folks to help out here.

To present an overview, we have a wrapper controller which is composed 
of several serial engines. The serial engines are programmed with UART, 
I2C or SPI protocol and support DMA transfer. When the serial engines 
perform DMA transfer, the wrapper controller device is used to perform 
the mapping. The reason wrapper device is used is because of IOMMU and 
there is only one IOMMU context bank to perform the translation for the 
entire wrapper controller. So the wrapper controller exports map and 
unmap functions to the individual protocol drivers.

There is a suggestion to make the parent wrapper controller implement 
the dma_map_ops, instead of exported map/unmap functions and populate 
those dma_map_ops on all the children serial engines. Can you please 
provide your inputs regarding this suggestion?
> 
> DMA MAPPING HELPERS
> M:      Christoph Hellwig <hch@lst.de>
> M:      Marek Szyprowski <m.szyprowski@samsung.com>
> R:      Robin Murphy <robin.murphy@arm.com>
> L:      iommu@lists.linux-foundation.org
> 
> 
Regards,
Karthik.
Robin Murphy March 8, 2018, 1:24 p.m. UTC | #11
On 08/03/18 06:46, Karthik Ramasubramanian wrote:
> 
> 
> On 3/6/2018 2:56 PM, Stephen Boyd wrote:
>> Quoting Karthik Ramasubramanian (2018-03-02 16:58:23)
>>
>>>>> +       return iova;
>>>>> +}
>>>>> +EXPORT_SYMBOL(geni_se_tx_dma_prep);
>>>>> +
>>>>> +/**
>>>>> + * geni_se_rx_dma_prep() - Prepare the Serial Engine for RX DMA 
>>>>> transfer
>>>>> + * @se:                        Pointer to the concerned Serial 
>>>>> Engine.
>>>>> + * @buf:               Pointer to the RX buffer.
>>>>> + * @len:               Length of the RX buffer.
>>>>> + *
>>>>> + * This function is used to prepare the buffers for DMA RX.
>>>>> + *
>>>>> + * Return: Mapped DMA Address of the buffer on success, NULL on 
>>>>> failure.
>>>>> + */
>>>>> +dma_addr_t geni_se_rx_dma_prep(struct geni_se *se, void *buf, 
>>>>> size_t len)
>>>>> +{
>>>>> +       dma_addr_t iova;
>>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>>> +       u32 val;
>>>>> +
>>>>> +       iova = dma_map_single(wrapper->dev, buf, len, 
>>>>> DMA_FROM_DEVICE);
>>>>> +       if (dma_mapping_error(wrapper->dev, iova))
>>>>> +               return (dma_addr_t)NULL;
>>>>
>>>> Can't return a dma_mapping_error address to the caller and have them
>>>> figure it out?
>>> Earlier we used to return the DMA_ERROR_CODE which has been removed
>>> recently in arm64 architecture. If we return the dma_mapping_error, then
>>> the caller also needs the device which encountered the mapping error.
>>> The serial interface drivers can use their parent currently to resolve
>>> the mapping error. Once the wrapper starts mapping using IOMMU context
>>> bank, then the serial interface drivers do not know which device to use
>>> to know if there is an error.
>>>
>>> Having said that, the dma_ops suggestion might help with handling this
>>> situation. I will look into it further.
>>
>> Ok, thanks.
>>
>>>>> +{
>>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>>> +
>>>>> +       if (iova)
>>>>> +               dma_unmap_single(wrapper->dev, iova, len, 
>>>>> DMA_FROM_DEVICE);
>>>>> +}
>>>>> +EXPORT_SYMBOL(geni_se_rx_dma_unprep);
>>>>
>>>> Instead of having the functions exported, could we set the dma_ops on
>>>> all child devices of the wrapper that this driver populates and then
>>>> implement the DMA ops for those devices here? I assume that there's
>>>> never another DMA master between the wrapper and the serial engine, 
>>>> so I
>>>> think it would work.
>>> This suggestion looks like it will work.
>>
>> It would be a good idea to check with some other people on the dma_ops
>> suggestion. Maybe add the DMA mapping subsystem folks to help out here
> I have added the DMA mapping subsystem folks to help out here.
> 
> To present an overview, we have a wrapper controller which is composed 
> of several serial engines. The serial engines are programmed with UART, 
> I2C or SPI protocol and support DMA transfer. When the serial engines 
> perform DMA transfer, the wrapper controller device is used to perform 
> the mapping. The reason wrapper device is used is because of IOMMU and 
> there is only one IOMMU context bank to perform the translation for the 
> entire wrapper controller. So the wrapper controller exports map and 
> unmap functions to the individual protocol drivers.
> 
> There is a suggestion to make the parent wrapper controller implement 
> the dma_map_ops, instead of exported map/unmap functions and populate 
> those dma_map_ops on all the children serial engines. Can you please 
> provide your inputs regarding this suggestion?

Implementing dma_map_ops inside a driver for real hardware is almost 
always the wrong thing to do.

Based on what I could infer about the hardware from looking through the 
whole series in the linux-arm-msm archive, this is probably more like a 
multi-channel DMA controller where each "channel" has a configurable 
serial interface on the other end, as opposed to an actual bus where the 
serial engines are individually distinct AHB masters routed through the 
wrapper. If that's true, then using the QUP platform device for DMA API 
calls is the appropriate thing to do. Personally I'd be inclined not to 
abstract the dma_{map,unmap} calls at all, and just have the protocol 
drivers make them directly using dev->parent/wrapper->dev/whatever, but 
if you do want to abstract those then just give the abstraction a saner 
interface, i.e. pass the DMA handle by reference and return a regular 
int for error/success status.

Robin.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig March 8, 2018, 2:41 p.m. UTC | #12
On Thu, Mar 08, 2018 at 01:24:45PM +0000, Robin Murphy wrote:
> Implementing dma_map_ops inside a driver for real hardware is almost always 
> the wrong thing to do.

Agreed.  dma_map_ops should be a platform decision based on the bus.

Even our dma_virt_ops basically just works around bad driver layering.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karthikeyan Ramasubramanian March 8, 2018, 6:18 p.m. UTC | #13
On 3/8/2018 6:24 AM, Robin Murphy wrote:
> On 08/03/18 06:46, Karthik Ramasubramanian wrote:
>>
>>
>> On 3/6/2018 2:56 PM, Stephen Boyd wrote:
>>> Quoting Karthik Ramasubramanian (2018-03-02 16:58:23)
>>>
>>>>>> +       return iova;
>>>>>> +}
>>>>>> +EXPORT_SYMBOL(geni_se_tx_dma_prep);
>>>>>> +
>>>>>> +/**
>>>>>> + * geni_se_rx_dma_prep() - Prepare the Serial Engine for RX DMA 
>>>>>> transfer
>>>>>> + * @se:                        Pointer to the concerned Serial 
>>>>>> Engine.
>>>>>> + * @buf:               Pointer to the RX buffer.
>>>>>> + * @len:               Length of the RX buffer.
>>>>>> + *
>>>>>> + * This function is used to prepare the buffers for DMA RX.
>>>>>> + *
>>>>>> + * Return: Mapped DMA Address of the buffer on success, NULL on 
>>>>>> failure.
>>>>>> + */
>>>>>> +dma_addr_t geni_se_rx_dma_prep(struct geni_se *se, void *buf, 
>>>>>> size_t len)
>>>>>> +{
>>>>>> +       dma_addr_t iova;
>>>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>>>> +       u32 val;
>>>>>> +
>>>>>> +       iova = dma_map_single(wrapper->dev, buf, len, 
>>>>>> DMA_FROM_DEVICE);
>>>>>> +       if (dma_mapping_error(wrapper->dev, iova))
>>>>>> +               return (dma_addr_t)NULL;
>>>>>
>>>>> Can't return a dma_mapping_error address to the caller and have them
>>>>> figure it out?
>>>> Earlier we used to return the DMA_ERROR_CODE which has been removed
>>>> recently in arm64 architecture. If we return the dma_mapping_error, 
>>>> then
>>>> the caller also needs the device which encountered the mapping error.
>>>> The serial interface drivers can use their parent currently to resolve
>>>> the mapping error. Once the wrapper starts mapping using IOMMU context
>>>> bank, then the serial interface drivers do not know which device to use
>>>> to know if there is an error.
>>>>
>>>> Having said that, the dma_ops suggestion might help with handling this
>>>> situation. I will look into it further.
>>>
>>> Ok, thanks.
>>>
>>>>>> +{
>>>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>>>> +
>>>>>> +       if (iova)
>>>>>> +               dma_unmap_single(wrapper->dev, iova, len, 
>>>>>> DMA_FROM_DEVICE);
>>>>>> +}
>>>>>> +EXPORT_SYMBOL(geni_se_rx_dma_unprep);
>>>>>
>>>>> Instead of having the functions exported, could we set the dma_ops on
>>>>> all child devices of the wrapper that this driver populates and then
>>>>> implement the DMA ops for those devices here? I assume that there's
>>>>> never another DMA master between the wrapper and the serial engine, 
>>>>> so I
>>>>> think it would work.
>>>> This suggestion looks like it will work.
>>>
>>> It would be a good idea to check with some other people on the dma_ops
>>> suggestion. Maybe add the DMA mapping subsystem folks to help out here
>> I have added the DMA mapping subsystem folks to help out here.
>>
>> To present an overview, we have a wrapper controller which is composed 
>> of several serial engines. The serial engines are programmed with 
>> UART, I2C or SPI protocol and support DMA transfer. When the serial 
>> engines perform DMA transfer, the wrapper controller device is used to 
>> perform the mapping. The reason wrapper device is used is because of 
>> IOMMU and there is only one IOMMU context bank to perform the 
>> translation for the entire wrapper controller. So the wrapper 
>> controller exports map and unmap functions to the individual protocol 
>> drivers.
>>
>> There is a suggestion to make the parent wrapper controller implement 
>> the dma_map_ops, instead of exported map/unmap functions and populate 
>> those dma_map_ops on all the children serial engines. Can you please 
>> provide your inputs regarding this suggestion?
> 
> Implementing dma_map_ops inside a driver for real hardware is almost 
> always the wrong thing to do.
> 
> Based on what I could infer about the hardware from looking through the 
> whole series in the linux-arm-msm archive, this is probably more like a 
> multi-channel DMA controller where each "channel" has a configurable 
> serial interface on the other end, as opposed to an actual bus where the 
> serial engines are individually distinct AHB masters routed through the 
> wrapper. If that's true, then using the QUP platform device for DMA API 
> calls is the appropriate thing to do. Personally I'd be inclined not to 
> abstract the dma_{map,unmap} calls at all, and just have the protocol 
> drivers make them directly using dev->parent/wrapper->dev/whatever, but 
> if you do want to abstract those then just give the abstraction a saner 
> interface, i.e. pass the DMA handle by reference and return a regular 
> int for error/success status.
> 
> Robin.
Thank you Robin & Christoph for your inputs. The wrapper driver used to 
provide the recommended abstraction until v2 of this patch series. In v3 
it was tweaked to address a comment. If there are no objections, I will 
revive it back.

Regards,
Karthik.
Stephen Boyd March 8, 2018, 10:32 p.m. UTC | #14
Quoting Karthik Ramasubramanian (2018-03-07 22:06:29)
> 
> 
> On 3/6/2018 2:45 PM, Stephen Boyd wrote:
> > Quoting Karthik Ramasubramanian (2018-03-05 16:51:23)
> >> On 3/2/2018 3:11 PM, Stephen Boyd wrote:
> > 
> > Ok. I've seen similar designs in some mmc drivers. For example, you can
> > look at drivers/mmc/host/meson-gx-mmc.c and see that they register a
> > clk_ops and then just start using that clk directly from the driver.
> > There are also some helper functions for dividers that would hopefully
> > make the job of calculating the best divider easier.
> Thanks for the pointers. I will take a look at it. In the meanwhile I 
> had discussions with our clock team. They pointed out that the register 
> to write the divider value here is outside the scope of clock controller 
> which makes it trickier to implement your suggestion. They are already 
> in the mailing list and we will discuss further and get back to you in 
> this regard.

Ok. Let me know if I can help answer any questions.

> >>>
> >>> Why are these noirq variants? Please add a comment.
> >> The intention is to enable the console UART port usage as late as
> >> possible in the suspend cycle. Hence noirq variants. I will add this as
> >> a comment. Please let me know if the usage does not match the intention.
> > 
> > Hm.. Does no_console_suspend not work? Or not work well enough?
> It works. When console suspend is disabled, the suspend operation does 
> not get triggered and the resume operation checks if the console suspend 
> is disabled and performs the needed thing.

Ok so then do we need the noirq variants? Or console suspend is special
enough for this to not matter?
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Karthikeyan Ramasubramanian March 9, 2018, 4:57 a.m. UTC | #15
On 3/8/2018 3:32 PM, Stephen Boyd wrote:
> Quoting Karthik Ramasubramanian (2018-03-07 22:06:29)
>>
>>
>> On 3/6/2018 2:45 PM, Stephen Boyd wrote:
>>> Quoting Karthik Ramasubramanian (2018-03-05 16:51:23)
>>>> On 3/2/2018 3:11 PM, Stephen Boyd wrote:
>>>
>>> Ok. I've seen similar designs in some mmc drivers. For example, you can
>>> look at drivers/mmc/host/meson-gx-mmc.c and see that they register a
>>> clk_ops and then just start using that clk directly from the driver.
>>> There are also some helper functions for dividers that would hopefully
>>> make the job of calculating the best divider easier.
>> Thanks for the pointers. I will take a look at it. In the meanwhile I
>> had discussions with our clock team. They pointed out that the register
>> to write the divider value here is outside the scope of clock controller
>> which makes it trickier to implement your suggestion. They are already
>> in the mailing list and we will discuss further and get back to you in
>> this regard.
> 
> Ok. Let me know if I can help answer any questions.
Ok.
> 
>>>>>
>>>>> Why are these noirq variants? Please add a comment.
>>>> The intention is to enable the console UART port usage as late as
>>>> possible in the suspend cycle. Hence noirq variants. I will add this as
>>>> a comment. Please let me know if the usage does not match the intention.
>>>
>>> Hm.. Does no_console_suspend not work? Or not work well enough?
>> It works. When console suspend is disabled, the suspend operation does
>> not get triggered and the resume operation checks if the console suspend
>> is disabled and performs the needed thing.
> 
> Ok so then do we need the noirq variants? Or console suspend is special
> enough for this to not matter?
I am a little confused as to whether you prefer the console to not 
suspend at all or you prefer the console suspend at an earlier stage 
than no_irq stage.

If it is former, then with the console_suspend_enabled flag set by 
default this seems the right thing to do. Atleast my understanding is 
that console is expecting the serial port to suspend as well.

If it is latter, then I will check the stage at which suspend_console() 
is initiated and can suspend the serial port after that.
> 
Regards,
Karthik.
Karthikeyan Ramasubramanian March 9, 2018, 6:18 p.m. UTC | #16
On 3/6/2018 2:56 PM, Stephen Boyd wrote:
> Quoting Karthik Ramasubramanian (2018-03-02 16:58:23)
>>
>>
>> On 3/2/2018 1:41 PM, Stephen Boyd wrote:
>>> Quoting Karthikeyan Ramasubramanian (2018-02-27 17:38:07)
>>>> +/**
>>>> + * geni_se_read_proto() - Read the protocol configured for a Serial Engine
>>>> + * @se:        Pointer to the concerned Serial Engine.
>>>> + *
>>>> + * Return: Protocol value as configured in the serial engine.
>>>> + */
>>>> +u32 geni_se_read_proto(struct geni_se *se)
>>>> +{
>>>> +       u32 val;
>>>> +
>>>> +       val = readl_relaxed(se->base + GENI_FW_REVISION_RO);
>>>> +
>>>> +       return (val & FW_REV_PROTOCOL_MSK) >> FW_REV_PROTOCOL_SHFT;
>>>> +}
>>>> +EXPORT_SYMBOL(geni_se_read_proto);
>>>
>>> Is this API really needed outside of this file? It would seem like the
>>> drivers that implement the protocol, which are child devices, would only
>>> use this API to confirm that the protocol chosen is for their particular
>>> protocol.
>> No, this API is meant for the protocol drivers to confirm that the
>> serial engine is programmed with the firmware for the concerned protocol
>> before using the serial engine. If the check fails, the protocol drivers
>> stop using the serial engine.
> 
> Ok maybe we don't really need it then?
Without this function the protocol drivers may not be able to verify if 
the serial engine is programmed with the right protocol. Operating on a 
serial engine that is not programmed with the right protocol leads to 
totally undefined behavior.
> >>>> +
>>>> +       ret = geni_se_clks_on(se);
>>>> +       if (ret)
>>>> +               return ret;
>>>> +
>>>> +       ret = pinctrl_pm_select_default_state(se->dev);
>>>> +       if (ret)
>>>> +               geni_se_clks_off(se);
>>>> +
>>>> +       return ret;
>>>> +}
>>>> +EXPORT_SYMBOL(geni_se_resources_on);
>>>
>>> IS there a reason why we can't use runtime PM or normal linux PM
>>> infrastructure to power on the wrapper and keep it powered while the
>>> protocol driver is active?
>> Besides turning on the clocks & pinctrl settings, wrapper also has to do
>> the bus scaling votes. The bus scaling votes depend on the individual
>> serial interface bandwidth requirements. The bus scaling votes is not
>> present currently. But once the support comes in, this function enables
>> adding it.
> 
> Ok, but that would basically be some code consolidation around picking a
> bandwidth and enabling/disabling? It sounds like it could go into either
> the serial interface drivers or into the runtime PM path of the wrapper.
Not really. SPI slaves, for example, can operate on different 
frequencies and therefore within a serial engine the bandwidth 
requirements can vary based on the slave. UART & I2C serial interfaces 
have different bandwidth requirements than SPI. So each serial interface 
driver has to specify their bandwidth requirements depending on their 
use-case. This function also allows for aggregation of the votes from 
the wrapper perspective, instead of constant RPMh communication
> 
>>>
>>>> +
>>>> +/**
>>>> + * geni_se_clk_tbl_get() - Get the clock table to program DFS
>>>> + * @se:                Pointer to the concerned Serial Engine.
>>>> + * @tbl:       Table in which the output is returned.
>>>> + *
>>>> + * This function is called by the protocol drivers to determine the different
>>>> + * clock frequencies supported by Serial Engine Core Clock. The protocol
>>>> + * drivers use the output to determine the clock frequency index to be
>>>> + * programmed into DFS.
>>>> + *
>>>> + * Return: number of valid performance levels in the table on success,
>>>> + *        standard Linux error codes on failure.
>>>> + */
>>>> +int geni_se_clk_tbl_get(struct geni_se *se, unsigned long **tbl)
>>>> +{
>>>> +       struct geni_wrapper *wrapper = se->wrapper;
>>>> +       unsigned long freq = 0;
>>>> +       int i;
>>>> +       int ret = 0;
>>>> +
>>>> +       mutex_lock(&wrapper->lock);
>>>> +       if (wrapper->clk_perf_tbl) {
>>>> +               *tbl = wrapper->clk_perf_tbl;
>>>> +               ret = wrapper->num_clk_levels;
>>>> +               goto out_unlock;
>>>> +       }
>>>> +
>>>> +       wrapper->clk_perf_tbl = kcalloc(MAX_CLK_PERF_LEVEL,
>>>> +                                       sizeof(*wrapper->clk_perf_tbl),
>>>> +                                       GFP_KERNEL);
>>>> +       if (!wrapper->clk_perf_tbl) {
>>>> +               ret = -ENOMEM;
>>>> +               goto out_unlock;
>>>> +       }
>>>> +
>>>> +       for (i = 0; i < MAX_CLK_PERF_LEVEL; i++) {
>>>> +               freq = clk_round_rate(se->clk, freq + 1);
>>>> +               if (!freq || freq == wrapper->clk_perf_tbl[i - 1])
>>>> +                       break;
>>>> +               wrapper->clk_perf_tbl[i] = freq;
>>>> +       }
>>>> +       wrapper->num_clk_levels = i;
>>>> +       *tbl = wrapper->clk_perf_tbl;
>>>> +       ret = wrapper->num_clk_levels;
>>>> +out_unlock:
>>>> +       mutex_unlock(&wrapper->lock);
>>>
>>> Is this lock actually protecting anything? I mean to say, is any more
>>> than one geni protocol driver calling this function at a time? Or is
>>> the same geni protocol driver calling this from multiple threads at the
>>> same time? The lock looks almost useless.
>> Yes, there is a possibility of multiple I2C instances within the same
>> wrapper trying to get this table simultaneously.
>>
>> As Evan mentioned in the other thread, Bjorn had the comment to move it
>> to the probe and remove the lock. I looked into the possibility of it.
>>   From the hardware perspective, this table belongs to the wrapper and is
>> shared by all the serial engines within the wrapper. But due to software
>> implementation reasons, clk_round_rate can be be performed only on the
>> clocks that are tagged as DFS compatible and only the serial engine
>> clocks are tagged so. At least this was the understanding based on our
>> earlier discussion with the concerned folks. We will revisit it and
>> check if anything has changed recently.
> 
> Hmm sounds like the round rate should happen on the parent of the
> se_clk, and this wrapper DT binding should get the clk for the parent of
> the se->clk to run round_rate() on. Then it could all be done in probe,
> which sounds good.
The parent of the se->clk is also specific to the serial engine itself. 
So putting that into the wrapper's DT binding does not look like a right 
location. For now, I will move the table to the individual serial engine 
themselves. Hence the lock can be removed.
> 
>>>> +
>>>> +/**
>>>> + * struct geni_se - GENI Serial Engine
>>>> + * @base:              Base Address of the Serial Engine's register block.
>>>> + * @dev:               Pointer to the Serial Engine device.
>>>> + * @wrapper:           Pointer to the parent QUP Wrapper core.
>>>> + * @clk:               Handle to the core serial engine clock.
>>>> + */
>>>> +struct geni_se {
>>>> +       void __iomem *base;
>>>> +       struct device *dev;
>>>> +       void *wrapper;
>>>
>>> Can this get the geni_wrapper type? It could be opaque if you like.
>> I am not sure if it is ok to have the children know the details of the
>> parent. That is why it is kept as opaque.
> 
> That's fine, but I mean to have struct geni_wrapper *wrapper, and then
> struct geni_wrapper; in this file. Children won't know details and we
> get slightly more type safety.
Ok.
> 
Regards,
Karthik.
Karthikeyan Ramasubramanian March 13, 2018, 8:16 p.m. UTC | #17
On 3/2/2018 5:11 PM, Evan Green wrote:
>> +
>> +#ifdef CONFIG_CONSOLE_POLL
>> +#define RX_BYTES_PW 1
>> +#else
>> +#define RX_BYTES_PW 4
>> +#endif
> 
> This seems fishy to me. Does either setting work? If so, why not just
> have one value?
Yes, either one works. In the interrupt driven mode, sometimes due to
increased interrupt latency the RX FIFO may overflow if we use only 1
byte per FIFO word - given there are no flow control lines in the debug
uart. Hence using 4 bytes in the FIFO word will help to prevent the FIFO
overflow - especially in the case where commands are executed through
scripts.

In polling mode, using 1 byte per word helps to use the hardware to
buffer the data instead of software buffering especially when the
framework keeps reading one byte at a time.
> 
>> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
>> +                               int offset, int bit_field, bool set)
>> +{
>> +       u32 reg;
>> +       struct qcom_geni_serial_port *port;
>> +       unsigned int baud;
>> +       unsigned int fifo_bits;
>> +       unsigned long timeout_us = 20000;
>> +
>> +       /* Ensure polling is not re-ordered before the prior writes/reads */
>> +       mb();
>> +
>> +       if (uport->private_data) {
>> +               port = to_dev_port(uport, uport);
>> +               baud = port->cur_baud;
>> +               if (!baud)
>> +                       baud = 115200;
>> +               fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
>> +               /*
>> +                * Total polling iterations based on FIFO worth of bytes to be
>> +                * sent at current baud .Add a little fluff to the wait.
>> +                */
>> +               timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;
> 
> This fluff is a little mysterious, can it be explained at all? Do you
> think the fluff factor is in units of time (as you have it) or bits?
> Time makes sense I guess if we're worried about clock source
> differences.
The fluff is in micro-seconds and can help with unforeseen delays in
emulation platforms.
> 
>> +
>> +static void qcom_geni_serial_console_write(struct console *co, const char *s,
>> +                             unsigned int count)
>> +{
>> +       struct uart_port *uport;
>> +       struct qcom_geni_serial_port *port;
>> +       bool locked = true;
>> +       unsigned long flags;
>> +
>> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
>> +
>> +       port = get_port_from_line(co->index);
>> +       if (IS_ERR(port))
>> +               return;
>> +
>> +       uport = &port->uport;
>> +       if (oops_in_progress)
>> +               locked = spin_trylock_irqsave(&uport->lock, flags);
>> +       else
>> +               spin_lock_irqsave(&uport->lock, flags);
>> +
>> +       if (locked) {
>> +               __qcom_geni_serial_console_write(uport, s, count);
>> +               spin_unlock_irqrestore(&uport->lock, flags);
> 
> I too am a little lost on the locking here. What exactly is the lock
> protecting? Looks like for the most part it's trying to synchronize
> with the ISR? What specifically in the ISR? I just wanted to go
> through and check to make sure whatever the shared resource is is
> appropriately protected.
The lock protects 2 simultaneous writers from putting the hardware in
the bad state. The output of the command entered in a shell can trigger
a write in the interrupt context while logging activity can trigger a
simultaneous write.
> 
>> +       }
>> +}
>> +
>> +static int handle_rx_console(struct uart_port *uport, u32 rx_bytes, bool drop)
>> +{
>> +       u32 i = rx_bytes;
>> +       u32 rx_fifo;
>> +       unsigned char *buf;
>> +       struct tty_port *tport;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +
>> +       tport = &uport->state->port;
>> +       while (i > 0) {
>> +               int c;
>> +               int bytes = i > port->rx_bytes_pw ? port->rx_bytes_pw : i;
> 
> Please replace this with a min macro.
Ok.
> 
>> +static int qcom_geni_serial_handle_tx(struct uart_port *uport)
>> +{
>> +       int ret = 0;
>> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
>> +       struct circ_buf *xmit = &uport->state->xmit;
>> +       size_t avail;
>> +       size_t remaining;
>> +       int i = 0;
>> +       u32 status;
>> +       unsigned int chunk;
>> +       int tail;
>> +
>> +       chunk = uart_circ_chars_pending(xmit);
>> +       status = readl_relaxed(uport->membase + SE_GENI_TX_FIFO_STATUS);
>> +       /* Both FIFO and framework buffer are drained */
>> +       if ((chunk == port->xmit_size) && !status) {
>> +               port->xmit_size = 0;
>> +               uart_circ_clear(xmit);
>> +               qcom_geni_serial_stop_tx(uport);
>> +               goto out_write_wakeup;
>> +       }
>> +       chunk -= port->xmit_size;
>> +
>> +       avail = (port->tx_fifo_depth - port->tx_wm) * port->tx_bytes_pw;
>> +       tail = (xmit->tail + port->xmit_size) & (UART_XMIT_SIZE - 1);
>> +       if (chunk > (UART_XMIT_SIZE - tail))
>> +               chunk = UART_XMIT_SIZE - tail;
>> +       if (chunk > avail)
>> +               chunk = avail;
>> +
>> +       if (!chunk)
>> +               goto out_write_wakeup;
>> +
>> +       qcom_geni_serial_setup_tx(uport, chunk);
>> +
>> +       remaining = chunk;
>> +       while (i < chunk) {
>> +               unsigned int tx_bytes;
>> +               unsigned int buf = 0;
>> +               int c;
>> +
>> +               tx_bytes = min_t(size_t, remaining, (size_t)port->tx_bytes_pw);
>> +               for (c = 0; c < tx_bytes ; c++)
>> +                       buf |= (xmit->buf[tail + c] << (c * BITS_PER_BYTE));
>> +
>> +               writel_relaxed(buf, uport->membase + SE_GENI_TX_FIFOn);
>> +
>> +               i += tx_bytes;
>> +               tail = (tail + tx_bytes) & (UART_XMIT_SIZE - 1);
>> +               uport->icount.tx += tx_bytes;
>> +               remaining -= tx_bytes;
>> +       }
>> +       qcom_geni_serial_poll_tx_done(uport);
>> +       port->xmit_size += chunk;
>> +out_write_wakeup:
>> +       uart_write_wakeup(uport);
>> +       return ret;
>> +}
> 
> This function can't fail, please change the return type to void.
Ok.
> 
>> +
>> +static void qcom_geni_serial_shutdown(struct uart_port *uport)
>> +{
>> +       unsigned long flags;
>> +
>> +       /* Stop the console before stopping the current tx */
>> +       console_stop(uport->cons);
>> +
>> +       disable_irq(uport->irq);
>> +       free_irq(uport->irq, uport);
>> +       spin_lock_irqsave(&uport->lock, flags);
>> +       qcom_geni_serial_stop_tx(uport);
>> +       qcom_geni_serial_stop_rx(uport);
>> +       spin_unlock_irqrestore(&uport->lock, flags);
> 
> This is one part of where I'm confused. What are we protecting here
> with the lock? disable_irq waits for any pending ISRs to finish
> according to its comment, so you know you're not racing with the ISR.
In android, console shutdown can be invoked while console write happens.
This lock prevents shutdown from not interfering with the write and
vice-versa.
> 
>> +static void geni_serial_write_term_regs(struct uart_port *uport,
>> +               u32 tx_trans_cfg, u32 tx_parity_cfg, u32 rx_trans_cfg,
>> +               u32 rx_parity_cfg, u32 bits_per_char, u32 stop_bit_len,
>> +               u32 s_clk_cfg)
>> +{
>> +       writel_relaxed(tx_trans_cfg, uport->membase + SE_UART_TX_TRANS_CFG);
>> +       writel_relaxed(tx_parity_cfg, uport->membase + SE_UART_TX_PARITY_CFG);
>> +       writel_relaxed(rx_trans_cfg, uport->membase + SE_UART_RX_TRANS_CFG);
>> +       writel_relaxed(rx_parity_cfg, uport->membase + SE_UART_RX_PARITY_CFG);
>> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_TX_WORD_LEN);
>> +       writel_relaxed(bits_per_char, uport->membase + SE_UART_RX_WORD_LEN);
>> +       writel_relaxed(stop_bit_len, uport->membase + SE_UART_TX_STOP_BIT_LEN);
>> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_M_CLK_CFG);
>> +       writel_relaxed(s_clk_cfg, uport->membase + GENI_SER_S_CLK_CFG);
>> +}
>> +
> 
> I agree with Stephen's comment, this should be inlined into the single
> place it's called from.
> 
> Thanks Karthik!
> -Evan
> 
Regards,
Karthik.
Evan Green March 16, 2018, 6:39 p.m. UTC | #18
Hi Karthik,

On Tue, Mar 13, 2018 at 1:16 PM Karthik Ramasubramanian <
kramasub@codeaurora.org> wrote:



> On 3/2/2018 5:11 PM, Evan Green wrote:
> >> +
> >> +#ifdef CONFIG_CONSOLE_POLL
> >> +#define RX_BYTES_PW 1
> >> +#else
> >> +#define RX_BYTES_PW 4
> >> +#endif
> >
> > This seems fishy to me. Does either setting work? If so, why not just
> > have one value?
> Yes, either one works. In the interrupt driven mode, sometimes due to
> increased interrupt latency the RX FIFO may overflow if we use only 1
> byte per FIFO word - given there are no flow control lines in the debug
> uart. Hence using 4 bytes in the FIFO word will help to prevent the FIFO
> overflow - especially in the case where commands are executed through
> scripts.
> In polling mode, using 1 byte per word helps to use the hardware to
> buffer the data instead of software buffering especially when the
> framework keeps reading one byte at a time.

Ok, I think I understand. Let me paraphrase in case I'm wrong: Normally,
you want all 4 bytes per word so that you use the hardware's full FIFO
capability. This works out well since on receive you tell the system how
many bytes you've received, so you're never stuck with leftover bytes.

In polling mode, however, the system asks you for one byte, and the problem
is with 4 bytes per FIFO word you may end up having read three additional
bytes that you don't know what to do with. Configuring the UART to 1 byte
per word allows you to skip coding up a couple of conditionals and an extra
couple u32s in the device struct for saving those extra bytes, but divides
the hardware FIFO size by 4.

It seems a little cheesy to me just to avoid a bit of logic, but I'm not
going to put my foot down about it. I guess it might get complicated when
the console pulls in four bytes, but then only ends up eating one of them,
and then we have to figure out how to get the other three into the normal
ISR-based path.

> >
> >> +static bool qcom_geni_serial_poll_bit(struct uart_port *uport,
> >> +                               int offset, int bit_field, bool set)
> >> +{
> >> +       u32 reg;
> >> +       struct qcom_geni_serial_port *port;
> >> +       unsigned int baud;
> >> +       unsigned int fifo_bits;
> >> +       unsigned long timeout_us = 20000;
> >> +
> >> +       /* Ensure polling is not re-ordered before the prior
writes/reads */
> >> +       mb();
> >> +
> >> +       if (uport->private_data) {
> >> +               port = to_dev_port(uport, uport);
> >> +               baud = port->cur_baud;
> >> +               if (!baud)
> >> +                       baud = 115200;
> >> +               fifo_bits = port->tx_fifo_depth * port->tx_fifo_width;
> >> +               /*
> >> +                * Total polling iterations based on FIFO worth of
bytes to be
> >> +                * sent at current baud .Add a little fluff to the
wait.
> >> +                */
> >> +               timeout_us = ((fifo_bits * USEC_PER_SEC) / baud) + 500;
> >
> > This fluff is a little mysterious, can it be explained at all? Do you
> > think the fluff factor is in units of time (as you have it) or bits?
> > Time makes sense I guess if we're worried about clock source
> > differences.
> The fluff is in micro-seconds and can help with unforeseen delays in
> emulation platforms.

So emulated platforms go out to lunch, but that generally doesn't depend on
baud rate or how many bits there are. Ok. Might be worth noting what that's
for.

> >
> >> +
> >> +static void qcom_geni_serial_console_write(struct console *co, const
char *s,
> >> +                             unsigned int count)
> >> +{
> >> +       struct uart_port *uport;
> >> +       struct qcom_geni_serial_port *port;
> >> +       bool locked = true;
> >> +       unsigned long flags;
> >> +
> >> +       WARN_ON(co->index < 0 || co->index >= GENI_UART_CONS_PORTS);
> >> +
> >> +       port = get_port_from_line(co->index);
> >> +       if (IS_ERR(port))
> >> +               return;
> >> +
> >> +       uport = &port->uport;
> >> +       if (oops_in_progress)
> >> +               locked = spin_trylock_irqsave(&uport->lock, flags);
> >> +       else
> >> +               spin_lock_irqsave(&uport->lock, flags);
> >> +
> >> +       if (locked) {
> >> +               __qcom_geni_serial_console_write(uport, s, count);
> >> +               spin_unlock_irqrestore(&uport->lock, flags);
> >
> > I too am a little lost on the locking here. What exactly is the lock
> > protecting? Looks like for the most part it's trying to synchronize
> > with the ISR? What specifically in the ISR? I just wanted to go
> > through and check to make sure whatever the shared resource is is
> > appropriately protected.
> The lock protects 2 simultaneous writers from putting the hardware in
> the bad state. The output of the command entered in a shell can trigger
> a write in the interrupt context while logging activity can trigger a
> simultaneous write.

Can you be any more specific here? What puts the hardware in a bad state?
Maybe I see what you're referring to, where both writers see the FIFO has
space and then end up overrunning it because they both add to it.
Similarly, you wouldn't want both the ISR and the polling code to read the
FIFO status and pull bytes out of the FIFO on rx.

If it's the FIFO that's being protected, then:
1. What about qcom_geni_serial_poll_put_char? It writes to the FIFO but
appears not to be locked when called directly via uart_ops. Up in
serial_core.c it looks like that lock is used for configuration changes,
but not transmit.
2. Same with qcom_geni_serial_get_char.

Based on your comment below, I'm coming to understand that the lock is
protecting the FIFO, whose state is partially in the IRQ_STATUS registers.
Shutdown and the ISR alter those bits, so that's why they're locked. This
finally makes sense to me, but took a lot of work to figure out. The lock
doesn't protect all of the bits in the IRQ registers, just a couple of
bits. Perhaps a comment somewhere indicating exactly what is being
protected (the FIFO, and the FIFO status bits specifically in the IRQ
registers) would be helpful to future folks trying to understand.

> >
> >> +       }
> >> +}
> >> +
> >> +static int handle_rx_console(struct uart_port *uport, u32 rx_bytes,
bool drop)
> >> +{
> >> +       u32 i = rx_bytes;
> >> +       u32 rx_fifo;
> >> +       unsigned char *buf;
> >> +       struct tty_port *tport;
> >> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> >> +
> >> +       tport = &uport->state->port;
> >> +       while (i > 0) {
> >> +               int c;
> >> +               int bytes = i > port->rx_bytes_pw ? port->rx_bytes_pw
: i;
> >
> > Please replace this with a min macro.
> Ok.
> >
> >> +static int qcom_geni_serial_handle_tx(struct uart_port *uport)
> >> +{
> >> +       int ret = 0;
> >> +       struct qcom_geni_serial_port *port = to_dev_port(uport, uport);
> >> +       struct circ_buf *xmit = &uport->state->xmit;
> >> +       size_t avail;
> >> +       size_t remaining;
> >> +       int i = 0;
> >> +       u32 status;
> >> +       unsigned int chunk;
> >> +       int tail;
> >> +
> >> +       chunk = uart_circ_chars_pending(xmit);
> >> +       status = readl_relaxed(uport->membase +
SE_GENI_TX_FIFO_STATUS);
> >> +       /* Both FIFO and framework buffer are drained */
> >> +       if ((chunk == port->xmit_size) && !status) {
> >> +               port->xmit_size = 0;
> >> +               uart_circ_clear(xmit);
> >> +               qcom_geni_serial_stop_tx(uport);
> >> +               goto out_write_wakeup;
> >> +       }
> >> +       chunk -= port->xmit_size;
> >> +
> >> +       avail = (port->tx_fifo_depth - port->tx_wm) *
port->tx_bytes_pw;
> >> +       tail = (xmit->tail + port->xmit_size) & (UART_XMIT_SIZE - 1);
> >> +       if (chunk > (UART_XMIT_SIZE - tail))
> >> +               chunk = UART_XMIT_SIZE - tail;
> >> +       if (chunk > avail)
> >> +               chunk = avail;
> >> +
> >> +       if (!chunk)
> >> +               goto out_write_wakeup;
> >> +
> >> +       qcom_geni_serial_setup_tx(uport, chunk);
> >> +
> >> +       remaining = chunk;
> >> +       while (i < chunk) {
> >> +               unsigned int tx_bytes;
> >> +               unsigned int buf = 0;
> >> +               int c;
> >> +
> >> +               tx_bytes = min_t(size_t, remaining,
(size_t)port->tx_bytes_pw);
> >> +               for (c = 0; c < tx_bytes ; c++)
> >> +                       buf |= (xmit->buf[tail + c] << (c *
BITS_PER_BYTE));
> >> +
> >> +               writel_relaxed(buf, uport->membase + SE_GENI_TX_FIFOn);
> >> +
> >> +               i += tx_bytes;
> >> +               tail = (tail + tx_bytes) & (UART_XMIT_SIZE - 1);
> >> +               uport->icount.tx += tx_bytes;
> >> +               remaining -= tx_bytes;
> >> +       }
> >> +       qcom_geni_serial_poll_tx_done(uport);
> >> +       port->xmit_size += chunk;
> >> +out_write_wakeup:
> >> +       uart_write_wakeup(uport);
> >> +       return ret;
> >> +}
> >
> > This function can't fail, please change the return type to void.
> Ok.
> >
> >> +
> >> +static void qcom_geni_serial_shutdown(struct uart_port *uport)
> >> +{
> >> +       unsigned long flags;
> >> +
> >> +       /* Stop the console before stopping the current tx */
> >> +       console_stop(uport->cons);
> >> +
> >> +       disable_irq(uport->irq);
> >> +       free_irq(uport->irq, uport);
> >> +       spin_lock_irqsave(&uport->lock, flags);
> >> +       qcom_geni_serial_stop_tx(uport);
> >> +       qcom_geni_serial_stop_rx(uport);
> >> +       spin_unlock_irqrestore(&uport->lock, flags);
> >
> > This is one part of where I'm confused. What are we protecting here
> > with the lock? disable_irq waits for any pending ISRs to finish
> > according to its comment, so you know you're not racing with the ISR.
> In android, console shutdown can be invoked while console write happens.
> This lock prevents shutdown from not interfering with the write and
> vice-versa.

-Evan
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html