mbox series

[v3,0/6] PM / devfreq: Add dynamic scaling for imx ddr controller

Message ID cover.1572558427.git.leonard.crestez@nxp.com
Headers show
Series PM / devfreq: Add dynamic scaling for imx ddr controller | expand

Message

Leonard Crestez Oct. 31, 2019, 9:50 p.m. UTC
This adds support for dynamic scaling of the DDR Controller (ddrc) present in
imx8m series. Actual frequency switching is implemented inside TF-A, this
driver wraps the SMC calls and synchronizes the clk tree.

DRAM frequency switching requires clock manipulation but during this operation
DRAM itself is briefly inaccessible so this operation is performed a SMC call
to by TF-A which runs from a SRAM area. Upon returning to linux the clock tree
is updated to correspond to hardware configuration.

This is handled via CLK_GET_RATE_NO_CACHE for dividers but muxes are handled
manually: the driver will prepare/enable the new parents ahead of switching (so
that the expected roots are enabled) and afterwards it will call clk_set_parent
to ensure the parents in clock framework are up-to-date.

This series is atomically useful and roughly similar to devfreq drivers for
tegra and rockchip.

Running at lower dram rates saves power but can affect the functionality of
other blocks in the chip (display, vpu etc). Support for in-kernel constraints
will some separately.

Angus/Martin: You previously attempted to test on purism boards, this updated
version should work without hacks and has no dependencies.

Changes since v2:
* Add support for entire imx8m family including imx8mq B0.
* Also mark dram PLLs as CLK_GET_RATE_NO_CACHE (required for imx8mq b0 low OPP)
* Explicitly update dram pll rate at the end of imx_ddrc_set_freq.
* Use do_div in imx-ddrc (kbuild robot)
* Improve explanations around adding CLK_GET_RATE_NO_CACHE to dram clks.
(Stephen Boyd)
* Handle ddrc devfreq-events earlier for fewer probe defers.
* Validate DDRC opp tables versus firmware: supported OPPs depend on board and
SOC revision.
* Move DDRC opp tables to board dts because they can vary based on ram type on
board.
* Verify DDRC rate is changed in clk tree and otherwise report an error.
* Change imx_ddrc_freq.rate to be measure in MT/s and round down from HZ in
imx_ddrc_find_freq instead.
* Split NOC scaling away.
Link to v2: https://patchwork.kernel.org/cover/11104113/

Changes since v1:
* bindings: Stop using "contains" for "compatible"
* bindings: Set "additionalProperties: false" and document missing stuff.
* Remove (c) from NXP copyright notice
* Fix various checkpatch issues
* Remove unused dram_alt_root clk from imx-ddrc
Link to v1: https://patchwork.kernel.org/cover/11090649/

Changes since RFC v3:
* Implement passive support and set NOC's parent to DDRC
* Drop scaling AHB/AXI for now (NOC/DDRC use most power anyway)
* Stop relying on clk_min_rate
* Split into two devreq drivers (and bindings) because the ddrc is
really a distinct piece of hardware.
* Perform DRAM frequency inside devfreq instead of clk, mostly due to
objections to earlier RFCs for imx8m-dram-clk.
* Fetch info about dram clk parents from firmware instead of
hardcoding in driver. This can more easily support additional rates.
* Link: https://patchwork.kernel.org/cover/11056779/
* Link: https://patchwork.kernel.org/patch/11049429/

Scaling buses can cause problems for devices with realtime bandwith
requirements such as display, the intention is to use the interconnect
framework to make DEV_PM_QOS_MIN_FREQUENCY to devfreq. There are
separate patches for that:

* https://patchwork.kernel.org/cover/11104055/
* https://patchwork.kernel.org/cover/11078671/


Leonard Crestez (6):
  clk: imx8m: Set CLK_GET_RATE_NOCACHE on dram clocks
  clk: imx: Mark dram pll on 8mm and 8mn with CLK_GET_RATE_NOCACHE
  dt-bindings: devfreq: Add bindings for imx ddr controller
  PM / devfreq: Add dynamic scaling for imx ddr controller
  PM / devfreq: imx-ddrc: Measure bandwidth with perf
  arm64: dts: imx8m: Add ddr controller nodes

 .../devicetree/bindings/devfreq/imx-ddrc.yaml |  60 ++
 arch/arm64/boot/dts/freescale/imx8mm-evk.dts  |  18 +
 arch/arm64/boot/dts/freescale/imx8mm.dtsi     |  17 +-
 .../boot/dts/freescale/imx8mn-ddr4-evk.dts    |  18 +
 arch/arm64/boot/dts/freescale/imx8mn.dtsi     |  16 +-
 arch/arm64/boot/dts/freescale/imx8mq-evk.dts  |  24 +
 arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  16 +-
 drivers/clk/imx/clk-imx8mm.c                  |  13 +-
 drivers/clk/imx/clk-imx8mn.c                  |  14 +-
 drivers/clk/imx/clk-imx8mq.c                  |  15 +-
 drivers/clk/imx/clk-pll14xx.c                 |   7 +
 drivers/clk/imx/clk.h                         |   1 +
 drivers/devfreq/Makefile                      |   1 +
 drivers/devfreq/imx-ddrc.c                    | 570 ++++++++++++++++++
 14 files changed, 777 insertions(+), 13 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/devfreq/imx-ddrc.yaml
 create mode 100644 drivers/devfreq/imx-ddrc.c

Comments

Rob Herring Nov. 4, 2019, 10:01 p.m. UTC | #1
On Thu, Oct 31, 2019 at 11:50:27PM +0200, Leonard Crestez wrote:
> This is used by the imx-ddrc devfreq driver to implement dynamic
> frequency scaling of DRAM.
> 
> Add a devfreq-event link to the dram PMU in order to support on-demand
> scaling of ddrc based on measured dram bandwidth usage.
> 
> Support for proactive scaling via interconnect will come later. The
> high-performance bus masters which need that (display, vpu, gpu) are not
> yet enabled in upstream anyway.
> 
> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
> ---
>  arch/arm64/boot/dts/freescale/imx8mm-evk.dts  | 18 ++++++++++++++
>  arch/arm64/boot/dts/freescale/imx8mm.dtsi     | 17 ++++++++++++-
>  .../boot/dts/freescale/imx8mn-ddr4-evk.dts    | 18 ++++++++++++++
>  arch/arm64/boot/dts/freescale/imx8mn.dtsi     | 16 ++++++++++++-
>  arch/arm64/boot/dts/freescale/imx8mq-evk.dts  | 24 +++++++++++++++++++
>  arch/arm64/boot/dts/freescale/imx8mq.dtsi     | 16 ++++++++++++-
>  6 files changed, 106 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> index 4f5e408d6e6a..be9abd8e4478 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> +++ b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> @@ -69,16 +69,34 @@
>  		simple-audio-card,codec {
>  			sound-dai = <&wm8524>;
>  			clocks = <&clk IMX8MM_CLK_SAI3_ROOT>;
>  		};
>  	};
> +
> +	ddrc_opp_table: ddrc-opp-table {
> +		compatible = "operating-points-v2";
> +
> +		opp-25M {
> +			opp-hz = /bits/ 64 <25000000>;
> +		};
> +		opp-100M {
> +			opp-hz = /bits/ 64 <100000000>;
> +		};
> +		opp-750M {
> +			opp-hz = /bits/ 64 <750000000>;
> +		};
> +	};
>  };
>  
>  &A53_0 {
>  	cpu-supply = <&buck2_reg>;
>  };
>  
> +&ddrc {
> +	operating-points-v2 = <&ddrc_opp_table>;
> +};
> +
>  &fec1 {
>  	pinctrl-names = "default";
>  	pinctrl-0 = <&pinctrl_fec1>;
>  	phy-mode = "rgmii-id";
>  	phy-handle = <&ethphy0>;
> diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> index 6edbdfe2d0d7..5404870d80d5 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> @@ -856,11 +856,26 @@
>  			#interrupt-cells = <3>;
>  			interrupt-controller;
>  			interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
>  		};
>  
> -		ddr-pmu@3d800000 {
> +		ddrc: dram-controller@3d400000 {
> +			compatible = "fsl,imx8mm-ddrc", "fsl,imx8m-ddrc";
> +			reg = <0x3d400000 0x400000>;

Do you really need the OS to map 4MB of register space? Virtual 
space on 64-bit doesn't matter, but it's still wasting 2KB of memory 
just to map all that if only a few pages are needed. Adds up if the 
whole DT is done this way.

> +			clock-names = "dram_core",
> +				      "dram_pll",
> +				      "dram_alt",
> +				      "dram_apb";
> +			clocks = <&clk IMX8MM_CLK_DRAM_CORE>,
> +				 <&clk IMX8MM_DRAM_PLL>,
> +				 <&clk IMX8MM_CLK_DRAM_ALT>,
> +				 <&clk IMX8MM_CLK_DRAM_APB>;
> +			devfreq-events = <&ddr_pmu>;
> +			operating-points-v2 = <&ddrc_opp_table>;
> +		};
> +
> +		ddr_pmu: ddr-pmu@3d800000 {
>  			compatible = "fsl,imx8mm-ddr-pmu", "fsl,imx8m-ddr-pmu";
>  			reg = <0x3d800000 0x400000>;
>  			interrupt-parent = <&gic>;
>  			interrupts = <GIC_SPI 98 IRQ_TYPE_LEVEL_HIGH>;
>  		};
> diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> index 071949412caf..ab2060667671 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> +++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> @@ -9,16 +9,34 @@
>  #include "imx8mn-evk.dtsi"
>  
>  / {
>  	model = "NXP i.MX8MNano DDR4 EVK board";
>  	compatible = "fsl,imx8mn-ddr4-evk", "fsl,imx8mn";
> +
> +	ddrc_opp_table: ddrc-opp-table {

I think it would be better to put this under the ddrc node (and named 
'opp-table'). Yes, it's kind of silly to have a phandle to a child node, 
but that still works.

> +		compatible = "operating-points-v2";
> +
> +		opp-25M {
> +			opp-hz = /bits/ 64 <25000000>;
> +		};
> +		opp-100M {
> +			opp-hz = /bits/ 64 <100000000>;
> +		};
> +		opp-600M {
> +			opp-hz = /bits/ 64 <600000000>;
> +		};
> +	};
>  };
>  
>  &A53_0 {
>  	cpu-supply = <&buck2_reg>;
>  };
>  
> +&ddrc {
> +	operating-points-v2 = <&ddrc_opp_table>;
> +};
> +
>  &i2c1 {
>  	pmic@4b {
>  		compatible = "rohm,bd71847";
>  		reg = <0x4b>;
>  		pinctrl-0 = <&pinctrl_pmic>;
> diff --git a/arch/arm64/boot/dts/freescale/imx8mn.dtsi b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> index e91625063f8e..344dd777635f 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mn.dtsi
> @@ -757,11 +757,25 @@
>  			#interrupt-cells = <3>;
>  			interrupt-controller;
>  			interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
>  		};
>  
> -		ddr-pmu@3d800000 {
> +		ddrc: dram-controller@3d400000 {
> +			compatible = "fsl,imx8mn-ddrc", "fsl,imx8m-ddrc";
> +			reg = <0x3d400000 0x400000>;
> +			clock-names = "dram_core",
> +				      "dram_pll",
> +				      "dram_alt",
> +				      "dram_apb";
> +			clocks = <&clk IMX8MN_CLK_DRAM_CORE>,
> +				 <&clk IMX8MN_DRAM_PLL>,
> +				 <&clk IMX8MN_CLK_DRAM_ALT>,
> +				 <&clk IMX8MN_CLK_DRAM_APB>;
> +			devfreq-events = <&ddr_pmu>;
> +		};
> +
> +		ddr_pmu: ddr-pmu@3d800000 {
>  			compatible = "fsl,imx8mn-ddr-pmu", "fsl,imx8m-ddr-pmu";
>  			reg = <0x3d800000 0x400000>;
>  			interrupts = <GIC_SPI 98 IRQ_TYPE_LEVEL_HIGH>;
>  		};
>  	};
> diff --git a/arch/arm64/boot/dts/freescale/imx8mq-evk.dts b/arch/arm64/boot/dts/freescale/imx8mq-evk.dts
> index c36685916683..fc4c12ab8991 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mq-evk.dts
> +++ b/arch/arm64/boot/dts/freescale/imx8mq-evk.dts
> @@ -85,10 +85,30 @@
>  		link_codec: simple-audio-card,codec {
>  			sound-dai = <&wm8524>;
>  			clocks = <&clk IMX8MQ_CLK_SAI2_ROOT>;
>  		};
>  	};
> +
> +	ddrc_opp_table: ddrc-opp-table {
> +		compatible = "operating-points-v2";
> +
> +		opp-25M {
> +			opp-hz = /bits/ 64 <25000000>;
> +		};
> +		opp-100M {
> +			opp-hz = /bits/ 64 <100000000>;
> +		};
> +		/*
> +		 * On imx8mq B0 PLL can't be bypassed so low bus is 166M
> +		 */
> +		opp-166M {
> +			opp-hz = /bits/ 64 <166935483>;
> +		};
> +		opp-800M {
> +			opp-hz = /bits/ 64 <800000000>;
> +		};
> +	};
>  };
>  
>  &A53_0 {
>  	cpu-supply = <&buck2_reg>;
>  };
> @@ -103,10 +123,14 @@
>  
>  &A53_3 {
>  	cpu-supply = <&buck2_reg>;
>  };
>  
> +&ddrc {
> +	operating-points-v2 = <&ddrc_opp_table>;
> +};
> +
>  &fec1 {
>  	pinctrl-names = "default";
>  	pinctrl-0 = <&pinctrl_fec1>;
>  	phy-mode = "rgmii-id";
>  	phy-handle = <&ethphy0>;
> diff --git a/arch/arm64/boot/dts/freescale/imx8mq.dtsi b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> index 7f9319452b58..6ef1af41ef68 100644
> --- a/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> +++ b/arch/arm64/boot/dts/freescale/imx8mq.dtsi
> @@ -1111,11 +1111,25 @@
>  			interrupt-controller;
>  			interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
>  			interrupt-parent = <&gic>;
>  		};
>  
> -		ddr-pmu@3d800000 {
> +		ddrc: dram-controller@3d400000 {
> +			compatible = "fsl,imx8mq-ddrc", "fsl,imx8m-ddrc";
> +			reg = <0x3d400000 0x400000>;
> +			clock-names = "dram_core",
> +				      "dram_pll",
> +				      "dram_alt",
> +				      "dram_apb";
> +			clocks = <&clk IMX8MQ_CLK_DRAM_CORE>,
> +				 <&clk IMX8MQ_DRAM_PLL_OUT>,
> +				 <&clk IMX8MQ_CLK_DRAM_ALT>,
> +				 <&clk IMX8MQ_CLK_DRAM_APB>;
> +			devfreq-events = <&ddr_pmu>;
> +		};
> +
> +		ddr_pmu: ddr-pmu@3d800000 {
>  			compatible = "fsl,imx8mq-ddr-pmu", "fsl,imx8m-ddr-pmu";
>  			reg = <0x3d800000 0x400000>;
>  			interrupt-parent = <&gic>;
>  			interrupts = <GIC_SPI 98 IRQ_TYPE_LEVEL_HIGH>;
>  		};
> -- 
> 2.17.1
>
Leonard Crestez Nov. 11, 2019, 2:29 p.m. UTC | #2
On Mon, 2019-11-04 at 16:01 -0600, Rob Herring wrote:
> On Thu, Oct 31, 2019 at 11:50:27PM +0200, Leonard Crestez wrote:
> > This is used by the imx-ddrc devfreq driver to implement dynamic
> > frequency scaling of DRAM.
> > 
> > Add a devfreq-event link to the dram PMU in order to support on-
> > demand scaling of ddrc based on measured dram bandwidth usage.
> > 
> > Support for proactive scaling via interconnect will come later. The
> > high-performance bus masters which need that (display, vpu, gpu)
> > are not yet enabled in upstream anyway.
> > 
> > Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
> > ---
> >  arch/arm64/boot/dts/freescale/imx8mm-evk.dts  | 18 ++++++++++++++
> >  arch/arm64/boot/dts/freescale/imx8mm.dtsi     | 17 ++++++++++++-
> >  .../boot/dts/freescale/imx8mn-ddr4-evk.dts    | 18 ++++++++++++++
> >  arch/arm64/boot/dts/freescale/imx8mn.dtsi     | 16 ++++++++++++-
> >  arch/arm64/boot/dts/freescale/imx8mq-evk.dts  | 24
> > +++++++++++++++++++
> >  arch/arm64/boot/dts/freescale/imx8mq.dtsi     | 16 ++++++++++++-
> >  6 files changed, 106 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> > b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> > index 4f5e408d6e6a..be9abd8e4478 100644
> > --- a/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> > +++ b/arch/arm64/boot/dts/freescale/imx8mm-evk.dts
> > @@ -69,16 +69,34 @@
> >  		simple-audio-card,codec {
> >  			sound-dai = <&wm8524>;
> >  			clocks = <&clk IMX8MM_CLK_SAI3_ROOT>;
> >  		};
> >  	};
> > +
> > +	ddrc_opp_table: ddrc-opp-table {
> > +		compatible = "operating-points-v2";
> > +
> > +		opp-25M {
> > +			opp-hz = /bits/ 64 <25000000>;
> > +		};
> > +		opp-100M {
> > +			opp-hz = /bits/ 64 <100000000>;
> > +		};
> > +		opp-750M {
> > +			opp-hz = /bits/ 64 <750000000>;
> > +		};
> > +	};
> >  };
> >  
> >  &A53_0 {
> >  	cpu-supply = <&buck2_reg>;
> >  };
> >  
> > +&ddrc {
> > +	operating-points-v2 = <&ddrc_opp_table>;
> > +};
> > +
> >  &fec1 {
> >  	pinctrl-names = "default";
> >  	pinctrl-0 = <&pinctrl_fec1>;
> >  	phy-mode = "rgmii-id";
> >  	phy-handle = <&ethphy0>;
> > diff --git a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > index 6edbdfe2d0d7..5404870d80d5 100644
> > --- a/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > +++ b/arch/arm64/boot/dts/freescale/imx8mm.dtsi
> > @@ -856,11 +856,26 @@
> >  			#interrupt-cells = <3>;
> >  			interrupt-controller;
> >  			interrupts = <GIC_PPI 9 IRQ_TYPE_LEVEL_HIGH>;
> >  		};
> >  
> > -		ddr-pmu@3d800000 {
> > +		ddrc: dram-controller@3d400000 {
> > +			compatible = "fsl,imx8mm-ddrc", "fsl,imx8m-
> > ddrc";
> > +			reg = <0x3d400000 0x400000>;
> 
> Do you really need the OS to map 4MB of register space? Virtual 
> space on 64-bit doesn't matter, but it's still wasting 2KB of memory 
> just to map all that if only a few pages are needed. Adds up if the 
> whole DT is done this way.

This driver doesn't actually map registers, they're only touched from
firmware.

Information is from memory map.

> > +			clock-names = "dram_core",
> > +				      "dram_pll",
> > +				      "dram_alt",
> > +				      "dram_apb";
> > +			clocks = <&clk IMX8MM_CLK_DRAM_CORE>,
> > +				 <&clk IMX8MM_DRAM_PLL>,
> > +				 <&clk IMX8MM_CLK_DRAM_ALT>,
> > +				 <&clk IMX8MM_CLK_DRAM_APB>;
> > +			devfreq-events = <&ddr_pmu>;
> > +			operating-points-v2 = <&ddrc_opp_table>;
> > +		};
> > +
> > +		ddr_pmu: ddr-pmu@3d800000 {
> >  			compatible = "fsl,imx8mm-ddr-pmu", "fsl,imx8m-
> > ddr-pmu";
> >  			reg = <0x3d800000 0x400000>;
> >  			interrupt-parent = <&gic>;
> >  			interrupts = <GIC_SPI 98 IRQ_TYPE_LEVEL_HIGH>;
> >  		};
> > diff --git a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > index 071949412caf..ab2060667671 100644
> > --- a/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > +++ b/arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
> > @@ -9,16 +9,34 @@
> >  #include "imx8mn-evk.dtsi"
> >  
> >  / {
> >  	model = "NXP i.MX8MNano DDR4 EVK board";
> >  	compatible = "fsl,imx8mn-ddr4-evk", "fsl,imx8mn";
> > +
> > +	ddrc_opp_table: ddrc-opp-table {
> 
> I think it would be better to put this under the ddrc node (and
> named 'opp-table'). Yes, it's kind of silly to have a phandle to a
> child node, but that still works.

That looks much nicer, fixed in v4.

I had to also update yaml to explicitly allow an "opp-table" child.

--
Regards,
Leonard
Shawn Guo Dec. 2, 2019, 3:12 a.m. UTC | #3
On Thu, Oct 31, 2019 at 11:50:22PM +0200, Leonard Crestez wrote:
> These clocks are only modified as part of DRAM frequency switches during
> which DRAM itself is briefly inaccessible. The switch is performed with
> a SMC call to by TF-A which runs from a SRAM area; upon returning to
> linux several clocks bits are modified and we need to update them.
> 
> For rate bits an easy solution is to just mark with
> CLK_GET_RATE_NOCACHE so that new rates are always read back from
> registers.
> 
> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
> ---
>  drivers/clk/imx/clk-imx8mm.c | 11 +++++++++--
>  drivers/clk/imx/clk-imx8mn.c | 12 ++++++++++--
>  drivers/clk/imx/clk-imx8mq.c | 15 +++++++++++----
>  3 files changed, 30 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/clk/imx/clk-imx8mm.c b/drivers/clk/imx/clk-imx8mm.c
> index 030b15d7c0ce..c58f988191a5 100644
> --- a/drivers/clk/imx/clk-imx8mm.c
> +++ b/drivers/clk/imx/clk-imx8mm.c
> @@ -440,13 +440,20 @@ static int imx8mm_clocks_probe(struct platform_device *pdev)
>  
>  	/* IPG */
>  	clks[IMX8MM_CLK_IPG_ROOT] = imx_clk_divider2("ipg_root", "ahb", base + 0x9080, 0, 1);
>  	clks[IMX8MM_CLK_IPG_AUDIO_ROOT] = imx_clk_divider2("ipg_audio_root", "audio_ahb", base + 0x9180, 0, 1);
>  
> +	/*
> +	 * DRAM clocks are manipulated from TF-A outside clock framework.
> +	 * Mark with GET_RATE_NOCACHE to always read div value from hardware
> +	 */
> +	clks[IMX8MM_CLK_DRAM_ALT] = __imx8m_clk_composite("dram_alt", imx8mm_dram_alt_sels, base + 0xa000,
> +			CLK_GET_RATE_NOCACHE);
> +	clks[IMX8MM_CLK_DRAM_APB] = __imx8m_clk_composite("dram_apb", imx8mm_dram_apb_sels, base + 0xa080,
> +			CLK_IS_CRITICAL | CLK_GET_RATE_NOCACHE);
> +

I think we prefer to ignore over-80-column warnings for i.MX clock
drivers, because doing that improve the readability of code.

Shawn

>  	/* IP */
> -	clks[IMX8MM_CLK_DRAM_ALT] = imx8m_clk_composite("dram_alt", imx8mm_dram_alt_sels, base + 0xa000);
> -	clks[IMX8MM_CLK_DRAM_APB] = imx8m_clk_composite_critical("dram_apb", imx8mm_dram_apb_sels, base + 0xa080);
>  	clks[IMX8MM_CLK_VPU_G1] = imx8m_clk_composite("vpu_g1", imx8mm_vpu_g1_sels, base + 0xa100);
>  	clks[IMX8MM_CLK_VPU_G2] = imx8m_clk_composite("vpu_g2", imx8mm_vpu_g2_sels, base + 0xa180);
>  	clks[IMX8MM_CLK_DISP_DTRC] = imx8m_clk_composite("disp_dtrc", imx8mm_disp_dtrc_sels, base + 0xa200);
>  	clks[IMX8MM_CLK_DISP_DC8000] = imx8m_clk_composite("disp_dc8000", imx8mm_disp_dc8000_sels, base + 0xa280);
>  	clks[IMX8MM_CLK_PCIE1_CTRL] = imx8m_clk_composite("pcie1_ctrl", imx8mm_pcie1_ctrl_sels, base + 0xa300);
> diff --git a/drivers/clk/imx/clk-imx8mn.c b/drivers/clk/imx/clk-imx8mn.c
> index 9f5a5a56b45e..ca78cb1249a7 100644
> --- a/drivers/clk/imx/clk-imx8mn.c
> +++ b/drivers/clk/imx/clk-imx8mn.c
> @@ -428,12 +428,20 @@ static int imx8mn_clocks_probe(struct platform_device *pdev)
>  	clks[IMX8MN_CLK_AHB] = imx8m_clk_composite_critical("ahb", imx8mn_ahb_sels, base + 0x9000);
>  	clks[IMX8MN_CLK_AUDIO_AHB] = imx8m_clk_composite("audio_ahb", imx8mn_audio_ahb_sels, base + 0x9100);
>  	clks[IMX8MN_CLK_IPG_ROOT] = imx_clk_divider2("ipg_root", "ahb", base + 0x9080, 0, 1);
>  	clks[IMX8MN_CLK_IPG_AUDIO_ROOT] = imx_clk_divider2("ipg_audio_root", "audio_ahb", base + 0x9180, 0, 1);
>  	clks[IMX8MN_CLK_DRAM_CORE] = imx_clk_mux2_flags("dram_core_clk", base + 0x9800, 24, 1, imx8mn_dram_core_sels, ARRAY_SIZE(imx8mn_dram_core_sels), CLK_IS_CRITICAL);
> -	clks[IMX8MN_CLK_DRAM_ALT] = imx8m_clk_composite("dram_alt", imx8mn_dram_alt_sels, base + 0xa000);
> -	clks[IMX8MN_CLK_DRAM_APB] = imx8m_clk_composite_critical("dram_apb", imx8mn_dram_apb_sels, base + 0xa080);
> +
> +	/*
> +	 * DRAM clocks are manipulated from TF-A outside clock framework.
> +	 * Mark with GET_RATE_NOCACHE to always read div value from hardware
> +	 */
> +	clks[IMX8MN_CLK_DRAM_ALT] = __imx8m_clk_composite("dram_alt", imx8mn_dram_alt_sels, base + 0xa000,
> +			CLK_GET_RATE_NOCACHE);
> +	clks[IMX8MN_CLK_DRAM_APB] = __imx8m_clk_composite("dram_apb", imx8mn_dram_apb_sels, base + 0xa080,
> +			CLK_IS_CRITICAL | CLK_GET_RATE_NOCACHE);
> +
>  	clks[IMX8MN_CLK_DISP_PIXEL] = imx8m_clk_composite("disp_pixel", imx8mn_disp_pixel_sels, base + 0xa500);
>  	clks[IMX8MN_CLK_SAI2] = imx8m_clk_composite("sai2", imx8mn_sai2_sels, base + 0xa600);
>  	clks[IMX8MN_CLK_SAI3] = imx8m_clk_composite("sai3", imx8mn_sai3_sels, base + 0xa680);
>  	clks[IMX8MN_CLK_SAI5] = imx8m_clk_composite("sai5", imx8mn_sai5_sels, base + 0xa780);
>  	clks[IMX8MN_CLK_SAI6] = imx8m_clk_composite("sai6", imx8mn_sai6_sels, base + 0xa800);
> diff --git a/drivers/clk/imx/clk-imx8mq.c b/drivers/clk/imx/clk-imx8mq.c
> index 4a5dbc4366a5..ceb1e79cf2e9 100644
> --- a/drivers/clk/imx/clk-imx8mq.c
> +++ b/drivers/clk/imx/clk-imx8mq.c
> @@ -341,11 +341,12 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
>  	clks[IMX8MQ_VIDEO_PLL1_OUT] = imx_clk_gate("video_pll1_out", "video_pll1_bypass", base + 0x10, 21);
>  
>  	clks[IMX8MQ_SYS1_PLL_OUT] = imx_clk_fixed("sys1_pll_out", 800000000);
>  	clks[IMX8MQ_SYS2_PLL_OUT] = imx_clk_fixed("sys2_pll_out", 1000000000);
>  	clks[IMX8MQ_SYS3_PLL_OUT] = imx_clk_sccg_pll("sys3_pll_out", sys3_pll_out_sels, ARRAY_SIZE(sys3_pll_out_sels), 0, 0, 1, base + 0x48, CLK_IS_CRITICAL);
> -	clks[IMX8MQ_DRAM_PLL_OUT] = imx_clk_sccg_pll("dram_pll_out", dram_pll_out_sels, ARRAY_SIZE(dram_pll_out_sels), 0, 0, 0, base + 0x60, CLK_IS_CRITICAL);
> +	clks[IMX8MQ_DRAM_PLL_OUT] = imx_clk_sccg_pll("dram_pll_out", dram_pll_out_sels, ARRAY_SIZE(dram_pll_out_sels), 0, 0, 0, base + 0x60,
> +			CLK_IS_CRITICAL | CLK_GET_RATE_NOCACHE);
>  	clks[IMX8MQ_VIDEO2_PLL_OUT] = imx_clk_sccg_pll("video2_pll_out", video2_pll_out_sels, ARRAY_SIZE(video2_pll_out_sels), 0, 0, 0, base + 0x54, 0);
>  
>  	/* SYS PLL1 fixed output */
>  	clks[IMX8MQ_SYS1_PLL_40M_CG] = imx_clk_gate("sys1_pll_40m_cg", "sys1_pll_out", base + 0x30, 9);
>  	clks[IMX8MQ_SYS1_PLL_80M_CG] = imx_clk_gate("sys1_pll_80m_cg", "sys1_pll_out", base + 0x30, 11);
> @@ -433,15 +434,21 @@ static int imx8mq_clocks_probe(struct platform_device *pdev)
>  
>  	/* IPG */
>  	clks[IMX8MQ_CLK_IPG_ROOT] = imx_clk_divider2("ipg_root", "ahb", base + 0x9080, 0, 1);
>  	clks[IMX8MQ_CLK_IPG_AUDIO_ROOT] = imx_clk_divider2("ipg_audio_root", "audio_ahb", base + 0x9180, 0, 1);
>  
> -	/* IP */
> +	/*
> +	 * DRAM clocks are manipulated from TF-A outside clock framework.
> +	 * Mark with GET_RATE_NOCACHE to always read div value from hardware
> +	 */
>  	clks[IMX8MQ_CLK_DRAM_CORE] = imx_clk_mux2_flags("dram_core_clk", base + 0x9800, 24, 1, imx8mq_dram_core_sels, ARRAY_SIZE(imx8mq_dram_core_sels), CLK_IS_CRITICAL);
> +	clks[IMX8MQ_CLK_DRAM_ALT] = __imx8m_clk_composite("dram_alt", imx8mq_dram_alt_sels, base + 0xa000,
> +			CLK_GET_RATE_NOCACHE);
> +	clks[IMX8MQ_CLK_DRAM_APB] = __imx8m_clk_composite("dram_apb", imx8mq_dram_apb_sels, base + 0xa080,
> +			CLK_IS_CRITICAL | CLK_GET_RATE_NOCACHE);
>  
> -	clks[IMX8MQ_CLK_DRAM_ALT] = imx8m_clk_composite("dram_alt", imx8mq_dram_alt_sels, base + 0xa000);
> -	clks[IMX8MQ_CLK_DRAM_APB] = imx8m_clk_composite_critical("dram_apb", imx8mq_dram_apb_sels, base + 0xa080);
> +	/* IP */
>  	clks[IMX8MQ_CLK_VPU_G1] = imx8m_clk_composite("vpu_g1", imx8mq_vpu_g1_sels, base + 0xa100);
>  	clks[IMX8MQ_CLK_VPU_G2] = imx8m_clk_composite("vpu_g2", imx8mq_vpu_g2_sels, base + 0xa180);
>  	clks[IMX8MQ_CLK_DISP_DTRC] = imx8m_clk_composite("disp_dtrc", imx8mq_disp_dtrc_sels, base + 0xa200);
>  	clks[IMX8MQ_CLK_DISP_DC8000] = imx8m_clk_composite("disp_dc8000", imx8mq_disp_dc8000_sels, base + 0xa280);
>  	clks[IMX8MQ_CLK_PCIE1_CTRL] = imx8m_clk_composite("pcie1_ctrl", imx8mq_pcie1_ctrl_sels, base + 0xa300);
> -- 
> 2.17.1
>
Leonard Crestez Dec. 2, 2019, 4:19 a.m. UTC | #4
On 2019-12-02 5:12 AM, Shawn Guo wrote:
> On Thu, Oct 31, 2019 at 11:50:22PM +0200, Leonard Crestez wrote:
>> These clocks are only modified as part of DRAM frequency switches during
>> which DRAM itself is briefly inaccessible. The switch is performed with
>> a SMC call to by TF-A which runs from a SRAM area; upon returning to
>> linux several clocks bits are modified and we need to update them.
>>
>> For rate bits an easy solution is to just mark with
>> CLK_GET_RATE_NOCACHE so that new rates are always read back from
>> registers.
>>
>> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
>> ---
>>   drivers/clk/imx/clk-imx8mm.c | 11 +++++++++--
>>   drivers/clk/imx/clk-imx8mn.c | 12 ++++++++++--
>>   drivers/clk/imx/clk-imx8mq.c | 15 +++++++++++----
>>   3 files changed, 30 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/clk/imx/clk-imx8mm.c b/drivers/clk/imx/clk-imx8mm.c
>> index 030b15d7c0ce..c58f988191a5 100644
>> --- a/drivers/clk/imx/clk-imx8mm.c
>> +++ b/drivers/clk/imx/clk-imx8mm.c
>> @@ -440,13 +440,20 @@ static int imx8mm_clocks_probe(struct platform_device *pdev)
>>   
>>   	/* IPG */
>>   	clks[IMX8MM_CLK_IPG_ROOT] = imx_clk_divider2("ipg_root", "ahb", base + 0x9080, 0, 1);
>>   	clks[IMX8MM_CLK_IPG_AUDIO_ROOT] = imx_clk_divider2("ipg_audio_root", "audio_ahb", base + 0x9180, 0, 1);
>>   
>> +	/*
>> +	 * DRAM clocks are manipulated from TF-A outside clock framework.
>> +	 * Mark with GET_RATE_NOCACHE to always read div value from hardware
>> +	 */
>> +	clks[IMX8MM_CLK_DRAM_ALT] = __imx8m_clk_composite("dram_alt", imx8mm_dram_alt_sels, base + 0xa000,
>> +			CLK_GET_RATE_NOCACHE);
>> +	clks[IMX8MM_CLK_DRAM_APB] = __imx8m_clk_composite("dram_apb", imx8mm_dram_apb_sels, base + 0xa080,
>> +			CLK_IS_CRITICAL | CLK_GET_RATE_NOCACHE);
>> +
> 
> I think we prefer to ignore over-80-column warnings for i.MX clock
> drivers, because doing that improve the readability of code.

You're replying to an old version, v7 doesn't have this issue:

* https://patchwork.kernel.org/patch/11258501/
* https://patchwork.kernel.org/patch/11258505/

It also has additional acks from Abel and Stephen and the devfreq parts 
of this series have already been accepted.

--
Regards,
Leonard
Shawn Guo Dec. 2, 2019, 5:38 a.m. UTC | #5
On Thu, Oct 31, 2019 at 11:50:25PM +0200, Leonard Crestez wrote:
> Add driver for dynamic scaling the DDR Controller on imx8m chips. Actual
> frequency switching is implemented inside TF-A, this driver wraps the
> SMC calls and synchronizes the clk tree.
> 
> The DRAM clocks on imx8m have the following structure (abridged):
> 
>  +----------+       |\            +------+
>  | dram_pll |-------|M| dram_core |      |
>  +----------+       |U|---------->| D    |
>                  /--|X|           |  D   |
>    dram_alt_root |  |/            |   R  |
>                  |                |    C |
>             +---------+           |      |
>             |FIX DIV/4|           |      |
>             +---------+           |      |
>   composite:     |                |      |
>  +----------+    |                |      |
>  | dram_alt |----/                |      |
>  +----------+                     |      |
>  | dram_apb |-------------------->|      |
>  +----------+                     +------+
> 
> The dram_pll is used for higher rates and dram_alt is used for lower
> rates. The dram_alt and dram_apb clocks are "imx composite" and their
> parent can also be modified.
> 
> This driver will prepare/enable the new parents ahead of switching (so
> that the expected roots are enabled) and afterwards it will call
> clk_set_parent to ensure the parents in clock framework are up-to-date.
> 
> The driver relies on dram_pll dram_alt and dram_apb being marked with
> CLK_GET_RATE_NOCACHE for rate updates.
> 
> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
> ---
>  drivers/devfreq/Makefile   |   1 +
>  drivers/devfreq/imx-ddrc.c | 430 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 431 insertions(+)
>  create mode 100644 drivers/devfreq/imx-ddrc.c
> 
> diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
> index 338ae8440db6..1ac92614b6aa 100644
> --- a/drivers/devfreq/Makefile
> +++ b/drivers/devfreq/Makefile
> @@ -7,10 +7,11 @@ obj-$(CONFIG_DEVFREQ_GOV_POWERSAVE)	+= governor_powersave.o
>  obj-$(CONFIG_DEVFREQ_GOV_USERSPACE)	+= governor_userspace.o
>  obj-$(CONFIG_DEVFREQ_GOV_PASSIVE)	+= governor_passive.o
>  
>  # DEVFREQ Drivers
>  obj-$(CONFIG_ARM_EXYNOS_BUS_DEVFREQ)	+= exynos-bus.o
> +obj-$(CONFIG_ARM_IMX_DEVFREQ)		+= imx-ddrc.o
>  obj-$(CONFIG_ARM_RK3399_DMC_DEVFREQ)	+= rk3399_dmc.o
>  obj-$(CONFIG_ARM_TEGRA_DEVFREQ)		+= tegra30-devfreq.o
>  obj-$(CONFIG_ARM_TEGRA20_DEVFREQ)	+= tegra20-devfreq.o
>  
>  # DEVFREQ Event Drivers
> diff --git a/drivers/devfreq/imx-ddrc.c b/drivers/devfreq/imx-ddrc.c
> new file mode 100644
> index 000000000000..3ce51614ecab
> --- /dev/null
> +++ b/drivers/devfreq/imx-ddrc.c
> @@ -0,0 +1,430 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright 2019 NXP
> + */
> +
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/of_device.h>
> +#include <linux/platform_device.h>
> +#include <linux/devfreq.h>
> +#include <linux/pm_opp.h>
> +#include <linux/clk.h>
> +#include <linux/clk-provider.h>

This is a header that should ideally be used by clock drivers only.

> +#include <linux/arm-smccc.h>
> +
> +#define IMX_SIP_DDR_DVFS			0xc2000004
> +
> +/* Values starting from 0 switch to specific frequency */
> +#define IMX_SIP_DDR_FREQ_SET_HIGH		0x00
> +
> +/* Deprecated after moving IRQ handling to ATF */
> +#define IMX_SIP_DDR_DVFS_WAIT_CHANGE		0x0F

These two defines are not used.  Will be?

> +
> +/* Query available frequencies. */
> +#define IMX_SIP_DDR_DVFS_GET_FREQ_COUNT		0x10
> +#define IMX_SIP_DDR_DVFS_GET_FREQ_INFO		0x11
> +
> +/*
> + * This should be in a 1:1 mapping with devicetree OPPs but
> + * firmware provides additional info.
> + */
> +struct imx_ddrc_freq {
> +	unsigned long rate;
> +	unsigned long smcarg;
> +	int dram_core_parent_index;
> +	int dram_alt_parent_index;
> +	int dram_apb_parent_index;
> +};
> +
> +/* Hardware limitation */
> +#define IMX_DDRC_MAX_FREQ_COUNT 4
> +
> +/*
> + * imx DRAM controller
> + *
> + * imx DRAM controller clocks have the following structure (abridged):
> + *
> + * +----------+       |\            +------+
> + * | dram_pll |-------|M| dram_core |      |
> + * +----------+       |U|---------->| D    |
> + *                 /--|X|           |  D   |
> + *   dram_alt_root |  |/            |   R  |
> + *                 |                |    C |
> + *            +---------+           |      |
> + *            |FIX DIV/4|           |      |
> + *            +---------+           |      |
> + *  composite:     |                |      |
> + * +----------+    |                |      |
> + * | dram_alt |----/                |      |
> + * +----------+                     |      |
> + * | dram_apb |-------------------->|      |
> + * +----------+                     +------+
> + *
> + * The dram_pll is used for higher rates and dram_alt is used for lower rates.
> + *
> + * Frequency switching is implemented in TF-A (via SMC call) and can change the
> + * configuration of the clocks, including mux parents. The dram_alt and
> + * dram_apb clocks are "imx composite" and their parent can change too.
> + *
> + * We need to prepare/enable the new mux parents head of switching and update
> + * their information afterwards.
> + */
> +struct imx_ddrc {
> +	struct devfreq_dev_profile profile;
> +	struct devfreq *devfreq;
> +
> +	/* For frequency switching: */
> +	struct clk *dram_core;
> +	struct clk *dram_pll;
> +	struct clk *dram_alt;
> +	struct clk *dram_apb;
> +
> +	int freq_count;
> +	struct imx_ddrc_freq freq_table[IMX_DDRC_MAX_FREQ_COUNT];
> +};
> +
> +static struct imx_ddrc_freq *imx_ddrc_find_freq(struct imx_ddrc *priv,
> +						unsigned long rate)
> +{
> +	int i;
> +
> +	/*
> +	 * Firmware reports values in MT/s, so we round-down from Hz
> +	 * Rounding is extra generous to ensure a match.
> +	 */
> +	rate = DIV_ROUND_CLOSEST(rate, 250000);
> +	for (i = 0; i < priv->freq_count; ++i) {
> +		struct imx_ddrc_freq *freq = &priv->freq_table[i];
> +		if (freq->rate == rate ||
> +				freq->rate + 1 == rate ||
> +				freq->rate - 1 == rate)
> +			return freq;
> +	}
> +
> +	return NULL;
> +}
> +
> +static void imx_ddrc_smc_set_freq(int target_freq)
> +{
> +	struct arm_smccc_res res;
> +	u32 online_cpus = 0;
> +	int cpu;
> +
> +	local_irq_disable();
> +
> +	for_each_online_cpu(cpu)
> +		online_cpus |= (1 << (cpu * 8));

Nit: one level of unnecessary parentheses.

> +
> +	/* change the ddr freqency */
> +	arm_smccc_smc(IMX_SIP_DDR_DVFS, target_freq, online_cpus,
> +			0, 0, 0, 0, 0, &res);
> +
> +	local_irq_enable();
> +}
> +
> +struct clk *clk_get_parent_by_index(struct clk *clk, int index)
> +{
> +	struct clk_hw *hw;
> +
> +	hw = clk_hw_get_parent_by_index(__clk_get_hw(clk), index);

Okay, this is why you need clk-provider.h.  But this
clk_get_parent_by_index() function looks completely generic, and should
be proposed to clock core?

Shawn

> +
> +	return hw ? hw->clk : NULL;
> +}
> +
> +static int imx_ddrc_set_freq(struct device *dev, struct imx_ddrc_freq *freq)
> +{
> +	struct imx_ddrc *priv = dev_get_drvdata(dev);
> +	struct clk *new_dram_core_parent;
> +	struct clk *new_dram_alt_parent;
> +	struct clk *new_dram_apb_parent;
> +	int ret;
> +
> +	new_dram_core_parent = clk_get_parent_by_index(
> +			priv->dram_core, freq->dram_core_parent_index - 1);
> +	new_dram_alt_parent = clk_get_parent_by_index(
> +			priv->dram_alt, freq->dram_alt_parent_index - 1);
> +	new_dram_apb_parent = clk_get_parent_by_index(
> +			priv->dram_apb, freq->dram_apb_parent_index - 1);
> +
> +	/* increase reference counts and ensure clks are ON before switch */
> +	ret = clk_prepare_enable(new_dram_core_parent);
> +	if (ret) {
> +		dev_err(dev, "failed enable new dram_core parent: %d\n", ret);
> +		goto out;
> +	}
> +	ret = clk_prepare_enable(new_dram_alt_parent);
> +	if (ret) {
> +		dev_err(dev, "failed enable new dram_alt parent: %d\n", ret);
> +		goto out_dis_core;
> +	}
> +	ret = clk_prepare_enable(new_dram_apb_parent);
> +	if (ret) {
> +		dev_err(dev, "failed enable new dram_apb parent: %d\n", ret);
> +		goto out_dis_alt;
> +	}
> +
> +	imx_ddrc_smc_set_freq(freq->smcarg);
> +
> +	/* update parents in clk tree after switch. */
> +	ret = clk_set_parent(priv->dram_core, new_dram_core_parent);
> +	if (ret)
> +		dev_err(dev, "failed set dram_core parent: %d\n", ret);
> +	if (new_dram_alt_parent) {
> +		ret = clk_set_parent(priv->dram_alt, new_dram_alt_parent);
> +		if (ret)
> +			dev_err(dev, "failed set dram_alt parent: %d\n", ret);
> +	}
> +	if (new_dram_apb_parent) {
> +		ret = clk_set_parent(priv->dram_apb, new_dram_apb_parent);
> +		if (ret)
> +			dev_err(dev, "failed set dram_apb parent: %d\n", ret);
> +	}
> +
> +	/*
> +	 * Explicitly refresh dram PLL rate.
> +	 *
> +	 * Even if it's marked with CLK_GET_RATE_NOCACHE the rate will not be
> +	 * automatically refreshed when clk_get_rate is called on children.
> +	 */
> +	clk_get_rate(priv->dram_pll);
> +
> +	/*
> +	 * clk_set_parent transfer the reference count from old parent.
> +	 * now we drop extra reference counts used during the switch
> +	 */
> +	clk_disable_unprepare(new_dram_apb_parent);
> +out_dis_alt:
> +	clk_disable_unprepare(new_dram_alt_parent);
> +out_dis_core:
> +	clk_disable_unprepare(new_dram_core_parent);
> +out:
> +	return ret;
> +}
> +
> +static int imx_ddrc_target(struct device *dev, unsigned long *freq, u32 flags)
> +{
> +	struct imx_ddrc *priv = dev_get_drvdata(dev);
> +	struct imx_ddrc_freq *freq_info;
> +	struct dev_pm_opp *new_opp;
> +	unsigned long old_freq, new_freq;
> +	int ret;
> +
> +	new_opp = devfreq_recommended_opp(dev, freq, flags);
> +	if (IS_ERR(new_opp)) {
> +		ret = PTR_ERR(new_opp);
> +		dev_err(dev, "failed to get recommended opp: %d\n", ret);
> +		return ret;
> +	}
> +	dev_pm_opp_put(new_opp);
> +
> +	old_freq = clk_get_rate(priv->dram_core);
> +	if (*freq == old_freq)
> +		return 0;
> +
> +	freq_info = imx_ddrc_find_freq(priv, *freq);
> +	if (!freq_info)
> +		return -EINVAL;
> +	ret = imx_ddrc_set_freq(dev, freq_info);
> +
> +	/* Also read back the clk rate to verify switch was correct */
> +	new_freq = clk_get_rate(priv->dram_core);
> +	if (ret || *freq != new_freq)
> +		dev_err(dev, "ddrc failed to change freq %lu to %lu: now at %lu\n",
> +				old_freq, *freq, new_freq);
> +	else
> +		dev_dbg(dev, "ddrc changed freq %lu to %lu\n",
> +				old_freq, *freq);
> +
> +	return ret;
> +}
> +
> +static int imx_ddrc_get_cur_freq(struct device *dev, unsigned long *freq)
> +{
> +	struct imx_ddrc *priv = dev_get_drvdata(dev);
> +
> +	*freq = clk_get_rate(priv->dram_core);
> +
> +	return 0;
> +}
> +
> +static int imx_ddrc_get_dev_status(struct device *dev,
> +		struct devfreq_dev_status *stat)
> +{
> +	struct imx_ddrc *priv = dev_get_drvdata(dev);
> +
> +	stat->busy_time = 0;
> +	stat->total_time = 0;
> +	stat->current_frequency = clk_get_rate(priv->dram_core);
> +
> +	return 0;
> +}
> +
> +static int imx_ddrc_init_freq_info(struct device *dev)
> +{
> +	struct imx_ddrc *priv = dev_get_drvdata(dev);
> +	struct arm_smccc_res res;
> +	int index;
> +
> +	/*
> +	 * An error here means DDR DVFS API not supported by firmware
> +	 */
> +	arm_smccc_smc(IMX_SIP_DDR_DVFS, IMX_SIP_DDR_DVFS_GET_FREQ_COUNT,
> +			0, 0, 0, 0, 0, 0, &res);
> +	priv->freq_count = res.a0;
> +	if (priv->freq_count <= 0 || priv->freq_count > IMX_DDRC_MAX_FREQ_COUNT)
> +		return -ENODEV;
> +
> +	for (index = 0; index < priv->freq_count; ++index) {
> +		struct imx_ddrc_freq *freq = &priv->freq_table[index];
> +
> +		arm_smccc_smc(IMX_SIP_DDR_DVFS, IMX_SIP_DDR_DVFS_GET_FREQ_INFO,
> +				index, 0, 0, 0, 0, 0, &res);
> +		/* Result should be strictly positive */
> +		if ((long)res.a0 <= 0)
> +			return -ENODEV;
> +
> +		freq->rate = res.a0;
> +		freq->smcarg = index;
> +		freq->dram_core_parent_index = res.a1;
> +		freq->dram_alt_parent_index = res.a2;
> +		freq->dram_apb_parent_index = res.a3;
> +
> +		/* dram_core has 2 options: dram_pll or dram_alt_root */
> +		if (freq->dram_core_parent_index != 1 &&
> +				freq->dram_core_parent_index != 2)
> +			return -ENODEV;
> +		/* dram_apb and dram_alt have exactly 8 possible parents */
> +		if (freq->dram_alt_parent_index > 8 ||
> +				freq->dram_apb_parent_index > 8)
> +			return -ENODEV;
> +		/* dram_core from alt requires explicit dram_alt parent */
> +		if (freq->dram_core_parent_index == 2 &&
> +				freq->dram_alt_parent_index == 0)
> +			return -ENODEV;
> +	}
> +
> +	return 0;
> +}
> +
> +/* imx_ddrc_check_opps() - disable OPPs not supported by firmware */
> +static int imx_ddrc_check_opps(struct device *dev)
> +{
> +	struct imx_ddrc *priv = dev_get_drvdata(dev);
> +	struct imx_ddrc_freq *freq_info;
> +	struct dev_pm_opp *opp;
> +	unsigned long freq;
> +
> +	freq = ULONG_MAX;
> +	while (true) {
> +		opp = dev_pm_opp_find_freq_floor(dev, &freq);
> +		if (opp == ERR_PTR(-ERANGE))
> +			break;
> +		if (IS_ERR(opp)) {
> +			dev_err(dev, "Failed enumerating OPPs: %ld\n",
> +				PTR_ERR(opp));
> +			return PTR_ERR(opp);
> +		}
> +		dev_pm_opp_put(opp);
> +
> +		freq_info = imx_ddrc_find_freq(priv, freq);
> +		if (!freq_info) {
> +			dev_info(dev, "Disable unsupported OPP %luHz %luMT/s\n",
> +					freq, DIV_ROUND_CLOSEST(freq, 250000));
> +			dev_pm_opp_disable(dev, freq);
> +		}
> +
> +		freq--;
> +	}
> +
> +	return 0;
> +}
> +
> +static void imx_ddrc_exit(struct device *dev)
> +{
> +	dev_pm_opp_of_remove_table(dev);
> +}
> +
> +static int imx_ddrc_probe(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct imx_ddrc *priv;
> +	struct device_node *events_node;
> +	const char *gov = DEVFREQ_GOV_USERSPACE;
> +	int ret;
> +
> +	priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
> +	if (!priv)
> +		return -ENOMEM;
> +
> +	platform_set_drvdata(pdev, priv);
> +
> +	ret = imx_ddrc_init_freq_info(dev);
> +	if (ret) {
> +		dev_err(dev, "failed to init firmware freq info: %d\n", ret);
> +		return ret;
> +	}
> +
> +	priv->dram_core = devm_clk_get(dev, "dram_core");
> +	priv->dram_pll = devm_clk_get(dev, "dram_pll");
> +	priv->dram_alt = devm_clk_get(dev, "dram_alt");
> +	priv->dram_apb = devm_clk_get(dev, "dram_apb");
> +	if (IS_ERR(priv->dram_core) ||
> +		IS_ERR(priv->dram_pll) ||
> +		IS_ERR(priv->dram_alt) ||
> +		IS_ERR(priv->dram_apb)) {
> +		ret = PTR_ERR(priv->devfreq);
> +		dev_err(dev, "failed to fetch clocks: %d\n", ret);
> +		return ret;
> +	}
> +
> +	ret = dev_pm_opp_of_add_table(dev);
> +	if (ret < 0) {
> +		dev_err(dev, "failed to get OPP table\n");
> +		return ret;
> +	}
> +
> +	ret = imx_ddrc_check_opps(dev);
> +	if (ret < 0)
> +		goto err;
> +
> +	priv->profile.polling_ms = 1000;
> +	priv->profile.target = imx_ddrc_target;
> +	priv->profile.get_dev_status = imx_ddrc_get_dev_status;
> +	priv->profile.exit = imx_ddrc_exit;
> +	priv->profile.get_cur_freq = imx_ddrc_get_cur_freq;
> +	priv->profile.initial_freq = clk_get_rate(priv->dram_core);
> +
> +	priv->devfreq = devm_devfreq_add_device(dev, &priv->profile,
> +						gov, NULL);
> +	if (IS_ERR(priv->devfreq)) {
> +		ret = PTR_ERR(priv->devfreq);
> +		dev_err(dev, "failed to add devfreq device: %d\n", ret);
> +		goto err;
> +	}
> +
> +	return 0;
> +
> +err:
> +	dev_pm_opp_of_remove_table(dev);
> +	return ret;
> +}
> +
> +static const struct of_device_id imx_ddrc_of_match[] = {
> +	{ .compatible = "fsl,imx8m-ddrc", },
> +	{ /* sentinel */ },
> +};
> +MODULE_DEVICE_TABLE(of, imx_ddrc_of_match);
> +
> +static struct platform_driver imx_ddrc_platdrv = {
> +	.probe		= imx_ddrc_probe,
> +	.driver = {
> +		.name	= "imx-ddrc-devfreq",
> +		.of_match_table = of_match_ptr(imx_ddrc_of_match),
> +	},
> +};
> +module_platform_driver(imx_ddrc_platdrv);
> +
> +MODULE_DESCRIPTION("i.MX DDR controller frequency driver");
> +MODULE_AUTHOR("Leonard Crestez <leonard.crestez@nxp.com>");
> +MODULE_LICENSE("GPL v2");
> -- 
> 2.17.1
>
Leonard Crestez Dec. 2, 2019, 9:12 a.m. UTC | #6
On 2019-12-02 7:39 AM, Shawn Guo wrote:
> On Thu, Oct 31, 2019 at 11:50:25PM +0200, Leonard Crestez wrote:
>> Add driver for dynamic scaling the DDR Controller on imx8m chips. Actual
>> frequency switching is implemented inside TF-A, this driver wraps the
>> SMC calls and synchronizes the clk tree.
>>
>> The DRAM clocks on imx8m have the following structure (abridged):
>>
>>   +----------+       |\            +------+
>>   | dram_pll |-------|M| dram_core |      |
>>   +----------+       |U|---------->| D    |
>>                   /--|X|           |  D   |
>>     dram_alt_root |  |/            |   R  |
>>                   |                |    C |
>>              +---------+           |      |
>>              |FIX DIV/4|           |      |
>>              +---------+           |      |
>>    composite:     |                |      |
>>   +----------+    |                |      |
>>   | dram_alt |----/                |      |
>>   +----------+                     |      |
>>   | dram_apb |-------------------->|      |
>>   +----------+                     +------+
>>
>> The dram_pll is used for higher rates and dram_alt is used for lower
>> rates. The dram_alt and dram_apb clocks are "imx composite" and their
>> parent can also be modified.
>>
>> This driver will prepare/enable the new parents ahead of switching (so
>> that the expected roots are enabled) and afterwards it will call
>> clk_set_parent to ensure the parents in clock framework are up-to-date.
>>
>> The driver relies on dram_pll dram_alt and dram_apb being marked with
>> CLK_GET_RATE_NOCACHE for rate updates.
>>
>> Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>

A more recent version of this patch is already in next:

https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-next&id=518e99e2a22e318944d531a92aab5082fabb4d38

>>   drivers/devfreq/Makefile   |   1 +
>>   drivers/devfreq/imx-ddrc.c | 430 +++++++++++++++++++++++++++++++++++++
>>   2 files changed, 431 insertions(+)
>>   create mode 100644 drivers/devfreq/imx-ddrc.c

>> +++ b/drivers/devfreq/imx-ddrc.c
>> @@ -0,0 +1,430 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright 2019 NXP
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/device.h>
>> +#include <linux/of_device.h>
>> +#include <linux/platform_device.h>
>> +#include <linux/devfreq.h>
>> +#include <linux/pm_opp.h>
>> +#include <linux/clk.h>
>> +#include <linux/clk-provider.h>
> 
> This is a header that should ideally be used by clock drivers only.
> 
>> +#include <linux/arm-smccc.h>
>> +
>> +#define IMX_SIP_DDR_DVFS			0xc2000004
>> +
>> +/* Values starting from 0 switch to specific frequency */
>> +#define IMX_SIP_DDR_FREQ_SET_HIGH		0x00
>> +
>> +/* Deprecated after moving IRQ handling to ATF */
>> +#define IMX_SIP_DDR_DVFS_WAIT_CHANGE		0x0F
> 
> These two defines are not used.  Will be?

No, can post a separate patch to remove them.
> 
>> +
>> +/* Query available frequencies. */
>> +#define IMX_SIP_DDR_DVFS_GET_FREQ_COUNT		0x10
>> +#define IMX_SIP_DDR_DVFS_GET_FREQ_INFO		0x11
>> +
>> +/*
>> + * This should be in a 1:1 mapping with devicetree OPPs but
>> + * firmware provides additional info.
>> + */
>> +struct imx_ddrc_freq {
>> +	unsigned long rate;
>> +	unsigned long smcarg;
>> +	int dram_core_parent_index;
>> +	int dram_alt_parent_index;
>> +	int dram_apb_parent_index;
>> +};
>> +
>> +/* Hardware limitation */
>> +#define IMX_DDRC_MAX_FREQ_COUNT 4
>> +
>> +/*
>> + * imx DRAM controller
>> + *
>> + * imx DRAM controller clocks have the following structure (abridged):
>> + *
>> + * +----------+       |\            +------+
>> + * | dram_pll |-------|M| dram_core |      |
>> + * +----------+       |U|---------->| D    |
>> + *                 /--|X|           |  D   |
>> + *   dram_alt_root |  |/            |   R  |
>> + *                 |                |    C |
>> + *            +---------+           |      |
>> + *            |FIX DIV/4|           |      |
>> + *            +---------+           |      |
>> + *  composite:     |                |      |
>> + * +----------+    |                |      |
>> + * | dram_alt |----/                |      |
>> + * +----------+                     |      |
>> + * | dram_apb |-------------------->|      |
>> + * +----------+                     +------+
>> + *
>> + * The dram_pll is used for higher rates and dram_alt is used for lower rates.
>> + *
>> + * Frequency switching is implemented in TF-A (via SMC call) and can change the
>> + * configuration of the clocks, including mux parents. The dram_alt and
>> + * dram_apb clocks are "imx composite" and their parent can change too.
>> + *
>> + * We need to prepare/enable the new mux parents head of switching and update
>> + * their information afterwards.
>> + */
>> +struct imx_ddrc {
>> +	struct devfreq_dev_profile profile;
>> +	struct devfreq *devfreq;
>> +
>> +	/* For frequency switching: */
>> +	struct clk *dram_core;
>> +	struct clk *dram_pll;
>> +	struct clk *dram_alt;
>> +	struct clk *dram_apb;
>> +
>> +	int freq_count;
>> +	struct imx_ddrc_freq freq_table[IMX_DDRC_MAX_FREQ_COUNT];
>> +};
>> +

... snip ...

>> +static void imx_ddrc_smc_set_freq(int target_freq)
>> +{
>> +	struct arm_smccc_res res;
>> +	u32 online_cpus = 0;
>> +	int cpu;
>> +
>> +	local_irq_disable();
>> +
>> +	for_each_online_cpu(cpu)
>> +		online_cpus |= (1 << (cpu * 8));
> 
> Nit: one level of unnecessary parentheses.

Yes

>> +
>> +	/* change the ddr freqency */
>> +	arm_smccc_smc(IMX_SIP_DDR_DVFS, target_freq, online_cpus,
>> +			0, 0, 0, 0, 0, &res);
>> +
>> +	local_irq_enable();
>> +}
>> +
>> +struct clk *clk_get_parent_by_index(struct clk *clk, int index)
>> +{
>> +	struct clk_hw *hw;
>> +
>> +	hw = clk_hw_get_parent_by_index(__clk_get_hw(clk), index);
> 
> Okay, this is why you need clk-provider.h.  But this
> clk_get_parent_by_index() function looks completely generic, and should
> be proposed to clock core?

There are very few driver users of clk_hw_get_parent_by_index:

$ git grep -wl clk_hw_get_parent_by_index |grep -v drivers/clk
arch/mips/alchemy/common/clock.c
drivers/cpufreq/qoriq-cpufreq.c
drivers/devfreq/imx8m-ddrc.c
drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
drivers/gpu/drm/sun4i/sun8i_hdmi_phy_clk.c
drivers/media/platform/atmel/atmel-isc-base.c
drivers/rtc/rtc-ac100.c
include/linux/clk-provider.h

Even clk_get_parent has few users and it contains this strange comment:

/* TODO: Create a per-user clk and change callers to call clk_put */

That proposed change effectively creates a new API? I didn't want to add 
a new clk core API with unclear semantics.

--
Regards,
Leonard
Shawn Guo Dec. 2, 2019, 1:34 p.m. UTC | #7
On Mon, Dec 02, 2019 at 09:12:12AM +0000, Leonard Crestez wrote:
> >> +
> >> +	/* change the ddr freqency */
> >> +	arm_smccc_smc(IMX_SIP_DDR_DVFS, target_freq, online_cpus,
> >> +			0, 0, 0, 0, 0, &res);
> >> +
> >> +	local_irq_enable();
> >> +}
> >> +
> >> +struct clk *clk_get_parent_by_index(struct clk *clk, int index)
> >> +{
> >> +	struct clk_hw *hw;
> >> +
> >> +	hw = clk_hw_get_parent_by_index(__clk_get_hw(clk), index);
> > 
> > Okay, this is why you need clk-provider.h.  But this
> > clk_get_parent_by_index() function looks completely generic, and should
> > be proposed to clock core?
> 
> There are very few driver users of clk_hw_get_parent_by_index:
> 
> $ git grep -wl clk_hw_get_parent_by_index |grep -v drivers/clk
> arch/mips/alchemy/common/clock.c
> drivers/cpufreq/qoriq-cpufreq.c
> drivers/devfreq/imx8m-ddrc.c
> drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
> drivers/gpu/drm/sun4i/sun8i_hdmi_phy_clk.c
> drivers/media/platform/atmel/atmel-isc-base.c
> drivers/rtc/rtc-ac100.c
> include/linux/clk-provider.h
> 
> Even clk_get_parent has few users and it contains this strange comment:
> 
> /* TODO: Create a per-user clk and change callers to call clk_put */
> 
> That proposed change effectively creates a new API? I didn't want to add 
> a new clk core API with unclear semantics.

Since the merged version has 'static' added for clk_get_parent_by_index(),
I'm fine with it being a local function.  It's Stephen's call whether
we should have it at clock core level.

Shawn