mbox series

[v3,00/10] drm/panel and i2c-hid: Allow panels and touchscreens to power sequence together

Message ID 20230725203545.2260506-1-dianders@chromium.org
Headers show
Series drm/panel and i2c-hid: Allow panels and touchscreens to power sequence together | expand

Message

Doug Anderson July 25, 2023, 8:34 p.m. UTC
The big motivation for this patch series is mostly described in the patch
("drm/panel: Add a way for other devices to follow panel state"), but to
quickly summarize here: for touchscreens that are connected to a panel we
need the ability to power sequence the two device together. This is not a
new need, but so far we've managed to get by through a combination of
inefficiency, added costs, or perhaps just a little bit of brokenness.
It's time to do better. This patch series allows us to do better.

Assuming that people think this patch series looks OK, we'll have to
figure out the right way to land it. The panel patches and i2c-hid
patches will go through very different trees and so either we'll need
an Ack from one side or the other or someone to create a tag for the
other tree to pull in. This will _probably_ require the true drm-misc
maintainers to get involved, not a lowly committer. ;-)

Version 3 of this patch was a long time coming after v2. Maxime and I
had a very long discussion trying to figure out if there was a beter
way and in the end we didn't find one so he was OK with the series in
general [1]. After that got resolved, I tried to resolve Benjamin's
feedback but got stuck [2]. Presumably Benjamin is busy at the moment,
so I've done my best to try to resolve things. The end result is a v3
that's not that different from v2 but that has a tiny bit more code
split out.

Version 2 of this patch series didn't change too much. At a high level:
* I added all the forgotten "static" to functions.
* I've hopefully made the bindings better.
* I've integrated into fw_devlink.
* I cleaned up a few descriptions / comments.

This still needs someone to say that the idea looks OK or to suggest
an alternative that solves the problems. ;-)

[1] https://lore.kernel.org/r/gkwymmfkdy2p2evz22wmbwgw42ii4wnvmvu64m3bghmj2jhv7x@4mbstjxnagxd
[2] https://lore.kernel.org/r/CAD=FV=VbdeomBGbWhppY+5TOSwt64GWBHga68OXFwsnO4gg4UA@mail.gmail.com

Changes in v3:
- Add is_panel_follower() as a convenience for clients.
- Add "depends on DRM || !DRM" to Kconfig to avoid randconfig error.
- Split more of the panel follower code out of the core.

Changes in v2:
- Move the description to the generic touchscreen.yaml.
- Update the desc to make it clearer it's only for integrated devices.
- Add even more text to the commit message.
- A few comment cleanups.
- ("Add a devlink for panel followers") new for v2.
- i2c_hid_core_initial_power_up() is now static.
- i2c_hid_core_panel_prepared() and ..._unpreparing() are now static.
- ihid_core_panel_prepare_work() is now static.
- Improve documentation for smp_wmb().

Douglas Anderson (10):
  dt-bindings: HID: i2c-hid: Add "panel" property to i2c-hid backed
    touchscreens
  drm/panel: Check for already prepared/enabled in drm_panel
  drm/panel: Add a way for other devices to follow panel state
  of: property: fw_devlink: Add a devlink for panel followers
  HID: i2c-hid: Switch to SYSTEM_SLEEP_PM_OPS()
  HID: i2c-hid: Rearrange probe() to power things up later
  HID: i2c-hid: Make suspend and resume into helper functions
  HID: i2c-hid: Support being a panel follower
  HID: i2c-hid: Do panel follower work on the system_wq
  arm64: dts: qcom: sc7180: Link trogdor touchscreens to the panels

 .../bindings/input/elan,ekth6915.yaml         |   5 +
 .../bindings/input/goodix,gt7375p.yaml        |   5 +
 .../bindings/input/hid-over-i2c.yaml          |   2 +
 .../input/touchscreen/touchscreen.yaml        |   7 +
 .../boot/dts/qcom/sc7180-trogdor-coachz.dtsi  |   1 +
 .../dts/qcom/sc7180-trogdor-homestar.dtsi     |   1 +
 .../boot/dts/qcom/sc7180-trogdor-lazor.dtsi   |   1 +
 .../boot/dts/qcom/sc7180-trogdor-pompom.dtsi  |   1 +
 .../qcom/sc7180-trogdor-quackingstick.dtsi    |   1 +
 .../dts/qcom/sc7180-trogdor-wormdingler.dtsi  |   1 +
 drivers/gpu/drm/drm_panel.c                   | 218 ++++++++++-
 drivers/hid/i2c-hid/Kconfig                   |   2 +
 drivers/hid/i2c-hid/i2c-hid-core.c            | 338 +++++++++++++-----
 drivers/of/property.c                         |   2 +
 include/drm/drm_panel.h                       |  94 +++++
 15 files changed, 583 insertions(+), 96 deletions(-)

Comments

Benjamin Tissoires July 26, 2023, 8:57 a.m. UTC | #1
On Jul 25 2023, Douglas Anderson wrote:
> As talked about in the patch ("drm/panel: Add a way for other devices
> to follow panel state"), we really want to keep the power states of a
> touchscreen and the panel it's attached to in sync with each other. In
> that spirit, add support to i2c-hid to be a panel follower. This will
> let the i2c-hid driver get informed when the panel is powered on and
> off. From there we can match the i2c-hid device's power state to that
> of the panel.
> 
> NOTE: this patch specifically _doesn't_ use pm_runtime to keep track
> of / manage the power state of the i2c-hid device, even though my
> first instinct said that would be the way to go. Specific problems
> with using pm_runtime():
> * The initial power up couldn't happen in a runtime resume function
>   since it create sub-devices and, apparently, that's not good to do
>   in your resume function.
> * Managing our power state with pm_runtime meant fighting to make the
>   right thing happen at system suspend to prevent the system from
>   trying to resume us only to suspend us again. While this might be
>   able to be solved, it added complexity.
> Overall the code without pm_runtime() ended up being smaller and
> easier to understand.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
> Changes in v3:
> - Add "depends on DRM || !DRM" to Kconfig to avoid randconfig error.
> - Split more of the panel follower code out of the core.
> 
> Changes in v2:
> - i2c_hid_core_panel_prepared() and ..._unpreparing() are now static.
> 
>  drivers/hid/i2c-hid/Kconfig        |  2 +
>  drivers/hid/i2c-hid/i2c-hid-core.c | 82 +++++++++++++++++++++++++++++-
>  2 files changed, 82 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hid/i2c-hid/Kconfig b/drivers/hid/i2c-hid/Kconfig
> index 3be17109301a..2bdb55203104 100644
> --- a/drivers/hid/i2c-hid/Kconfig
> +++ b/drivers/hid/i2c-hid/Kconfig
> @@ -70,5 +70,7 @@ config I2C_HID_OF_GOODIX
>  
>  config I2C_HID_CORE
>  	tristate
> +	# We need to call into panel code so if DRM=m, this can't be 'y'
> +	depends on DRM || !DRM
>  endif
>  
> diff --git a/drivers/hid/i2c-hid/i2c-hid-core.c b/drivers/hid/i2c-hid/i2c-hid-core.c
> index fa8a1ca43d7f..fa6d1f624342 100644
> --- a/drivers/hid/i2c-hid/i2c-hid-core.c
> +++ b/drivers/hid/i2c-hid/i2c-hid-core.c
> @@ -38,6 +38,8 @@
>  #include <linux/mutex.h>
>  #include <asm/unaligned.h>
>  
> +#include <drm/drm_panel.h>
> +
>  #include "../hid-ids.h"
>  #include "i2c-hid.h"
>  
> @@ -107,6 +109,8 @@ struct i2c_hid {
>  	struct mutex		reset_lock;
>  
>  	struct i2chid_ops	*ops;
> +	struct drm_panel_follower panel_follower;
> +	bool			is_panel_follower;
>  };
>  
>  static const struct i2c_hid_quirks {
> @@ -1058,6 +1062,59 @@ static int i2c_hid_core_initial_power_up(struct i2c_hid *ihid)
>  	return ret;
>  }
>  
> +static int i2c_hid_core_panel_prepared(struct drm_panel_follower *follower)
> +{
> +	struct i2c_hid *ihid = container_of(follower, struct i2c_hid, panel_follower);
> +	struct hid_device *hid = ihid->hid;
> +
> +	/*
> +	 * hid->version is set on the first power up. If it's still zero then
> +	 * this is the first power on so we should perform initial power up
> +	 * steps.
> +	 */
> +	if (!hid->version)
> +		return i2c_hid_core_initial_power_up(ihid);
> +
> +	return i2c_hid_core_resume(ihid);
> +}
> +
> +static int i2c_hid_core_panel_unpreparing(struct drm_panel_follower *follower)
> +{
> +	struct i2c_hid *ihid = container_of(follower, struct i2c_hid, panel_follower);
> +
> +	return i2c_hid_core_suspend(ihid);
> +}
> +
> +static const struct drm_panel_follower_funcs i2c_hid_core_panel_follower_funcs = {
> +	.panel_prepared = i2c_hid_core_panel_prepared,
> +	.panel_unpreparing = i2c_hid_core_panel_unpreparing,
> +};
> +
> +static int i2c_hid_core_register_panel_follower(struct i2c_hid *ihid)
> +{
> +	struct device *dev = &ihid->client->dev;
> +	int ret;
> +
> +	ihid->is_panel_follower = true;
> +	ihid->panel_follower.funcs = &i2c_hid_core_panel_follower_funcs;
> +
> +	/*
> +	 * If we're not in control of our own power up/power down then we can't
> +	 * do the logic to manage wakeups. Give a warning if a user thought
> +	 * that was possible then force the capability off.
> +	 */
> +	if (device_can_wakeup(dev)) {
> +		dev_warn(dev, "Can't wakeup if following panel\n");
> +		device_set_wakeup_capable(dev, false);
> +	}
> +
> +	ret = drm_panel_add_follower(dev, &ihid->panel_follower);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
>  int i2c_hid_core_probe(struct i2c_client *client, struct i2chid_ops *ops,
>  		       u16 hid_descriptor_address, u32 quirks)
>  {
> @@ -1119,7 +1176,15 @@ int i2c_hid_core_probe(struct i2c_client *client, struct i2chid_ops *ops,
>  	hid->bus = BUS_I2C;
>  	hid->initial_quirks = quirks;
>  
> -	ret = i2c_hid_core_initial_power_up(ihid);
> +	/*
> +	 * If we're a panel follower, we'll register and do our initial power
> +	 * up when the panel turns on; otherwise we do it right away.
> +	 */
> +	if (drm_is_panel_follower(&client->dev))
> +		ret = i2c_hid_core_register_panel_follower(ihid);
> +	else
> +		ret = i2c_hid_core_initial_power_up(ihid);

nitpicks, but I'm not sure I'm a big fan of having
"if (drm_is_panel_follower(&client->dev))" sprinkled everywhere in the
generic probe/resume/suspend code.

Would it be OK to define a `static int __do_i2c_hid_core_initial_power_up(struct i2c_hid *ihid)`
that would do the actual powering up, and have
i2c_hid_core_initial_power_up() doing the test if we are a panel
follower?

The i2c_hid_core_panel_* will need to be updated to use the `__do_`
prefixed functions.

> +
>  	if (ret)
>  		goto err_mem_free;
>  
> @@ -1143,7 +1208,14 @@ void i2c_hid_core_remove(struct i2c_client *client)
>  	struct i2c_hid *ihid = i2c_get_clientdata(client);
>  	struct hid_device *hid;
>  
> -	i2c_hid_core_power_down(ihid);
> +	/*
> +	 * If we're a follower, the act of unfollowing will cause us to be
> +	 * powered down. Otherwise we need to manually do it.
> +	 */
> +	if (ihid->is_panel_follower)
> +		drm_panel_remove_follower(&ihid->panel_follower);

That part is concerning, as we are now calling hid_drv->suspend() when removing
the device. It might or not have an impact (I'm not sure of it), but we
are effectively changing the path of commands sent to the device.

hid-multitouch might call a feature in ->suspend, but the remove makes
that the physical is actually disconnected, so the function will fail,
and I'm not sure what is happening then.

> +	else
> +		i2c_hid_core_power_down(ihid);

Same here, I *think* it would be best to have the `if (ihid->is_panel_follower)`
test in i2c_hid_core_power_down()  (and have a separate
_do_i2c_hid_core_power_down()).

>  
>  	hid = ihid->hid;
>  	hid_destroy_device(hid);
> @@ -1171,6 +1243,9 @@ static int i2c_hid_core_pm_suspend(struct device *dev)
>  	struct i2c_client *client = to_i2c_client(dev);
>  	struct i2c_hid *ihid = i2c_get_clientdata(client);
>  
> +	if (ihid->is_panel_follower)
> +		return 0;

Not sure we need to split that one with _do_ prefix, it's already split
:)

> +
>  	return i2c_hid_core_suspend(ihid);
>  }
>  
> @@ -1179,6 +1254,9 @@ static int i2c_hid_core_pm_resume(struct device *dev)
>  	struct i2c_client *client = to_i2c_client(dev);
>  	struct i2c_hid *ihid = i2c_get_clientdata(client);
>  
> +	if (ihid->is_panel_follower)
> +		return 0;

Same here, no need to split.

> +
>  	return i2c_hid_core_resume(ihid);
>  }
>  
> -- 
> 2.41.0.487.g6d72f3e995-goog
> 

Cheers,
Benjamin
Maxime Ripard July 26, 2023, 12:41 p.m. UTC | #2
Hi,

On Tue, Jul 25, 2023 at 01:34:37PM -0700, Douglas Anderson wrote:
> NOTE: arguably, the right thing to do here is actually to skip this
> patch and simply remove all the extra checks from the individual
> drivers. Perhaps the checks were needed at some point in time in the
> past but maybe they no longer are? Certainly as we continue
> transitioning over to "panel_bridge" then we expect there to be much
> less variety in how these calls are made. When we're called as part of
> the bridge chain, things should be pretty simple. In fact, there was
> some discussion in the past about these checks [1], including a
> discussion about whether the checks were needed and whether the calls
> ought to be refcounted. At the time, I decided not to mess with it
> because it felt too risky.

Yeah, I'd agree here too. I've never found evidence that it was actually
needed and it really looks like cargo cult to me.

And if it was needed, then I'm not sure we need refcounting either. We
don't have refcounting for atomic_enable / disable, we have a sound API
design that makes sure we don't fall into that trap :)

> Looking closer at it now, I'm fairly certain that nothing in the
> existing codebase is expecting these calls to be refcounted. The only
> real question is whether someone is already doing something to ensure
> prepare()/unprepare() match and enabled()/disable() match. I would say
> that, even if there is something else ensuring that things match,
> there's enough complexity that adding an extra bool and an extra
> double-check here is a good idea. Let's add a drm_warn() to let people
> know that it's considered a minor error to take advantage of
> drm_panel's double-checking but we'll still make things work fine.

I'm ok with this, if we follow-up in a couple of releases and remove it
and all the calls.

Could you add a TODO item so that we can keep a track of it? A follow-up
is fine if you don't send a new version of that series.

Maxime
Maxime Ripard July 26, 2023, 12:45 p.m. UTC | #3
On Tue, 25 Jul 2023 13:34:35 -0700, Douglas Anderson wrote:
> 
> The big motivation for this patch series is mostly described in the patch
> ("drm/panel: Add a way for other devices to follow panel state"), but to
> quickly summarize here: for touchscreens that are connected to a panel we
> need the ability to power sequence the two device together. This is not a
> 
> [ ... ]

Reviewed-by: Maxime Ripard <mripard@kernel.org>

Thanks!
Maxime
Doug Anderson July 26, 2023, 3:10 p.m. UTC | #4
Hi,

On Wed, Jul 26, 2023 at 5:41 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> Hi,
>
> On Tue, Jul 25, 2023 at 01:34:37PM -0700, Douglas Anderson wrote:
> > NOTE: arguably, the right thing to do here is actually to skip this
> > patch and simply remove all the extra checks from the individual
> > drivers. Perhaps the checks were needed at some point in time in the
> > past but maybe they no longer are? Certainly as we continue
> > transitioning over to "panel_bridge" then we expect there to be much
> > less variety in how these calls are made. When we're called as part of
> > the bridge chain, things should be pretty simple. In fact, there was
> > some discussion in the past about these checks [1], including a
> > discussion about whether the checks were needed and whether the calls
> > ought to be refcounted. At the time, I decided not to mess with it
> > because it felt too risky.
>
> Yeah, I'd agree here too. I've never found evidence that it was actually
> needed and it really looks like cargo cult to me.
>
> And if it was needed, then I'm not sure we need refcounting either. We
> don't have refcounting for atomic_enable / disable, we have a sound API
> design that makes sure we don't fall into that trap :)
>
> > Looking closer at it now, I'm fairly certain that nothing in the
> > existing codebase is expecting these calls to be refcounted. The only
> > real question is whether someone is already doing something to ensure
> > prepare()/unprepare() match and enabled()/disable() match. I would say
> > that, even if there is something else ensuring that things match,
> > there's enough complexity that adding an extra bool and an extra
> > double-check here is a good idea. Let's add a drm_warn() to let people
> > know that it's considered a minor error to take advantage of
> > drm_panel's double-checking but we'll still make things work fine.
>
> I'm ok with this, if we follow-up in a couple of releases and remove it
> and all the calls.
>
> Could you add a TODO item so that we can keep a track of it? A follow-up
> is fine if you don't send a new version of that series.

By this, I think you mean to add a "TODO" comment inline in the code?

Also: I was thinking that we'd keep the check in "drm_panel.c" with
the warning message indefinitely. You think it should be eventually
removed? If we are truly thinking of removing it eventually, this
feels like it should be a more serious warning message like a WARN(1,
...) to make it really obvious to people that they're relying on
behavior that will eventually go away.


-Doug
Doug Anderson July 26, 2023, 4:07 p.m. UTC | #5
Hi,

On Wed, Jul 26, 2023 at 1:57 AM Benjamin Tissoires <bentiss@kernel.org> wrote:
>
> > @@ -1143,7 +1208,14 @@ void i2c_hid_core_remove(struct i2c_client *client)
> >       struct i2c_hid *ihid = i2c_get_clientdata(client);
> >       struct hid_device *hid;
> >
> > -     i2c_hid_core_power_down(ihid);
> > +     /*
> > +      * If we're a follower, the act of unfollowing will cause us to be
> > +      * powered down. Otherwise we need to manually do it.
> > +      */
> > +     if (ihid->is_panel_follower)
> > +             drm_panel_remove_follower(&ihid->panel_follower);
>
> That part is concerning, as we are now calling hid_drv->suspend() when removing
> the device. It might or not have an impact (I'm not sure of it), but we
> are effectively changing the path of commands sent to the device.
>
> hid-multitouch might call a feature in ->suspend, but the remove makes
> that the physical is actually disconnected, so the function will fail,
> and I'm not sure what is happening then.

It's not too hard to change this if we're sure we want to. I could
change how the panel follower API works, though I'd rather keep it how
it is now for symmetry. Thus, if we want to do this I'd probably just
set a boolean at the beginning of i2c_hid_core_remove() to avoid the
suspend when the panel follower API calls us back.

That being said, are you sure you want me to do that?

1. My patch doesn't change the behavior of any existing hardware. It
will only do anything for hardware that indicates it needs the panel
follower logic. Presumably these people could confirm that the logic
is OK for them, though I'll also admit that it's likely not many of
them will test the remove() case.

2. Can you give more details about why you say that the function will
fail? The first thing that the remove() function will do is to
unfollow the panel and that can cause the suspend to happen. At the
time this code runs all the normal communications should work and so
there should be no problems calling into the suspend code.

3. You can correct me if I'm wrong, but I'd actually argue that
calling the suspend code during remove actually fixes issues and we
should probably do it for the non-panel-follower case as well. I think
there are at least two benefits. One benefit is that if the i2c-hid
device is on a power rail that can't turn off (either an always-on or
a shared power rail) that we'll at least get the device in a low power
state before we stop managing it with this driver. The second benefit
is that it implicitly disables the interrupt and that fixes a
potential crash at remove time(). The crash in the old code I'm
imagining is:

a) i2c_hid_core_remove() is called.

b) We try to power down the i2c hid device, which might not do
anything if the device is on an always-on rail.

c) We call hid_destroy_device(), which frees the hid device.

d) An interrupt comes in before the call to free_irq() and we try to
dispatch it to the already freed hid device and crash.


If you agree that my reasoning makes sense, I can add a separate patch
before this one to suspend during remove.



-Doug
Benjamin Tissoires July 26, 2023, 4:45 p.m. UTC | #6
On Jul 26 2023, Doug Anderson wrote:
> Hi,
> 
> On Wed, Jul 26, 2023 at 1:57 AM Benjamin Tissoires <bentiss@kernel.org> wrote:
> >
> > > @@ -1143,7 +1208,14 @@ void i2c_hid_core_remove(struct i2c_client *client)
> > >       struct i2c_hid *ihid = i2c_get_clientdata(client);
> > >       struct hid_device *hid;
> > >
> > > -     i2c_hid_core_power_down(ihid);
> > > +     /*
> > > +      * If we're a follower, the act of unfollowing will cause us to be
> > > +      * powered down. Otherwise we need to manually do it.
> > > +      */
> > > +     if (ihid->is_panel_follower)
> > > +             drm_panel_remove_follower(&ihid->panel_follower);
> >
> > That part is concerning, as we are now calling hid_drv->suspend() when removing
> > the device. It might or not have an impact (I'm not sure of it), but we
> > are effectively changing the path of commands sent to the device.
> >
> > hid-multitouch might call a feature in ->suspend, but the remove makes
> > that the physical is actually disconnected, so the function will fail,
> > and I'm not sure what is happening then.
> 
> It's not too hard to change this if we're sure we want to. I could
> change how the panel follower API works, though I'd rather keep it how
> it is now for symmetry. Thus, if we want to do this I'd probably just
> set a boolean at the beginning of i2c_hid_core_remove() to avoid the
> suspend when the panel follower API calls us back.

I was more thinking on a boolean. No need to overload the API.

> 
> That being said, are you sure you want me to do that?
> 
> 1. My patch doesn't change the behavior of any existing hardware. It
> will only do anything for hardware that indicates it needs the panel
> follower logic. Presumably these people could confirm that the logic
> is OK for them, though I'll also admit that it's likely not many of
> them will test the remove() case.

Isn't trogdor (patch 10/10) already supported? Though you should be the
one making tests, so it should be fine ;)

> 
> 2. Can you give more details about why you say that the function will
> fail? The first thing that the remove() function will do is to
> unfollow the panel and that can cause the suspend to happen. At the
> time this code runs all the normal communications should work and so
> there should be no problems calling into the suspend code.

Now that I think about it more, maybe I am too biased by USB where the
device remove would happened *after* the device has been physically
unplugged. And this doesn't apply of course in the I2C world.

> 
> 3. You can correct me if I'm wrong, but I'd actually argue that
> calling the suspend code during remove actually fixes issues and we
> should probably do it for the non-panel-follower case as well. I think
> there are at least two benefits. One benefit is that if the i2c-hid
> device is on a power rail that can't turn off (either an always-on or
> a shared power rail) that we'll at least get the device in a low power
> state before we stop managing it with this driver. The second benefit
> is that it implicitly disables the interrupt and that fixes a
> potential crash at remove time(). The crash in the old code I'm
> imagining is:
> 
> a) i2c_hid_core_remove() is called.
> 
> b) We try to power down the i2c hid device, which might not do
> anything if the device is on an always-on rail.
> 
> c) We call hid_destroy_device(), which frees the hid device.
> 
> d) An interrupt comes in before the call to free_irq() and we try to
> dispatch it to the already freed hid device and crash.
> 
> 
> If you agree that my reasoning makes sense, I can add a separate patch
> before this one to suspend during remove.

Yep, I agree with you :)

Adding a separate patch would be nice, yes. Thanks!

Cheers,
Benjamin
Maxime Ripard July 27, 2023, 6:37 a.m. UTC | #7
Hi,

On Wed, Jul 26, 2023 at 08:10:33AM -0700, Doug Anderson wrote:
> On Wed, Jul 26, 2023 at 5:41 AM Maxime Ripard <mripard@kernel.org> wrote:
> > On Tue, Jul 25, 2023 at 01:34:37PM -0700, Douglas Anderson wrote:
> > > NOTE: arguably, the right thing to do here is actually to skip this
> > > patch and simply remove all the extra checks from the individual
> > > drivers. Perhaps the checks were needed at some point in time in the
> > > past but maybe they no longer are? Certainly as we continue
> > > transitioning over to "panel_bridge" then we expect there to be much
> > > less variety in how these calls are made. When we're called as part of
> > > the bridge chain, things should be pretty simple. In fact, there was
> > > some discussion in the past about these checks [1], including a
> > > discussion about whether the checks were needed and whether the calls
> > > ought to be refcounted. At the time, I decided not to mess with it
> > > because it felt too risky.
> >
> > Yeah, I'd agree here too. I've never found evidence that it was actually
> > needed and it really looks like cargo cult to me.
> >
> > And if it was needed, then I'm not sure we need refcounting either. We
> > don't have refcounting for atomic_enable / disable, we have a sound API
> > design that makes sure we don't fall into that trap :)
> >
> > > Looking closer at it now, I'm fairly certain that nothing in the
> > > existing codebase is expecting these calls to be refcounted. The only
> > > real question is whether someone is already doing something to ensure
> > > prepare()/unprepare() match and enabled()/disable() match. I would say
> > > that, even if there is something else ensuring that things match,
> > > there's enough complexity that adding an extra bool and an extra
> > > double-check here is a good idea. Let's add a drm_warn() to let people
> > > know that it's considered a minor error to take advantage of
> > > drm_panel's double-checking but we'll still make things work fine.
> >
> > I'm ok with this, if we follow-up in a couple of releases and remove it
> > and all the calls.
> >
> > Could you add a TODO item so that we can keep a track of it? A follow-up
> > is fine if you don't send a new version of that series.
> 
> By this, I think you mean to add a "TODO" comment inline in the code?

No, sorry, I meant an entry in our TODO list: Documentation/gpu/todo.rst

> Also: I was thinking that we'd keep the check in "drm_panel.c" with
> the warning message indefinitely. You think it should be eventually
> removed? If we are truly thinking of removing it eventually, this
> feels like it should be a more serious warning message like a WARN(1,
> ...) to make it really obvious to people that they're relying on
> behavior that will eventually go away.

Yeah, it really feels like this is cargo-cult to me. Your approach seems
like a good short-term thing to do to warn everyone but eventually we'll
want it to go away.

So promoting it to a WARN could be a good thing, or let's say we do a
drm_warn for now, WARN next release, and gone in two?

Maxime
Chris Morgan July 31, 2023, 4:33 p.m. UTC | #8
In my case a few different panel drivers disable the regulators in the
unprepare/disable routines. For at least the Rockchip DSI
implementations for some reason the panel gets unprepared more than
once, which triggers an unbalanced regulator disable. Obviously though
the correct course of action is to fix the reason why the panel is
disabled more than once, but that's at least the root cause of this
behavior on the few panels I've worked with.

Thank you.

On Thu, Jul 27, 2023 at 1:38 AM Maxime Ripard <mripard@kernel.org> wrote:
>
> Hi,
>
> On Wed, Jul 26, 2023 at 08:10:33AM -0700, Doug Anderson wrote:
> > On Wed, Jul 26, 2023 at 5:41 AM Maxime Ripard <mripard@kernel.org> wrote:
> > > On Tue, Jul 25, 2023 at 01:34:37PM -0700, Douglas Anderson wrote:
> > > > NOTE: arguably, the right thing to do here is actually to skip this
> > > > patch and simply remove all the extra checks from the individual
> > > > drivers. Perhaps the checks were needed at some point in time in the
> > > > past but maybe they no longer are? Certainly as we continue
> > > > transitioning over to "panel_bridge" then we expect there to be much
> > > > less variety in how these calls are made. When we're called as part of
> > > > the bridge chain, things should be pretty simple. In fact, there was
> > > > some discussion in the past about these checks [1], including a
> > > > discussion about whether the checks were needed and whether the calls
> > > > ought to be refcounted. At the time, I decided not to mess with it
> > > > because it felt too risky.
> > >
> > > Yeah, I'd agree here too. I've never found evidence that it was actually
> > > needed and it really looks like cargo cult to me.
> > >
> > > And if it was needed, then I'm not sure we need refcounting either. We
> > > don't have refcounting for atomic_enable / disable, we have a sound API
> > > design that makes sure we don't fall into that trap :)
> > >
> > > > Looking closer at it now, I'm fairly certain that nothing in the
> > > > existing codebase is expecting these calls to be refcounted. The only
> > > > real question is whether someone is already doing something to ensure
> > > > prepare()/unprepare() match and enabled()/disable() match. I would say
> > > > that, even if there is something else ensuring that things match,
> > > > there's enough complexity that adding an extra bool and an extra
> > > > double-check here is a good idea. Let's add a drm_warn() to let people
> > > > know that it's considered a minor error to take advantage of
> > > > drm_panel's double-checking but we'll still make things work fine.
> > >
> > > I'm ok with this, if we follow-up in a couple of releases and remove it
> > > and all the calls.
> > >
> > > Could you add a TODO item so that we can keep a track of it? A follow-up
> > > is fine if you don't send a new version of that series.
> >
> > By this, I think you mean to add a "TODO" comment inline in the code?
>
> No, sorry, I meant an entry in our TODO list: Documentation/gpu/todo.rst
>
> > Also: I was thinking that we'd keep the check in "drm_panel.c" with
> > the warning message indefinitely. You think it should be eventually
> > removed? If we are truly thinking of removing it eventually, this
> > feels like it should be a more serious warning message like a WARN(1,
> > ...) to make it really obvious to people that they're relying on
> > behavior that will eventually go away.
>
> Yeah, it really feels like this is cargo-cult to me. Your approach seems
> like a good short-term thing to do to warn everyone but eventually we'll
> want it to go away.
>
> So promoting it to a WARN could be a good thing, or let's say we do a
> drm_warn for now, WARN next release, and gone in two?
>
> Maxime
Maxime Ripard July 31, 2023, 5:03 p.m. UTC | #9
Hi,

On Mon, Jul 31, 2023 at 11:33:22AM -0500, Chris Morgan wrote:
> In my case a few different panel drivers disable the regulators in the
> unprepare/disable routines.

And that's totally fine.

> For at least the Rockchip DSI implementations for some reason the
> panel gets unprepared more than once, which triggers an unbalanced
> regulator disable.

"For some reason" being that DW-DSI apparently finds it ok to bypass any
kind of abstraction and randomly calling panel functions by itself:

https://elixir.bootlin.com/linux/v6.4.7/source/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c#L868

It looks like it's fixed it current drm-misc-next though.

> Obviously though the correct course of action is to fix the reason why
> the panel is disabled more than once, but that's at least the root
> cause of this behavior on the few panels I've worked with.

Like I said we already have a commit on the way to fix that, so it
shouldn't be an issue anymore.

I stand by what I was saying earlier though, I think it's mostly
cargo-cult or drivers being very wrong. If anything, the DW-DSI stuff
made me even more convinced that we shouldn't even entertain that idea
:)

Maxime
Chris Morgan Aug. 2, 2023, 5:25 p.m. UTC | #10
On Mon, Jul 31, 2023 at 07:03:07PM +0200, Maxime Ripard wrote:
> Hi,
> 
> On Mon, Jul 31, 2023 at 11:33:22AM -0500, Chris Morgan wrote:
> > In my case a few different panel drivers disable the regulators in the
> > unprepare/disable routines.
> 
> And that's totally fine.
> 
> > For at least the Rockchip DSI implementations for some reason the
> > panel gets unprepared more than once, which triggers an unbalanced
> > regulator disable.
> 
> "For some reason" being that DW-DSI apparently finds it ok to bypass any
> kind of abstraction and randomly calling panel functions by itself:
> 
> https://elixir.bootlin.com/linux/v6.4.7/source/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c#L868
> 
> It looks like it's fixed it current drm-misc-next though.

Good, when I get a chance I will test it out with the existing panels
I have at my disposal and submit some patches to clean them up.

> 
> > Obviously though the correct course of action is to fix the reason why
> > the panel is disabled more than once, but that's at least the root
> > cause of this behavior on the few panels I've worked with.
> 
> Like I said we already have a commit on the way to fix that, so it
> shouldn't be an issue anymore.
> 
> I stand by what I was saying earlier though, I think it's mostly
> cargo-cult or drivers being very wrong. If anything, the DW-DSI stuff
> made me even more convinced that we shouldn't even entertain that idea
> :)
> 
> Maxime

Thank you, and yes if a driver is doing something it shouldn't we
shouldn't be patching around that, we should be fixing things. Thanks
for providing me with the additional info.

Chris
Dave Stevenson Aug. 2, 2023, 5:50 p.m. UTC | #11
On Wed, 2 Aug 2023 at 18:26, Chris Morgan <macroalpha82@gmail.com> wrote:
>
> * Spam *
> On Mon, Jul 31, 2023 at 07:03:07PM +0200, Maxime Ripard wrote:
> > Hi,
> >
> > On Mon, Jul 31, 2023 at 11:33:22AM -0500, Chris Morgan wrote:
> > > In my case a few different panel drivers disable the regulators in the
> > > unprepare/disable routines.
> >
> > And that's totally fine.
> >
> > > For at least the Rockchip DSI implementations for some reason the
> > > panel gets unprepared more than once, which triggers an unbalanced
> > > regulator disable.
> >
> > "For some reason" being that DW-DSI apparently finds it ok to bypass any
> > kind of abstraction and randomly calling panel functions by itself:
> >
> > https://elixir.bootlin.com/linux/v6.4.7/source/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c#L868
> >
> > It looks like it's fixed it current drm-misc-next though.
>
> Good, when I get a chance I will test it out with the existing panels
> I have at my disposal and submit some patches to clean them up.
>
> >
> > > Obviously though the correct course of action is to fix the reason why
> > > the panel is disabled more than once, but that's at least the root
> > > cause of this behavior on the few panels I've worked with.
> >
> > Like I said we already have a commit on the way to fix that, so it
> > shouldn't be an issue anymore.
> >
> > I stand by what I was saying earlier though, I think it's mostly
> > cargo-cult or drivers being very wrong. If anything, the DW-DSI stuff
> > made me even more convinced that we shouldn't even entertain that idea
> > :)

DW-DSI is hacking around the fact that DSI panels may want to send DCS
commands in unprepare, however the DSI host driver shuts down the
controller in the DSI bridge post_disable which gets called first.

That ordering can now be reversed with pre_enable_prev_first flag in
struct drm_bridge, or prepare_prev_first in drm_panel, hence no need
for the DSI controller to jump around the bridge chain.

  Dave

> > Maxime
>
> Thank you, and yes if a driver is doing something it shouldn't we
> shouldn't be patching around that, we should be fixing things. Thanks
> for providing me with the additional info.
>
> Chris
>