Message ID | 20201120172153.2112-1-kamal@canonical.com |
---|---|
State | New |
Headers | show |
Series | [SRU,Groovy] drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init | expand |
On 2020-11-20 09:21:53 , Kamal Mostafa wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > BugLink: https://bugs.launchpad.net/bugs/1903397 > > Currently we leave the cache_level of the initial fb obj > set to NONE. This means on eLLC machines the first pin_to_display() > will try to switch it to WT which requires a vma unbind+bind. > If that happens during the fbdev initialization rcu does not > seem operational which causes the unbind to get stuck. To > most appearances this looks like a dead machine on boot. > > Avoid the unbind by already marking the object cache_level > as WT when creating it. We still do an excplicit ggtt pin > which will rewrite the PTEs anyway, so they will match whatever > cache level we set. > > Cc: <stable@vger.kernel.org> # v5.7+ > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381 > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com > Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk > (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338) > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > (cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660) > Signed-off-by: Kamal Mostafa <kamal@canonical.com> > --- Clean cherry pick. Applies and builds ok. LGTM. Thank you, Kamal! Acked-by: Kelsey Skunberg <kelsey.skunberg@canonical.com> > drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c > index 26996e1839e2..4bd383ff0880 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915, > if (IS_ERR(obj)) > return NULL; > > + /* > + * Mark it WT ahead of time to avoid changing the > + * cache_level during fbdev initialization. The > + * unbind there would get stuck waiting for rcu. > + */ > + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ? > + I915_CACHE_WT : I915_CACHE_NONE); > + > switch (plane_config->tiling) { > case I915_TILING_NONE: > break; > -- > 2.17.1 > > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On 20.11.20 18:21, Kamal Mostafa wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > BugLink: https://bugs.launchpad.net/bugs/1903397 > > Currently we leave the cache_level of the initial fb obj > set to NONE. This means on eLLC machines the first pin_to_display() > will try to switch it to WT which requires a vma unbind+bind. > If that happens during the fbdev initialization rcu does not > seem operational which causes the unbind to get stuck. To > most appearances this looks like a dead machine on boot. > > Avoid the unbind by already marking the object cache_level > as WT when creating it. We still do an excplicit ggtt pin > which will rewrite the PTEs anyway, so they will match whatever > cache level we set. > > Cc: <stable@vger.kernel.org> # v5.7+ > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381 > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com > Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk > (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338) > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > (cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660) > Signed-off-by: Kamal Mostafa <kamal@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> > --- Maybe the bug report (at least the justification) could be clarified to the fact that this is something appearing to be a hang but in reality it is no video output. -Stefan > drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c > index 26996e1839e2..4bd383ff0880 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915, > if (IS_ERR(obj)) > return NULL; > > + /* > + * Mark it WT ahead of time to avoid changing the > + * cache_level during fbdev initialization. The > + * unbind there would get stuck waiting for rcu. > + */ > + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ? > + I915_CACHE_WT : I915_CACHE_NONE); > + > switch (plane_config->tiling) { > case I915_TILING_NONE: > break; >
On 20.11.20 18:21, Kamal Mostafa wrote: > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > > BugLink: https://bugs.launchpad.net/bugs/1903397 > > Currently we leave the cache_level of the initial fb obj > set to NONE. This means on eLLC machines the first pin_to_display() > will try to switch it to WT which requires a vma unbind+bind. > If that happens during the fbdev initialization rcu does not > seem operational which causes the unbind to get stuck. To > most appearances this looks like a dead machine on boot. > > Avoid the unbind by already marking the object cache_level > as WT when creating it. We still do an excplicit ggtt pin > which will rewrite the PTEs anyway, so they will match whatever > cache level we set. > > Cc: <stable@vger.kernel.org> # v5.7+ > Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381 > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Link: https://patchwork.freedesktop.org/patch/msgid/20201007120329.17076-1-ville.syrjala@linux.intel.com > Link: https://patchwork.freedesktop.org/patch/msgid/20201015122138.30161-1-chris@chris-wilson.co.uk > (cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338) > Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> > (cherry picked from commit 1664ffee760a5d98952318fdd9b198fae396d660) > Signed-off-by: Kamal Mostafa <kamal@canonical.com> > --- > drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c > index 26996e1839e2..4bd383ff0880 100644 > --- a/drivers/gpu/drm/i915/display/intel_display.c > +++ b/drivers/gpu/drm/i915/display/intel_display.c > @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915, > if (IS_ERR(obj)) > return NULL; > > + /* > + * Mark it WT ahead of time to avoid changing the > + * cache_level during fbdev initialization. The > + * unbind there would get stuck waiting for rcu. > + */ > + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ? > + I915_CACHE_WT : I915_CACHE_NONE); > + > switch (plane_config->tiling) { > case I915_TILING_NONE: > break; > Applied to groovy/linux. Thanks, Kleber
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 26996e1839e2..4bd383ff0880 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3431,6 +3431,14 @@ initial_plane_vma(struct drm_i915_private *i915, if (IS_ERR(obj)) return NULL; + /* + * Mark it WT ahead of time to avoid changing the + * cache_level during fbdev initialization. The + * unbind there would get stuck waiting for rcu. + */ + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ? + I915_CACHE_WT : I915_CACHE_NONE); + switch (plane_config->tiling) { case I915_TILING_NONE: break;