From patchwork Mon Jul 28 18:28:28 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Luis R. Rodriguez" X-Patchwork-Id: 374299 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 96C3214007D for ; Tue, 29 Jul 2014 04:30:32 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751725AbaG1S2y (ORCPT ); Mon, 28 Jul 2014 14:28:54 -0400 Received: from mail-pd0-f172.google.com ([209.85.192.172]:50886 "EHLO mail-pd0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751634AbaG1S2q (ORCPT ); Mon, 28 Jul 2014 14:28:46 -0400 Received: by mail-pd0-f172.google.com with SMTP id ft15so10365781pdb.31 for ; Mon, 28 Jul 2014 11:28:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=7nwM3+7VRI/5qdhLiDC6rGreI+k04my/A+e8w9QFy3E=; b=vWjWxYNRdEMnAk8Ga21zeNH6xOpIlXVahkAmwA48ZWZH9dVZ+9/0F9Axx1XJ/Lt1zA Gk+1hddED+zNMzpvUuo55rJkqXlwcuM+79EpGCC4/Zw6q1ez2iag4bIzczJNf90GabrL sQyitzK7I3GnFrq5fk0eTbvqU8KG5YwRaoRfDxFSNNXjrs0CxQFG0ETf1d0RgZLxNx8G Q6iS3Sr1HhQsmBiN3TUMXkuZhAJTSi6KqxWaiXSQ9IHNE2FBdnsmzi1Sf4MJUIcEzhfF Acj99PuVPsRFRFmrgfIE0aRtchCbNV8mviPBTy8AB9tpHqRTW0sEHgHUIL2WA/KTzs4b mDVQ== X-Received: by 10.66.152.171 with SMTP id uz11mr8031388pab.96.1406572125691; Mon, 28 Jul 2014 11:28:45 -0700 (PDT) Received: from mcgrof@gmail.com (c-98-234-145-61.hsd1.ca.comcast.net. [98.234.145.61]) by mx.google.com with ESMTPSA id br1sm18384659pbc.6.2014.07.28.11.28.42 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 28 Jul 2014 11:28:44 -0700 (PDT) Received: by mcgrof@gmail.com (sSMTP sendmail emulation); Mon, 28 Jul 2014 11:28:41 -0700 From: "Luis R. Rodriguez" To: gregkh@linuxfoundation.org Cc: tiwai@suse.de, linux-kernel@vger.kernel.org, "Luis R. Rodriguez" , Tetsuo Handa , Joseph Salisbury , Kay Sievers , One Thousand Gnomes , Tim Gardner , Pierre Fersing , Andrew Morton , Oleg Nesterov , Benjamin Poirier , Nagalakshmi Nandigama , Praveen Krishnamoorthy , Sreekanth Reddy , Abhijit Mahajan , Hariprasad S , Santosh Rastapur , MPT-FusionLinux.pdl@avagotech.com, linux-scsi@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH v2 2/4] driver core: enable drivers to use deferred probe from init Date: Mon, 28 Jul 2014 11:28:28 -0700 Message-Id: <1406572110-26823-3-git-send-email-mcgrof@do-not-panic.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1406572110-26823-1-git-send-email-mcgrof@do-not-panic.com> References: <1406572110-26823-1-git-send-email-mcgrof@do-not-panic.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: "Luis R. Rodriguez" Tetsuo bisected and found that commit 786235ee "kthread: make kthread_create() killable" modified kthread_create() to bail as soon as SIGKILL is received. This is causing some issues with some drivers and at times boot. Joseph then found that failures occur as the systemd-udevd process sends SIGKILL to modprobe if probe on a driver takes over 30 seconds. When this happens probe will fail on any driver, its why booting on some system will fail if the driver happens to be a storage related driver. Some folks have suggested fixing this by modifying kthread_create() to not leave upon SIGKILL [3], upon review Oleg rejected this change and the discussion was punted out to systemd to see if the default timeout could be increased from 30 seconds to 120. The opinion of the systemd maintainers is that the driver's behavior should be fixed [4]. Linus seems to agree [5], however more recently even networking drivers have been reported to fail on probe since just writing the firmware to a device and kicking it can take easy over 60 seconds [6]. Benjamim was able to trace the issues recently reported on cxgb4 down to the same systemd-udevd 30 second timeout [6]. This is an alternative solution which enables drivers that are known to take long to use deferred probe workqueue. This avoids the 30 second timeout and lets us annotate drivers with long init sequences. As drivers determine a component is not yet available and needs to defer probe you'll be notified this happen upon init for each device but now with a message such as: pci 0000:03:00.0: Driver cxgb4 requests probe deferral on init You should see one of these per struct device probed. [0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1276705 [1] https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1297248 [2] http://lists.freedesktop.org/archives/systemd-devel/2014-March/018006.html [3] http://thread.gmane.org/gmane.linux.ubuntu.devel.kernel.general/39123 [4] http://article.gmane.org/gmane.comp.sysutils.systemd.devel/17860 [5] http://article.gmane.org/gmane.linux.kernel/1671333 [6] https://bugzilla.novell.com/show_bug.cgi?id=877622 Cc: Greg Kroah-Hartman Cc: Tetsuo Handa Cc: Joseph Salisbury Cc: Kay Sievers Cc: One Thousand Gnomes Cc: Tim Gardner Cc: Pierre Fersing Cc: Andrew Morton Cc: Oleg Nesterov Cc: Benjamin Poirier Cc: Greg Kroah-Hartman Cc: Nagalakshmi Nandigama Cc: Praveen Krishnamoorthy Cc: Sreekanth Reddy Cc: Abhijit Mahajan Cc: Hariprasad S Cc: Santosh Rastapur Cc: MPT-FusionLinux.pdl@avagotech.com Cc: linux-scsi@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Luis R. Rodriguez --- drivers/base/dd.c | 18 +++++++++++++++++- include/linux/device.h | 7 +++++++ 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 9947c2e..d59e4b5 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -159,6 +159,9 @@ static void driver_deferred_probe_add(struct device *dev) list_add_tail(&dev->p->deferred_probe, &deferred_probe_pending_list); } mutex_unlock(&deferred_probe_mutex); + + if (driver_deferred_probe_enable) + driver_deferred_probe_trigger(); } void driver_deferred_probe_del(struct device *dev) @@ -374,6 +377,19 @@ void wait_for_device_probe(void) } EXPORT_SYMBOL_GPL(wait_for_device_probe); +static int __driver_probe_device(struct device_driver *drv, struct device *dev) +{ + if (drv->delay_probe && !dev->init_delayed_probe) { + dev_info(dev, "Driver %s requests probe deferral on init\n", + drv->name); + dev->init_delayed_probe = true; + driver_deferred_probe_add(dev); + return -EPROBE_DEFER; + } + + return really_probe(dev, drv); +} + /** * driver_probe_device - attempt to bind device & driver together * @drv: driver to bind a device to @@ -396,7 +412,7 @@ int driver_probe_device(struct device_driver *drv, struct device *dev) drv->bus->name, __func__, dev_name(dev), drv->name); pm_runtime_barrier(dev); - ret = really_probe(dev, drv); + ret = __driver_probe_device(drv, dev); pm_request_idle(dev); return ret; diff --git a/include/linux/device.h b/include/linux/device.h index af424ac..c317dfa 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -200,6 +200,9 @@ extern struct klist *bus_get_device_klist(struct bus_type *bus); * @owner: The module owner. * @mod_name: Used for built-in modules. * @suppress_bind_attrs: Disables bind/unbind via sysfs. + * @delay_probe: this driver is requesting a deferred probe since + * initialization. This can be desirable if its known the device probe + * or initialization takes more than 30 seconds. * @of_match_table: The open firmware table. * @acpi_match_table: The ACPI match table. * @probe: Called to query the existence of a specific device, @@ -233,6 +236,7 @@ struct device_driver { const char *mod_name; /* used for built-in modules */ bool suppress_bind_attrs; /* disables bind/unbind via sysfs */ + bool delay_probe; /* requests deferred probe */ const struct of_device_id *of_match_table; const struct acpi_device_id *acpi_match_table; @@ -715,6 +719,8 @@ struct acpi_dev_node { * * @offline_disabled: If set, the device is permanently online. * @offline: Set after successful invocation of bus type's .offline(). + * @init_delayed_probe: lets the coore keep track if the device has already + * gone through a delayed probe upon init. * * At the lowest level, every device in a Linux system is represented by an * instance of struct device. The device structure contains the information @@ -793,6 +799,7 @@ struct device { bool offline_disabled:1; bool offline:1; + bool init_delayed_probe:1; }; static inline struct device *kobj_to_dev(struct kobject *kobj)