From patchwork Wed Mar 9 00:24:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kamal Mostafa X-Patchwork-Id: 594494 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id ADA4114031D; Wed, 9 Mar 2016 11:25:46 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1adRwd-0003ZO-DP; Wed, 09 Mar 2016 00:25:43 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1adRvj-0002zM-0B for kernel-team@lists.ubuntu.com; Wed, 09 Mar 2016 00:24:47 +0000 Received: from 1.general.kamal.us.vpn ([10.172.68.52] helo=fourier) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1adRvi-0002qb-Gn; Wed, 09 Mar 2016 00:24:46 +0000 Received: from kamal by fourier with local (Exim 4.86) (envelope-from ) id 1adRvf-0006Q2-Rv; Tue, 08 Mar 2016 16:24:43 -0800 From: Kamal Mostafa To: James Bottomley Subject: [3.19.y-ckt stable] Patch "scsi: fix soft lockup in scsi_remove_target() on module removal" has been added to the 3.19.y-ckt tree Date: Tue, 8 Mar 2016 16:24:42 -0800 Message-Id: <1457483082-24639-1-git-send-email-kamal@canonical.com> X-Mailer: git-send-email 2.7.0 X-Extended-Stable: 3.19 Cc: Kamal Mostafa , kernel-team@lists.ubuntu.com, Sebastian Herbszt X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled scsi: fix soft lockup in scsi_remove_target() on module removal to the linux-3.19.y-queue branch of the 3.19.y-ckt extended stable tree which can be found at: http://kernel.ubuntu.com/git/ubuntu/linux.git/log/?h=linux-3.19.y-queue This patch is scheduled to be released in version 3.19.8-ckt16. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.19.y-ckt tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal ---8<------------------------------------------------------------ From 108c3cd4dc9487fe5d4c2074a7afc70ad453c575 Mon Sep 17 00:00:00 2001 From: James Bottomley Date: Wed, 10 Feb 2016 08:03:26 -0800 Subject: scsi: fix soft lockup in scsi_remove_target() on module removal commit 90a88d6ef88edcfc4f644dddc7eef4ea41bccf8b upstream. This softlockup is currently happening: [ 444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29] [ 444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_ fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th ermal [ 444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G O 4.4.0-rc5-2.g1e923a3-default #1 [ 444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E /D2164-A1, BIOS 5.00 R1.10.2164.A1 05/08/2006 [ 444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc] [ 444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000 [ 444.088002] EIP: 0060:[] EFLAGS: 00000286 CPU: 1 [ 444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20 [ 444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286 [ 444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8 [ 444.088002] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0 [ 444.088002] Stack: [ 444.088002] f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90 [ 444.088002] f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040 [ 444.088002] f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004 [ 444.088002] Call Trace: [ 444.088002] [] scsi_remove_target+0x167/0x1c0 [ 444.088002] [] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc] [ 444.088002] [] process_one_work+0x155/0x3e0 [ 444.088002] [] worker_thread+0x37/0x490 [ 444.088002] [] kthread+0x9b/0xb0 [ 444.088002] [] ret_from_kernel_thread+0x21/0x40 What appears to be happening is that something has pinned the target so it can't go into STARGET_DEL via final release and the loop in scsi_remove_target spins endlessly until that happens. The fix for this soft lockup is to not keep looping over a device that we've called remove on but which hasn't gone into DEL state. This patch will retain a simplistic memory of the last target and not keep looping over it. Reported-by: Sebastian Herbszt Tested-by: Sebastian Herbszt Fixes: 40998193560dab6c3ce8d25f4fa58a23e252ef38 Signed-off-by: James Bottomley Signed-off-by: Kamal Mostafa --- drivers/scsi/scsi_sysfs.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) -- 2.7.0 diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index e71eb8e..168a509 100644 --- a/drivers/scsi/scsi_sysfs.c +++ b/drivers/scsi/scsi_sysfs.c @@ -1148,16 +1148,18 @@ static void __scsi_remove_target(struct scsi_target *starget) void scsi_remove_target(struct device *dev) { struct Scsi_Host *shost = dev_to_shost(dev->parent); - struct scsi_target *starget; + struct scsi_target *starget, *last_target = NULL; unsigned long flags; restart: spin_lock_irqsave(shost->host_lock, flags); list_for_each_entry(starget, &shost->__targets, siblings) { - if (starget->state == STARGET_DEL) + if (starget->state == STARGET_DEL || + starget == last_target) continue; if (starget->dev.parent == dev || &starget->dev == dev) { kref_get(&starget->reap_ref); + last_target = starget; spin_unlock_irqrestore(shost->host_lock, flags); __scsi_remove_target(starget); scsi_target_reap(starget);