From patchwork Thu Jan 16 14:44:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224255 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=rTnoQY8w; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6Qx4dxZz9sRQ for ; Fri, 17 Jan 2020 01:44:25 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726714AbgAPOoZ (ORCPT ); Thu, 16 Jan 2020 09:44:25 -0500 Received: from mail.kernel.org ([198.145.29.99]:34978 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726189AbgAPOoZ (ORCPT ); Thu, 16 Jan 2020 09:44:25 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 400312075B; Thu, 16 Jan 2020 14:44:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185864; bh=1ex58SMPt/dCwANUPh78pSMpR2tbb9x2YKco51bZMQk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rTnoQY8wjtY/aiYo/rYUjUWBxfme7CgkCU17FOgpMhJav+b0HeZ7u+phpzVhSrTZv vNo2rY7rkvw3Nt/hpXGCRBjNEZl1r5jwh9D8FBr9Po9KnfuGhNSrNF4QcPjAfg2JAK w9hFpYwfibecuXhmuD+Vz8qne+SVmVCRW2i4irJQ= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 01/13] kprobes: Fix to protect kick_kprobe_optimizer() by kprobe_mutex Date: Thu, 16 Jan 2020 23:44:20 +0900 Message-Id: <157918585992.29301.13166378246753856348.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org In kprobe_optimizer() kick_kprobe_optimizer() is called without kprobe_mutex, but this can race with other caller which is protected by kprobe_mutex. To fix that, expand kprobe_mutex protected area to protect kick_kprobe_optimizer() call. Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 5a664f995377..52b05ab9c323 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -592,11 +592,12 @@ static void kprobe_optimizer(struct work_struct *work) mutex_unlock(&module_mutex); mutex_unlock(&text_mutex); cpus_read_unlock(); - mutex_unlock(&kprobe_mutex); /* Step 5: Kick optimizer again if needed */ if (!list_empty(&optimizing_list) || !list_empty(&unoptimizing_list)) kick_kprobe_optimizer(); + + mutex_unlock(&kprobe_mutex); } /* Wait for completing optimization and unoptimization */ From patchwork Thu Jan 16 14:44:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224256 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=VkAsTcJE; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6RB0rHyz9sRX for ; Fri, 17 Jan 2020 01:44:38 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726925AbgAPOoh (ORCPT ); Thu, 16 Jan 2020 09:44:37 -0500 Received: from mail.kernel.org ([198.145.29.99]:35140 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726189AbgAPOoh (ORCPT ); Thu, 16 Jan 2020 09:44:37 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C1260208C3; Thu, 16 Jan 2020 14:44:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185876; bh=2GiHt6Ui+3fyNaUuixY123hCFx4jIohDN/TnvjbfZ9o=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VkAsTcJE9TkApHfIPMOeCvK3rX5XNfQGJ9D8ZzNHMDJfP2l40UzQYqbiM1WoDtuzH oM2i9ssA8li9UsG3enlltazIeneNiyRux0HeAPr0iphBI+B3qvq4oCf7tSUGHuC/eo ZlJUuFj6izpSiwR3IHWj34C8apFdOPtrTG0bbMys= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 02/13] kprobes: Remove redundant arch_disarm_kprobe() call Date: Thu, 16 Jan 2020 23:44:30 +0900 Message-Id: <157918586979.29301.15267608912757298568.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Fix to remove redundant arch_disarm_kprobe() call in force_unoptimize_kprobe(). This arch_disarm_kprobe() will be done if the kprobe is optimized but disabled, but that means the kprobe (optprobe) is unused (unoptimizing) state. In that case, unoptimize_kprobe() puts it in freeing_list and kprobe_optimizer automatically disarm it. So this arch_disarm_kprobe() is redundant. Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 52b05ab9c323..a2c755e79be7 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -674,8 +674,6 @@ static void force_unoptimize_kprobe(struct optimized_kprobe *op) lockdep_assert_cpus_held(); arch_unoptimize_kprobe(op); op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED; - if (kprobe_disabled(&op->kp)) - arch_disarm_kprobe(&op->kp); } /* Unoptimize a kprobe if p is optimized */ From patchwork Thu Jan 16 14:44:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224257 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=J65etQUt; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6RN4hvtz9s4Y for ; Fri, 17 Jan 2020 01:44:48 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726778AbgAPOor (ORCPT ); Thu, 16 Jan 2020 09:44:47 -0500 Received: from mail.kernel.org ([198.145.29.99]:35298 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726366AbgAPOor (ORCPT ); Thu, 16 Jan 2020 09:44:47 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4ED7820684; Thu, 16 Jan 2020 14:44:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185886; bh=sSD5mwamux5+wjoulyCcRI5d07g/ay7zhN0fXj2xIns=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=J65etQUtUrdHX10B9ewczQfWBBgoADBQmeyjR5eze8xxlMl0CZHLJxFLB4yPpknGX psUNAWTfyfAv52AgOaoGYTW3W3NcCmP8B1S9JdyVrvdk/dVbH5kKKS2AGSCY6byePP o7XOUgnjfwwwVbbHPBW7a0aSvcSUM/ANDfBONOrA= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 03/13] kprobes: Postpone optimizer until a bunch of probes (un)registered Date: Thu, 16 Jan 2020 23:44:42 +0900 Message-Id: <157918588172.29301.12636373067838941611.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add a counter to kick_kprobe_optimizer() to detect any additional register/unregister kprobes and postpone kprobe_optimizer() until a bunch of probes are registered. This might improve some long waiting unregistration process for bunch of kprobes. Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index a2c755e79be7..0dacdcecc90f 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -465,6 +465,7 @@ static struct kprobe *get_optimized_kprobe(unsigned long addr) static LIST_HEAD(optimizing_list); static LIST_HEAD(unoptimizing_list); static LIST_HEAD(freeing_list); +static int kprobe_optimizer_queue_update; static void kprobe_optimizer(struct work_struct *work); static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer); @@ -555,12 +556,22 @@ static void do_free_cleaned_kprobes(void) static void kick_kprobe_optimizer(void) { schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY); + kprobe_optimizer_queue_update++; } /* Kprobe jump optimizer */ static void kprobe_optimizer(struct work_struct *work) { mutex_lock(&kprobe_mutex); + + /* + * If new kprobe is queued in optimized/unoptimized list while + * OPTIMIZE_DELAY waiting period, wait again for a series of + * probes registration/unregistrations. + */ + if (kprobe_optimizer_queue_update > 1) + goto end; + cpus_read_lock(); mutex_lock(&text_mutex); /* Lock modules while optimizing kprobes */ @@ -593,6 +604,8 @@ static void kprobe_optimizer(struct work_struct *work) mutex_unlock(&text_mutex); cpus_read_unlock(); +end: + kprobe_optimizer_queue_update = 0; /* Step 5: Kick optimizer again if needed */ if (!list_empty(&optimizing_list) || !list_empty(&unoptimizing_list)) kick_kprobe_optimizer(); From patchwork Thu Jan 16 14:44:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224258 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=lSvM9Bj/; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6RY5mzPz9s4Y for ; Fri, 17 Jan 2020 01:44:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726371AbgAPOo5 (ORCPT ); Thu, 16 Jan 2020 09:44:57 -0500 Received: from mail.kernel.org ([198.145.29.99]:35436 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726151AbgAPOo5 (ORCPT ); Thu, 16 Jan 2020 09:44:57 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id EE1912075B; Thu, 16 Jan 2020 14:44:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185896; bh=4upz+fdluUISoJZ33aDVbQ1x81k94Nw4AOfmfeoVQZ0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lSvM9Bj/NOwNqHbD6Pu9DMbkqlfc5gX1JoIItPS1FjXevogDBXBKS3FsMWOd5BcQM oDBxU3hJ0dnSiC5N/LfEPppkICShyhQsZd9Xzns7H/1eZNq6FcLG9CI3A55EQZyVSe 2m6CFZpk+rqbCBYJZpH6MXCXEcHVZcTP017JKnsU= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 04/13] kprobes: Make optimizer delay to 1 second Date: Thu, 16 Jan 2020 23:44:52 +0900 Message-Id: <157918589199.29301.4419459150054220408.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Since the 5 jiffies delay for the optimizer is too short to wait for other probes, make it longer, like 1 second. Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 0dacdcecc90f..9c6e230852ad 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -469,7 +469,8 @@ static int kprobe_optimizer_queue_update; static void kprobe_optimizer(struct work_struct *work); static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer); -#define OPTIMIZE_DELAY 5 +/* Wait 1 second for starting optimization */ +#define OPTIMIZE_DELAY HZ /* * Optimize (replace a breakpoint with a jump) kprobes listed on From patchwork Thu Jan 16 14:45:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224259 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=Bql8uLL4; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6Rn4JG4z9sR4 for ; Fri, 17 Jan 2020 01:45:09 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726951AbgAPOpI (ORCPT ); Thu, 16 Jan 2020 09:45:08 -0500 Received: from mail.kernel.org ([198.145.29.99]:35616 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726151AbgAPOpI (ORCPT ); Thu, 16 Jan 2020 09:45:08 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 811C920684; Thu, 16 Jan 2020 14:45:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185907; bh=IFSRRDWoO04+d+aw3lwFPJ0LN77zME0FCYGJcB0/PX0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Bql8uLL4lnVu7kUTmWxapGiTQ/LR5RnjuBsGHfU/uf3ISjsMI2kFl2ypvZpKPRzLQ OepJt5mMSPPF4m7EVrq6NEqPAx+x/ONAn4eCArzpdoE+FR6Vw0xQut21z0SspMA34v N783/x/tHhw9wZRV/SVHWJWvwXCAul0hNjgW3S+E= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 05/13] tracing/kprobe: Use call_rcu to defer freeing event_file_link Date: Thu, 16 Jan 2020 23:45:02 +0900 Message-Id: <157918590192.29301.6909688694265698678.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Use call_rcu() to defer freeing event_file_link data structure. This removes RCU synchronization from per-probe event disabling operation. Since unregistering kprobe event requires all handlers to be disabled and have finished, this also introduces a gatekeeper to ensure that. If there is any disabled event which is not finished, the unregister process can synchronize RCU once (IOW, may sleep a while.) Signed-off-by: Masami Hiramatsu Reported-by: kbuild test robot --- kernel/trace/trace_kprobe.c | 35 +++++++++++++++++++++++++++++------ kernel/trace/trace_probe.c | 10 ++++++++-- kernel/trace/trace_probe.h | 1 + 3 files changed, 38 insertions(+), 8 deletions(-) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index cbdc4f4e64c7..906af1ffe2b2 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -328,10 +328,25 @@ static inline int __enable_trace_kprobe(struct trace_kprobe *tk) return ret; } +atomic_t trace_kprobe_disabled_finished; + +static void trace_kprobe_disabled_handlers_finish(void) +{ + if (atomic_read(&trace_kprobe_disabled_finished)) + synchronize_rcu(); +} + +static void trace_kprobe_disabled_finish_cb(struct rcu_head *head) +{ + atomic_dec(&trace_kprobe_disabled_finished); + kfree(head); +} + static void __disable_trace_kprobe(struct trace_probe *tp) { struct trace_probe *pos; struct trace_kprobe *tk; + struct rcu_head *head; list_for_each_entry(pos, trace_probe_probe_list(tp), list) { tk = container_of(pos, struct trace_kprobe, tp); @@ -342,6 +357,13 @@ static void __disable_trace_kprobe(struct trace_probe *tp) else disable_kprobe(&tk->rp.kp); } + + /* Handler exit gatekeeper */ + head = kzalloc(sizeof(*head), GFP_KERNEL); + if (WARN_ON(!head)) + return; + atomic_inc(&trace_kprobe_disabled_finished); + call_rcu(head, trace_kprobe_disabled_finish_cb); } /* @@ -422,13 +444,11 @@ static int disable_trace_kprobe(struct trace_event_call *call, out: if (file) - /* - * Synchronization is done in below function. For perf event, - * file == NULL and perf_trace_event_unreg() calls - * tracepoint_synchronize_unregister() to ensure synchronize - * event. We don't need to care about it. - */ trace_probe_remove_file(tp, file); + /* + * We have no RCU synchronization here. Caller must wait for the + * completion of disabling. + */ return 0; } @@ -542,6 +562,9 @@ static int unregister_trace_kprobe(struct trace_kprobe *tk) if (trace_probe_is_enabled(&tk->tp)) return -EBUSY; + /* Make sure all disabled trace_kprobe handlers finished */ + trace_kprobe_disabled_handlers_finish(); + /* Will fail if probe is being used by ftrace or perf */ if (unregister_kprobe_event(tk)) return -EBUSY; diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 905b10af5d5c..b18df8e1b2d6 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -1067,6 +1067,13 @@ struct event_file_link *trace_probe_get_file_link(struct trace_probe *tp, return NULL; } +static void event_file_link_free_cb(struct rcu_head *head) +{ + struct event_file_link *link = container_of(head, typeof(*link), rcu); + + kfree(link); +} + int trace_probe_remove_file(struct trace_probe *tp, struct trace_event_file *file) { @@ -1077,8 +1084,7 @@ int trace_probe_remove_file(struct trace_probe *tp, return -ENOENT; list_del_rcu(&link->list); - synchronize_rcu(); - kfree(link); + call_rcu(&link->rcu, event_file_link_free_cb); if (list_empty(&tp->event->files)) trace_probe_clear_flag(tp, TP_FLAG_TRACE); diff --git a/kernel/trace/trace_probe.h b/kernel/trace/trace_probe.h index 4ee703728aec..71ac01a50815 100644 --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -243,6 +243,7 @@ struct trace_probe { struct event_file_link { struct trace_event_file *file; struct list_head list; + struct rcu_head rcu; }; static inline bool trace_probe_test_flag(struct trace_probe *tp, From patchwork Thu Jan 16 14:45:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224260 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=PeDWQFx1; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6Rz6QcPz9s4Y for ; Fri, 17 Jan 2020 01:45:19 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726189AbgAPOpT (ORCPT ); Thu, 16 Jan 2020 09:45:19 -0500 Received: from mail.kernel.org ([198.145.29.99]:35758 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726151AbgAPOpS (ORCPT ); Thu, 16 Jan 2020 09:45:18 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D4C702075B; Thu, 16 Jan 2020 14:45:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185918; bh=av/dEcA/sXTt3eARj4QuByAAuTkVtu1gXciMKX8ZVoc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PeDWQFx123BbDcVQ5H1xHdKek8/W9Fb2stCraYpC/+wtM9Zqyfnl8t6cQ9yR2SHpP 5+6QvMP8WEe1NQuCA/REtaH6wPmKlP+SVkMbeodeeAfKdGekVAlRifHj43TERvia+e wYal5mGic1h9CriEiA2+mLVH2BuOG3pfACWIgaRU= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 06/13] kprobes: Enable kprobe-booster with CONFIG_PREEMPT=y Date: Thu, 16 Jan 2020 23:45:12 +0900 Message-Id: <157918591239.29301.2563999389420824545.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org As we did in commit a30b85df7d59 ("kprobes: Use synchronize_rcu_tasks() for optprobe with CONFIG_PREEMPT=y"), we can also enable kprobe- booster which depends on trampoline execution buffer as same as optprobe. Before releasing the trampoline buffer (kprobe_insn_page), the garbage collector waits for all potentially preempted tasks on the trampoline bufer using synchronize_rcu_tasks() instead of synchronize_rcu(). This requires to enable CONFIG_TASKS_RCU=y too, so this also introduces HAVE_KPROBES_BOOSTER for the archs which supports kprobe-booster (currently only x86 and ia64.) If both of CONFIG_PREEMPTION and HAVE_KPROBES_BOOSTER is y, CONFIG_KPROBES selects CONFIG_TASKS_RCU=y. Signed-off-by: Masami Hiramatsu --- arch/Kconfig | 4 ++++ arch/ia64/Kconfig | 1 + arch/ia64/kernel/kprobes.c | 3 +-- arch/x86/Kconfig | 1 + arch/x86/kernel/kprobes/core.c | 2 -- kernel/kprobes.c | 4 ++-- 6 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index 48b5e103bdb0..ead87084c8bf 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -64,6 +64,7 @@ config KPROBES depends on MODULES depends on HAVE_KPROBES select KALLSYMS + select TASKS_RCU if PREEMPTION && HAVE_KPROBES_BOOSTER help Kprobes allows you to trap at almost any kernel address and execute a callback function. register_kprobe() establishes @@ -189,6 +190,9 @@ config HAVE_KPROBES config HAVE_KRETPROBES bool +config HAVE_KPROBES_BOOSTER + bool + config HAVE_OPTPROBES bool diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig index bab7cd878464..341f9ca8a745 100644 --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -25,6 +25,7 @@ config IA64 select HAVE_IDE select HAVE_OPROFILE select HAVE_KPROBES + select HAVE_KPROBES_BOOSTER select HAVE_KRETPROBES select HAVE_FTRACE_MCOUNT_RECORD select HAVE_DYNAMIC_FTRACE if (!ITANIUM) diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c index a6d6a0556f08..1680a10c9f49 100644 --- a/arch/ia64/kernel/kprobes.c +++ b/arch/ia64/kernel/kprobes.c @@ -841,7 +841,6 @@ static int __kprobes pre_kprobes_handler(struct die_args *args) return 1; } -#if !defined(CONFIG_PREEMPTION) if (p->ainsn.inst_flag == INST_FLAG_BOOSTABLE && !p->post_handler) { /* Boost up -- we can execute copied instructions directly */ ia64_psr(regs)->ri = p->ainsn.slot; @@ -853,7 +852,7 @@ static int __kprobes pre_kprobes_handler(struct die_args *args) preempt_enable_no_resched(); return 1; } -#endif + prepare_ss(p, regs); kcb->kprobe_status = KPROBE_HIT_SS; return 1; diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e5800e52a59a..d509578d824b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -181,6 +181,7 @@ config X86 select HAVE_KERNEL_LZO select HAVE_KERNEL_XZ select HAVE_KPROBES + select HAVE_KPROBES_BOOSTER select HAVE_KPROBES_ON_FTRACE select HAVE_FUNCTION_ERROR_INJECTION select HAVE_KRETPROBES diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c index 4d7022a740ab..7aba45037885 100644 --- a/arch/x86/kernel/kprobes/core.c +++ b/arch/x86/kernel/kprobes/core.c @@ -587,7 +587,6 @@ static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, if (setup_detour_execution(p, regs, reenter)) return; -#if !defined(CONFIG_PREEMPTION) if (p->ainsn.boostable && !p->post_handler) { /* Boost up -- we can execute copied instructions directly */ if (!reenter) @@ -600,7 +599,6 @@ static void setup_singlestep(struct kprobe *p, struct pt_regs *regs, regs->ip = (unsigned long)p->ainsn.insn; return; } -#endif if (reenter) { save_previous_kprobe(kcb); set_current_kprobe(p, regs, kcb); diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 9c6e230852ad..848c14e92ccc 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -220,8 +220,8 @@ static int collect_garbage_slots(struct kprobe_insn_cache *c) { struct kprobe_insn_page *kip, *next; - /* Ensure no-one is interrupted on the garbages */ - synchronize_rcu(); + /* Ensure no-one is running on the garbages. */ + synchronize_rcu_tasks(); list_for_each_entry_safe(kip, next, &c->pages, list) { int i; From patchwork Thu Jan 16 14:45:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224261 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=lZmsfnjl; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6SB336xz9sRQ for ; Fri, 17 Jan 2020 01:45:29 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726653AbgAPOp2 (ORCPT ); Thu, 16 Jan 2020 09:45:28 -0500 Received: from mail.kernel.org ([198.145.29.99]:35926 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726151AbgAPOp2 (ORCPT ); Thu, 16 Jan 2020 09:45:28 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7CDA620684; Thu, 16 Jan 2020 14:45:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185928; bh=v9XSHMwY4zRV2HP2e0bNBZ2kNH3h9xSoS9wOX3LfJyY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lZmsfnjl/45yfEKSdW3gYOarkWTgBYTp1NCFZ0ZWJoxamfHp/pY0DZqEnA1pce2Pu oiiTZ1xkqZ8GaZ4b50ORoHpdXhWGOeDXprvArZuIJdebKMHF/QZN7HcSvo/75qeVLq wvEXzznDuHFyvOB8nuZCFV+KPdNA9jKDV/K+UfGY= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 07/13] kprobes: Use normal list traversal API if a mutex is held Date: Thu, 16 Jan 2020 23:45:23 +0900 Message-Id: <157918592332.29301.1564446199611592837.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Use normal list traversal API instead of rcu_read_lock, RCU list traversal and rcu_read_unlock pair if a mutex which protects the list is held in the methods of kprobe_insn_cache. Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 848c14e92ccc..09b0e33bc845 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -141,8 +141,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) /* Since the slot array is not protected by rcu, we need a mutex */ mutex_lock(&c->mutex); retry: - rcu_read_lock(); - list_for_each_entry_rcu(kip, &c->pages, list) { + list_for_each_entry(kip, &c->pages, list) { if (kip->nused < slots_per_page(c)) { int i; for (i = 0; i < slots_per_page(c); i++) { @@ -150,7 +149,6 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) kip->slot_used[i] = SLOT_USED; kip->nused++; slot = kip->insns + (i * c->insn_size); - rcu_read_unlock(); goto out; } } @@ -159,7 +157,6 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) WARN_ON(1); } } - rcu_read_unlock(); /* If there are any garbage slots, collect it and try again. */ if (c->nr_garbage && collect_garbage_slots(c) == 0) @@ -244,8 +241,7 @@ void __free_insn_slot(struct kprobe_insn_cache *c, long idx; mutex_lock(&c->mutex); - rcu_read_lock(); - list_for_each_entry_rcu(kip, &c->pages, list) { + list_for_each_entry(kip, &c->pages, list) { idx = ((long)slot - (long)kip->insns) / (c->insn_size * sizeof(kprobe_opcode_t)); if (idx >= 0 && idx < slots_per_page(c)) @@ -255,7 +251,6 @@ void __free_insn_slot(struct kprobe_insn_cache *c, WARN_ON(1); kip = NULL; out: - rcu_read_unlock(); /* Mark and sweep: this may sleep */ if (kip) { /* Check double free */ From patchwork Thu Jan 16 14:45:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224264 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=aXCjgQXA; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6Sk17P4z9sRX for ; Fri, 17 Jan 2020 01:45:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726970AbgAPOpu (ORCPT ); Thu, 16 Jan 2020 09:45:50 -0500 Received: from mail.kernel.org ([198.145.29.99]:36164 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728981AbgAPOpl (ORCPT ); Thu, 16 Jan 2020 09:45:41 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7185E207E0; Thu, 16 Jan 2020 14:45:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185940; bh=9hKc13sX3wl6v6E7D/d/V41QrfhG81urjfZd9vMLYX4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aXCjgQXAXdlEr/vZiau2kV4qAhpzZrrfNBrM2+5Prkee16N6DHA9KY2W86t2+Lvjp ATv0Pv8gD0L4yBDIOufPHLWGxr7n6/hWr76P6BimdtlqVzEU4mwfQdelNbN5tJrPyQ ikJYFhEcAi6NsR2oHbiLNZvEd8c8e9rhUu5GkU64= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 08/13] kprobes: Use workqueue for reclaiming kprobe insn cache pages Date: Thu, 16 Jan 2020 23:45:33 +0900 Message-Id: <157918593350.29301.7175144493909010321.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Use workqueues for reclaiming kprobe insn cache pages. This can split the heaviest part from the unregistration process. Signed-off-by: Masami Hiramatsu --- include/linux/kprobes.h | 2 ++ kernel/kprobes.c | 29 ++++++++++++++++++----------- 2 files changed, 20 insertions(+), 11 deletions(-) diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 04bdaf01112c..0f832817fca3 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -245,6 +245,7 @@ struct kprobe_insn_cache { struct list_head pages; /* list of kprobe_insn_page */ size_t insn_size; /* size of instruction slot */ int nr_garbage; + struct work_struct work; }; #ifdef __ARCH_WANT_KPROBES_INSN_SLOT @@ -254,6 +255,7 @@ extern void __free_insn_slot(struct kprobe_insn_cache *c, /* sleep-less address checking routine */ extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr); +void kprobe_insn_cache_gc(struct work_struct *work); #define DEFINE_INSN_CACHE_OPS(__name) \ extern struct kprobe_insn_cache kprobe_##__name##_slots; \ diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 09b0e33bc845..a9114923da4c 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -126,8 +126,15 @@ struct kprobe_insn_cache kprobe_insn_slots = { .pages = LIST_HEAD_INIT(kprobe_insn_slots.pages), .insn_size = MAX_INSN_SIZE, .nr_garbage = 0, + .work = __WORK_INITIALIZER(kprobe_insn_slots.work, + kprobe_insn_cache_gc), }; -static int collect_garbage_slots(struct kprobe_insn_cache *c); + +static void kick_kprobe_insn_cache_gc(struct kprobe_insn_cache *c) +{ + if (!work_pending(&c->work)) + schedule_work(&c->work); +} /** * __get_insn_slot() - Find a slot on an executable page for an instruction. @@ -140,7 +147,6 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) /* Since the slot array is not protected by rcu, we need a mutex */ mutex_lock(&c->mutex); - retry: list_for_each_entry(kip, &c->pages, list) { if (kip->nused < slots_per_page(c)) { int i; @@ -158,11 +164,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) } } - /* If there are any garbage slots, collect it and try again. */ - if (c->nr_garbage && collect_garbage_slots(c) == 0) - goto retry; - - /* All out of space. Need to allocate a new page. */ + /* All out of space. Need to allocate a new page. */ kip = kmalloc(KPROBE_INSN_PAGE_SIZE(slots_per_page(c)), GFP_KERNEL); if (!kip) goto out; @@ -213,10 +215,12 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx) return 0; } -static int collect_garbage_slots(struct kprobe_insn_cache *c) +void kprobe_insn_cache_gc(struct work_struct *work) { + struct kprobe_insn_cache *c = container_of(work, typeof(*c), work); struct kprobe_insn_page *kip, *next; + mutex_lock(&c->mutex); /* Ensure no-one is running on the garbages. */ synchronize_rcu_tasks(); @@ -226,12 +230,13 @@ static int collect_garbage_slots(struct kprobe_insn_cache *c) continue; kip->ngarbage = 0; /* we will collect all garbages */ for (i = 0; i < slots_per_page(c); i++) { - if (kip->slot_used[i] == SLOT_DIRTY && collect_one_slot(kip, i)) + if (kip->slot_used[i] == SLOT_DIRTY && + collect_one_slot(kip, i)) break; } } c->nr_garbage = 0; - return 0; + mutex_unlock(&c->mutex); } void __free_insn_slot(struct kprobe_insn_cache *c, @@ -259,7 +264,7 @@ void __free_insn_slot(struct kprobe_insn_cache *c, kip->slot_used[idx] = SLOT_DIRTY; kip->ngarbage++; if (++c->nr_garbage > slots_per_page(c)) - collect_garbage_slots(c); + kick_kprobe_insn_cache_gc(c); } else { collect_one_slot(kip, idx); } @@ -299,6 +304,8 @@ struct kprobe_insn_cache kprobe_optinsn_slots = { .pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages), /* .insn_size is initialized later */ .nr_garbage = 0, + .work = __WORK_INITIALIZER(kprobe_optinsn_slots.work, + kprobe_insn_cache_gc), }; #endif #endif From patchwork Thu Jan 16 14:45:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224263 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=E/znIrFk; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6Sj4X3Bz9sR4 for ; Fri, 17 Jan 2020 01:45:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729043AbgAPOpx (ORCPT ); Thu, 16 Jan 2020 09:45:53 -0500 Received: from mail.kernel.org ([198.145.29.99]:36446 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729044AbgAPOpv (ORCPT ); Thu, 16 Jan 2020 09:45:51 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6E9012072B; Thu, 16 Jan 2020 14:45:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185951; bh=GC3qzfU4RX1GrYAWYtC4PLg6M7nTBzDNTdRDeUAklZY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=E/znIrFkYEWN52lPe+FGFD9cSZ2z7hr5vSzaV/1SYfYu0VOcWucHeSlx+eMBg0j2K bzps6ycMdb6JsC4uwXLp/2ey4IOKz3j5mh1ISS0DooCMiQcELzM22/QdZkT9pTu8jn eq1muBO5De5QqSBZiXKvQmC73wwHAzG45ZbRt0a0= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 09/13] kprobes: Free kprobe_insn_page asynchronously Date: Thu, 16 Jan 2020 23:45:46 +0900 Message-Id: <157918594575.29301.16307406359272775745.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Free the kprobe_insn_page data structure asynchronously using call_rcu(). Signed-off-by: Masami Hiramatsu --- kernel/kprobes.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/kprobes.c b/kernel/kprobes.c index a9114923da4c..60ffc9d54d87 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -87,6 +87,7 @@ static LIST_HEAD(kprobe_blacklist); */ struct kprobe_insn_page { struct list_head list; + struct rcu_head rcu; kprobe_opcode_t *insns; /* Page of instruction slots */ struct kprobe_insn_cache *cache; int nused; @@ -192,6 +193,13 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) return slot; } +static void free_kprobe_insn_page(struct rcu_head *head) +{ + struct kprobe_insn_page *kip = container_of(head, typeof(*kip), rcu); + + kfree(kip); +} + /* Return 1 if all garbages are collected, otherwise 0. */ static int collect_one_slot(struct kprobe_insn_page *kip, int idx) { @@ -206,9 +214,8 @@ static int collect_one_slot(struct kprobe_insn_page *kip, int idx) */ if (!list_is_singular(&kip->list)) { list_del_rcu(&kip->list); - synchronize_rcu(); kip->cache->free(kip->insns); - kfree(kip); + call_rcu(&kip->rcu, free_kprobe_insn_page); } return 1; } From patchwork Thu Jan 16 14:45:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224266 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=NajVxD9B; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6Sr3Vq1z9sRp for ; Fri, 17 Jan 2020 01:46:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726706AbgAPOqD (ORCPT ); Thu, 16 Jan 2020 09:46:03 -0500 Received: from mail.kernel.org ([198.145.29.99]:36646 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726362AbgAPOqD (ORCPT ); Thu, 16 Jan 2020 09:46:03 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B010B20684; Thu, 16 Jan 2020 14:45:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185961; bh=WZUGiEJ/e6urkguiLJzQqlix3oe6FHXiP6/1t/uZDcY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NajVxD9BjkTh37lS3Eer/hM2jcnRRJqK+01tzBHRpXwp2+QpQuprM47xpjwEh4h5E 1cvJFlgen0JuVnPlsu2Elhy6AKSwm/axkGzJXUROliAhruHYMreB6FJqmN/Cy2QrSc +Y1Vgk94vNlle2Zj87s9Kr3m8V3GwwLpb2vmcct8= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 10/13] kprobes: Make free_*insn_slot() mutex-less Date: Thu, 16 Jan 2020 23:45:56 +0900 Message-Id: <157918595628.29301.5657205433519510960.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Rewrite kprobe_insn_cache implementation so that free_*insn_slot() do not acquire kprobe_insn_cache->mutex. This allows us to call it from call_rcu() callback function. For this purpose, this introduces flip-flop dirty generation in kprobe_insn_cache. When __free_insn_slot() is called, if a slot is dirty (this means it can be in use even after RCU grace period), it marks as "current-generation" dirty. The garbage collector (kprobe_insn_cache_gc()) flips the generation bit, waits enough safe period by synchronize_rcu_tasks(), and collect "previous- generation" dirty slots. In the results, it collects the dirty slots which was returned by __free_insn_slot() before the GC starts waiting the period (and the dirty slots which is returned while the safe period, will be marked as new-generation dirty.) Since the GC is not concurrently running, we do not need more than 2 generations. So it flips the generation bit instead of counting it up. Signed-off-by: Masami Hiramatsu --- arch/powerpc/kernel/optprobes.c | 1 include/linux/kprobes.h | 2 kernel/kprobes.c | 172 ++++++++++++++++++++++----------------- 3 files changed, 96 insertions(+), 79 deletions(-) diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c index 024f7aad1952..8304f3814515 100644 --- a/arch/powerpc/kernel/optprobes.c +++ b/arch/powerpc/kernel/optprobes.c @@ -53,7 +53,6 @@ struct kprobe_insn_cache kprobe_ppc_optinsn_slots = { /* insn_size initialized later */ .alloc = __ppc_alloc_insn_page, .free = __ppc_free_insn_page, - .nr_garbage = 0, }; /* diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 0f832817fca3..1cd53b7b8409 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -244,7 +244,7 @@ struct kprobe_insn_cache { void (*free)(void *); /* free insn page */ struct list_head pages; /* list of kprobe_insn_page */ size_t insn_size; /* size of instruction slot */ - int nr_garbage; + int generation; /* dirty generation */ struct work_struct work; }; diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 60ffc9d54d87..5c12eb7fa8e1 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -90,8 +90,7 @@ struct kprobe_insn_page { struct rcu_head rcu; kprobe_opcode_t *insns; /* Page of instruction slots */ struct kprobe_insn_cache *cache; - int nused; - int ngarbage; + atomic_t nused; char slot_used[]; }; @@ -106,8 +105,9 @@ static int slots_per_page(struct kprobe_insn_cache *c) enum kprobe_slot_state { SLOT_CLEAN = 0, - SLOT_DIRTY = 1, - SLOT_USED = 2, + SLOT_USED = 1, + SLOT_DIRTY0 = 2, + SLOT_DIRTY1 = 3, }; void __weak *alloc_insn_page(void) @@ -126,7 +126,6 @@ struct kprobe_insn_cache kprobe_insn_slots = { .free = free_insn_page, .pages = LIST_HEAD_INIT(kprobe_insn_slots.pages), .insn_size = MAX_INSN_SIZE, - .nr_garbage = 0, .work = __WORK_INITIALIZER(kprobe_insn_slots.work, kprobe_insn_cache_gc), }; @@ -137,6 +136,22 @@ static void kick_kprobe_insn_cache_gc(struct kprobe_insn_cache *c) schedule_work(&c->work); } +static void *try_to_get_clean_slot(struct kprobe_insn_page *kip) +{ + struct kprobe_insn_cache *c = kip->cache; + int i; + + for (i = 0; i < slots_per_page(c); i++) { + if (kip->slot_used[i] == SLOT_CLEAN) { + kip->slot_used[i] = SLOT_USED; + atomic_inc(&kip->nused); + return kip->insns + (i * c->insn_size); + } + } + + return NULL; +} + /** * __get_insn_slot() - Find a slot on an executable page for an instruction. * We allocate an executable page if there's no room on existing ones. @@ -145,25 +160,20 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) { struct kprobe_insn_page *kip; kprobe_opcode_t *slot = NULL; + bool reclaimable = false; /* Since the slot array is not protected by rcu, we need a mutex */ mutex_lock(&c->mutex); list_for_each_entry(kip, &c->pages, list) { - if (kip->nused < slots_per_page(c)) { - int i; - for (i = 0; i < slots_per_page(c); i++) { - if (kip->slot_used[i] == SLOT_CLEAN) { - kip->slot_used[i] = SLOT_USED; - kip->nused++; - slot = kip->insns + (i * c->insn_size); - goto out; - } - } - /* kip->nused is broken. Fix it. */ - kip->nused = slots_per_page(c); - WARN_ON(1); + if (atomic_read(&kip->nused) < slots_per_page(c)) { + slot = try_to_get_clean_slot(kip); + if (slot) + goto out; + reclaimable = true; } } + if (reclaimable) + kick_kprobe_insn_cache_gc(c); /* All out of space. Need to allocate a new page. */ kip = kmalloc(KPROBE_INSN_PAGE_SIZE(slots_per_page(c)), GFP_KERNEL); @@ -183,8 +193,7 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) INIT_LIST_HEAD(&kip->list); memset(kip->slot_used, SLOT_CLEAN, slots_per_page(c)); kip->slot_used[0] = SLOT_USED; - kip->nused = 1; - kip->ngarbage = 0; + atomic_set(&kip->nused, 1); kip->cache = c; list_add_rcu(&kip->list, &c->pages); slot = kip->insns; @@ -193,90 +202,106 @@ kprobe_opcode_t *__get_insn_slot(struct kprobe_insn_cache *c) return slot; } -static void free_kprobe_insn_page(struct rcu_head *head) +static void free_kprobe_insn_page_cb(struct rcu_head *head) { struct kprobe_insn_page *kip = container_of(head, typeof(*kip), rcu); kfree(kip); } -/* Return 1 if all garbages are collected, otherwise 0. */ -static int collect_one_slot(struct kprobe_insn_page *kip, int idx) +static void free_kprobe_insn_page(struct kprobe_insn_page *kip) { - kip->slot_used[idx] = SLOT_CLEAN; - kip->nused--; - if (kip->nused == 0) { - /* - * Page is no longer in use. Free it unless - * it's the last one. We keep the last one - * so as not to have to set it up again the - * next time somebody inserts a probe. - */ - if (!list_is_singular(&kip->list)) { - list_del_rcu(&kip->list); - kip->cache->free(kip->insns); - call_rcu(&kip->rcu, free_kprobe_insn_page); - } - return 1; + if (WARN_ON_ONCE(atomic_read(&kip->nused) != 0)) + return; + /* + * Page is no longer in use. Free it unless + * it's the last one. We keep the last one + * so as not to have to set it up again the + * next time somebody inserts a probe. + */ + if (!list_is_singular(&kip->list)) { + list_del_rcu(&kip->list); + kip->cache->free(kip->insns); + call_rcu(&kip->rcu, free_kprobe_insn_page_cb); } - return 0; } void kprobe_insn_cache_gc(struct work_struct *work) { struct kprobe_insn_cache *c = container_of(work, typeof(*c), work); struct kprobe_insn_page *kip, *next; + int dirtygen = c->generation ? SLOT_DIRTY1 : SLOT_DIRTY0; + int i, nr; mutex_lock(&c->mutex); + + c->generation ^= 1; /* flip generation (0->1, 1->0) */ + + /* Make sure the generation update is shown in __free_insn_slot() */ + smp_wmb(); + /* Ensure no-one is running on the garbages. */ synchronize_rcu_tasks(); list_for_each_entry_safe(kip, next, &c->pages, list) { - int i; - if (kip->ngarbage == 0) - continue; - kip->ngarbage = 0; /* we will collect all garbages */ + nr = 0; + /* Reclaim previous generation dirty slots */ for (i = 0; i < slots_per_page(c); i++) { - if (kip->slot_used[i] == SLOT_DIRTY && - collect_one_slot(kip, i)) - break; + if (kip->slot_used[i] == dirtygen) + kip->slot_used[i] = SLOT_CLEAN; + else if (kip->slot_used[i] != SLOT_CLEAN) + nr++; } + if (!nr) + free_kprobe_insn_page(kip); } - c->nr_garbage = 0; mutex_unlock(&c->mutex); } +static struct kprobe_insn_page * +find_kprobe_insn_page(struct kprobe_insn_cache *c, unsigned long addr) +{ + struct kprobe_insn_page *kip; + + list_for_each_entry_rcu(kip, &c->pages, list) { + if (addr >= (unsigned long)kip->insns && + addr < (unsigned long)kip->insns + PAGE_SIZE) + return kip; + } + return NULL; +} + void __free_insn_slot(struct kprobe_insn_cache *c, kprobe_opcode_t *slot, int dirty) { struct kprobe_insn_page *kip; + int dirtygen; long idx; - mutex_lock(&c->mutex); - list_for_each_entry(kip, &c->pages, list) { + rcu_read_lock(); + kip = find_kprobe_insn_page(c, (unsigned long)slot); + if (kip) { idx = ((long)slot - (long)kip->insns) / (c->insn_size * sizeof(kprobe_opcode_t)); - if (idx >= 0 && idx < slots_per_page(c)) + /* Check double free */ + if (WARN_ON(kip->slot_used[idx] != SLOT_USED)) goto out; - } - /* Could not find this slot. */ - WARN_ON(1); - kip = NULL; + + /* Make sure to use new generation */ + smp_rmb(); + + dirtygen = c->generation ? SLOT_DIRTY1 : SLOT_DIRTY0; + if (dirty) + kip->slot_used[idx] = dirtygen; + else + kip->slot_used[idx] = SLOT_CLEAN; + + if (!atomic_dec_return(&kip->nused)) + kick_kprobe_insn_cache_gc(c); + } else + WARN_ON(1); /* Not found: what happen? */ out: - /* Mark and sweep: this may sleep */ - if (kip) { - /* Check double free */ - WARN_ON(kip->slot_used[idx] != SLOT_USED); - if (dirty) { - kip->slot_used[idx] = SLOT_DIRTY; - kip->ngarbage++; - if (++c->nr_garbage > slots_per_page(c)) - kick_kprobe_insn_cache_gc(c); - } else { - collect_one_slot(kip, idx); - } - } - mutex_unlock(&c->mutex); + rcu_read_unlock(); } /* @@ -286,17 +311,11 @@ void __free_insn_slot(struct kprobe_insn_cache *c, */ bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr) { - struct kprobe_insn_page *kip; bool ret = false; rcu_read_lock(); - list_for_each_entry_rcu(kip, &c->pages, list) { - if (addr >= (unsigned long)kip->insns && - addr < (unsigned long)kip->insns + PAGE_SIZE) { - ret = true; - break; - } - } + if (find_kprobe_insn_page(c, addr)) + ret = true; rcu_read_unlock(); return ret; @@ -310,7 +329,6 @@ struct kprobe_insn_cache kprobe_optinsn_slots = { .free = free_insn_page, .pages = LIST_HEAD_INIT(kprobe_optinsn_slots.pages), /* .insn_size is initialized later */ - .nr_garbage = 0, .work = __WORK_INITIALIZER(kprobe_optinsn_slots.work, kprobe_insn_cache_gc), }; From patchwork Thu Jan 16 14:46:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224267 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=zxRG/xiq; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6T20D07z9sRQ for ; Fri, 17 Jan 2020 01:46:14 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726559AbgAPOqN (ORCPT ); Thu, 16 Jan 2020 09:46:13 -0500 Received: from mail.kernel.org ([198.145.29.99]:36804 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726362AbgAPOqN (ORCPT ); Thu, 16 Jan 2020 09:46:13 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 21DD12072B; Thu, 16 Jan 2020 14:46:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185972; bh=WKp9TaCHvXe+OIL/Jsjsxj34PxTjaSn/9cbxee71PeQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=zxRG/xiqmSkx8p5oddGIFI69vxRorwuvf2EYOufZQWwsNqhS7GOQFnOxoS06NEn2I WvcVE9WjPh6zFFGsOGbr6PdFJBRA/yHon9P3yWGDjp/qCshy+IZ2AatSxK2nFMOZ4f wiG5bYP2I296arGgxPbA1Q4Yni8AJoCvlwiJtbGQ= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 11/13] kprobes: Add asynchronous unregistration APIs Date: Thu, 16 Jan 2020 23:46:07 +0900 Message-Id: <157918596704.29301.4085897993817952679.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add asynchronous unregistration APIs for kprobes and kretprobes. These APIs can accelerate the unregistration process of multiple probes because user do not need to wait for RCU sync. However, caller must take care of following notes. - If you wants to synchronize unregistration (for example, making sure all handlers are running out), you have to use synchronize_rcu() once at last. - If you need to free objects which related to the kprobes, you can pass a callback, but that callback must call kprobe_free_callback() or kretprobe_free_callback() at first. Since it is easy to shoot your foot, at this moment I don't export these APIs to modules. Signed-off-by: Masami Hiramatsu --- include/linux/kprobes.h | 9 ++++++++ kernel/kprobes.c | 56 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index 1cd53b7b8409..f892c3a11dac 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -98,6 +98,9 @@ struct kprobe { * Protected by kprobe_mutex after this kprobe is registered. */ u32 flags; + + /* For asynchronous unregistration callback */ + struct rcu_head rcu; }; /* Kprobe status flags */ @@ -364,6 +367,12 @@ void unregister_kretprobe(struct kretprobe *rp); int register_kretprobes(struct kretprobe **rps, int num); void unregister_kretprobes(struct kretprobe **rps, int num); +/* Async unregister APIs (Do not wait for rcu sync) */ +void kprobe_free_callback(struct rcu_head *head); +void kretprobe_free_callback(struct rcu_head *head); +void unregister_kprobe_async(struct kprobe *kp, rcu_callback_t free_cb); +void unregister_kretprobe_async(struct kretprobe *kp, rcu_callback_t free_cb); + void kprobe_flush_task(struct task_struct *tk); void recycle_rp_inst(struct kretprobe_instance *ri, struct hlist_head *head); diff --git a/kernel/kprobes.c b/kernel/kprobes.c index 5c12eb7fa8e1..ab57c22b64f9 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -1887,6 +1887,31 @@ void unregister_kprobes(struct kprobe **kps, int num) } EXPORT_SYMBOL_GPL(unregister_kprobes); +void kprobe_free_callback(struct rcu_head *head) +{ + struct kprobe *kp = container_of(head, struct kprobe, rcu); + + __unregister_kprobe_bottom(kp); +} + +/* + * If you call this function, you must call kprobe_free_callback() at first + * in your free_cb(), or set free_cb = NULL. + */ +void unregister_kprobe_async(struct kprobe *kp, rcu_callback_t free_cb) +{ + mutex_lock(&kprobe_mutex); + if (__unregister_kprobe_top(kp) < 0) + kp->addr = NULL; + mutex_unlock(&kprobe_mutex); + + if (!kp->addr) + return; + if (!free_cb) + free_cb = kprobe_free_callback; + call_rcu(&kp->rcu, free_cb); +} + int __weak kprobe_exceptions_notify(struct notifier_block *self, unsigned long val, void *data) { @@ -2080,6 +2105,29 @@ void unregister_kretprobes(struct kretprobe **rps, int num) } EXPORT_SYMBOL_GPL(unregister_kretprobes); +void kretprobe_free_callback(struct rcu_head *head) +{ + struct kprobe *kp = container_of(head, struct kprobe, rcu); + struct kretprobe *rp = container_of(kp, struct kretprobe, kp); + + __unregister_kprobe_bottom(kp); + cleanup_rp_inst(rp); +} + +void unregister_kretprobe_async(struct kretprobe *rp, rcu_callback_t free_cb) +{ + mutex_lock(&kprobe_mutex); + if (__unregister_kprobe_top(&rp->kp) < 0) + rp->kp.addr = NULL; + mutex_unlock(&kprobe_mutex); + + if (!rp->kp.addr) + return; + if (!free_cb) + free_cb = kretprobe_free_callback; + call_rcu(&rp->kp.rcu, free_cb); +} + #else /* CONFIG_KRETPROBES */ int register_kretprobe(struct kretprobe *rp) { @@ -2109,6 +2157,14 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs) } NOKPROBE_SYMBOL(pre_handler_kretprobe); +void kretprobe_free_callback(struct rcu_head *head) +{ +} + +void unregister_kretprobe_async(struct kretprobe *rp, rcu_callback_t free_cb) +{ +} + #endif /* CONFIG_KRETPROBES */ /* Set the kprobe gone and remove its instruction buffer. */ From patchwork Thu Jan 16 14:46:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224268 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=qCopYxEe; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6TF0Hhrz9s4Y for ; Fri, 17 Jan 2020 01:46:25 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726527AbgAPOqY (ORCPT ); Thu, 16 Jan 2020 09:46:24 -0500 Received: from mail.kernel.org ([198.145.29.99]:36976 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726362AbgAPOqY (ORCPT ); Thu, 16 Jan 2020 09:46:24 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1F47820684; Thu, 16 Jan 2020 14:46:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185982; bh=VutGjxWwdtEPcCBYrGyCDjwJ75btz6UR1pL3UbFs08c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qCopYxEeBllyD83t5ggx9lLCMEqXOioXxFbSCQYI7g2LtcvQJ+qALffU9m3YOzIU3 9iV9mxRqVVBkBKJ3PY2MpI/Q1qpOXdsfillIc1IfIksCIQfudpXlGrnArkVNTJ0ceG RLw8DQq7KHGJV/3hnu7IvzrmRfI+Vw7Wqo6YhXBs= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 12/13] tracing/kprobe: Free probe event asynchronously Date: Thu, 16 Jan 2020 23:46:17 +0900 Message-Id: <157918597739.29301.3329193112465223174.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Free each probe event data structure asynchronously when deleting probe events. But this finally synchronizes RCU so that we make sure all event handlers have finished. Signed-off-by: Masami Hiramatsu --- kernel/trace/trace_dynevent.c | 5 ++++ kernel/trace/trace_kprobe.c | 46 +++++++++++++++++++++++++++++++++++------ 2 files changed, 44 insertions(+), 7 deletions(-) diff --git a/kernel/trace/trace_dynevent.c b/kernel/trace/trace_dynevent.c index 89779eb84a07..2d5e8d457309 100644 --- a/kernel/trace/trace_dynevent.c +++ b/kernel/trace/trace_dynevent.c @@ -70,6 +70,9 @@ int dyn_event_release(int argc, char **argv, struct dyn_event_operations *type) if (ret) break; } + + /* Wait for running events because of async event unregistration */ + synchronize_rcu(); mutex_unlock(&event_mutex); return ret; @@ -164,6 +167,8 @@ int dyn_events_release_all(struct dyn_event_operations *type) if (ret) break; } + /* Wait for running events because of async event unregistration */ + synchronize_rcu(); out: mutex_unlock(&event_mutex); diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index 906af1ffe2b2..f7e0370b10ae 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -551,7 +551,35 @@ static void __unregister_trace_kprobe(struct trace_kprobe *tk) } } -/* Unregister a trace_probe and probe_event */ +static void free_trace_kprobe_cb(struct rcu_head *head) +{ + struct kprobe *kp = container_of(head, struct kprobe, rcu); + struct kretprobe *rp = container_of(kp, struct kretprobe, kp); + struct trace_kprobe *tk = container_of(rp, struct trace_kprobe, rp); + + if (trace_kprobe_is_return(tk)) + kretprobe_free_callback(head); + else + kprobe_free_callback(head); + free_trace_kprobe(tk); +} + +static void __unregister_trace_kprobe_async(struct trace_kprobe *tk) +{ + if (trace_kprobe_is_registered(tk)) { + if (trace_kprobe_is_return(tk)) + unregister_kretprobe_async(&tk->rp, + free_trace_kprobe_cb); + else + unregister_kprobe_async(&tk->rp.kp, + free_trace_kprobe_cb); + } +} + +/* + * Unregister a trace_probe and probe_event asynchronously. + * Caller must wait for RCU. + */ static int unregister_trace_kprobe(struct trace_kprobe *tk) { /* If other probes are on the event, just unregister kprobe */ @@ -570,9 +598,17 @@ static int unregister_trace_kprobe(struct trace_kprobe *tk) return -EBUSY; unreg: - __unregister_trace_kprobe(tk); dyn_event_remove(&tk->devent); + /* + * This trace_probe_unlink() can free the trace_event_call linked to + * this probe. + * We can do this before unregistering because this probe is + * already disabled and the disabling process waits enough period + * for all handlers finished. IOW, the disabling process must wait + * RCU sync at least once before returning to its caller. + */ trace_probe_unlink(&tk->tp); + __unregister_trace_kprobe_async(tk); return 0; } @@ -928,11 +964,7 @@ static int create_or_delete_trace_kprobe(int argc, char **argv) static int trace_kprobe_release(struct dyn_event *ev) { struct trace_kprobe *tk = to_trace_kprobe(ev); - int ret = unregister_trace_kprobe(tk); - - if (!ret) - free_trace_kprobe(tk); - return ret; + return unregister_trace_kprobe(tk); } static int trace_kprobe_show(struct seq_file *m, struct dyn_event *ev) From patchwork Thu Jan 16 14:46:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Masami Hiramatsu (Google)" X-Patchwork-Id: 1224269 Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=kernel.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=kernel.org header.i=@kernel.org header.a=rsa-sha256 header.s=default header.b=LOCjJ1rp; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47z6TQ4Fq1z9s4Y for ; Fri, 17 Jan 2020 01:46:34 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726903AbgAPOqe (ORCPT ); Thu, 16 Jan 2020 09:46:34 -0500 Received: from mail.kernel.org ([198.145.29.99]:37132 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726362AbgAPOqe (ORCPT ); Thu, 16 Jan 2020 09:46:34 -0500 Received: from localhost.localdomain (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 731712072B; Thu, 16 Jan 2020 14:46:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1579185993; bh=r2rV98t7yERiTDZYhhTOeqzP6On1Kcg4dpcDBQpoYPc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LOCjJ1rpSVWGkHUp1Y6fMnYBivSXeTotY0HS/WEhID+pmAFzMigIDudiVspHPH7Hk I4WZMJJkCrUliRXG30305TQeEYUHIRgJsKHfl5xqn82om3FOKRpNHgIMgLfMnWqYHC V7lhv50kptG3qGKK8iMhFjd0ylBUw8xRuQi7oABw= From: Masami Hiramatsu To: Brendan Gregg , Steven Rostedt , Alexei Starovoitov Cc: mhiramat@kernel.org, Ingo Molnar , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Borkmann , Arnaldo Carvalho de Melo , "David S . Miller" , paulmck@kernel.org, joel@joelfernandes.org, "Naveen N . Rao" , Anil S Keshavamurthy Subject: [RFT PATCH 13/13] tracing/kprobe: perf_event: Remove local kprobe event asynchronously Date: Thu, 16 Jan 2020 23:46:28 +0900 Message-Id: <157918598813.29301.14393624193409447045.stgit@devnote2> X-Mailer: git-send-email 2.20.1 In-Reply-To: <157918584866.29301.6941815715391411338.stgit@devnote2> References: <157918584866.29301.6941815715391411338.stgit@devnote2> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Remove local kprobe event asynchronously. Note that this can asynchronously remove a kprobe_event part, but the perf_event needs to wait for all handlers finished before removing the local kprobe event. So from the perf_event (and eBPF) point of view, this shortens the trace termination process a bit, but it still takes O(n) time to finish it. To fix this issue, we need to change perf_event terminating process by decoupling "disable events" and "destroy events" as in ftrace. Signed-off-by: Masami Hiramatsu --- kernel/trace/trace_kprobe.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c index f7e0370b10ae..e8c4828c21ae 100644 --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -1707,9 +1707,7 @@ void destroy_local_trace_kprobe(struct trace_event_call *event_call) return; } - __unregister_trace_kprobe(tk); - - free_trace_kprobe(tk); + __unregister_trace_kprobe_async(tk); } #endif /* CONFIG_PERF_EVENTS */