diff mbox series

[RFT,04/13] kprobes: Make optimizer delay to 1 second

Message ID 157918589199.29301.4419459150054220408.stgit@devnote2
State RFC
Headers show
Series tracing: kprobes: Introduce async unregistration | expand

Commit Message

Masami Hiramatsu (Google) Jan. 16, 2020, 2:44 p.m. UTC
Since the 5 jiffies delay for the optimizer is too
short to wait for other probes, make it longer,
like 1 second.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 kernel/kprobes.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Steven Rostedt Jan. 22, 2020, 12:29 a.m. UTC | #1
On Thu, 16 Jan 2020 23:44:52 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Since the 5 jiffies delay for the optimizer is too
> short to wait for other probes, make it longer,
> like 1 second.

Hi Masami,

Can you explain more *why* 5 jiffies is too short.

Thanks!

-- Steve

> 
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> ---
>  kernel/kprobes.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 0dacdcecc90f..9c6e230852ad 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -469,7 +469,8 @@ static int kprobe_optimizer_queue_update;
>  
>  static void kprobe_optimizer(struct work_struct *work);
>  static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
> -#define OPTIMIZE_DELAY 5
> +/* Wait 1 second for starting optimization */
> +#define OPTIMIZE_DELAY HZ
>  
>  /*
>   * Optimize (replace a breakpoint with a jump) kprobes listed on
Masami Hiramatsu (Google) Jan. 22, 2020, 7:23 a.m. UTC | #2
On Tue, 21 Jan 2020 19:29:05 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Thu, 16 Jan 2020 23:44:52 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > Since the 5 jiffies delay for the optimizer is too
> > short to wait for other probes, make it longer,
> > like 1 second.
> 
> Hi Masami,
> 
> Can you explain more *why* 5 jiffies is too short.

Yes, I had introduced this 5 jiffies delay for multiple probe registration
and unregistration like systemtap, which will use array-based interface to
register/unregister. In that case, 5 jiffies will be enough for the delay
to wait for other kprobe registration/unregsitration.

However, since perf and ftrace register/unregister probes one-by-one with
RCU synchronization interval, the optimizer will be started before
finishing to register/unregister all probes.
And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
Since the kprobe_mutex is also involved in disabling kprobes, this also
stops probe-event disabling.

Maybe 5 jiffies is enough for adding/removing a few probe events, but
not enough for dozens of probe events.

Thank you,

> 
> Thanks!
> 
> -- Steve
> 
> > 
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > ---
> >  kernel/kprobes.c |    3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> > index 0dacdcecc90f..9c6e230852ad 100644
> > --- a/kernel/kprobes.c
> > +++ b/kernel/kprobes.c
> > @@ -469,7 +469,8 @@ static int kprobe_optimizer_queue_update;
> >  
> >  static void kprobe_optimizer(struct work_struct *work);
> >  static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
> > -#define OPTIMIZE_DELAY 5
> > +/* Wait 1 second for starting optimization */
> > +#define OPTIMIZE_DELAY HZ
> >  
> >  /*
> >   * Optimize (replace a breakpoint with a jump) kprobes listed on
>
Steven Rostedt Jan. 22, 2020, 12:11 p.m. UTC | #3
On Wed, 22 Jan 2020 16:23:17 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Tue, 21 Jan 2020 19:29:05 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Thu, 16 Jan 2020 23:44:52 +0900
> > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >   
> > > Since the 5 jiffies delay for the optimizer is too
> > > short to wait for other probes, make it longer,
> > > like 1 second.  
> > 
> > Hi Masami,
> > 
> > Can you explain more *why* 5 jiffies is too short.  
> 
> Yes, I had introduced this 5 jiffies delay for multiple probe registration
> and unregistration like systemtap, which will use array-based interface to
> register/unregister. In that case, 5 jiffies will be enough for the delay
> to wait for other kprobe registration/unregsitration.
> 
> However, since perf and ftrace register/unregister probes one-by-one with
> RCU synchronization interval, the optimizer will be started before
> finishing to register/unregister all probes.
> And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> Since the kprobe_mutex is also involved in disabling kprobes, this also
> stops probe-event disabling.
> 
> Maybe 5 jiffies is enough for adding/removing a few probe events, but
> not enough for dozens of probe events.
> 

Perhaps we should have a mechanism that can detect new probes being
added, and just continue to delay the optimization, instead of having
some arbitrary delay.

-- Steve
Masami Hiramatsu (Google) Jan. 22, 2020, 1:12 p.m. UTC | #4
On Wed, 22 Jan 2020 07:11:15 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 22 Jan 2020 16:23:17 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
> 
> > On Tue, 21 Jan 2020 19:29:05 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >   
> > > > Since the 5 jiffies delay for the optimizer is too
> > > > short to wait for other probes, make it longer,
> > > > like 1 second.  
> > > 
> > > Hi Masami,
> > > 
> > > Can you explain more *why* 5 jiffies is too short.  
> > 
> > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > and unregistration like systemtap, which will use array-based interface to
> > register/unregister. In that case, 5 jiffies will be enough for the delay
> > to wait for other kprobe registration/unregsitration.
> > 
> > However, since perf and ftrace register/unregister probes one-by-one with
> > RCU synchronization interval, the optimizer will be started before
> > finishing to register/unregister all probes.
> > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > stops probe-event disabling.
> > 
> > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > not enough for dozens of probe events.
> > 
> 
> Perhaps we should have a mechanism that can detect new probes being
> added, and just continue to delay the optimization, instead of having
> some arbitrary delay.

Yes, that is what [03/13] does :) 
Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
And in that case, [03/13] still doesn't work. That's why I added this patch
after that.

Thank you,
Paul E. McKenney Jan. 22, 2020, 4:54 p.m. UTC | #5
On Wed, Jan 22, 2020 at 10:12:40PM +0900, Masami Hiramatsu wrote:
> On Wed, 22 Jan 2020 07:11:15 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Wed, 22 Jan 2020 16:23:17 +0900
> > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > 
> > > On Tue, 21 Jan 2020 19:29:05 -0500
> > > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > 
> > > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >   
> > > > > Since the 5 jiffies delay for the optimizer is too
> > > > > short to wait for other probes, make it longer,
> > > > > like 1 second.  
> > > > 
> > > > Hi Masami,
> > > > 
> > > > Can you explain more *why* 5 jiffies is too short.  
> > > 
> > > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > > and unregistration like systemtap, which will use array-based interface to
> > > register/unregister. In that case, 5 jiffies will be enough for the delay
> > > to wait for other kprobe registration/unregsitration.
> > > 
> > > However, since perf and ftrace register/unregister probes one-by-one with
> > > RCU synchronization interval, the optimizer will be started before
> > > finishing to register/unregister all probes.
> > > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > > stops probe-event disabling.
> > > 
> > > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > > not enough for dozens of probe events.
> > > 
> > 
> > Perhaps we should have a mechanism that can detect new probes being
> > added, and just continue to delay the optimization, instead of having
> > some arbitrary delay.
> 
> Yes, that is what [03/13] does :) 
> Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
> And in that case, [03/13] still doesn't work. That's why I added this patch
> after that.

If the RCU synchronization is synchronize_rcu_tasks(), then yes, it
will often take way more than 5 jiffies.  If it is synchronize_rcu(),
5 jiffies would not be unusual, especially on larger systems.
But in the case of synchronize_rcu(), one option is to instead use
synchronize_rcu_expedited().  It is not clear that this last is really
justified in this case, but figured it might be worth mentioning.

							Thanx, Paul
Masami Hiramatsu (Google) Jan. 23, 2020, 1:33 a.m. UTC | #6
On Wed, 22 Jan 2020 08:54:32 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> On Wed, Jan 22, 2020 at 10:12:40PM +0900, Masami Hiramatsu wrote:
> > On Wed, 22 Jan 2020 07:11:15 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Wed, 22 Jan 2020 16:23:17 +0900
> > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > 
> > > > On Tue, 21 Jan 2020 19:29:05 -0500
> > > > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > 
> > > > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > >   
> > > > > > Since the 5 jiffies delay for the optimizer is too
> > > > > > short to wait for other probes, make it longer,
> > > > > > like 1 second.  
> > > > > 
> > > > > Hi Masami,
> > > > > 
> > > > > Can you explain more *why* 5 jiffies is too short.  
> > > > 
> > > > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > > > and unregistration like systemtap, which will use array-based interface to
> > > > register/unregister. In that case, 5 jiffies will be enough for the delay
> > > > to wait for other kprobe registration/unregsitration.
> > > > 
> > > > However, since perf and ftrace register/unregister probes one-by-one with
> > > > RCU synchronization interval, the optimizer will be started before
> > > > finishing to register/unregister all probes.
> > > > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > > > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > > > stops probe-event disabling.
> > > > 
> > > > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > > > not enough for dozens of probe events.
> > > > 
> > > 
> > > Perhaps we should have a mechanism that can detect new probes being
> > > added, and just continue to delay the optimization, instead of having
> > > some arbitrary delay.
> > 
> > Yes, that is what [03/13] does :) 
> > Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
> > And in that case, [03/13] still doesn't work. That's why I added this patch
> > after that.
> 
> If the RCU synchronization is synchronize_rcu_tasks(), then yes, it
> will often take way more than 5 jiffies.  If it is synchronize_rcu(),
> 5 jiffies would not be unusual, especially on larger systems.
> But in the case of synchronize_rcu(), one option is to instead use
> synchronize_rcu_expedited().  It is not clear that this last is really
> justified in this case, but figured it might be worth mentioning.

It is synchronize_rcu(), but it seems sometimes it is called several
times on one probe disabling.

Anyway, without this update, I added a printk to count optimizer
queue-length and found that all optimizer call with a single kprobe
on the quenes. I think this depends on the machine, but as far as
I tested on 8-threads qemu x86, shows this result...

Probes: 256 kprobe_events
Enable events
real	0m 0.02s
user	0m 0.00s
sys	0m 0.02s
[   17.730548] Queue-update: 180, skipped, Total Queued: 180
[   17.739445] Queue-update: 1, go, Total Queued: 180
Disable events
[   19.744634] Queue-update: 1, go, Total Queued: 1
[   19.770743] Queue-update: 1, go, Total Queued: 1
[   19.886625] Queue-update: 1, go, Total Queued: 1
[   20.006631] Queue-update: 1, go, Total Queued: 1
[   20.126628] Queue-update: 1, go, Total Queued: 1
[   20.246620] Queue-update: 1, go, Total Queued: 1
[   20.374665] Queue-update: 1, go, Total Queued: 1
[   20.486617] Queue-update: 1, go, Total Queued: 1
[   20.606608] Queue-update: 1, go, Total Queued: 1
[   20.726596] Queue-update: 1, go, Total Queued: 1
[   20.846608] Queue-update: 1, go, Total Queued: 1
[   20.966723] Queue-update: 1, go, Total Queued: 1
[   21.086611] Queue-update: 1, go, Total Queued: 1
[   21.206605] Queue-update: 1, go, Total Queued: 1
[   21.326603] Queue-update: 1, go, Total Queued: 1
[   21.462609] Queue-update: 1, go, Total Queued: 1
[   21.566755] Queue-update: 1, go, Total Queued: 1
[   21.686606] Queue-update: 1, go, Total Queued: 1
[   21.806611] Queue-update: 1, go, Total Queued: 1
[   21.926609] Queue-update: 1, go, Total Queued: 1
[   22.046621] Queue-update: 1, go, Total Queued: 1
[   22.166729] Queue-update: 1, go, Total Queued: 1
[   22.302609] Queue-update: 1, go, Total Queued: 1
[   22.510627] Queue-update: 1, go, Total Queued: 1
[   22.536638] Queue-update: 1, go, Total Queued: 1
[   22.654618] Queue-update: 1, go, Total Queued: 1
[   22.774643] Queue-update: 1, go, Total Queued: 1
[   22.902609] Queue-update: 1, go, Total Queued: 1
[   23.022608] Queue-update: 1, go, Total Queued: 1
[   23.158606] Queue-update: 1, go, Total Queued: 1
[   23.254618] Queue-update: 1, go, Total Queued: 1
[   23.374647] Queue-update: 1, go, Total Queued: 1
[   23.494605] Queue-update: 1, go, Total Queued: 1
[   23.614610] Queue-update: 1, go, Total Queued: 1
[   23.734606] Queue-update: 1, go, Total Queued: 1
[   23.854609] Queue-update: 1, go, Total Queued: 1
[   23.974615] Queue-update: 1, go, Total Queued: 1
[   24.094609] Queue-update: 1, go, Total Queued: 1
[   24.230607] Queue-update: 1, go, Total Queued: 1
[   24.342625] Queue-update: 1, go, Total Queued: 1
[   24.475478] Queue-update: 1, go, Total Queued: 1
[   24.574554] Queue-update: 1, go, Total Queued: 1
[   24.694605] Queue-update: 1, go, Total Queued: 1
[   24.814585] Queue-update: 1, go, Total Queued: 1
[   24.934591] Queue-update: 1, go, Total Queued: 1
[   25.054614] Queue-update: 1, go, Total Queued: 1
[   25.174628] Queue-update: 1, go, Total Queued: 1
[   25.294637] Queue-update: 1, go, Total Queued: 1
[   25.414620] Queue-update: 1, go, Total Queued: 1
[   25.534553] Queue-update: 1, go, Total Queued: 1
[   25.654610] Queue-update: 1, go, Total Queued: 1
[   25.774627] Queue-update: 1, go, Total Queued: 1
[   25.894609] Queue-update: 1, go, Total Queued: 1
[   26.030548] Queue-update: 1, go, Total Queued: 1
[   26.134626] Queue-update: 1, go, Total Queued: 1
[   26.254620] Queue-update: 1, go, Total Queued: 1
[   26.373603] Queue-update: 1, go, Total Queued: 1
[   26.502606] Queue-update: 1, go, Total Queued: 1
[   26.614607] Queue-update: 1, go, Total Queued: 1
[   26.734610] Queue-update: 1, go, Total Queued: 1
[   26.854620] Queue-update: 1, go, Total Queued: 1
[   26.974660] Queue-update: 1, go, Total Queued: 1
[   27.094620] Queue-update: 1, go, Total Queued: 1
[   27.214596] Queue-update: 1, go, Total Queued: 1
[   27.334640] Queue-update: 1, go, Total Queued: 1
[   27.494606] Queue-update: 1, go, Total Queued: 1
[   27.574703] Queue-update: 1, go, Total Queued: 1
[   27.694609] Queue-update: 1, go, Total Queued: 1
[   27.814607] Queue-update: 1, go, Total Queued: 1
[   27.934631] Queue-update: 1, go, Total Queued: 1
[   28.054606] Queue-update: 1, go, Total Queued: 1
[   28.182685] Queue-update: 1, go, Total Queued: 1
[   28.318568] Queue-update: 1, go, Total Queued: 1
[   28.422610] Queue-update: 1, go, Total Queued: 1
[   28.534621] Queue-update: 1, go, Total Queued: 1
[   28.654618] Queue-update: 1, go, Total Queued: 1
[   28.774622] Queue-update: 1, go, Total Queued: 1
[   28.894609] Queue-update: 1, go, Total Queued: 1
[   29.022609] Queue-update: 1, go, Total Queued: 1
[   29.150615] Queue-update: 1, go, Total Queued: 1
[   29.262610] Queue-update: 1, go, Total Queued: 1
[   29.374621] Queue-update: 1, go, Total Queued: 1
[   29.494606] Queue-update: 1, go, Total Queued: 1
[   29.614616] Queue-update: 1, go, Total Queued: 1
[   29.734607] Queue-update: 1, go, Total Queued: 1
[   29.854601] Queue-update: 1, go, Total Queued: 1
[   29.974610] Queue-update: 1, go, Total Queued: 1
[   30.094625] Queue-update: 1, go, Total Queued: 1
[   30.214606] Queue-update: 1, go, Total Queued: 1
[   30.334602] Queue-update: 1, go, Total Queued: 1
[   30.454634] Queue-update: 1, go, Total Queued: 1
[   30.574606] Queue-update: 1, go, Total Queued: 1
[   30.694589] Queue-update: 1, go, Total Queued: 1
[   30.814613] Queue-update: 1, go, Total Queued: 1
[   30.934602] Queue-update: 1, go, Total Queued: 1
[   31.054605] Queue-update: 1, go, Total Queued: 1
[   31.182596] Queue-update: 1, go, Total Queued: 1
[   31.318621] Queue-update: 1, go, Total Queued: 1
[   31.414615] Queue-update: 1, go, Total Queued: 1
[   31.534610] Queue-update: 1, go, Total Queued: 1
[   31.670608] Queue-update: 1, go, Total Queued: 1
[   31.774626] Queue-update: 1, go, Total Queued: 1
[   31.894607] Queue-update: 1, go, Total Queued: 1
[   32.014609] Queue-update: 1, go, Total Queued: 1
[   32.134607] Queue-update: 1, go, Total Queued: 1
[   32.254611] Queue-update: 1, go, Total Queued: 1
[   32.374608] Queue-update: 1, go, Total Queued: 1
[   32.494619] Queue-update: 1, go, Total Queued: 1
[   32.614607] Queue-update: 1, go, Total Queued: 1
[   32.734612] Queue-update: 1, go, Total Queued: 1
[   32.862616] Queue-update: 1, go, Total Queued: 1
[   32.974620] Queue-update: 1, go, Total Queued: 1
[   33.110609] Queue-update: 1, go, Total Queued: 1
[   33.214609] Queue-update: 1, go, Total Queued: 1
[   33.342611] Queue-update: 1, go, Total Queued: 1
[   33.454607] Queue-update: 1, go, Total Queued: 1
[   33.574611] Queue-update: 1, go, Total Queued: 1
[   33.694619] Queue-update: 1, go, Total Queued: 1
[   33.814607] Queue-update: 1, go, Total Queued: 1
[   33.950614] Queue-update: 1, go, Total Queued: 1
[   34.062609] Queue-update: 1, go, Total Queued: 1
[   34.174609] Queue-update: 1, go, Total Queued: 1
[   34.294619] Queue-update: 1, go, Total Queued: 1
[   34.430533] Queue-update: 1, go, Total Queued: 1
[   34.534594] Queue-update: 1, go, Total Queued: 1
[   34.654605] Queue-update: 1, go, Total Queued: 1
[   34.790596] Queue-update: 1, go, Total Queued: 1
[   34.902611] Queue-update: 1, go, Total Queued: 1
[   35.014630] Queue-update: 1, go, Total Queued: 1
[   35.134634] Queue-update: 1, go, Total Queued: 1
[   35.262608] Queue-update: 1, go, Total Queued: 1
[   35.374634] Queue-update: 1, go, Total Queued: 1
[   35.494617] Queue-update: 1, go, Total Queued: 1
[   35.622608] Queue-update: 1, go, Total Queued: 1
[   35.742610] Queue-update: 1, go, Total Queued: 1
[   35.854572] Queue-update: 1, go, Total Queued: 1
[   35.982596] Queue-update: 1, go, Total Queued: 1
[   36.094603] Queue-update: 1, go, Total Queued: 1
[   36.222612] Queue-update: 1, go, Total Queued: 1
[   36.334610] Queue-update: 1, go, Total Queued: 1
[   36.454619] Queue-update: 1, go, Total Queued: 1
[   36.574619] Queue-update: 1, go, Total Queued: 1
[   36.694643] Queue-update: 1, go, Total Queued: 1
[   36.814614] Queue-update: 1, go, Total Queued: 1
[   36.934610] Queue-update: 1, go, Total Queued: 1
[   37.062609] Queue-update: 1, go, Total Queued: 1
[   37.198611] Queue-update: 1, go, Total Queued: 1
[   37.294618] Queue-update: 1, go, Total Queued: 1
[   37.414618] Queue-update: 1, go, Total Queued: 1
[   37.534595] Queue-update: 1, go, Total Queued: 1
[   37.662594] Queue-update: 1, go, Total Queued: 1
[   37.774610] Queue-update: 1, go, Total Queued: 1
[   37.894618] Queue-update: 1, go, Total Queued: 1
[   38.014619] Queue-update: 1, go, Total Queued: 1
[   38.142612] Queue-update: 1, go, Total Queued: 1
[   38.254609] Queue-update: 1, go, Total Queued: 1
[   38.374619] Queue-update: 1, go, Total Queued: 1
[   38.502481] Queue-update: 1, go, Total Queued: 1
[   38.614596] Queue-update: 1, go, Total Queued: 1
[   38.734597] Queue-update: 1, go, Total Queued: 1
[   38.854606] Queue-update: 1, go, Total Queued: 1
[   38.974620] Queue-update: 1, go, Total Queued: 1
[   39.094617] Queue-update: 1, go, Total Queued: 1
[   39.222597] Queue-update: 1, go, Total Queued: 1
[   39.334610] Queue-update: 1, go, Total Queued: 1
[   39.454609] Queue-update: 1, go, Total Queued: 1
[   39.574633] Queue-update: 1, go, Total Queued: 1
[   39.694611] Queue-update: 1, go, Total Queued: 1
[   39.814608] Queue-update: 1, go, Total Queued: 1
[   39.934610] Queue-update: 1, go, Total Queued: 1
[   40.054621] Queue-update: 1, go, Total Queued: 1
[   40.174611] Queue-update: 1, go, Total Queued: 1
[   40.297471] Queue-update: 1, go, Total Queued: 1
[   40.414504] Queue-update: 1, go, Total Queued: 1
[   40.534601] Queue-update: 1, go, Total Queued: 1
[   40.654611] Queue-update: 1, go, Total Queued: 1
[   40.774609] Queue-update: 1, go, Total Queued: 1
[   40.894628] Queue-update: 1, go, Total Queued: 1
[   41.014608] Queue-update: 1, go, Total Queued: 1
[   41.135594] Queue-update: 1, go, Total Queued: 1
real	0m 21.40s
user	0m 0.00s
sys	0m 0.04s
Remove events
real	0m 2.14s
user	0m 0.00s
sys	0m 0.04s


Thank you,
Paul E. McKenney Jan. 23, 2020, 2:26 a.m. UTC | #7
On Thu, Jan 23, 2020 at 10:33:34AM +0900, Masami Hiramatsu wrote:
> On Wed, 22 Jan 2020 08:54:32 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > On Wed, Jan 22, 2020 at 10:12:40PM +0900, Masami Hiramatsu wrote:
> > > On Wed, 22 Jan 2020 07:11:15 -0500
> > > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > 
> > > > On Wed, 22 Jan 2020 16:23:17 +0900
> > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > 
> > > > > On Tue, 21 Jan 2020 19:29:05 -0500
> > > > > Steven Rostedt <rostedt@goodmis.org> wrote:
> > > > > 
> > > > > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > > >   
> > > > > > > Since the 5 jiffies delay for the optimizer is too
> > > > > > > short to wait for other probes, make it longer,
> > > > > > > like 1 second.  
> > > > > > 
> > > > > > Hi Masami,
> > > > > > 
> > > > > > Can you explain more *why* 5 jiffies is too short.  
> > > > > 
> > > > > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > > > > and unregistration like systemtap, which will use array-based interface to
> > > > > register/unregister. In that case, 5 jiffies will be enough for the delay
> > > > > to wait for other kprobe registration/unregsitration.
> > > > > 
> > > > > However, since perf and ftrace register/unregister probes one-by-one with
> > > > > RCU synchronization interval, the optimizer will be started before
> > > > > finishing to register/unregister all probes.
> > > > > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > > > > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > > > > stops probe-event disabling.
> > > > > 
> > > > > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > > > > not enough for dozens of probe events.
> > > > > 
> > > > 
> > > > Perhaps we should have a mechanism that can detect new probes being
> > > > added, and just continue to delay the optimization, instead of having
> > > > some arbitrary delay.
> > > 
> > > Yes, that is what [03/13] does :) 
> > > Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
> > > And in that case, [03/13] still doesn't work. That's why I added this patch
> > > after that.
> > 
> > If the RCU synchronization is synchronize_rcu_tasks(), then yes, it
> > will often take way more than 5 jiffies.  If it is synchronize_rcu(),
> > 5 jiffies would not be unusual, especially on larger systems.
> > But in the case of synchronize_rcu(), one option is to instead use
> > synchronize_rcu_expedited().  It is not clear that this last is really
> > justified in this case, but figured it might be worth mentioning.
> 
> It is synchronize_rcu(), but it seems sometimes it is called several
> times on one probe disabling.

OK, good to know.

> Anyway, without this update, I added a printk to count optimizer
> queue-length and found that all optimizer call with a single kprobe
> on the quenes. I think this depends on the machine, but as far as
> I tested on 8-threads qemu x86, shows this result...
> 
> Probes: 256 kprobe_events
> Enable events
> real	0m 0.02s
> user	0m 0.00s
> sys	0m 0.02s
> [   17.730548] Queue-update: 180, skipped, Total Queued: 180
> [   17.739445] Queue-update: 1, go, Total Queued: 180
> Disable events
> [   19.744634] Queue-update: 1, go, Total Queued: 1
> [   19.770743] Queue-update: 1, go, Total Queued: 1
> [   19.886625] Queue-update: 1, go, Total Queued: 1
> [   20.006631] Queue-update: 1, go, Total Queued: 1
> [   20.126628] Queue-update: 1, go, Total Queued: 1
> [   20.246620] Queue-update: 1, go, Total Queued: 1
> [   20.374665] Queue-update: 1, go, Total Queued: 1
> [   20.486617] Queue-update: 1, go, Total Queued: 1
> [   20.606608] Queue-update: 1, go, Total Queued: 1
> [   20.726596] Queue-update: 1, go, Total Queued: 1
> [   20.846608] Queue-update: 1, go, Total Queued: 1
> [   20.966723] Queue-update: 1, go, Total Queued: 1
> [   21.086611] Queue-update: 1, go, Total Queued: 1
> [   21.206605] Queue-update: 1, go, Total Queued: 1
> [   21.326603] Queue-update: 1, go, Total Queued: 1
> [   21.462609] Queue-update: 1, go, Total Queued: 1
> [   21.566755] Queue-update: 1, go, Total Queued: 1
> [   21.686606] Queue-update: 1, go, Total Queued: 1
> [   21.806611] Queue-update: 1, go, Total Queued: 1
> [   21.926609] Queue-update: 1, go, Total Queued: 1
> [   22.046621] Queue-update: 1, go, Total Queued: 1
> [   22.166729] Queue-update: 1, go, Total Queued: 1
> [   22.302609] Queue-update: 1, go, Total Queued: 1
> [   22.510627] Queue-update: 1, go, Total Queued: 1
> [   22.536638] Queue-update: 1, go, Total Queued: 1
> [   22.654618] Queue-update: 1, go, Total Queued: 1
> [   22.774643] Queue-update: 1, go, Total Queued: 1
> [   22.902609] Queue-update: 1, go, Total Queued: 1
> [   23.022608] Queue-update: 1, go, Total Queued: 1
> [   23.158606] Queue-update: 1, go, Total Queued: 1
> [   23.254618] Queue-update: 1, go, Total Queued: 1
> [   23.374647] Queue-update: 1, go, Total Queued: 1
> [   23.494605] Queue-update: 1, go, Total Queued: 1
> [   23.614610] Queue-update: 1, go, Total Queued: 1
> [   23.734606] Queue-update: 1, go, Total Queued: 1
> [   23.854609] Queue-update: 1, go, Total Queued: 1
> [   23.974615] Queue-update: 1, go, Total Queued: 1
> [   24.094609] Queue-update: 1, go, Total Queued: 1
> [   24.230607] Queue-update: 1, go, Total Queued: 1
> [   24.342625] Queue-update: 1, go, Total Queued: 1
> [   24.475478] Queue-update: 1, go, Total Queued: 1
> [   24.574554] Queue-update: 1, go, Total Queued: 1
> [   24.694605] Queue-update: 1, go, Total Queued: 1
> [   24.814585] Queue-update: 1, go, Total Queued: 1
> [   24.934591] Queue-update: 1, go, Total Queued: 1
> [   25.054614] Queue-update: 1, go, Total Queued: 1
> [   25.174628] Queue-update: 1, go, Total Queued: 1
> [   25.294637] Queue-update: 1, go, Total Queued: 1
> [   25.414620] Queue-update: 1, go, Total Queued: 1
> [   25.534553] Queue-update: 1, go, Total Queued: 1
> [   25.654610] Queue-update: 1, go, Total Queued: 1
> [   25.774627] Queue-update: 1, go, Total Queued: 1
> [   25.894609] Queue-update: 1, go, Total Queued: 1
> [   26.030548] Queue-update: 1, go, Total Queued: 1
> [   26.134626] Queue-update: 1, go, Total Queued: 1
> [   26.254620] Queue-update: 1, go, Total Queued: 1
> [   26.373603] Queue-update: 1, go, Total Queued: 1
> [   26.502606] Queue-update: 1, go, Total Queued: 1
> [   26.614607] Queue-update: 1, go, Total Queued: 1
> [   26.734610] Queue-update: 1, go, Total Queued: 1
> [   26.854620] Queue-update: 1, go, Total Queued: 1
> [   26.974660] Queue-update: 1, go, Total Queued: 1
> [   27.094620] Queue-update: 1, go, Total Queued: 1
> [   27.214596] Queue-update: 1, go, Total Queued: 1
> [   27.334640] Queue-update: 1, go, Total Queued: 1
> [   27.494606] Queue-update: 1, go, Total Queued: 1
> [   27.574703] Queue-update: 1, go, Total Queued: 1
> [   27.694609] Queue-update: 1, go, Total Queued: 1
> [   27.814607] Queue-update: 1, go, Total Queued: 1
> [   27.934631] Queue-update: 1, go, Total Queued: 1
> [   28.054606] Queue-update: 1, go, Total Queued: 1
> [   28.182685] Queue-update: 1, go, Total Queued: 1
> [   28.318568] Queue-update: 1, go, Total Queued: 1
> [   28.422610] Queue-update: 1, go, Total Queued: 1
> [   28.534621] Queue-update: 1, go, Total Queued: 1
> [   28.654618] Queue-update: 1, go, Total Queued: 1
> [   28.774622] Queue-update: 1, go, Total Queued: 1
> [   28.894609] Queue-update: 1, go, Total Queued: 1
> [   29.022609] Queue-update: 1, go, Total Queued: 1
> [   29.150615] Queue-update: 1, go, Total Queued: 1
> [   29.262610] Queue-update: 1, go, Total Queued: 1
> [   29.374621] Queue-update: 1, go, Total Queued: 1
> [   29.494606] Queue-update: 1, go, Total Queued: 1
> [   29.614616] Queue-update: 1, go, Total Queued: 1
> [   29.734607] Queue-update: 1, go, Total Queued: 1
> [   29.854601] Queue-update: 1, go, Total Queued: 1
> [   29.974610] Queue-update: 1, go, Total Queued: 1
> [   30.094625] Queue-update: 1, go, Total Queued: 1
> [   30.214606] Queue-update: 1, go, Total Queued: 1
> [   30.334602] Queue-update: 1, go, Total Queued: 1
> [   30.454634] Queue-update: 1, go, Total Queued: 1
> [   30.574606] Queue-update: 1, go, Total Queued: 1
> [   30.694589] Queue-update: 1, go, Total Queued: 1
> [   30.814613] Queue-update: 1, go, Total Queued: 1
> [   30.934602] Queue-update: 1, go, Total Queued: 1
> [   31.054605] Queue-update: 1, go, Total Queued: 1
> [   31.182596] Queue-update: 1, go, Total Queued: 1
> [   31.318621] Queue-update: 1, go, Total Queued: 1
> [   31.414615] Queue-update: 1, go, Total Queued: 1
> [   31.534610] Queue-update: 1, go, Total Queued: 1
> [   31.670608] Queue-update: 1, go, Total Queued: 1
> [   31.774626] Queue-update: 1, go, Total Queued: 1
> [   31.894607] Queue-update: 1, go, Total Queued: 1
> [   32.014609] Queue-update: 1, go, Total Queued: 1
> [   32.134607] Queue-update: 1, go, Total Queued: 1
> [   32.254611] Queue-update: 1, go, Total Queued: 1
> [   32.374608] Queue-update: 1, go, Total Queued: 1
> [   32.494619] Queue-update: 1, go, Total Queued: 1
> [   32.614607] Queue-update: 1, go, Total Queued: 1
> [   32.734612] Queue-update: 1, go, Total Queued: 1
> [   32.862616] Queue-update: 1, go, Total Queued: 1
> [   32.974620] Queue-update: 1, go, Total Queued: 1
> [   33.110609] Queue-update: 1, go, Total Queued: 1
> [   33.214609] Queue-update: 1, go, Total Queued: 1
> [   33.342611] Queue-update: 1, go, Total Queued: 1
> [   33.454607] Queue-update: 1, go, Total Queued: 1
> [   33.574611] Queue-update: 1, go, Total Queued: 1
> [   33.694619] Queue-update: 1, go, Total Queued: 1
> [   33.814607] Queue-update: 1, go, Total Queued: 1
> [   33.950614] Queue-update: 1, go, Total Queued: 1
> [   34.062609] Queue-update: 1, go, Total Queued: 1
> [   34.174609] Queue-update: 1, go, Total Queued: 1
> [   34.294619] Queue-update: 1, go, Total Queued: 1
> [   34.430533] Queue-update: 1, go, Total Queued: 1
> [   34.534594] Queue-update: 1, go, Total Queued: 1
> [   34.654605] Queue-update: 1, go, Total Queued: 1
> [   34.790596] Queue-update: 1, go, Total Queued: 1
> [   34.902611] Queue-update: 1, go, Total Queued: 1
> [   35.014630] Queue-update: 1, go, Total Queued: 1
> [   35.134634] Queue-update: 1, go, Total Queued: 1
> [   35.262608] Queue-update: 1, go, Total Queued: 1
> [   35.374634] Queue-update: 1, go, Total Queued: 1
> [   35.494617] Queue-update: 1, go, Total Queued: 1
> [   35.622608] Queue-update: 1, go, Total Queued: 1
> [   35.742610] Queue-update: 1, go, Total Queued: 1
> [   35.854572] Queue-update: 1, go, Total Queued: 1
> [   35.982596] Queue-update: 1, go, Total Queued: 1
> [   36.094603] Queue-update: 1, go, Total Queued: 1
> [   36.222612] Queue-update: 1, go, Total Queued: 1
> [   36.334610] Queue-update: 1, go, Total Queued: 1
> [   36.454619] Queue-update: 1, go, Total Queued: 1
> [   36.574619] Queue-update: 1, go, Total Queued: 1
> [   36.694643] Queue-update: 1, go, Total Queued: 1
> [   36.814614] Queue-update: 1, go, Total Queued: 1
> [   36.934610] Queue-update: 1, go, Total Queued: 1
> [   37.062609] Queue-update: 1, go, Total Queued: 1
> [   37.198611] Queue-update: 1, go, Total Queued: 1
> [   37.294618] Queue-update: 1, go, Total Queued: 1
> [   37.414618] Queue-update: 1, go, Total Queued: 1
> [   37.534595] Queue-update: 1, go, Total Queued: 1
> [   37.662594] Queue-update: 1, go, Total Queued: 1
> [   37.774610] Queue-update: 1, go, Total Queued: 1
> [   37.894618] Queue-update: 1, go, Total Queued: 1
> [   38.014619] Queue-update: 1, go, Total Queued: 1
> [   38.142612] Queue-update: 1, go, Total Queued: 1
> [   38.254609] Queue-update: 1, go, Total Queued: 1
> [   38.374619] Queue-update: 1, go, Total Queued: 1
> [   38.502481] Queue-update: 1, go, Total Queued: 1
> [   38.614596] Queue-update: 1, go, Total Queued: 1
> [   38.734597] Queue-update: 1, go, Total Queued: 1
> [   38.854606] Queue-update: 1, go, Total Queued: 1
> [   38.974620] Queue-update: 1, go, Total Queued: 1
> [   39.094617] Queue-update: 1, go, Total Queued: 1
> [   39.222597] Queue-update: 1, go, Total Queued: 1
> [   39.334610] Queue-update: 1, go, Total Queued: 1
> [   39.454609] Queue-update: 1, go, Total Queued: 1
> [   39.574633] Queue-update: 1, go, Total Queued: 1
> [   39.694611] Queue-update: 1, go, Total Queued: 1
> [   39.814608] Queue-update: 1, go, Total Queued: 1
> [   39.934610] Queue-update: 1, go, Total Queued: 1
> [   40.054621] Queue-update: 1, go, Total Queued: 1
> [   40.174611] Queue-update: 1, go, Total Queued: 1
> [   40.297471] Queue-update: 1, go, Total Queued: 1
> [   40.414504] Queue-update: 1, go, Total Queued: 1
> [   40.534601] Queue-update: 1, go, Total Queued: 1
> [   40.654611] Queue-update: 1, go, Total Queued: 1
> [   40.774609] Queue-update: 1, go, Total Queued: 1
> [   40.894628] Queue-update: 1, go, Total Queued: 1
> [   41.014608] Queue-update: 1, go, Total Queued: 1
> [   41.135594] Queue-update: 1, go, Total Queued: 1
> real	0m 21.40s
> user	0m 0.00s
> sys	0m 0.04s

So 21.4s/256 = 84 milliseconds per event disable, correct?

It might be worth trying synchronize_rcu_expedited() just as an experiment
to see if it speeds things up significantly.

							Thanx, Paul

> Remove events
> real	0m 2.14s
> user	0m 0.00s
> sys	0m 0.04s
> 
> 
> Thank you,
> 
> 
> -- 
> Masami Hiramatsu <mhiramat@kernel.org>
Masami Hiramatsu (Google) Jan. 23, 2020, 6:13 a.m. UTC | #8
On Wed, 22 Jan 2020 18:26:47 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:

> > Anyway, without this update, I added a printk to count optimizer
> > queue-length and found that all optimizer call with a single kprobe
> > on the quenes. I think this depends on the machine, but as far as
> > I tested on 8-threads qemu x86, shows this result...
> > 
> > Probes: 256 kprobe_events
> > Enable events
> > real	0m 0.02s
> > user	0m 0.00s
> > sys	0m 0.02s
> > [   17.730548] Queue-update: 180, skipped, Total Queued: 180
> > [   17.739445] Queue-update: 1, go, Total Queued: 180
> > Disable events
[...]
> > [   41.135594] Queue-update: 1, go, Total Queued: 1
> > real	0m 21.40s
> > user	0m 0.00s
> > sys	0m 0.04s
> 
> So 21.4s/256 = 84 milliseconds per event disable, correct?

Actually, it seems only 172 probes are on the unoptimized list, so
the number will be a bit different.

Anyway, that above elapsed time is including non-batch optimizer
working time as below.

(1) start looping on probe events
  (2) disabling-kprobe
    (2.1) wait kprobe_mutex if optimizer is running
    (2.2) if the kprobe is on optimized kprobe, queue it to unoptimizing
          list and kick optimizer with 5 jiffies delay
  (4) unlink enabled event
  (5) wait synchronize_rcu()
  (6) optimizer start optimization before finishing (5)
(7) goto (1)

I think the disabling performance issue came from (6) (and (2.1)).
Thus, if we change (2.2) to 1 HZ jiffies, the optimizer will start
after some loops are done. (and the optimizer detects "active"
queuing, postpone the process)

> 
> It might be worth trying synchronize_rcu_expedited() just as an experiment
> to see if it speeds things up significantly.

Would you mean replacing synchronize_rcu() in disabling loop, or
replacing synchronize_rcu_tasks() in optimizer?

I think that is not a root cause of this behavior, since if we
make the optimizer delay to 1 sec, it seems enough for making
it a batch operation. See below, this is the result with patched
kernel (1 HZ delay).

Probes: 256 kprobe_events
Enable events
real	0m 0.07s
user	0m 0.00s
sys	0m 0.07s
[   19.191181] Queue-update: 180, skipped, Total Queued: 180
Disable events
[   20.214966] Queue-update: 1, go, Total Queued: 172
[   21.302924] Queue-update: 86, skipped, Total Queued: 86
real	0m 2.11s
user	0m 0.00s
sys	0m 0.03s
[   22.327173] Queue-update: 87, skipped, Total Queued: 172
[   23.350933] Queue-update: 1, go, Total Queued: 172
Remove events
real	0m 2.13s
user	0m 0.02s
sys	0m 0.02s

As you can see, the optimizer ran outside of the disabling loop.
In that case, it is OK to synchronize RCU tasks in the optimizer
because it just runs *once* per multiple probe events.

From above result, 86 probes are disabled per 1 sec delay.
Each probe disabling took 11-12 msec in average. So 
(HZ / 10) can also be good. (But note that the optimizer
will retry to run each time until the disabling loop is
finished.)

BTW, testing kernel was build with HZ=1000, if HZ=250 or HZ=100,
the result will be different in the current code. So I think
we should use HZ-based delay instead of fixed number.

Thank you,
Paul E. McKenney Jan. 23, 2020, 10:03 a.m. UTC | #9
On Thu, Jan 23, 2020 at 03:13:28PM +0900, Masami Hiramatsu wrote:
> On Wed, 22 Jan 2020 18:26:47 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> 
> > > Anyway, without this update, I added a printk to count optimizer
> > > queue-length and found that all optimizer call with a single kprobe
> > > on the quenes. I think this depends on the machine, but as far as
> > > I tested on 8-threads qemu x86, shows this result...
> > > 
> > > Probes: 256 kprobe_events
> > > Enable events
> > > real	0m 0.02s
> > > user	0m 0.00s
> > > sys	0m 0.02s
> > > [   17.730548] Queue-update: 180, skipped, Total Queued: 180
> > > [   17.739445] Queue-update: 1, go, Total Queued: 180
> > > Disable events
> [...]
> > > [   41.135594] Queue-update: 1, go, Total Queued: 1
> > > real	0m 21.40s
> > > user	0m 0.00s
> > > sys	0m 0.04s
> > 
> > So 21.4s/256 = 84 milliseconds per event disable, correct?
> 
> Actually, it seems only 172 probes are on the unoptimized list, so
> the number will be a bit different.
> 
> Anyway, that above elapsed time is including non-batch optimizer
> working time as below.
> 
> (1) start looping on probe events
>   (2) disabling-kprobe
>     (2.1) wait kprobe_mutex if optimizer is running
>     (2.2) if the kprobe is on optimized kprobe, queue it to unoptimizing
>           list and kick optimizer with 5 jiffies delay
>   (4) unlink enabled event
>   (5) wait synchronize_rcu()
>   (6) optimizer start optimization before finishing (5)
> (7) goto (1)
> 
> I think the disabling performance issue came from (6) (and (2.1)).
> Thus, if we change (2.2) to 1 HZ jiffies, the optimizer will start
> after some loops are done. (and the optimizer detects "active"
> queuing, postpone the process)
> 
> > 
> > It might be worth trying synchronize_rcu_expedited() just as an experiment
> > to see if it speeds things up significantly.
> 
> Would you mean replacing synchronize_rcu() in disabling loop, or
> replacing synchronize_rcu_tasks() in optimizer?

The former.  (There is no synchronize_rcu_tasks_expedited().)

> I think that is not a root cause of this behavior, since if we
> make the optimizer delay to 1 sec, it seems enough for making
> it a batch operation. See below, this is the result with patched
> kernel (1 HZ delay).
> 
> Probes: 256 kprobe_events
> Enable events
> real	0m 0.07s
> user	0m 0.00s
> sys	0m 0.07s
> [   19.191181] Queue-update: 180, skipped, Total Queued: 180
> Disable events
> [   20.214966] Queue-update: 1, go, Total Queued: 172
> [   21.302924] Queue-update: 86, skipped, Total Queued: 86
> real	0m 2.11s
> user	0m 0.00s
> sys	0m 0.03s
> [   22.327173] Queue-update: 87, skipped, Total Queued: 172
> [   23.350933] Queue-update: 1, go, Total Queued: 172
> Remove events
> real	0m 2.13s
> user	0m 0.02s
> sys	0m 0.02s
> 
> As you can see, the optimizer ran outside of the disabling loop.
> In that case, it is OK to synchronize RCU tasks in the optimizer
> because it just runs *once* per multiple probe events.

Even better!

> >From above result, 86 probes are disabled per 1 sec delay.
> Each probe disabling took 11-12 msec in average. So 
> (HZ / 10) can also be good. (But note that the optimizer
> will retry to run each time until the disabling loop is
> finished.)
> 
> BTW, testing kernel was build with HZ=1000, if HZ=250 or HZ=100,
> the result will be different in the current code. So I think
> we should use HZ-based delay instead of fixed number.

Agreed, the HZ-based delay seems worth a try.

							Thanx, Paul
diff mbox series

Patch

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 0dacdcecc90f..9c6e230852ad 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -469,7 +469,8 @@  static int kprobe_optimizer_queue_update;
 
 static void kprobe_optimizer(struct work_struct *work);
 static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
-#define OPTIMIZE_DELAY 5
+/* Wait 1 second for starting optimization */
+#define OPTIMIZE_DELAY HZ
 
 /*
  * Optimize (replace a breakpoint with a jump) kprobes listed on