Message ID | 20200520072814.128267-6-irogers@google.com |
---|---|
State | Not Applicable |
Delegated to: | BPF Maintainers |
Headers | show |
Series | Share events between metrics | expand |
On Wed, May 20, 2020 at 12:28:12AM -0700, Ian Rogers wrote: SNIP > > @@ -157,7 +183,7 @@ static int metricgroup__setup_events(struct list_head *groups, > int i = 0; > int ret = 0; > struct egroup *eg; > - struct evsel *evsel; > + struct evsel *evsel, *tmp; > unsigned long *evlist_used; > > evlist_used = bitmap_alloc(perf_evlist->core.nr_entries); > @@ -173,7 +199,8 @@ static int metricgroup__setup_events(struct list_head *groups, > ret = -ENOMEM; > break; > } > - evsel = find_evsel_group(perf_evlist, &eg->pctx, metric_events, > + evsel = find_evsel_group(perf_evlist, &eg->pctx, > + eg->has_constraint, metric_events, > evlist_used); > if (!evsel) { > pr_debug("Cannot resolve %s: %s\n", > @@ -200,6 +227,12 @@ static int metricgroup__setup_events(struct list_head *groups, > list_add(&expr->nd, &me->head); > } > > + evlist__for_each_entry_safe(perf_evlist, tmp, evsel) { > + if (!test_bit(evsel->idx, evlist_used)) { > + evlist__remove(perf_evlist, evsel); > + evsel__delete(evsel); > + } is the groupping still enabled when we merge groups? could part of the metric (events) be now computed in different groups? I was wondering if we could merge all the hasmaps into single one before the parse the evlist.. this way we won't need removing later.. but I did not thought this through completely, so it might not work at some point jirka
On Wed, May 20, 2020 at 6:49 AM Jiri Olsa <jolsa@redhat.com> wrote: > > On Wed, May 20, 2020 at 12:28:12AM -0700, Ian Rogers wrote: > > SNIP > > > > > @@ -157,7 +183,7 @@ static int metricgroup__setup_events(struct list_head *groups, > > int i = 0; > > int ret = 0; > > struct egroup *eg; > > - struct evsel *evsel; > > + struct evsel *evsel, *tmp; > > unsigned long *evlist_used; > > > > evlist_used = bitmap_alloc(perf_evlist->core.nr_entries); > > @@ -173,7 +199,8 @@ static int metricgroup__setup_events(struct list_head *groups, > > ret = -ENOMEM; > > break; > > } > > - evsel = find_evsel_group(perf_evlist, &eg->pctx, metric_events, > > + evsel = find_evsel_group(perf_evlist, &eg->pctx, > > + eg->has_constraint, metric_events, > > evlist_used); > > if (!evsel) { > > pr_debug("Cannot resolve %s: %s\n", > > @@ -200,6 +227,12 @@ static int metricgroup__setup_events(struct list_head *groups, > > list_add(&expr->nd, &me->head); > > } > > > > + evlist__for_each_entry_safe(perf_evlist, tmp, evsel) { > > + if (!test_bit(evsel->idx, evlist_used)) { > > + evlist__remove(perf_evlist, evsel); > > + evsel__delete(evsel); > > + } > > is the groupping still enabled when we merge groups? could part > of the metric (events) be now computed in different groups? By default the change will take two metrics and allow the shorter metric (in terms of number of events) to share the events of the longer metric. If the events for the shorter metric aren't in the longer metric then the shorter metric must use its own group of events. If sharing has occurred then the bitmap is used to work out which events and groups are no longer in use. With --metric-no-group then any event can be used for a metric as there is no grouping. This is why more events can be eliminated. With --metric-no-merge then the logic to share events between different metrics is disabled and every metric is in a group. This allows the current behavior to be had. There are some corner cases, such as metrics with constraints (that don't group) and duration_time that is never placed into a group. Partial sharing, with one event in 1 weak event group and 1 in another is never performed. Using --metric-no-group allows something similar. Given multiplexing, I'd be concerned about accuracy problems if events between groups were shared - say for IPC, were you measuring instructions and cycles at the same moment? > I was wondering if we could merge all the hasmaps into single > one before the parse the evlist.. this way we won't need removing > later.. but I did not thought this through completely, so it > might not work at some point This could be done in the --metric-no-group case reasonably easily like the current hashmap. For groups you'd want something like a set of sets of events, but then you'd only be able to share events if the sets were the same. A directed acyclic graph could capture the events and the sharing relationships, it may be possible to optimize cases like {A,B,C}, {A,B,D}, {A,B} so that the small group on the end shares events with both the {A,B,C} and {A,B,D} group. This may be good follow up work. We could also solve this in the json, for example create a "phony" group of {A,B,C,D} that all three metrics share from. You could also use --metric-no-group to achieve that sharing now. Thanks, Ian > jirka >
On Wed, May 20, 2020 at 09:50:13AM -0700, Ian Rogers wrote: SNIP > > > + } > > > > is the groupping still enabled when we merge groups? could part > > of the metric (events) be now computed in different groups? > > By default the change will take two metrics and allow the shorter > metric (in terms of number of events) to share the events of the > longer metric. If the events for the shorter metric aren't in the > longer metric then the shorter metric must use its own group of > events. If sharing has occurred then the bitmap is used to work out > which events and groups are no longer in use. > > With --metric-no-group then any event can be used for a metric as > there is no grouping. This is why more events can be eliminated. > > With --metric-no-merge then the logic to share events between > different metrics is disabled and every metric is in a group. This > allows the current behavior to be had. > > There are some corner cases, such as metrics with constraints (that > don't group) and duration_time that is never placed into a group. > > Partial sharing, with one event in 1 weak event group and 1 in another > is never performed. Using --metric-no-group allows something similar. > Given multiplexing, I'd be concerned about accuracy problems if events > between groups were shared - say for IPC, were you measuring > instructions and cycles at the same moment? hum, I think that's also concern if you are multiplexing 2 groups and one metric getting events from both groups that were not meassured together it makes sense to me put all the merged events into single weak group anything else will have the issue you described above, no? and perhaps add command line option for merging that to make sure it's what user actuly wants thanks, jirka > > > I was wondering if we could merge all the hasmaps into single > > one before the parse the evlist.. this way we won't need removing > > later.. but I did not thought this through completely, so it > > might not work at some point > > This could be done in the --metric-no-group case reasonably easily > like the current hashmap. For groups you'd want something like a set > of sets of events, but then you'd only be able to share events if the > sets were the same. A directed acyclic graph could capture the events > and the sharing relationships, it may be possible to optimize cases > like {A,B,C}, {A,B,D}, {A,B} so that the small group on the end shares > events with both the {A,B,C} and {A,B,D} group. This may be good > follow up work. We could also solve this in the json, for example > create a "phony" group of {A,B,C,D} that all three metrics share from. > You could also use --metric-no-group to achieve that sharing now. > > Thanks, > Ian > > > jirka > > >
On Wed, May 20, 2020 at 3:09 PM Jiri Olsa <jolsa@redhat.com> wrote: > > On Wed, May 20, 2020 at 09:50:13AM -0700, Ian Rogers wrote: > > SNIP > > > > > + } > > > > > > is the groupping still enabled when we merge groups? could part > > > of the metric (events) be now computed in different groups? > > > > By default the change will take two metrics and allow the shorter > > metric (in terms of number of events) to share the events of the > > longer metric. If the events for the shorter metric aren't in the > > longer metric then the shorter metric must use its own group of > > events. If sharing has occurred then the bitmap is used to work out > > which events and groups are no longer in use. > > > > With --metric-no-group then any event can be used for a metric as > > there is no grouping. This is why more events can be eliminated. > > > > With --metric-no-merge then the logic to share events between > > different metrics is disabled and every metric is in a group. This > > allows the current behavior to be had. > > > > There are some corner cases, such as metrics with constraints (that > > don't group) and duration_time that is never placed into a group. > > > > Partial sharing, with one event in 1 weak event group and 1 in another > > is never performed. Using --metric-no-group allows something similar. > > Given multiplexing, I'd be concerned about accuracy problems if events > > between groups were shared - say for IPC, were you measuring > > instructions and cycles at the same moment? > > hum, I think that's also concern if you are multiplexing 2 groups and one > metric getting events from both groups that were not meassured together > > it makes sense to me put all the merged events into single weak group > anything else will have the issue you described above, no? > > and perhaps add command line option for merging that to make sure it's > what user actuly wants I'm not sure I'm following. With the patch set if we have 3 metrics with the event groups shown: M1: {A,B,C}:W M2: {A,B}:W M3: {A,B,D}:W then what happens is we sort the metrics in to M1, M3, M2 then when we come to match the events: - by default: match events allowing sharing if all events come from the same group. So in the example M1 will first match with {A,B,C} then M3 will fail to match the group {A,B,C} but match {A,B,D}; M2 will succeed with matching {A,B} from M1. The events/group for M2 can be removed as they are no longer used. This kind of sharing is opportunistic and respects existing groupings. While it may mean a metric is computed from a group that now multiplexes, that group will run for more of the time as there are fewer groups to multiplex with. In this example we've gone from 3 groups down to 2, 8 events down to 6. An improvement would be to realize that A,B is in both M1 and M3, so when we print the stat we could combine these values. - with --metric-no-merge: no events are shared by metrics M1, M2 and M3 have their events and computation as things currently are. There are 3 groups and 8 events. - with --metric-no-group: all groups are removed and so the evlist has A,B,C,A,B,A,B,D in it. The matching will now match M1 to A,B,C at the beginning of the list, M2 to the first A,B and M3 to the same A,B and D at the end of the list. We've got no groups and the events have gone from 8 down to 4. It is difficult to reason about which grouping is most accurate. If we have 4 counters (no NMI watchdog) then this example will fit with no multiplexing. The default above should achieve less multiplexing, in the same way merging PMU events currently does - this patch is trying to mirror the --no-merge functionality to a degree. Considering TopDownL1 then we go from metrics that never sum to 100%, to metrics that do in either the default or --metric-no-group cases. I'm not sure what user option is missing with these combinations? The default is trying to strike a compromise and I think user interaction is unnecessary, just as --no-merge doesn't cause interaction. If the existing behavior is wanted using --metric-no-merge will give that. The new default and --metric-no-group are hopefully going to reduce the number of groups and events. I'm somewhat agnostic as to what the flag functionality should be as what I'm working with needs either the default or --metric-no-group, I can use whatever flag is agreed upon. Thanks, Ian > thanks, > jirka > > > > > > > I was wondering if we could merge all the hasmaps into single > > > one before the parse the evlist.. this way we won't need removing > > > later.. but I did not thought this through completely, so it > > > might not work at some point > > > > This could be done in the --metric-no-group case reasonably easily > > like the current hashmap. For groups you'd want something like a set > > of sets of events, but then you'd only be able to share events if the > > sets were the same. A directed acyclic graph could capture the events > > and the sharing relationships, it may be possible to optimize cases > > like {A,B,C}, {A,B,D}, {A,B} so that the small group on the end shares > > events with both the {A,B,C} and {A,B,D} group. This may be good > > follow up work. We could also solve this in the json, for example > > create a "phony" group of {A,B,C,D} that all three metrics share from. > > You could also use --metric-no-group to achieve that sharing now. > > > > Thanks, > > Ian > > > > > jirka > > > > > >
On Wed, May 20, 2020 at 03:42:02PM -0700, Ian Rogers wrote: SNIP > > > > hum, I think that's also concern if you are multiplexing 2 groups and one > > metric getting events from both groups that were not meassured together > > > > it makes sense to me put all the merged events into single weak group > > anything else will have the issue you described above, no? > > > > and perhaps add command line option for merging that to make sure it's > > what user actuly wants > > I'm not sure I'm following. With the patch set if we have 3 metrics > with the event groups shown: > M1: {A,B,C}:W > M2: {A,B}:W > M3: {A,B,D}:W > > then what happens is we sort the metrics in to M1, M3, M2 then when we > come to match the events: > > - by default: match events allowing sharing if all events come from > the same group. So in the example M1 will first match with {A,B,C} > then M3 will fail to match the group {A,B,C} but match {A,B,D}; M2 > will succeed with matching {A,B} from M1. The events/group for M2 can > be removed as they are no longer used. This kind of sharing is > opportunistic and respects existing groupings. While it may mean a > metric is computed from a group that now multiplexes, that group will > run for more of the time as there are fewer groups to multiplex with. > In this example we've gone from 3 groups down to 2, 8 events down to > 6. An improvement would be to realize that A,B is in both M1 and M3, > so when we print the stat we could combine these values. ok, I misunderstood and thought you would colaps also M3 to have A,B computed via M1 group and with separate D ... thanks a lot for the explanation, it might be great to have it in the comments/changelog or even man page > > - with --metric-no-merge: no events are shared by metrics M1, M2 and > M3 have their events and computation as things currently are. There > are 3 groups and 8 events. > > - with --metric-no-group: all groups are removed and so the evlist > has A,B,C,A,B,A,B,D in it. The matching will now match M1 to A,B,C at > the beginning of the list, M2 to the first A,B and M3 to the same A,B > and D at the end of the list. We've got no groups and the events have > gone from 8 down to 4. > > It is difficult to reason about which grouping is most accurate. If we > have 4 counters (no NMI watchdog) then this example will fit with no > multiplexing. The default above should achieve less multiplexing, in > the same way merging PMU events currently does - this patch is trying > to mirror the --no-merge functionality to a degree. Considering > TopDownL1 then we go from metrics that never sum to 100%, to metrics > that do in either the default or --metric-no-group cases. > > I'm not sure what user option is missing with these combinations? The > default is trying to strike a compromise and I think user interaction > is unnecessary, just as --no-merge doesn't cause interaction. If the > existing behavior is wanted using --metric-no-merge will give that. > The new default and --metric-no-group are hopefully going to reduce > the number of groups and events. I'm somewhat agnostic as to what the > flag functionality should be as what I'm working with needs either the > default or --metric-no-group, I can use whatever flag is agreed upon. no other option is needed then thanks, jirka
On Thu, May 21, 2020 at 3:54 AM Jiri Olsa <jolsa@redhat.com> wrote: > > On Wed, May 20, 2020 at 03:42:02PM -0700, Ian Rogers wrote: > > SNIP > > > > > > > hum, I think that's also concern if you are multiplexing 2 groups and one > > > metric getting events from both groups that were not meassured together > > > > > > it makes sense to me put all the merged events into single weak group > > > anything else will have the issue you described above, no? > > > > > > and perhaps add command line option for merging that to make sure it's > > > what user actuly wants > > > > I'm not sure I'm following. With the patch set if we have 3 metrics > > with the event groups shown: > > M1: {A,B,C}:W > > M2: {A,B}:W > > M3: {A,B,D}:W > > > > then what happens is we sort the metrics in to M1, M3, M2 then when we > > come to match the events: > > > > - by default: match events allowing sharing if all events come from > > the same group. So in the example M1 will first match with {A,B,C} > > then M3 will fail to match the group {A,B,C} but match {A,B,D}; M2 > > will succeed with matching {A,B} from M1. The events/group for M2 can > > be removed as they are no longer used. This kind of sharing is > > opportunistic and respects existing groupings. While it may mean a > > metric is computed from a group that now multiplexes, that group will > > run for more of the time as there are fewer groups to multiplex with. > > In this example we've gone from 3 groups down to 2, 8 events down to > > 6. An improvement would be to realize that A,B is in both M1 and M3, > > so when we print the stat we could combine these values. > > ok, I misunderstood and thought you would colaps also M3 to > have A,B computed via M1 group and with separate D ... > > thanks a lot for the explanation, it might be great to have it > in the comments/changelog or even man page Thanks Jiri! Arnaldo do you want me to copy the description above into the commit message of this change and resend? This patch adds some description to find_evsel_group, this is expanded by the next patch that adds the two command line flags: https://lore.kernel.org/lkml/20200520072814.128267-7-irogers@google.com/ When writing the patches it wasn't clear to me how much detail to include in say the man pages. Thanks, Ian > > > > - with --metric-no-merge: no events are shared by metrics M1, M2 and > > M3 have their events and computation as things currently are. There > > are 3 groups and 8 events. > > > > - with --metric-no-group: all groups are removed and so the evlist > > has A,B,C,A,B,A,B,D in it. The matching will now match M1 to A,B,C at > > the beginning of the list, M2 to the first A,B and M3 to the same A,B > > and D at the end of the list. We've got no groups and the events have > > gone from 8 down to 4. > > > > It is difficult to reason about which grouping is most accurate. If we > > have 4 counters (no NMI watchdog) then this example will fit with no > > multiplexing. The default above should achieve less multiplexing, in > > the same way merging PMU events currently does - this patch is trying > > to mirror the --no-merge functionality to a degree. Considering > > TopDownL1 then we go from metrics that never sum to 100%, to metrics > > that do in either the default or --metric-no-group cases. > > > > I'm not sure what user option is missing with these combinations? The > > default is trying to strike a compromise and I think user interaction > > is unnecessary, just as --no-merge doesn't cause interaction. If the > > existing behavior is wanted using --metric-no-merge will give that. > > The new default and --metric-no-group are hopefully going to reduce > > the number of groups and events. I'm somewhat agnostic as to what the > > flag functionality should be as what I'm working with needs either the > > default or --metric-no-group, I can use whatever flag is agreed upon. > > no other option is needed then > > thanks, > jirka >
Em Thu, May 21, 2020 at 10:26:02AM -0700, Ian Rogers escreveu: > On Thu, May 21, 2020 at 3:54 AM Jiri Olsa <jolsa@redhat.com> wrote: > > ok, I misunderstood and thought you would colaps also M3 to > > have A,B computed via M1 group and with separate D ... > > > > thanks a lot for the explanation, it might be great to have it > > in the comments/changelog or even man page > > Thanks Jiri! Arnaldo do you want me to copy the description above into > the commit message of this change and resend? Send as a follow on patch, please. - Arnaldo > This patch adds some description to find_evsel_group, this is expanded > by the next patch that adds the two command line flags: > https://lore.kernel.org/lkml/20200520072814.128267-7-irogers@google.com/ > When writing the patches it wasn't clear to me how much detail to > include in say the man pages. > > Thanks, > Ian
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index 52e4c3e4748a..7ed32baeb767 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -93,36 +93,72 @@ struct egroup { bool has_constraint; }; +/** + * Find a group of events in perf_evlist that correpond to those from a parsed + * metric expression. + * @perf_evlist: a list of events something like: {metric1 leader, metric1 + * sibling, metric1 sibling}:W,duration_time,{metric2 leader, metric2 sibling, + * metric2 sibling}:W,duration_time + * @pctx: the parse context for the metric expression. + * @has_constraint: is there a contraint on the group of events? In which case + * the events won't be grouped. + * @metric_events: out argument, null terminated array of evsel's associated + * with the metric. + * @evlist_used: in/out argument, bitmap tracking which evlist events are used. + * @return the first metric event or NULL on failure. + */ static struct evsel *find_evsel_group(struct evlist *perf_evlist, struct expr_parse_ctx *pctx, + bool has_constraint, struct evsel **metric_events, unsigned long *evlist_used) { - struct evsel *ev; - bool leader_found; - const size_t idnum = hashmap__size(&pctx->ids); - size_t i = 0; - int j = 0; + struct evsel *ev, *current_leader = NULL; double *val_ptr; + int i = 0, matched_events = 0, events_to_match; + const int idnum = (int)hashmap__size(&pctx->ids); + + /* duration_time is grouped separately. */ + if (!has_constraint && + hashmap__find(&pctx->ids, "duration_time", (void **)&val_ptr)) + events_to_match = idnum - 1; + else + events_to_match = idnum; evlist__for_each_entry (perf_evlist, ev) { - if (test_bit(j++, evlist_used)) + /* + * Events with a constraint aren't grouped and match the first + * events available. + */ + if (has_constraint && ev->weak_group) continue; - if (hashmap__find(&pctx->ids, ev->name, (void **)&val_ptr)) { - if (!metric_events[i]) - metric_events[i] = ev; - i++; - if (i == idnum) - break; - } else { - /* Discard the whole match and start again */ - i = 0; + if (!has_constraint && ev->leader != current_leader) { + /* + * Start of a new group, discard the whole match and + * start again. + */ + matched_events = 0; memset(metric_events, 0, sizeof(struct evsel *) * idnum); + current_leader = ev->leader; + } + if (hashmap__find(&pctx->ids, ev->name, (void **)&val_ptr)) + metric_events[matched_events++] = ev; + if (matched_events == events_to_match) + break; + } + + if (events_to_match != idnum) { + /* Add the first duration_time. */ + evlist__for_each_entry(perf_evlist, ev) { + if (!strcmp(ev->name, "duration_time")) { + metric_events[matched_events++] = ev; + break; + } } } - if (i != idnum) { + if (matched_events != idnum) { /* Not whole match */ return NULL; } @@ -130,18 +166,8 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist, metric_events[idnum] = NULL; for (i = 0; i < idnum; i++) { - leader_found = false; - evlist__for_each_entry(perf_evlist, ev) { - if (!leader_found && (ev == metric_events[i])) - leader_found = true; - - if (leader_found && - !strcmp(ev->name, metric_events[i]->name)) { - ev->metric_leader = metric_events[i]; - } - j++; - } ev = metric_events[i]; + ev->metric_leader = ev; set_bit(ev->idx, evlist_used); } @@ -157,7 +183,7 @@ static int metricgroup__setup_events(struct list_head *groups, int i = 0; int ret = 0; struct egroup *eg; - struct evsel *evsel; + struct evsel *evsel, *tmp; unsigned long *evlist_used; evlist_used = bitmap_alloc(perf_evlist->core.nr_entries); @@ -173,7 +199,8 @@ static int metricgroup__setup_events(struct list_head *groups, ret = -ENOMEM; break; } - evsel = find_evsel_group(perf_evlist, &eg->pctx, metric_events, + evsel = find_evsel_group(perf_evlist, &eg->pctx, + eg->has_constraint, metric_events, evlist_used); if (!evsel) { pr_debug("Cannot resolve %s: %s\n", @@ -200,6 +227,12 @@ static int metricgroup__setup_events(struct list_head *groups, list_add(&expr->nd, &me->head); } + evlist__for_each_entry_safe(perf_evlist, tmp, evsel) { + if (!test_bit(evsel->idx, evlist_used)) { + evlist__remove(perf_evlist, evsel); + evsel__delete(evsel); + } + } bitmap_free(evlist_used); return ret;
A metric group contains multiple metrics. These metrics may use the same events. If metrics use separate events then it leads to more multiplexing and overall metric counts fail to sum to 100%. Modify how metrics are associated with events so that if the events in an earlier group satisfy the current metric, the same events are used. A record of used events is kept and at the end of processing unnecessary events are eliminated. Before: $ perf stat -a -M TopDownL1 sleep 1 Performance counter stats for 'system wide': 920,211,343 uops_issued.any # 0.5 Backend_Bound (16.56%) 1,977,733,128 idq_uops_not_delivered.core (16.56%) 51,668,510 int_misc.recovery_cycles (16.56%) 732,305,692 uops_retired.retire_slots (16.56%) 1,497,621,849 cycles (16.56%) 721,098,274 uops_issued.any # 0.1 Bad_Speculation (16.79%) 1,332,681,791 cycles (16.79%) 552,475,482 uops_retired.retire_slots (16.79%) 47,708,340 int_misc.recovery_cycles (16.79%) 1,383,713,292 cycles # 0.4 Frontend_Bound (16.76%) 2,013,757,701 idq_uops_not_delivered.core (16.76%) 1,373,363,790 cycles # 0.1 Retiring (33.54%) 577,302,589 uops_retired.retire_slots (33.54%) 392,766,987 inst_retired.any # 0.3 IPC (50.24%) 1,351,873,350 cpu_clk_unhalted.thread (50.24%) 1,332,510,318 cycles # 5330041272.0 SLOTS (49.90%) 1.006336145 seconds time elapsed After: $ perf stat -a -M TopDownL1 sleep 1 Performance counter stats for 'system wide': 765,949,145 uops_issued.any # 0.1 Bad_Speculation # 0.5 Backend_Bound (50.09%) 1,883,830,591 idq_uops_not_delivered.core # 0.3 Frontend_Bound (50.09%) 48,237,080 int_misc.recovery_cycles (50.09%) 581,798,385 uops_retired.retire_slots # 0.1 Retiring (50.09%) 1,361,628,527 cycles # 5446514108.0 SLOTS (50.09%) 391,415,714 inst_retired.any # 0.3 IPC (49.91%) 1,336,486,781 cpu_clk_unhalted.thread (49.91%) 1.005469298 seconds time elapsed Note: Bad_Speculation + Backend_Bound + Frontend_Bound + Retiring = 100% after, where as before it is 110%. After there are 2 groups, whereas before there are 6. After the cycles event appears once, before it appeared 5 times. Signed-off-by: Ian Rogers <irogers@google.com> --- tools/perf/util/metricgroup.c | 91 ++++++++++++++++++++++++----------- 1 file changed, 62 insertions(+), 29 deletions(-)