Message ID | PAWPR08MB89825143A8C77510EB33C9D783D79@PAWPR08MB8982.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | patch for hp-timing.h | expand |
On 03/02/23 10:29, Wilco Dijkstra via Libc-alpha wrote: > Hi Jun, > >> hp-timing is only used on benchtest today, but it can be used on other >> measurements in the future. > > It can't as proposed below since this would overflow in ~16 seconds. > We could improve this by removing the zero bits from the multiply > and doing some shifts, maybe that will make the interval long enough > for integer multiply. Otherwise I'd suggest the floating point version since > it's not like the benchtests don't already use floating point. > > Cheers, > Wilco Do we really need to handle more than ~16 seconds for the hp-timing? It is usually used along with a accumulator and I would expect that once you need to measure more than 16s using clock_gettime works as expected. > > diff --git a/benchtests/Makefile b/benchtests/Makefile > index 292976b26b..a624614207 100644 > --- a/benchtests/Makefile > +++ b/benchtests/Makefile > @@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps) > cat $($*-INCLUDE); \ > fi; \ > $(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp > + cp -f $@-tmp $@-bak > > Unintended change I guess? > > mv -f $@-tmp $@ > diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h > index f7f7ac7cae..c699effe6a 100644 > --- a/sysdeps/aarch64/hp-timing.h > +++ b/sysdeps/aarch64/hp-timing.h > @@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t; > #define HP_TIMING_DIFF(Diff, Start, End) \ > ({ hp_timing_t freq; \ > __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq)); \ > - (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \ > + (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \ > }) > > #endif /* hp-timing.h */
Wilco, Removing zero bits from multiply and making shifts afterwards can extend the time before overflow. Does it also reduce the resolution of the timer? Regards, Jun -----Original Message----- From: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Sent: Friday, February 3, 2023 7:30 AM To: Tang, Jun <juntangc@amazon.com> Cc: 'GNU C Library' <libc-alpha@sourceware.org> Subject: [EXTERNAL] patch for hp-timing.h CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Jun, > hp-timing is only used on benchtest today, but it can be used on other > measurements in the future. It can't as proposed below since this would overflow in ~16 seconds. We could improve this by removing the zero bits from the multiply and doing some shifts, maybe that will make the interval long enough for integer multiply. Otherwise I'd suggest the floating point version since it's not like the benchtests don't already use floating point. Cheers, Wilco diff --git a/benchtests/Makefile b/benchtests/Makefile index 292976b26b..a624614207 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps) cat $($*-INCLUDE); \ fi; \ $(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp + cp -f $@-tmp $@-bak Unintended change I guess? mv -f $@-tmp $@ diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h index f7f7ac7cae..c699effe6a 100644 --- a/sysdeps/aarch64/hp-timing.h +++ b/sysdeps/aarch64/hp-timing.h @@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t; #define HP_TIMING_DIFF(Diff, Start, End) \ ({ hp_timing_t freq; \ __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq)); \ - (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \ + (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \ }) #endif /* hp-timing.h */
Hi Jun, > Removing zero bits from multiply and making shifts afterwards can extend the time before > overflow. Does it also reduce the resolution of the timer? It wouldn't if we shift out zero bits. For example 1000000 has 6 zero bits at the bottom, so assuming every implementation uses a multiple of 1MHz then there is no loss of accuracy. And 6 extra bits would mean 1024 seconds before overflow at 1GHz which seems long enough. Cheers, Wilco
Wilco, Thanks for the suggestion, There are 7 zero bits at the end of freq (1050000000). So the time can be extended to 2048 second before losing resolution. Below is the updated patch for 1024 sec - - (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \ + (Diff) = (((End) - (Start)) * UINT64_C(1000000000>>6)) / (freq>>6); \ Regards, Jun -----Original Message----- From: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Sent: Monday, February 6, 2023 10:54 AM To: Tang, Jun <juntangc@amazon.com> Cc: 'GNU C Library' <libc-alpha@sourceware.org> Subject: RE: [EXTERNAL]patch for hp-timing.h CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi Jun, > Removing zero bits from multiply and making shifts afterwards can > extend the time before overflow. Does it also reduce the resolution of the timer? It wouldn't if we shift out zero bits. For example 1000000 has 6 zero bits at the bottom, so assuming every implementation uses a multiple of 1MHz then there is no loss of accuracy. And 6 extra bits would mean 1024 seconds before overflow at 1GHz which seems long enough. Cheers, Wilco
Hi Jun, > There are 7 zero bits at the end of freq (1050000000). So the time can be extended to > 2048 second before losing resolution. Below is the updated patch for 1024 sec - > > - (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \ > + (Diff) = (((End) - (Start)) * UINT64_C(1000000000>>6)) / (freq>>6); \ Minor codestyle: there should be a space before/after '>>'. Apart from that it looks good and works fine. Can you resend an updated patch with comment header etc in a new email with subject like "[PATCH v2] AArch64: Fix HP_TIMING_DIFF computation" or similar? I guess this is your first patch and you don't have commit rights? Cheers, Wilco
diff --git a/benchtests/Makefile b/benchtests/Makefile index 292976b26b..a624614207 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -499,4 +499,5 @@ $(objpfx)bench-%.c: %-inputs $(bench-deps) cat $($*-INCLUDE); \ fi; \ $(PYTHON) scripts/bench.py $(patsubst %-inputs,%,$<); } > $@-tmp + cp -f $@-tmp $@-bak Unintended change I guess? mv -f $@-tmp $@ diff --git a/sysdeps/aarch64/hp-timing.h b/sysdeps/aarch64/hp-timing.h index f7f7ac7cae..c699effe6a 100644 --- a/sysdeps/aarch64/hp-timing.h +++ b/sysdeps/aarch64/hp-timing.h @@ -41,7 +41,7 @@ typedef uint64_t hp_timing_t; #define HP_TIMING_DIFF(Diff, Start, End) \ ({ hp_timing_t freq; \ __asm__ __volatile__ ("mrs %0, cntfrq_el0" : "=r" (freq)); \ - (Diff) = ((End) - (Start)) * (UINT64_C(1000000000) / freq); \ + (Diff) = (((End) - (Start)) * UINT64_C(1000000000)) / freq; \ }) #endif /* hp-timing.h */