Message ID | 20100622123003.GB15547@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On 06/22/2010 02:30 PM, Jan Hubicka wrote: > The patch seems to save about 10% of samples on bitmap_clear_bit, but it is > within noise factor. > To be frank, I don't understand this stance. As I see this type of issue, either you have independent reasons to believe that the patch is beneficial and just go ahead, or you should repeat the test enough times and in the correct conditions until you reach statistical significance. Otherwise, I don't believe it really makes sense to mention numbers which you don't trust yourself... Paolo.
> On 06/22/2010 02:30 PM, Jan Hubicka wrote: > > The patch seems to save about 10% of samples on bitmap_clear_bit, but it is > > within noise factor. > > > To be frank, I don't understand this stance. As I see this type of > issue, either you have independent reasons to believe that the patch is > beneficial and just go ahead, or you should repeat the test enough times > and in the correct conditions until you reach statistical significance. > Otherwise, I don't believe it really makes sense to mention numbers > which you don't trust yourself... Well, I can dig bit deeper. In this case I oprofiled cc1 binary before the change 8 times, and the numbers of samples was always above 10000 for bitmap_clear_bit (removing the peaks and averaging, it is 11480), results are +-2000 samples including the peaks. After the change, I profiled 4 times, the average is 9988. So the 10% difference is pretty clear. However with this particular patch, it seemed just good idea to microoptimize the conditional even if the overall effects on compile time are very tiny. (bitmap_clear_bit is 0.9% of compile time in my benchmark, so the overall change should be about 0.09%) Honza > > Paolo.
On Tue, 22 Jun 2010, Jan Hubicka wrote: > Hi, > this is small change I noticed while looking into bitmap_clear_bit (that is also > amonght top 10 most hot GCC functions). We test if the bit was previously > clear but then we go for bitmap_element_zerop. Since there is branch anyway > (I am still not convinced that reducing store bandwidth is worth the hard to > predict branches here), it is probably good idea to branch around the > bitmap_element_zerop too. I also added nonzero test for !ptr->bits[word_num] > since the value will be conveniently in register. > > The patch seems to save about 10% of samples on bitmap_clear_bit, but it is > within noise factor. > > Bootstrapped/regtested x86_64-linux, OK? Ok. Thanks, Richard. > * bitmap.c (bitmap_clear_bit): Micro optimize. > Index: bitmap.c > =================================================================== > --- bitmap.c (revision 161163) > +++ bitmap.c (working copy) > @@ -624,11 +624,13 @@ bitmap_clear_bit (bitmap head, int bit) > BITMAP_WORD bit_val = ((BITMAP_WORD) 1) << bit_num; > bool res = (ptr->bits[word_num] & bit_val) != 0; > if (res) > - ptr->bits[word_num] &= ~bit_val; > - > - /* If we cleared the entire word, free up the element. */ > - if (bitmap_element_zerop (ptr)) > - bitmap_element_free (head, ptr); > + { > + ptr->bits[word_num] &= ~bit_val; > + /* If we cleared the entire word, free up the element. */ > + if (!ptr->bits[word_num] > + && bitmap_element_zerop (ptr)) > + bitmap_element_free (head, ptr); > + } > > return res; > } > >
Index: bitmap.c =================================================================== --- bitmap.c (revision 161163) +++ bitmap.c (working copy) @@ -624,11 +624,13 @@ bitmap_clear_bit (bitmap head, int bit) BITMAP_WORD bit_val = ((BITMAP_WORD) 1) << bit_num; bool res = (ptr->bits[word_num] & bit_val) != 0; if (res) - ptr->bits[word_num] &= ~bit_val; - - /* If we cleared the entire word, free up the element. */ - if (bitmap_element_zerop (ptr)) - bitmap_element_free (head, ptr); + { + ptr->bits[word_num] &= ~bit_val; + /* If we cleared the entire word, free up the element. */ + if (!ptr->bits[word_num] + && bitmap_element_zerop (ptr)) + bitmap_element_free (head, ptr); + } return res; }