Message ID | 87k39n6mtn.fsf@oracle.com |
---|---|
State | New |
Headers | show |
From: jose.marchesi@oracle.com (Jose E. Marchesi) Date: Thu, 15 May 2014 18:14:44 +0200 > The following patch prevents stores made before calling memcpy to > overlap with the block stores in niagara2_memcpy. > > Tested on an Athena machine (sparc64x) running sparc64-*-linux-gnu. > No new warnings triggered by the patch. > No regressions detected. Does it really fix anything? Are these crashes or data corruptions or thread data visibilty issues that you've seen in practice and are cured by this change? The member is extremely expensive and I avoided it intentionally.
> Tested on an Athena machine (sparc64x) running sparc64-*-linux-gnu. > No new warnings triggered by the patch. > No regressions detected. Does it really fix anything? Certainly. Are these crashes or data corruptions or thread data visibilty issues that you've seen in practice and are cured by this change? Several system tests of a certain (very data intensive) program crashed as soon as it started to use multilib/niagara2_memcpy, due to memory corruption. They run just fine after this patch is applied. We also observed that jar would occasionally raise java.util.zip.ZipException. This was due to the fact the last two bytes of a buffer were not apparently being copied by memcpy. Again, this patch fixes that problem. The member is extremely expensive and I avoided it intentionally. Yes I suspected that, given how carefully that code is written. We are aware that membars are painful on the T2 (any membar, because all flavors are mapped to #Sync) but the risk of overlaps is not only theoretical in this case, unfortunately :(
From: jose.marchesi@oracle.com (Jose E. Marchesi) Date: Thu, 15 May 2014 19:31:43 +0200 > Several system tests of a certain (very data intensive) program crashed > as soon as it started to use multilib/niagara2_memcpy, due to memory > corruption. They run just fine after this patch is applied. > > We also observed that jar would occasionally raise > java.util.zip.ZipException. This was due to the fact the last two bytes > of a buffer were not apparently being copied by memcpy. Again, this > patch fixes that problem. > > The member is extremely expensive and I avoided it intentionally. > > Yes I suspected that, given how carefully that code is written. We are > aware that membars are painful on the T2 (any membar, because all > flavors are mapped to #Sync) but the risk of overlaps is not only > theoretical in this case, unfortunately :( Ok I'll install this fix after some testing and update the copy of the same code in the kernel as well, thanks.
> Several system tests of a certain (very data intensive) program crashed > as soon as it started to use multilib/niagara2_memcpy, due to memory > corruption. They run just fine after this patch is applied. > > We also observed that jar would occasionally raise > java.util.zip.ZipException. This was due to the fact the last two bytes > of a buffer were not apparently being copied by memcpy. Again, this > patch fixes that problem. > > The member is extremely expensive and I avoided it intentionally. > > Yes I suspected that, given how carefully that code is written. We are > aware that membars are painful on the T2 (any membar, because all > flavors are mapped to #Sync) but the risk of overlaps is not only > theoretical in this case, unfortunately :( Ok I'll install this fix after some testing and update the copy of the same code in the kernel as well, thanks. Thanks.
diff --git a/sysdeps/sparc/sparc64/multiarch/memcpy-niagara2.S b/sysdeps/sparc/sparc64/multiarch/memcpy-niagara2.S index b43a9e3..a1a9642 100644 --- a/sysdeps/sparc/sparc64/multiarch/memcpy-niagara2.S +++ b/sysdeps/sparc/sparc64/multiarch/memcpy-niagara2.S @@ -211,6 +211,7 @@ ENTRY(__memcpy_niagara2) */ VISEntryHalf + membar #Sync alignaddr %o1, %g0, %g0 add %o1, (64 - 1), %o4