Message ID | 1448036610-7077-1-git-send-email-ryan.barnett@rockwellcollins.com |
---|---|
State | Superseded |
Headers | show |
All, On Fri, Nov 20, 2015 at 10:23 AM, Ryan Barnett <ryan.barnett@rockwellcollins.com> wrote: > The build-shared target depends on do_crypto and link-shared, which > will be executed in parallel. do_crypto calls > link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, > link-shared calls symlink.linux_shared which also does SYMLINK_SO. > Before the symlink is created, it is rm'ed, but there is a tiny chance > that the second one is created after the rm has been called. > > Fix this by using 'ln -sf' instead of 'ln -s' so the build doesn't > error out. > > Patch submitted upstream at: > https://bugs.gentoo.org/show_bug.cgi?id=566260 So it doesn't appear that using 'ln -sf' is a valid fix as pointed out in the bug report on the gentoo forum: -- Comment #2 from SpanKY <vapier@gentoo.org> --- `ln -sf` is no more atomic than `rm; ln`. there is still a window where things can fail. you can easily check this: i=0; while [ $((i++)) -lt 500 ]; do ln -sf a b & :; done a good number of those will fail with: ln: failed to create symbolic link ‘b’: File exists the deps need to be fixed up So how do we want to proceed with this? Do we revert the patch for parallel builds for openssl? Thanks, -Ryan
On 20-11-15 17:23, Ryan Barnett wrote: > The build-shared target depends on do_crypto and link-shared, which > will be executed in parallel. do_crypto calls > link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, > link-shared calls symlink.linux_shared which also does SYMLINK_SO. > Before the symlink is created, it is rm'ed, but there is a tiny chance > that the second one is created after the rm has been called. > > Fix this by using 'ln -sf' instead of 'ln -s' so the build doesn't > error out. > > Patch submitted upstream at: > https://bugs.gentoo.org/show_bug.cgi?id=566260 Ahem, gentoo is not exactly upstream :-) From [1]: To report a bug or make an enhancement request, send email to rt@openssl.org. In the subject line, please make sure to indicate if it's a bug or a fix, and a brief description of the issue. In the body of your mail, please include the versions of the operating system and OpenSSL you are using. If you have a patch or diff, please send it as an attachment, and not inline in the message body. As far as I can see, this patch should still apply (in slightly adapted form) to the real upstream. > > Thanks to Arnout for explain the issue (wording used above). You're a native English speaker, right? explain -> explaining :-) Some more minor things below, but anyway more for the upstreams. So Reviewed-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be> > > Signed-off-by: Ryan Barnett <ryan.barnett@rockwellcollins.com> > CC: Arnout Vandecappelle <arnout@mind.be> > CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> > > --- > > Note this is a temporary fix until a gentoo maintainer weighs in and > updates their parallel build patch. > > v1 -> v2 > Remove the 'rm -f' since we are using 'ln -sf' (Arnout originally) > --- > ...ared-fix-race-condition-when-symlinking-s.patch | 48 ++++++++++++++++++++++ > 1 file changed, 48 insertions(+) > create mode 100644 package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch > > diff --git a/package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch b/package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch > new file mode 100644 > index 0000000..d6672f4 > --- /dev/null > +++ b/package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch > @@ -0,0 +1,48 @@ > +From 878793890a9b84ff313397c56478d7c6f770b2d2 Mon Sep 17 00:00:00 2001 > +From: Ryan Barnett <ryan.barnett@rockwellcollins.com> > +Date: Thu, 19 Nov 2015 10:47:00 -0600 > +Subject: [PATCH] makefile.shared: fix race condition when symlinking shared > + libs > + > +The build-shared target depends on do_crypto and link-shared, which > +will be executed in parallel. do_crypto calls > +link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, > +link-shared calls symlink.linux_shared which also does SYMLINK_SO. > +Before the symlink is created, it is rm'ed, but there is a tiny chance > +that the second one is created after the rm has been called. > + > +Fix this race condition by just using ln -sf since it will be the same > +symlink regards and not cause the build to error out. regards -> regardless > + > +Signed-off-by: Ryan Barnett <ryan.barnett@rockwellcollins.com> > +--- > + Makefile.shared | 8 ++++---- > + 1 file changed, 4 insertions(+), 4 deletions(-) > + > +diff --git a/Makefile.shared b/Makefile.shared > +index 8d57163..8ab3d18 100644 > +--- a/Makefile.shared > ++++ b/Makefile.shared > +@@ -117,15 +117,15 @@ SYMLINK_SO= \ > + prev=$$SHLIB$$SHLIB_SOVER$$SHLIB_SUFFIX; \ > + if [ -n "$$SHLIB_COMPAT" ]; then \ > + for x in $$SHLIB_COMPAT; do \ > +- ( $(SET_X); rm -f $$SHLIB$$x$$SHLIB_SUFFIX; \ > +- ln -s $$prev $$SHLIB$$x$$SHLIB_SUFFIX ); \ > ++ ( $(SET_X); \ > ++ ln -sf $$prev $$SHLIB$$x$$SHLIB_SUFFIX ); \ > + prev=$$SHLIB$$x$$SHLIB_SUFFIX; \ > + done; \ > + fi; \ > + if [ -n "$$SHLIB_SOVER" ]; then \ > + [ -e "$$SHLIB$$SHLIB_SUFFIX" ] || \ This check is not really needed anymore. Regards, Arnout [1] https://www.openssl.org/community/ > +- ( $(SET_X); rm -f $$SHLIB$$SHLIB_SUFFIX; \ > +- ln -s $$prev $$SHLIB$$SHLIB_SUFFIX ); \ > ++ ( $(SET_X); \ > ++ ln -sf $$prev $$SHLIB$$SHLIB_SUFFIX ); \ > + fi; \ > + fi > + > +-- > +1.9.1 > + >
Arnout, On Sat, Nov 21, 2015 at 7:24 AM, Arnout Vandecappelle <arnout@mind.be> wrote: > On 20-11-15 17:23, Ryan Barnett wrote: >> The build-shared target depends on do_crypto and link-shared, which >> will be executed in parallel. do_crypto calls >> link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, >> link-shared calls symlink.linux_shared which also does SYMLINK_SO. >> Before the symlink is created, it is rm'ed, but there is a tiny chance >> that the second one is created after the rm has been called. >> >> Fix this by using 'ln -sf' instead of 'ln -s' so the build doesn't >> error out. >> >> Patch submitted upstream at: >> https://bugs.gentoo.org/show_bug.cgi?id=566260 > > Ahem, gentoo is not exactly upstream :-) From [1]: > > To report a bug or make an enhancement request, send email to rt@openssl.org. In > the subject line, please make sure to indicate if it's a bug or a fix, and a > brief description of the issue. In the body of your mail, please include the > versions of the operating system and OpenSSL you are using. If you have a patch > or diff, please send it as an attachment, and not inline in the message body. > > As far as I can see, this patch should still apply (in slightly adapted form) > to the real upstream. Yes I agree to submit it upstream directly to openssl. However, I started with Gentoo since there are the ones that seem most interested in making OpenSSL work in parallel. I have also just sent an email to OpenSSL about the issue. I think this patch is pointless now since it doesn't actually fix the problem that existed before. Please see my reply to this patch earlier describing that 'ln -sf' isn't an atomic operation and the error from the build failures still will exist: http://lists.busybox.net/pipermail/buildroot/2015-November/144882.html >> >> Thanks to Arnout for explain the issue (wording used above). > > You're a native English speaker, right? explain -> explaining :-) Yes but I never claimed to be great at translating my English thoughts to properly written sentence ;) This is especially true when I type up something in a hurry which this commit message was. Thanks, -Ryan
On 21-11-15 01:55, Ryan Barnett wrote: > All, > > On Fri, Nov 20, 2015 at 10:23 AM, Ryan Barnett > <ryan.barnett@rockwellcollins.com> wrote: >> The build-shared target depends on do_crypto and link-shared, which >> will be executed in parallel. do_crypto calls >> link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, >> link-shared calls symlink.linux_shared which also does SYMLINK_SO. >> Before the symlink is created, it is rm'ed, but there is a tiny chance >> that the second one is created after the rm has been called. >> >> Fix this by using 'ln -sf' instead of 'ln -s' so the build doesn't >> error out. >> >> Patch submitted upstream at: >> https://bugs.gentoo.org/show_bug.cgi?id=566260 > > So it doesn't appear that using 'ln -sf' is a valid fix as pointed out > in the bug report on the gentoo forum: > > -- Comment #2 from SpanKY <vapier@gentoo.org> --- > `ln -sf` is no more atomic than `rm; ln`. there is still a window where things > can fail. you can easily check this: > > i=0; while [ $((i++)) -lt 500 ]; do ln -sf a b & :; done > > a good number of those will fail with: > ln: failed to create symbolic link ‘b’: File exists > > the deps need to be fixed up > > So how do we want to proceed with this? Do we revert the patch for > parallel builds for openssl? I think for 2015.11 at least we should indeed revert it. The symlink can be created atomically by going through a temp file, but that's a bit crazy... Regards, Arnout > > Thanks, > -Ryan >
On 21 Nov 2015 14:24, Arnout Vandecappelle wrote: > On 20-11-15 17:23, Ryan Barnett wrote: > > The build-shared target depends on do_crypto and link-shared, which > > will be executed in parallel. do_crypto calls > > link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, > > link-shared calls symlink.linux_shared which also does SYMLINK_SO. > > Before the symlink is created, it is rm'ed, but there is a tiny chance > > that the second one is created after the rm has been called. > > > > Fix this by using 'ln -sf' instead of 'ln -s' so the build doesn't > > error out. > > > > Patch submitted upstream at: > > https://bugs.gentoo.org/show_bug.cgi?id=566260 > > Ahem, gentoo is not exactly upstream :-) From [1]: > > To report a bug or make an enhancement request, send email to rt@openssl.org. which these have been already. they don't respond. in fact, you can see a link to the relevant upstream report in every Gentoo parallel patch. -mike
diff --git a/package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch b/package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch new file mode 100644 index 0000000..d6672f4 --- /dev/null +++ b/package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch @@ -0,0 +1,48 @@ +From 878793890a9b84ff313397c56478d7c6f770b2d2 Mon Sep 17 00:00:00 2001 +From: Ryan Barnett <ryan.barnett@rockwellcollins.com> +Date: Thu, 19 Nov 2015 10:47:00 -0600 +Subject: [PATCH] makefile.shared: fix race condition when symlinking shared + libs + +The build-shared target depends on do_crypto and link-shared, which +will be executed in parallel. do_crypto calls +link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, +link-shared calls symlink.linux_shared which also does SYMLINK_SO. +Before the symlink is created, it is rm'ed, but there is a tiny chance +that the second one is created after the rm has been called. + +Fix this race condition by just using ln -sf since it will be the same +symlink regards and not cause the build to error out. + +Signed-off-by: Ryan Barnett <ryan.barnett@rockwellcollins.com> +--- + Makefile.shared | 8 ++++---- + 1 file changed, 4 insertions(+), 4 deletions(-) + +diff --git a/Makefile.shared b/Makefile.shared +index 8d57163..8ab3d18 100644 +--- a/Makefile.shared ++++ b/Makefile.shared +@@ -117,15 +117,15 @@ SYMLINK_SO= \ + prev=$$SHLIB$$SHLIB_SOVER$$SHLIB_SUFFIX; \ + if [ -n "$$SHLIB_COMPAT" ]; then \ + for x in $$SHLIB_COMPAT; do \ +- ( $(SET_X); rm -f $$SHLIB$$x$$SHLIB_SUFFIX; \ +- ln -s $$prev $$SHLIB$$x$$SHLIB_SUFFIX ); \ ++ ( $(SET_X); \ ++ ln -sf $$prev $$SHLIB$$x$$SHLIB_SUFFIX ); \ + prev=$$SHLIB$$x$$SHLIB_SUFFIX; \ + done; \ + fi; \ + if [ -n "$$SHLIB_SOVER" ]; then \ + [ -e "$$SHLIB$$SHLIB_SUFFIX" ] || \ +- ( $(SET_X); rm -f $$SHLIB$$SHLIB_SUFFIX; \ +- ln -s $$prev $$SHLIB$$SHLIB_SUFFIX ); \ ++ ( $(SET_X); \ ++ ln -sf $$prev $$SHLIB$$SHLIB_SUFFIX ); \ + fi; \ + fi + +-- +1.9.1 +
The build-shared target depends on do_crypto and link-shared, which will be executed in parallel. do_crypto calls link_a.linux_shared -> link_a.gnu which does SYMLINK_SO; in parallel, link-shared calls symlink.linux_shared which also does SYMLINK_SO. Before the symlink is created, it is rm'ed, but there is a tiny chance that the second one is created after the rm has been called. Fix this by using 'ln -sf' instead of 'ln -s' so the build doesn't error out. Patch submitted upstream at: https://bugs.gentoo.org/show_bug.cgi?id=566260 Thanks to Arnout for explain the issue (wording used above). Signed-off-by: Ryan Barnett <ryan.barnett@rockwellcollins.com> CC: Arnout Vandecappelle <arnout@mind.be> CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> --- Note this is a temporary fix until a gentoo maintainer weighs in and updates their parallel build patch. v1 -> v2 Remove the 'rm -f' since we are using 'ln -sf' (Arnout originally) --- ...ared-fix-race-condition-when-symlinking-s.patch | 48 ++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 package/openssl/0003-makefile.shared-fix-race-condition-when-symlinking-s.patch