Fix PR 47272 to restore Altivec vec_ld/vec_st

This patch fixes bug PR target/47272, but I'm sending it out to a wider
audience to solicit feedback from other developers to resolve a sticky
situation with the PowerPC code gen.

For those of you who don't know the power architecture, particularly with the
VSX extensions, the first main vector extension was the Altivec (VMX) vector
support.  The VSX vector support adds more floating point vector registers, and
overlaps with the Altivec support.  In particular, the vector loads and stores
on Altivec ignore the bottom 3 bits of the address, while the vector loads and
stores of the VSX instruction set do not, and will do unaligned loads and
stores.  Obviously, if the address is completely aligned, either an Altivec or
a VSX memory instruction will behave the same.  If on the other hand you have
an unaligned address, you will get different bytes loaded/stored.

The PowerPC compiler has a full set of overloaded vector intrinisic builtin
functions, including builtins for doing load and store.  When I added the VSX
support, I changed the compiler to do VSX loads/stores if the user used -mvsx
or -mcpu=power7, including changing the builtin load/store functions to use the
VSX instructions.  However, as I said, you get different results for unaligned
addresses.

Richard Henderson's change to libcpp/lex.c in August 21st, 2010 added code to
use the Altivec instruction set if the compiler supports it to speed up the
preprocessor:

2010-08-21  Richard Henderson  <rth@redhat.com>
	    Andi Kleen <ak@linux.intel.com>
	    David S. Miller  <davem@davemloft.net>

	* configure.ac (AC_C_BIGENDIAN, AC_TYPE_UINTPTR_T): New tests.
	(ssize_t): Check via AC_TYPE_SSIZE_T instead of AC_CHECK_TYPE.
	(ptrdiff_t): Check via AC_CHECK_TYPE.
	* config.in, configure: Rebuild.
	* system.h: Include stdint.h, if available.
	* lex.c (WORDS_BIGENDIAN): Provide default.
	(acc_char_mask_misalign, acc_char_replicate, acc_char_cmp,
	acc_char_index, search_line_acc_char, repl_chars, search_line_mmx,
	search_line_sse2, search_line_sse42, init_vectorized_lexer,
	search_line_fast): New.
	(_cpp_clean_line): Use search_line_fast.  Restructure the fast
	loop to make it clear when we're leaving the loop.  Stay in the
	fast loop for non-trigraph '?'.

Recently we started to look at building internal versions of the GCC 4.6
compiler with the --with-cpu=power7 support, and it exposed the difference
between the two loads.

So after some debate within IBM, we've come to the conclusion that I should not
have changed the semantics of __builtin_vec_ld and __builtin_vec_st, and that
we should go back to using the Altivec form for these instructions.  However,
in doing so, it means that anybody who has written new code explicitly for
power7 since GCC 4.5 came out might now be suprised.  Unfortunately the bug
exists in GCC 4.5 as well as the Red Hat RHEL6 and SUSE Sles 11 Sp1 compilers.

I realize that we are in stage 4 of the release process, but if we are going to
change the builtins back to the 4.4 semantics, we should do it as soon as
possible.

David suggested I poll release managers and other interested parties
what path we should take (make the builtins adhere to the 4.4 semantics, or
just keep the current situation).

I'm enclosing patches to make the load/store builtins go back to the Altivec
semantics, and added vector double support to those.  In addition, I added
patches for libcpp/lex.c so that it will work with 4.5 compilers as well as 4.4
and future 4.6 compilers.  No matter whether we decide not to re-change the
builtin semantics or not, I feel the lex.c patch should go it.

Right now, I did not add an #ifdef or -m switch to toggle to the 4.5
behaviour.  I can do this if desired (it probably is a good idea to allow code
written for 4.5 to continue to be used).  I don't know how many people directly
write using the Altivec semantics.

I should mention that Darwin users and people using the host processor in PS3
that might have written Altivec specific code will not be affected, since those
machines do not have the VSX instruction set.  It is only the newer machines
being shipped by IBM that currently will have the problem.

Sorry about all this.

Fix PR 47272 to restore Altivec vec_ld/vec_st

Commit Message

Comments

Patch