Message ID | CAAs8HmyUjEf5ecGBerysqSFgOq3SC0esvN3AnEJm5O_kxzrGyQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Fri, Mar 15, 2013 at 2:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: > Hi, > > This patch is meant for google/gcc-4_7 but I want this to be > considered for trunk when it opens again. This patch makes it easy to > test for code coverage of multiversioned functions. Here is a > motivating example: > > __attribute__((target ("default"))) int foo () { ... return 0; } > __attribute__((target ("sse"))) int foo () { ... return 1; } > __attribute__((target ("popcnt"))) int foo () { ... return 2; } > > int main () > { > return foo(); > } > > Lets say your test CPU supports popcnt. A run of this program will > invoke the popcnt version of foo (). Then, how do we test the sse > version of foo()? To do that for the above example, we need to run > this code on a CPU that has sse support but no popcnt support. > Otherwise, we need to comment out the popcnt version and run this > example. This can get painful when there are many versions. The same > argument applies to testing the default version of foo. > > So, I am introducing the ability to mock a CPU. If the CPU you are > testing on supports sse, you should be able to test the sse version. > > First, I have introduced a new flag called -fmv-debug. This patch > invokes the function version dispatcher every time a call to a foo () > is made. Without that flag, the version dispatch happens once at > startup time via the IFUNC mechanism. > > Also, with -fmv-debug, the version dispatcher uses the two new > builtins "__builtin_mock_cpu_is" and "__builtin_mock_cpu_supports" to > check the cpu type and cpu isa. With this option, compiler probably can also define some macros so that if user can use to write overriding hooks. > > Then, I plan to add the following hooks to libgcc (in a different patch) : > > int set_mock_cpu_is (const char *cpu); > int set_mock_cpu_supports (const char *isa); > int init_mock_cpu (); // Clear the values of the mock cpu. > > With this support, here is how you can test for code coverage of the > "sse" version and "default version of foo in the above example: > > int main () > { > // Test SSE version. > if (__builtin_cpu_supports ("sse")) > { > init_mock_cpu(); > set_mock_cpu_supports ("sse"); > assert (foo () == 1); > } > // Test default version. > init_mock_cpu(); > assert (foo () == 0); > } > > Invoking a multiversioned binary several times with appropriate mock > cpu values for the various ISAs and CPUs will give the complete code > coverage desired. Ofcourse, the underlying platform should be able to > support the various features. > It is the other way around -- it simplifies unit test writing and running -- one unit test just need to be run on the same hardware (with the most hw features) *ONCE* and all the versions can be covered. > Note that the above test will work only with -fmv-debug as the > dispatcher must be invoked on every multiversioned call to be able to > dynamically change the version. > > Multiple ISA features can be set in the mock cpu by calling > "set_mock_cpu_supports" several times with different ISA names. > Calling "init_mock_cpu" will clear all the values. "set_mock_cpu_is" > will set the CPU type. > Just through about another idea. Is it possible for compiler to create some alias for each version so that they can be accessed explicitly, just like the use of :: ? if (__buitin_cpu_supports ("sse")) CHECK_RESULT (foo_sse (...)); CHECK_RESULT (foo_default(...)); ... David > This patch only includes the gcc changes. I will separately prepare a > patch for the libgcc changes. Right now, since the libgcc changes are > not available the two new mock cpu builtins check the real CPU like > "__builtin_cpu_is" and "__builtin_cpu_supports". > > Patch attached. Please look at mv14_debug_code_coverage.C for an > exhaustive example of testing for code coverage in the presence of > multiple versions. > > Comments please. > > Thanks > Sri
On Fri, Mar 15, 2013 at 3:37 PM, Xinliang David Li <davidxl@google.com> wrote: > On Fri, Mar 15, 2013 at 2:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: >> Hi, >> >> This patch is meant for google/gcc-4_7 but I want this to be >> considered for trunk when it opens again. This patch makes it easy to >> test for code coverage of multiversioned functions. Here is a >> motivating example: >> >> __attribute__((target ("default"))) int foo () { ... return 0; } >> __attribute__((target ("sse"))) int foo () { ... return 1; } >> __attribute__((target ("popcnt"))) int foo () { ... return 2; } >> >> int main () >> { >> return foo(); >> } >> >> Lets say your test CPU supports popcnt. A run of this program will >> invoke the popcnt version of foo (). Then, how do we test the sse >> version of foo()? To do that for the above example, we need to run >> this code on a CPU that has sse support but no popcnt support. >> Otherwise, we need to comment out the popcnt version and run this >> example. This can get painful when there are many versions. The same >> argument applies to testing the default version of foo. >> >> So, I am introducing the ability to mock a CPU. If the CPU you are >> testing on supports sse, you should be able to test the sse version. >> >> First, I have introduced a new flag called -fmv-debug. This patch >> invokes the function version dispatcher every time a call to a foo () >> is made. Without that flag, the version dispatch happens once at >> startup time via the IFUNC mechanism. >> >> Also, with -fmv-debug, the version dispatcher uses the two new >> builtins "__builtin_mock_cpu_is" and "__builtin_mock_cpu_supports" to >> check the cpu type and cpu isa. > > With this option, compiler probably can also define some macros so > that if user can use to write overriding hooks. > >> >> Then, I plan to add the following hooks to libgcc (in a different patch) : >> >> int set_mock_cpu_is (const char *cpu); >> int set_mock_cpu_supports (const char *isa); >> int init_mock_cpu (); // Clear the values of the mock cpu. >> >> With this support, here is how you can test for code coverage of the >> "sse" version and "default version of foo in the above example: >> >> int main () >> { >> // Test SSE version. >> if (__builtin_cpu_supports ("sse")) >> { >> init_mock_cpu(); >> set_mock_cpu_supports ("sse"); >> assert (foo () == 1); >> } >> // Test default version. >> init_mock_cpu(); >> assert (foo () == 0); >> } >> >> Invoking a multiversioned binary several times with appropriate mock >> cpu values for the various ISAs and CPUs will give the complete code >> coverage desired. Ofcourse, the underlying platform should be able to >> support the various features. >> > > It is the other way around -- it simplifies unit test writing and > running -- one unit test just need to be run on the same hardware > (with the most hw features) *ONCE* and all the versions can be > covered. Yes, the test needs to run just once, potentially, if the test platform can support all of the features. > > > >> Note that the above test will work only with -fmv-debug as the >> dispatcher must be invoked on every multiversioned call to be able to >> dynamically change the version. >> >> Multiple ISA features can be set in the mock cpu by calling >> "set_mock_cpu_supports" several times with different ISA names. >> Calling "init_mock_cpu" will clear all the values. "set_mock_cpu_is" >> will set the CPU type. >> > > > Just through about another idea. Is it possible for compiler to create > some alias for each version so that they can be accessed explicitly, > just like the use of :: ? > > if (__buitin_cpu_supports ("sse")) > CHECK_RESULT (foo_sse (...)); > > CHECK_RESULT (foo_default(...)); This will work for this example. But, in general, this means changing the call site of every multiversioned call and that can become infeasible. Thanks Sri > > ... > > David > > >> This patch only includes the gcc changes. I will separately prepare a >> patch for the libgcc changes. Right now, since the libgcc changes are >> not available the two new mock cpu builtins check the real CPU like >> "__builtin_cpu_is" and "__builtin_cpu_supports". >> >> Patch attached. Please look at mv14_debug_code_coverage.C for an >> exhaustive example of testing for code coverage in the presence of >> multiple versions. >> >> Comments please. >> >> Thanks >> Sri
Ok. If the use case is to enable the test of the same application binary (not the per function unit test) with CPU mocking at runtime (via environment variable or application specific flags), the proposed changes make sense. David On Fri, Mar 15, 2013 at 3:49 PM, Sriraman Tallam <tmsriram@google.com> wrote: > On Fri, Mar 15, 2013 at 3:37 PM, Xinliang David Li <davidxl@google.com> wrote: >> On Fri, Mar 15, 2013 at 2:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: >>> Hi, >>> >>> This patch is meant for google/gcc-4_7 but I want this to be >>> considered for trunk when it opens again. This patch makes it easy to >>> test for code coverage of multiversioned functions. Here is a >>> motivating example: >>> >>> __attribute__((target ("default"))) int foo () { ... return 0; } >>> __attribute__((target ("sse"))) int foo () { ... return 1; } >>> __attribute__((target ("popcnt"))) int foo () { ... return 2; } >>> >>> int main () >>> { >>> return foo(); >>> } >>> >>> Lets say your test CPU supports popcnt. A run of this program will >>> invoke the popcnt version of foo (). Then, how do we test the sse >>> version of foo()? To do that for the above example, we need to run >>> this code on a CPU that has sse support but no popcnt support. >>> Otherwise, we need to comment out the popcnt version and run this >>> example. This can get painful when there are many versions. The same >>> argument applies to testing the default version of foo. >>> >>> So, I am introducing the ability to mock a CPU. If the CPU you are >>> testing on supports sse, you should be able to test the sse version. >>> >>> First, I have introduced a new flag called -fmv-debug. This patch >>> invokes the function version dispatcher every time a call to a foo () >>> is made. Without that flag, the version dispatch happens once at >>> startup time via the IFUNC mechanism. >>> >>> Also, with -fmv-debug, the version dispatcher uses the two new >>> builtins "__builtin_mock_cpu_is" and "__builtin_mock_cpu_supports" to >>> check the cpu type and cpu isa. >> >> With this option, compiler probably can also define some macros so >> that if user can use to write overriding hooks. >> >>> >>> Then, I plan to add the following hooks to libgcc (in a different patch) : >>> >>> int set_mock_cpu_is (const char *cpu); >>> int set_mock_cpu_supports (const char *isa); >>> int init_mock_cpu (); // Clear the values of the mock cpu. >>> >>> With this support, here is how you can test for code coverage of the >>> "sse" version and "default version of foo in the above example: >>> >>> int main () >>> { >>> // Test SSE version. >>> if (__builtin_cpu_supports ("sse")) >>> { >>> init_mock_cpu(); >>> set_mock_cpu_supports ("sse"); >>> assert (foo () == 1); >>> } >>> // Test default version. >>> init_mock_cpu(); >>> assert (foo () == 0); >>> } >>> >>> Invoking a multiversioned binary several times with appropriate mock >>> cpu values for the various ISAs and CPUs will give the complete code >>> coverage desired. Ofcourse, the underlying platform should be able to >>> support the various features. >>> >> >> It is the other way around -- it simplifies unit test writing and >> running -- one unit test just need to be run on the same hardware >> (with the most hw features) *ONCE* and all the versions can be >> covered. > > > Yes, the test needs to run just once, potentially, if the test > platform can support all of the features. > >> >> >> >>> Note that the above test will work only with -fmv-debug as the >>> dispatcher must be invoked on every multiversioned call to be able to >>> dynamically change the version. >>> >>> Multiple ISA features can be set in the mock cpu by calling >>> "set_mock_cpu_supports" several times with different ISA names. >>> Calling "init_mock_cpu" will clear all the values. "set_mock_cpu_is" >>> will set the CPU type. >>> >> >> >> Just through about another idea. Is it possible for compiler to create >> some alias for each version so that they can be accessed explicitly, >> just like the use of :: ? >> >> if (__buitin_cpu_supports ("sse")) >> CHECK_RESULT (foo_sse (...)); >> >> CHECK_RESULT (foo_default(...)); > > This will work for this example. But, in general, this means changing > the call site of every multiversioned call and that can become > infeasible. > > Thanks > Sri > > >> >> ... >> >> David >> >> >>> This patch only includes the gcc changes. I will separately prepare a >>> patch for the libgcc changes. Right now, since the libgcc changes are >>> not available the two new mock cpu builtins check the real CPU like >>> "__builtin_cpu_is" and "__builtin_cpu_supports". >>> >>> Patch attached. Please look at mv14_debug_code_coverage.C for an >>> exhaustive example of testing for code coverage in the presence of >>> multiple versions. >>> >>> Comments please. >>> >>> Thanks >>> Sri
On Fri, Mar 15, 2013 at 10:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: > Hi, > > This patch is meant for google/gcc-4_7 but I want this to be > considered for trunk when it opens again. This patch makes it easy to > test for code coverage of multiversioned functions. Here is a > motivating example: > > __attribute__((target ("default"))) int foo () { ... return 0; } > __attribute__((target ("sse"))) int foo () { ... return 1; } > __attribute__((target ("popcnt"))) int foo () { ... return 2; } > > int main () > { > return foo(); > } > > Lets say your test CPU supports popcnt. A run of this program will > invoke the popcnt version of foo (). Then, how do we test the sse > version of foo()? To do that for the above example, we need to run > this code on a CPU that has sse support but no popcnt support. > Otherwise, we need to comment out the popcnt version and run this > example. This can get painful when there are many versions. The same > argument applies to testing the default version of foo. > > So, I am introducing the ability to mock a CPU. If the CPU you are > testing on supports sse, you should be able to test the sse version. > > First, I have introduced a new flag called -fmv-debug. This patch > invokes the function version dispatcher every time a call to a foo () > is made. Without that flag, the version dispatch happens once at > startup time via the IFUNC mechanism. > > Also, with -fmv-debug, the version dispatcher uses the two new > builtins "__builtin_mock_cpu_is" and "__builtin_mock_cpu_supports" to > check the cpu type and cpu isa. > > Then, I plan to add the following hooks to libgcc (in a different patch) : > > int set_mock_cpu_is (const char *cpu); > int set_mock_cpu_supports (const char *isa); > int init_mock_cpu (); // Clear the values of the mock cpu. > > With this support, here is how you can test for code coverage of the > "sse" version and "default version of foo in the above example: > > int main () > { > // Test SSE version. > if (__builtin_cpu_supports ("sse")) > { > init_mock_cpu(); > set_mock_cpu_supports ("sse"); > assert (foo () == 1); > } > // Test default version. > init_mock_cpu(); > assert (foo () == 0); > } > > Invoking a multiversioned binary several times with appropriate mock > cpu values for the various ISAs and CPUs will give the complete code > coverage desired. Ofcourse, the underlying platform should be able to > support the various features. > > Note that the above test will work only with -fmv-debug as the > dispatcher must be invoked on every multiversioned call to be able to > dynamically change the version. > > Multiple ISA features can be set in the mock cpu by calling > "set_mock_cpu_supports" several times with different ISA names. > Calling "init_mock_cpu" will clear all the values. "set_mock_cpu_is" > will set the CPU type. > > This patch only includes the gcc changes. I will separately prepare a > patch for the libgcc changes. Right now, since the libgcc changes are > not available the two new mock cpu builtins check the real CPU like > "__builtin_cpu_is" and "__builtin_cpu_supports". > > Patch attached. Please look at mv14_debug_code_coverage.C for an > exhaustive example of testing for code coverage in the presence of > multiple versions. > > Comments please. Err. As we are using IFUNCs isn't it simply possible to do this in the dynamic loader - for example by simlply pre-loading a library with the IFUNC relocators implemented differently? Thus, shouldn't we simply provide such library as a convenience? Thanks, Richard. > Thanks > Sri
Interesting idea about lazy IFUNC relocation. David On Mon, Mar 18, 2013 at 2:02 AM, Richard Biener <richard.guenther@gmail.com> wrote: > On Fri, Mar 15, 2013 at 10:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: >> Hi, >> >> This patch is meant for google/gcc-4_7 but I want this to be >> considered for trunk when it opens again. This patch makes it easy to >> test for code coverage of multiversioned functions. Here is a >> motivating example: >> >> __attribute__((target ("default"))) int foo () { ... return 0; } >> __attribute__((target ("sse"))) int foo () { ... return 1; } >> __attribute__((target ("popcnt"))) int foo () { ... return 2; } >> >> int main () >> { >> return foo(); >> } >> >> Lets say your test CPU supports popcnt. A run of this program will >> invoke the popcnt version of foo (). Then, how do we test the sse >> version of foo()? To do that for the above example, we need to run >> this code on a CPU that has sse support but no popcnt support. >> Otherwise, we need to comment out the popcnt version and run this >> example. This can get painful when there are many versions. The same >> argument applies to testing the default version of foo. >> >> So, I am introducing the ability to mock a CPU. If the CPU you are >> testing on supports sse, you should be able to test the sse version. >> >> First, I have introduced a new flag called -fmv-debug. This patch >> invokes the function version dispatcher every time a call to a foo () >> is made. Without that flag, the version dispatch happens once at >> startup time via the IFUNC mechanism. >> >> Also, with -fmv-debug, the version dispatcher uses the two new >> builtins "__builtin_mock_cpu_is" and "__builtin_mock_cpu_supports" to >> check the cpu type and cpu isa. >> >> Then, I plan to add the following hooks to libgcc (in a different patch) : >> >> int set_mock_cpu_is (const char *cpu); >> int set_mock_cpu_supports (const char *isa); >> int init_mock_cpu (); // Clear the values of the mock cpu. >> >> With this support, here is how you can test for code coverage of the >> "sse" version and "default version of foo in the above example: >> >> int main () >> { >> // Test SSE version. >> if (__builtin_cpu_supports ("sse")) >> { >> init_mock_cpu(); >> set_mock_cpu_supports ("sse"); >> assert (foo () == 1); >> } >> // Test default version. >> init_mock_cpu(); >> assert (foo () == 0); >> } >> >> Invoking a multiversioned binary several times with appropriate mock >> cpu values for the various ISAs and CPUs will give the complete code >> coverage desired. Ofcourse, the underlying platform should be able to >> support the various features. >> >> Note that the above test will work only with -fmv-debug as the >> dispatcher must be invoked on every multiversioned call to be able to >> dynamically change the version. >> >> Multiple ISA features can be set in the mock cpu by calling >> "set_mock_cpu_supports" several times with different ISA names. >> Calling "init_mock_cpu" will clear all the values. "set_mock_cpu_is" >> will set the CPU type. >> >> This patch only includes the gcc changes. I will separately prepare a >> patch for the libgcc changes. Right now, since the libgcc changes are >> not available the two new mock cpu builtins check the real CPU like >> "__builtin_cpu_is" and "__builtin_cpu_supports". >> >> Patch attached. Please look at mv14_debug_code_coverage.C for an >> exhaustive example of testing for code coverage in the presence of >> multiple versions. >> >> Comments please. > > Err. As we are using IFUNCs isn't it simply possible to do this in > the dynamic loader - for example by simlply pre-loading a library > with the IFUNC relocators implemented differently? Thus, shouldn't > we simply provide such library as a convenience? > > Thanks, > Richard. > >> Thanks >> Sri
+cc libc-alpha On Mon, Mar 18, 2013 at 9:05 AM, Xinliang David Li <davidxl@google.com> wrote: > Interesting idea about lazy IFUNC relocation. > On Mon, Mar 18, 2013 at 2:02 AM, Richard Biener > <richard.guenther@gmail.com> wrote: >> On Fri, Mar 15, 2013 at 10:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: >>> This patch is meant for google/gcc-4_7 but I want this to be >>> considered for trunk when it opens again. This patch makes it easy to >>> test for code coverage of multiversioned functions. Here is a >>> motivating example: >> Err. As we are using IFUNCs isn't it simply possible to do this in >> the dynamic loader - for example by simlply pre-loading a library >> with the IFUNC relocators implemented differently? Thus, shouldn't >> we simply provide such library as a convenience? A similar need exists in glibc itself: it too has multiversioned functions, and lack of testing has led to recent bugs in some of them. HJ has added a framework to test IFUNCs to glibc late last year, but it would be nice to have a more general IFUNC control, so I could e.g. run a binary on SSE4-capable machine A as that binary would run on SSE2-only capable machine B. (We've had a few bugs recently, were the crash would only show on machine B and not A. These are a pain to debug, as I may not have access to B.) If such a controller is implemented, I'd think it would have to be part of GLIBC (or part of the ld-linux itself), and not of libgcc. LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are available Thanks,
On Mon, Mar 18, 2013 at 10:02 AM, Paul Pluzhnikov <ppluzhnikov@google.com> wrote: > +cc libc-alpha > > On Mon, Mar 18, 2013 at 9:05 AM, Xinliang David Li <davidxl@google.com> wrote: >> Interesting idea about lazy IFUNC relocation. > >> On Mon, Mar 18, 2013 at 2:02 AM, Richard Biener >> <richard.guenther@gmail.com> wrote: > >>> On Fri, Mar 15, 2013 at 10:55 PM, Sriraman Tallam <tmsriram@google.com> wrote: > >>>> This patch is meant for google/gcc-4_7 but I want this to be >>>> considered for trunk when it opens again. This patch makes it easy to >>>> test for code coverage of multiversioned functions. Here is a >>>> motivating example: > >>> Err. As we are using IFUNCs isn't it simply possible to do this in >>> the dynamic loader - for example by simlply pre-loading a library >>> with the IFUNC relocators implemented differently? Thus, shouldn't >>> we simply provide such library as a convenience? > > A similar need exists in glibc itself: it too has multiversioned functions, > and lack of testing has led to recent bugs in some of them. > > HJ has added a framework to test IFUNCs to glibc late last year, but it > would be nice to have a more general IFUNC control, so I could e.g. run > a binary on SSE4-capable machine A as that binary would run on SSE2-only > capable machine B. > > (We've had a few bugs recently, were the crash would only show on machine > B and not A. These are a pain to debug, as I may not have access to B.) > > If such a controller is implemented, I'd think it would have to be part > of GLIBC (or part of the ld-linux itself), and not of libgcc. > > LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are available > We can pass environment variables to IFUNC selector. Maybe we can enable it for debug build.
"H.J. Lu" <hjl.tools@gmail.com> wrote: >On Mon, Mar 18, 2013 at 10:02 AM, Paul Pluzhnikov ><ppluzhnikov@google.com> wrote: >> +cc libc-alpha >> >> On Mon, Mar 18, 2013 at 9:05 AM, Xinliang David Li ><davidxl@google.com> wrote: >>> Interesting idea about lazy IFUNC relocation. >> >>> On Mon, Mar 18, 2013 at 2:02 AM, Richard Biener >>> <richard.guenther@gmail.com> wrote: >> >>>> On Fri, Mar 15, 2013 at 10:55 PM, Sriraman Tallam ><tmsriram@google.com> wrote: >> >>>>> This patch is meant for google/gcc-4_7 but I want this to be >>>>> considered for trunk when it opens again. This patch makes it easy >to >>>>> test for code coverage of multiversioned functions. Here is a >>>>> motivating example: >> >>>> Err. As we are using IFUNCs isn't it simply possible to do this in >>>> the dynamic loader - for example by simlply pre-loading a library >>>> with the IFUNC relocators implemented differently? Thus, shouldn't >>>> we simply provide such library as a convenience? >> >> A similar need exists in glibc itself: it too has multiversioned >functions, >> and lack of testing has led to recent bugs in some of them. >> >> HJ has added a framework to test IFUNCs to glibc late last year, but >it >> would be nice to have a more general IFUNC control, so I could e.g. >run >> a binary on SSE4-capable machine A as that binary would run on >SSE2-only >> capable machine B. >> >> (We've had a few bugs recently, were the crash would only show on >machine >> B and not A. These are a pain to debug, as I may not have access to >B.) >> >> If such a controller is implemented, I'd think it would have to be >part >> of GLIBC (or part of the ld-linux itself), and not of libgcc. >> >> LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are >available >> > >We can pass environment variables to IFUNC selector. Maybe we can >enable it for debug build. I was asking for the ifunc selector to be Overridable by ld_preload or a similar mechanism at dynamic load time. Richard.
On Mon, Mar 18, 2013 at 10:18 AM, Richard Biener <richard.guenther@gmail.com> wrote: > "H.J. Lu" <hjl.tools@gmail.com> wrote: >>We can pass environment variables to IFUNC selector. Maybe we can >>enable it for debug build. Enabling this for just debug builds would not cover my use case. If the environment variable is used at loader initialization time to override CPUID output, then the runtime cost of that code would be minuscule, and it can be available in production glibc builds. > I was asking for the ifunc selector to be > Overridable by ld_preload or a similar mechanism at dynamic load time. Yes, that's how I understood you. I don't believe it would be easy to implement such interposer (if possible at all), and it would be very much tied to glibc internals. Overriding CPUID at loader initialization time sounds simpler (but I haven't looked at the code yet :-).
On Mon, Mar 18, 2013 at 06:18:58PM +0100, Richard Biener wrote: > I was asking for the ifunc selector to be > Overridable by ld_preload or a similar mechanism at dynamic load time. Please don't. Calling an ifunc resolver function in another library is just asking for trouble with current glibc. Why? Well, the other library containing the resolver function may not have had any dynamic relocations applied. So if the resolver makes use of the GOT (to read some variable), it will use unrelocated addresses. You'll segfault if you're lucky. For anyone playing with ifunc, please test out your great ideas on i386, ppc32, mips, arm, etc. *NOT* x86_64 or powerpc64 which both avoid the GOT in many cases.
Hi, On Mon, Mar 18, 2013 at 10:44 PM, Alan Modra <amodra@gmail.com> wrote: > On Mon, Mar 18, 2013 at 06:18:58PM +0100, Richard Biener wrote: >> I was asking for the ifunc selector to be >> Overridable by ld_preload or a similar mechanism at dynamic load time. > > Please don't. Calling an ifunc resolver function in another library > is just asking for trouble with current glibc. Why? Well, the other > library containing the resolver function may not have had any dynamic > relocations applied. So if the resolver makes use of the GOT (to read > some variable), it will use unrelocated addresses. You'll segfault if > you're lucky. Does this also mean that Paul's idea of doing: LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are available is fraught with risk when used with IFUNC, particularly on x86_64? Shouldn't the IFUNC resolver go through the GOT even in this case. This could work well for the MV testing problem I explained earlier, but if this is not feasible with IFUNC in play I would like my original proposal reconsidered. Thanks Sri > > For anyone playing with ifunc, please test out your great ideas on > i386, ppc32, mips, arm, etc. *NOT* x86_64 or powerpc64 which both > avoid the GOT in many cases. > > -- > Alan Modra > Australia Development Lab, IBM
On Mon, Mar 25, 2013 at 02:24:21PM -0700, Sriraman Tallam wrote: > Does this also mean that Paul's idea of doing: > > LD_CPU_FEATURES=sse,sse2 ./a.out # run as if only sse and sse2 are available > > is fraught with risk when used with IFUNC, particularly on x86_64? > > Shouldn't the IFUNC resolver go through the GOT even in this case. > This could work well for the MV testing problem I explained earlier, > but if this is not feasible with IFUNC in play I would like my > original proposal reconsidered. I haven't been following the thread so can't comment, sorry. I jumped in on seeing Richard's suggestion re LD_PRELOAD, which is a bad idea given glibc's current support for STT_GNU_IFUNC. IFUNC as it stands is not a general purpose feature and interacts badly with other features of ELF shared libraries. Trivial testcases can easily be created that 1) won't work on any architecture. eg. shared library takes address of ifunc, ifunc resolver in main app, resolver uses variable in shared library. 2) only work on x86_64 and powerpc64. eg. shared library takes address of ifunc, ifunc resolver in main app which is PIC, resolver uses variable in main app. 3) won't work with LD_BIND_NOW=1 either of the above examples but shared library doesn't take address, just calls ifunc. The reason for these problems is that ld.so makes a single pass over dynamic relocations. In the simple case of LD_BIND_NOW=1 and an application that uses a single shared library, relocations for the library will be applied first, then relocations for the main app. So if the shared library has relocations against symbols that turn out to be ifunc, and the ifunc resolver lives in the main app, then the resolver will run *before* the main app has been relocated. The resolver had better not have code that requires relocation! Of course, the obvious fix of making ld.so do two passes over relocations, applying ifunc relocations on the second pass, is somewhat counterproductive. Mostly ifunc is used to gain a speedup when running on particular hardware. Two passes would have to slow down application startup.. Nonetheless, I believe that is the correct solution if we want to make ifunc generally useful. What we have at the moment requires quite a lot of care when using ifunc. Accidentally writing code that only works on x86_64 or powerpc64 is very easy, and might lead people to think you own shares in Intel or IBM.
Index: cgraphunit.c =================================================================== --- cgraphunit.c (revision 196618) +++ cgraphunit.c (working copy) @@ -942,7 +942,12 @@ cgraph_analyze_function (struct cgraph_node *node) { tree resolver = NULL_TREE; gcc_assert (targetm.generate_version_dispatcher_body); - resolver = targetm.generate_version_dispatcher_body (node); + /* flag_mv_debug is 0 means that the dispatcher should be invoked + optimally (once using ifunc support). When flag_mv_debug is 1, + the dispatcher should be invoked every time a call to the + multiversioned function is made. */ + resolver + = targetm.generate_version_dispatcher_body (node, flag_mv_debug); gcc_assert (resolver != NULL_TREE); } } Index: common.opt =================================================================== --- common.opt (revision 196618) +++ common.opt (working copy) @@ -1600,6 +1600,10 @@ fmove-loop-invariants Common Report Var(flag_move_loop_invariants) Init(1) Optimization Move loop invariant computations out of loops +fmv-debug +Common RejectNegative Report Var(flag_mv_debug) Init(0) +Invoke the function version dispatcher for every multiversioned function call. + ftsan Common RejectNegative Report Var(flag_tsan) Add ThreadSanitizer instrumentation Index: doc/tm.texi =================================================================== --- doc/tm.texi (revision 196618) +++ doc/tm.texi (working copy) @@ -11032,11 +11032,13 @@ version at run-time. @var{decl} is one version fro identical versions. @end deftypefn -@deftypefn {Target Hook} tree TARGET_GENERATE_VERSION_DISPATCHER_BODY (void *@var{arg}) +@deftypefn {Target Hook} tree TARGET_GENERATE_VERSION_DISPATCHER_BODY (void *@var{arg}, int @var{debug_mode}) This hook is used to generate the dispatcher logic to invoke the right function version at run-time for a given set of function versions. @var{arg} points to the callgraph node of the dispatcher function whose -body must be generated. +body must be generated. When @var{debug_mode} is 1, the dispatcher +logic is invoked on every call. Otherwise, the dispatcher is invoked +only at start up to minimize call overhead. @end deftypefn @deftypefn {Target Hook} {const char *} TARGET_INVALID_WITHIN_DOLOOP (const_rtx @var{insn}) Index: doc/tm.texi.in =================================================================== --- doc/tm.texi.in (revision 196618) +++ doc/tm.texi.in (working copy) @@ -10908,7 +10908,9 @@ identical versions. This hook is used to generate the dispatcher logic to invoke the right function version at run-time for a given set of function versions. @var{arg} points to the callgraph node of the dispatcher function whose -body must be generated. +body must be generated. When @var{debug_mode} is 1, the dispatcher +logic is invoked on every call. Otherwise, the dispatcher is invoked +only at start up to minimize call overhead. @end deftypefn @hook TARGET_INVALID_WITHIN_DOLOOP Index: testsuite/g++.dg/ext/mv14_debug_code_coverage.C =================================================================== --- testsuite/g++.dg/ext/mv14_debug_code_coverage.C (revision 0) +++ testsuite/g++.dg/ext/mv14_debug_code_coverage.C (revision 0) @@ -0,0 +1,213 @@ +/* Test case to show how code coverage testing of of a multiversioned function + can be done using cpu mocks. */ +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fmv-debug" } */ + +#include <assert.h> +#include <string.h> + +/* Temporary code till the libgcc hooks for this are checked in. Override + __builtin_mock_cpu_* builtins to change the mock cpu. */ +const char *mock_cpu = NULL; +int __builtin_mock_cpu_is (const char *cpu) +{ + if (strcmp (cpu, mock_cpu) == 0) + return 1; + return 0; +} + +/* Only mock one ISA type. The libgcc hooks will allow mocking multiple + ISA features together, like popcnt and avx2. */ +const char *mock_isa = NULL; +int __builtin_mock_cpu_supports (const char *isa) +{ + if (strcmp (isa, mock_isa) == 0) + return 1 + return 0; +} +/* End of temporary code. */ + + +/* Default version. */ +int foo () __attribute__ ((target ("default"))); + +int foo () __attribute__ ((target ("mmx"))); +int foo () __attribute__ ((target ("sse"))); +int foo () __attribute__ ((target ("sse2"))); +int foo () __attribute__ ((target ("sse3"))); +int foo () __attribute__ ((target ("ssse3"))); +int foo () __attribute__ ((target ("sse4.1"))); +int foo () __attribute__ ((target ("sse4.2"))); +int foo () __attribute__ ((target ("popcnt"))); +int foo () __attribute__ ((target ("avx"))); +int foo () __attribute__ ((target ("avx2"))); + +int foo () __attribute__ ((target ("arch=corei7"))); + +int main () +{ + /* Using CPU mocks run each version of foo() when possible and + check the return value. */ + + /* Run Intel corei7 version if possible. Test if this + CPU can mock corei7. It should support SSE4.2 and + below, SSSE3 and MMX. */ + if (__builtin_cpu_supports ("sse4.2") + && __builtin_cpu_supports ("ssse3") + && __builtin_cpu_supports ("mmx")) + { + mock_cpu = "corei7"; + mock_isa = ""; + assert (foo () == 11); + } + + /* Run avx2 version if possible. */ + if (__builtin_cpu_supports ("avx2")) + { + mock_cpu = ""; + mock_isa = "avx2"; + assert (foo () == 1); + } + /* Run avx version if possible. */ + if (__builtin_cpu_supports ("avx")) + { + mock_cpu = ""; + mock_isa = "avx"; + assert (foo () == 2); + } + /* Run popcnt version if possible. */ + if (__builtin_cpu_supports ("popcnt")) + { + mock_cpu = ""; + mock_isa = "popcnt"; + assert (foo () == 3); + } + /* Run sse4.2 version if possible. */ + if (__builtin_cpu_supports ("sse4.2")) + { + mock_cpu = ""; + mock_isa = "sse4.2"; + assert (foo () == 4); + } + /* Run sse4.1 version if possible. */ + if (__builtin_cpu_supports ("sse4.1")) + { + mock_cpu = ""; + mock_isa = "sse4.1"; + assert (foo () == 5); + } + /* Run ssse3 version if possible. */ + if (__builtin_cpu_supports ("ssse3")) + { + mock_cpu = ""; + mock_isa = "ssse3"; + assert (foo () == 6); + } + /* Run sse3 version if possible. */ + if (__builtin_cpu_supports ("sse3")) + { + mock_cpu = ""; + mock_isa = "sse3"; + assert (foo () == 7); + } + /* Run sse2 version if possible. */ + if (__builtin_cpu_supports ("sse2")) + { + mock_cpu = ""; + mock_isa = "sse2"; + assert (foo () == 8); + } + /* Run sse version if possible. */ + if (__builtin_cpu_supports ("sse")) + { + mock_cpu = ""; + mock_isa = "sse"; + assert (foo () == 9); + } + /* Run mmx version if possible. */ + if (__builtin_cpu_supports ("mmx")) + { + mock_cpu = ""; + mock_isa = "mmx"; + assert (foo () == 10); + } + + /* Run the default version. */ + mock_cpu = ""; + mock_isa = ""; + assert (foo () == 0); + + return 0; +} + +int __attribute__ ((target("default"))) +foo () +{ + return 0; +} + +int __attribute__ ((target("arch=corei7"))) +foo () +{ + return 11; +} + +int __attribute__ ((target("mmx"))) +foo () +{ + return 10; +} + +int __attribute__ ((target("sse"))) +foo () +{ + return 9; +} + +int __attribute__ ((target("sse2"))) +foo () +{ + return 8; +} + +int __attribute__ ((target("sse3"))) +foo () +{ + return 7; +} + +int __attribute__ ((target("ssse3"))) +foo () +{ + return 6; +} + +int __attribute__ ((target("sse4.1"))) +foo () +{ + return 5; +} + +int __attribute__ ((target("sse4.2"))) +foo () +{ + return 4; +} + +int __attribute__ ((target("popcnt"))) +foo () +{ + return 3; +} + +int __attribute__ ((target("avx"))) +foo () +{ + return 2; +} + +int __attribute__ ((target("avx2"))) +foo () +{ + return 1; +} Index: testsuite/g++.dg/ext/mv2_debug.C =================================================================== --- testsuite/g++.dg/ext/mv2_debug.C (revision 0) +++ testsuite/g++.dg/ext/mv2_debug.C (revision 0) @@ -0,0 +1,4 @@ +/* Test case to check if mv2.C works with -fmv-debug additionally added. */ +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fmv-debug" } */ +/* { dg-additional-sources "mv2.C" } */ Index: testsuite/g++.dg/ext/mv6_debug.C =================================================================== --- testsuite/g++.dg/ext/mv6_debug.C (revision 0) +++ testsuite/g++.dg/ext/mv6_debug.C (revision 0) @@ -0,0 +1,4 @@ +/* Test case to check if mv6.C works with -fmv-debug additionally added. */ +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-march=x86-64 -fmv-debug" } */ +/* { dg-additional-sources "mv6.C" } */ Index: testsuite/g++.dg/ext/mv1_debug.C =================================================================== --- testsuite/g++.dg/ext/mv1_debug.C (revision 0) +++ testsuite/g++.dg/ext/mv1_debug.C (revision 0) @@ -0,0 +1,4 @@ +/* Test case to check if mv1.C works with -fmv-debug additionally added. */ +/* { dg-do run { target i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fPIC -fmv-debug" } */ +/* { dg-additional-sources "mv1.C" } */ Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 196618) +++ config/i386/i386.c (working copy) @@ -26173,6 +26173,11 @@ enum ix86_builtins IX86_BUILTIN_CPU_IS, IX86_BUILTIN_CPU_SUPPORTS, + /* Builtins to mock CPU and ISA features, for + testing multiversioned functions. */ + IX86_BUILTIN_MOCK_CPU_IS, + IX86_BUILTIN_MOCK_CPU_SUPPORTS, + IX86_BUILTIN_MAX }; @@ -28001,11 +28006,14 @@ ix86_slow_unaligned_vector_memop (void) to return a pointer to VERSION_DECL if the outcome of the expression formed by PREDICATE_CHAIN is true. This function will be called during version dispatch to decide which function version to execute. It returns - the basic block at the end, to which more conditions can be added. */ + the basic block at the end, to which more conditions can be added. When + DEBUG_MODE is 1, the version dispatcher is invoked for every call + to the multiversioned function. */ static basic_block add_condition_to_bb (tree function_decl, tree version_decl, - tree predicate_chain, basic_block new_bb) + tree predicate_chain, basic_block new_bb, + int debug_mode) { gimple return_stmt; tree convert_expr, result_var; @@ -28026,11 +28034,43 @@ add_condition_to_bb (tree function_decl, tree vers gcc_assert (new_bb != NULL); gseq = bb_seq (new_bb); + /* If debug_mode is true, generate a call to the versioned function + and return the output of the call. Otherwise, return a pointer to + the versioned function. */ - convert_expr = build1 (CONVERT_EXPR, ptr_type_node, - build_fold_addr_expr (version_decl)); - result_var = create_tmp_var (ptr_type_node, NULL); - convert_stmt = gimple_build_assign (result_var, convert_expr); + if (debug_mode) + { + tree arg; + tree ret_type = TREE_TYPE (TREE_TYPE (function_decl)); + VEC (tree, heap) *vec = NULL; + vec = VEC_alloc (tree, heap, 2); + + arg = DECL_ARGUMENTS (function_decl); + + while (arg) + { + VEC_safe_push (tree, heap, vec, arg); + arg = DECL_CHAIN (arg); + } + + convert_stmt = gimple_build_call_vec (version_decl, vec); + VEC_free (tree, heap, vec); + result_var = NULL; + + if (ret_type != void_type_node) + { + result_var = DECL_RESULT (function_decl); + gimple_call_set_lhs (convert_stmt, result_var); + } + } + else + { + convert_expr = build1 (CONVERT_EXPR, ptr_type_node, + build_fold_addr_expr (version_decl)); + result_var = DECL_RESULT (function_decl); + convert_stmt = gimple_build_assign (result_var, convert_expr); + } + return_stmt = gimple_build_return (result_var); if (predicate_chain == NULL_TREE) @@ -28112,10 +28152,11 @@ add_condition_to_bb (tree function_decl, tree vers the right builtin to use to match the platform specification. It returns the priority value for this version decl. If PREDICATE_LIST is not NULL, it stores the list of cpu features that need to be checked - before dispatching this function. */ + before dispatching this function. When debug_mode is 1, use the mock + cpu check builtins to do the dispatch. */ static unsigned int -get_builtin_code_for_version (tree decl, tree *predicate_list) +get_builtin_code_for_version (tree decl, tree *predicate_list, int debug_mode) { tree attrs; struct cl_target_option cur_target; @@ -28254,7 +28295,10 @@ static unsigned int if (predicate_list) { - predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_IS]; + if (debug_mode) + predicate_decl = ix86_builtins [(int) IX86_BUILTIN_MOCK_CPU_IS]; + else + predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_IS]; /* For a C string literal the length includes the trailing NULL. */ predicate_arg = build_string_literal (strlen (arg_str) + 1, arg_str); predicate_chain = tree_cons (predicate_decl, predicate_arg, @@ -28266,8 +28310,12 @@ static unsigned int tok_str = (char *) xmalloc (strlen (attrs_str) + 1); strcpy (tok_str, attrs_str); token = strtok (tok_str, ","); - predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_SUPPORTS]; + if (debug_mode) + predicate_decl = ix86_builtins [(int) IX86_BUILTIN_MOCK_CPU_SUPPORTS]; + else + predicate_decl = ix86_builtins [(int) IX86_BUILTIN_CPU_SUPPORTS]; + while (token != NULL) { /* Do not process "arch=" */ @@ -28329,8 +28377,8 @@ static unsigned int static int ix86_compare_version_priority (tree decl1, tree decl2) { - unsigned int priority1 = get_builtin_code_for_version (decl1, NULL); - unsigned int priority2 = get_builtin_code_for_version (decl2, NULL); + unsigned int priority1 = get_builtin_code_for_version (decl1, NULL, false); + unsigned int priority2 = get_builtin_code_for_version (decl2, NULL, false); return (int)priority1 - (int)priority2; } @@ -28357,12 +28405,15 @@ feature_compare (const void *v1, const void *v2) multi-versioned functions. DISPATCH_DECL is the function which will contain the dispatch logic. FNDECLS are the function choices for dispatch, and is a tree chain. EMPTY_BB is the basic block pointer - in DISPATCH_DECL in which the dispatch code is generated. */ + in DISPATCH_DECL in which the dispatch code is generated. When + DEBUG_MODE is 1, the version dispatcher is invoked for every call + to the multiversioned function. */ static int dispatch_function_versions (tree dispatch_decl, void *fndecls_p, - basic_block *empty_bb) + basic_block *empty_bb, + int debug_mode) { tree default_decl; gimple ifunc_cpu_init_stmt; @@ -28420,8 +28471,8 @@ dispatch_function_versions (tree dispatch_decl, /* Get attribute string, parse it and find the right predicate decl. The predicate function could be a lengthy combination of many features, like arch-type and various isa-variants. */ - priority = get_builtin_code_for_version (version_decl, - &predicate_chain); + priority = get_builtin_code_for_version (version_decl, &predicate_chain, + debug_mode); if (predicate_chain == NULL_TREE) continue; @@ -28444,11 +28495,11 @@ dispatch_function_versions (tree dispatch_decl, *empty_bb = add_condition_to_bb (dispatch_decl, function_version_info[i].version_decl, function_version_info[i].predicate_chain, - *empty_bb); + *empty_bb, debug_mode); /* dispatch default version at the end. */ *empty_bb = add_condition_to_bb (dispatch_decl, default_decl, - NULL, *empty_bb); + NULL, *empty_bb, debug_mode); free (function_version_info); return 0; @@ -28813,8 +28864,19 @@ ix86_get_function_versions_dispatcher (void *decl) default_node = default_version_info->this_node; + + /* Right now, the dispatching at startup non-debug mode is done via ifunc. */ #if defined (ASM_OUTPUT_TYPE_DIRECTIVE) && HAVE_GNU_INDIRECT_FUNCTION - /* Right now, the dispatching is done via ifunc. */ +#else + if (!debug_mode) + { + error_at (DECL_SOURCE_LOCATION (default_node->symbol.decl), + "multiversioning needs ifunc which is not supported " + "in this configuration"); + return NULL; + } +#endif + dispatch_decl = make_dispatcher_decl (default_node->decl); dispatcher_node = cgraph_get_create_node (dispatch_decl); @@ -28832,11 +28894,7 @@ ix86_get_function_versions_dispatcher (void *decl) it_v->dispatcher_resolver = dispatch_decl; it_v = it_v->next; } -#else - error_at (DECL_SOURCE_LOCATION (default_node->symbol.decl), - "multiversioning needs ifunc which is not supported " - "in this configuration"); -#endif + return dispatch_decl; } @@ -28861,15 +28919,19 @@ make_attribute (const char *name, const char *arg_ /* Make the resolver function decl to dispatch the versions of a multi-versioned function, DEFAULT_DECL. Create an empty basic block in the resolver and store the pointer in - EMPTY_BB. Return the decl of the resolver function. */ + EMPTY_BB. Return the decl of the resolver function. When + DEBUG_MODE is 1, the resolver function body is not an + ifunc resolver; it simply calls the appropriate function + version and returns the call output. */ static tree make_resolver_func (const tree default_decl, const tree dispatch_decl, - basic_block *empty_bb) + basic_block *empty_bb, + int debug_mode) { char *resolver_name; - tree decl, type, decl_name, t; + tree decl, type, decl_name, t = NULL; bool is_uniq = false; /* IFUNC's have to be globally visible. So, if the default_decl is @@ -28884,8 +28946,19 @@ make_resolver_func (const tree default_decl, another module which is based on the same version name. */ resolver_name = make_name (default_decl, "resolver", is_uniq); - /* The resolver function should return a (void *). */ - type = build_function_type_list (ptr_type_node, NULL_TREE); + if (debug_mode) + { + /* In debug_mode, the resolver function calls the appropriate + function version. Its type is same as dispatch_decl. */ + tree fn_type = TREE_TYPE (dispatch_decl); + type = build_function_type (TREE_TYPE (fn_type), + TYPE_ARG_TYPES (fn_type)); + } + else + { + /* The resolver function should return a (void *). */ + type = build_function_type_list (ptr_type_node, NULL_TREE); + } decl = build_fn_decl (resolver_name, type); decl_name = get_identifier (resolver_name); @@ -28907,6 +28980,16 @@ make_resolver_func (const tree default_decl, DECL_INITIAL (decl) = make_node (BLOCK); DECL_STATIC_CONSTRUCTOR (decl) = 0; + /* In debug_mode, the resolver function is not an ifunc resolver. Its + signature is the same as the dispatch_decl or default_decl. */ + if (debug_mode) + { + tree arg; + DECL_ARGUMENTS (decl) = copy_list (DECL_ARGUMENTS (default_decl)); + for (arg = DECL_ARGUMENTS (decl); arg ; arg = DECL_CHAIN (arg)) + DECL_CONTEXT (arg) = decl; + } + if (DECL_COMDAT_GROUP (default_decl) || TREE_PUBLIC (default_decl)) { @@ -28917,7 +29000,9 @@ make_resolver_func (const tree default_decl, make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl)); } /* Build result decl and add to function_decl. */ - t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node); + t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, + TREE_TYPE (TREE_TYPE (decl))); + DECL_ARTIFICIAL (t) = 1; DECL_IGNORED_P (t) = 1; DECL_RESULT (decl) = t; @@ -28932,9 +29017,17 @@ make_resolver_func (const tree default_decl, pop_cfun (); gcc_assert (dispatch_decl != NULL); - /* Mark dispatch_decl as "ifunc" with resolver as resolver_name. */ - DECL_ATTRIBUTES (dispatch_decl) - = make_attribute ("ifunc", resolver_name, DECL_ATTRIBUTES (dispatch_decl)); + + /* Mark dispatch_decl as "alias" or "ifunc" with resolver as + resolver_name. */ + if (debug_mode) + DECL_ATTRIBUTES (dispatch_decl) + = make_attribute ("alias", resolver_name, + DECL_ATTRIBUTES (dispatch_decl)); + else + DECL_ATTRIBUTES (dispatch_decl) + = make_attribute ("ifunc", resolver_name, + DECL_ATTRIBUTES (dispatch_decl)); /* Create the alias for dispatch to resolver here. */ /*cgraph_create_function_alias (dispatch_decl, decl);*/ @@ -28946,10 +29039,13 @@ make_resolver_func (const tree default_decl, /* Generate the dispatching code body to dispatch multi-versioned function DECL. The target hook is called to process the "target" attributes and provide the code to dispatch the right function at run-time. NODE points - to the dispatcher decl whose body will be created. */ + to the dispatcher decl whose body will be created. When DEBUG_MODE is + 1, the dispatch checks should be made during every call to the versioned + function. When DEBUG_MODE is 0, ifunc based dispatching is used to + keep the call overhead small. */ static tree -ix86_generate_version_dispatcher_body (void *node_p) +ix86_generate_version_dispatcher_body (void *node_p, int debug_mode) { tree resolver_decl; basic_block empty_bb; @@ -28976,8 +29072,8 @@ static tree /* node is going to be an alias, so remove the finalized bit. */ node->local.finalized = false; - resolver_decl = make_resolver_func (default_ver_decl, - node->decl, &empty_bb); + resolver_decl = make_resolver_func (default_ver_decl, node->decl, + &empty_bb, debug_mode); node_version_info->dispatcher_resolver = resolver_decl; @@ -29000,7 +29096,8 @@ static tree VEC_safe_push (tree, heap, fn_ver_vec, versn->decl); } - dispatch_function_versions (resolver_decl, fn_ver_vec, &empty_bb); + dispatch_function_versions (resolver_decl, fn_ver_vec, + &empty_bb, debug_mode); VEC_free (tree, heap, fn_ver_vec); rebuild_cgraph_edges (); pop_cfun (); @@ -29185,7 +29282,8 @@ fold_builtin_cpu (tree fndecl, tree *args) gcc_assert (param_string_cst); - if (fn_code == IX86_BUILTIN_CPU_IS) + if (fn_code == IX86_BUILTIN_CPU_IS + || fn_code == IX86_BUILTIN_MOCK_CPU_IS) { tree ref; tree field; @@ -29234,7 +29332,8 @@ fold_builtin_cpu (tree fndecl, tree *args) build_int_cstu (unsigned_type_node, field_val)); return build1 (CONVERT_EXPR, integer_type_node, final); } - else if (fn_code == IX86_BUILTIN_CPU_SUPPORTS) + else if (fn_code == IX86_BUILTIN_CPU_SUPPORTS + || fn_code == IX86_BUILTIN_MOCK_CPU_SUPPORTS) { tree ref; tree array_elt; @@ -29288,7 +29387,9 @@ ix86_fold_builtin (tree fndecl, int n_args, enum ix86_builtins fn_code = (enum ix86_builtins) DECL_FUNCTION_CODE (fndecl); if (fn_code == IX86_BUILTIN_CPU_IS - || fn_code == IX86_BUILTIN_CPU_SUPPORTS) + || fn_code == IX86_BUILTIN_CPU_SUPPORTS + || fn_code == IX86_BUILTIN_MOCK_CPU_IS + || fn_code == IX86_BUILTIN_MOCK_CPU_SUPPORTS) { gcc_assert (n_args == 1); return fold_builtin_cpu (fndecl, args); @@ -29334,6 +29435,13 @@ ix86_init_platform_type_builtins (void) INT_FTYPE_PCCHAR, true); make_cpu_type_builtin ("__builtin_cpu_supports", IX86_BUILTIN_CPU_SUPPORTS, INT_FTYPE_PCCHAR, true); + /* Create builtins that mock cpu type and isa features. This is meant to + be used for code coverage testing of multiversioned functions. */ + make_cpu_type_builtin ("__builtin_mock_cpu_is", IX86_BUILTIN_MOCK_CPU_IS, + INT_FTYPE_PCCHAR, false); + make_cpu_type_builtin ("__builtin_mock_cpu_supports", + IX86_BUILTIN_MOCK_CPU_SUPPORTS, + INT_FTYPE_PCCHAR, false); } /* Internal method for ix86_init_builtins. */ @@ -31050,6 +31158,8 @@ ix86_expand_builtin (tree exp, rtx target, rtx sub call_expr = build_call_expr (fndecl, 0); return expand_expr (call_expr, target, mode, EXPAND_NORMAL); } + case IX86_BUILTIN_MOCK_CPU_IS: + case IX86_BUILTIN_MOCK_CPU_SUPPORTS: case IX86_BUILTIN_CPU_IS: case IX86_BUILTIN_CPU_SUPPORTS: { Index: target.def =================================================================== --- target.def (revision 196618) +++ target.def (working copy) @@ -1271,11 +1271,12 @@ DEFHOOK /* Target hook is used to generate the dispatcher logic to invoke the right function version at run-time for a given set of function versions. ARG points to the callgraph node of the dispatcher function whose body - must be generated. */ + must be generated. The version dispatcher is invoked on every call when + debug_mode is 1. */ DEFHOOK (generate_version_dispatcher_body, "", - tree, (void *arg), NULL) + tree, (void *arg, int debug_mode), NULL) /* Target hook is used to get the dispatcher function for a set of function versions. The dispatcher function is called to invoke the right function