Message ID | 20211109190137.1107736-1-siddhesh@gotplt.org |
---|---|
Headers | show |
Series | __builtin_dynamic_object_size | expand |
On Wed, Nov 10, 2021 at 12:31:26AM +0530, Siddhesh Poyarekar wrote: > - Instead of bailing out on non-constant sizes with > __builtin_object_size, it should be possible to use ranger to > get an upper and lower bound on the size expression and use that to > implement __builtin_object_size. I'd prefer not to. One thing is that VRP heavily relies on UB not happening in the program, while __bos is typically used to catch UB in those programs. And, I'm afraid fairly often VRP would result in runtime tests for limits that aren't useful for security at all. Say VRP figures out that some integer isn't negative or doesn't have MSB set etc., suddenly we have range of [0, INT_MAX] or similar and making that imply __builtin_object_size INT_MAX rather than ~(size_t) 0 would mean we need to use __*_chk and compare at runtime, even when it is very unlikely an object would be that big... The compiler computes some range, but there is not information on how much the range actually maps to the actual range the program is using, or when it is some much larger superset of the actual range (same problem is with Martin's warnings BTW). Some unrelated inlined function can perform some comparison just in case, perhaps some jump threading is done and all of sudden there is non-VARYING range. Jakub
On Tue, Nov 9, 2021 at 11:02 AM Siddhesh Poyarekar <siddhesh@gotplt.org> wrote: > > > This patchset implements the __builtin_dynamic_object_size builtin for > gcc. The primary motivation to have this builtin in gcc is to enable > _FORTIFY_SOURCE=3 support with gcc, thus allowing greater fortification > in use cases where the potential performance tradeoff is acceptable. > > Semantics: > ---------- > > __builtin_dynamic_object_size has the same signature as > __builtin_object_size; it accepts a pointer and type ranging from 0 to 3 > and it returns an object size estimate for the pointer based on an > analysis of which objects the pointer could point to. The actual > properties of the object size estimate are different: > > - In the best case __builtin_dynamic_object_size evaluates to an > expression that represents a precise size of the object being pointed > to. > > - In case a precise object size expression cannot be evaluated, > __builtin_dynamic_object_size attempts to evaluate an estimate size > expression based on the object size type. > > - In what situations the builtin returns an estimate vs a precise > expression is an implementation detail and may change in future. > Users must always assume, as in the case of __builtin_object_size, that > the returned value is the maximum or minimum based on the object size > type they have provided. > > - In the worst case of failure, __builtin_dynamic_object_size returns a > constant (size_t)-1 or (size_t)0. > > Implementation: > --------------- > > - The __builtin_dynamic_object_size support is implemented in > tree-object-size. In most cases the first pass (early_objsz) is able > to evaluate object size expressions. The second phase mainly ends up > simplifying the __builtin_dynamic_object_size to > __builtin_object_size. > > - The patchset begins with structural modification of the > tree-object-size pass, followed by enhancement to return size > expressions. I have split the implementation into one feature per > patch (calls, function parameters, PHI, etc.) to hopefully ease > review. > > Performance: > ------------ > > Expressions generated by this pass in theory could be arbitrarily > complex. I have not made an attempt to limit nesting of objects since > it seemed too early to do that. In practice based on the few > applications I built, most of the complexity of the expressions got > folded away. Even so, the performance overhead is likely to be > non-zero. If we find performance degradation to be significant, we > could later add nesting limits to bail out if a size expression gets too > complex. > > I have also not implemented simplification of __*_chk to their normal > variants if we can determine at compile time that it is safe, which > still depends on the object size to be constant. I hope to do this as a > minor performance improvement in stage 3. > > Build time performance doesn't seem to be affected much based on an > unscientific check to time > `make check-gcc RUNTESTFLAGS="dg.exp=builtin*"`. It only increases by > about a couple of seconds when the dynamic tests are added and remains > more or less in the same ballpark otherwise. > > Testing: > -------- > > I have added tests for dynamic object sizes as well as wrappers for all > __builtin_object_size tests to provide wide coverage. With that in > place I have run full bootstrap builds on Fedora rawhide by backporting the > patches to the gcc11 branch and x86_64 and i686 have no failures in any > of the builtin-*object-size* tests and no new failures. > > I have also built bash, cmake, zfs-fuse and systemtap with > _FORTIFY_SOURCE=3 (with a hacked up glibc to make sure it works) and saw > no issues in any of those builds. I did some rudimentary analysis of > the generated binaries to see if there was any difference in coverage > and found that there was. In terms of pure numbers, there were far more > _chk variants of calls than the regular ones due to _FORTIFY_SOURCE > (from about 4% to 70% in bash), but that could well be due to the _chk > variants not being folded into regular variants when safe. However, on > manual inspection of some of these sites, it was clear that coverage was > increasing significantly where __builtin_object_size was previously > bailing out. > > Specifically for bash, the coverage went from 30.81% to 86.87% (it was > 84.5% with the v1 patch). I actually hope to reduce this a bit with > folding improvements for __builtin___memcpy_chk, etc. > > A bootstrap build is in progress on x86_64. > > Limitations/Future work: > ------------------------ > > - The most immediate work is to fold _chk variants of builtins into > regular ones when it can be proven at compile time that the object > size will alwasy be less than the length of the write. I am working > on it right now. > > - I need to enable _FORTIFY_SOURCE=3 for gcc in glibc; currently it is > llvm-only. I've started working on these patches too on the side. > > - Instead of bailing out on non-constant sizes with > __builtin_object_size, it should be possible to use ranger to > get an upper and lower bound on the size expression and use that to > implement __builtin_object_size. When I was implementing improvements into phiopt I ran into case where objsz would fail now because we get: tmp = PHI <CST0, CST1> ptr = ptr + tmp where before the pointer plus was inside each branch instead. So my question is there any progress on implementing objsz with ranger or has that work been put off? I filed https://gcc.gnu.org/PR116556 for this. Do you have any start of patches for this so it maybe it could be taken to finish? If not then I am going to try to implement it since it blocks my work on phiopt. Thanks, Andrew Pinski > > - More work could to be done to reduce the performance impact of the > computation. One way could be to add a heuristic where the pass keeps > track of nesting in the expression and either bail out or compute an > estimate if nesting crosses a threshold. I'll take this up once we > have more data on the nature of the bottlenecks. > > > Siddhesh Poyarekar (10): > tree-object-size: Replace magic numbers with enums > tree-object-size: Abstract object_sizes array > tree-object-size: Use tree instead of HOST_WIDE_INT > tree-object-size: Single pass dependency loop resolution > __builtin_dynamic_object_size: Recognize builtin > tree-object-size: Support dynamic sizes in conditions > tree-object-size: Handle function parameters > tree-object-size: Handle GIMPLE_CALL > tree-object-size: Dynamic sizes for ADDR_EXPR > tree-object-size: Handle dynamic offsets > > gcc/builtins.c | 22 +- > gcc/builtins.def | 1 + > gcc/doc/extend.texi | 13 + > gcc/gimple-fold.c | 9 +- > .../g++.dg/ext/builtin-dynamic-object-size1.C | 5 + > .../g++.dg/ext/builtin-dynamic-object-size2.C | 5 + > .../gcc.dg/builtin-dynamic-alloc-size.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-0.c | 463 +++++ > .../gcc.dg/builtin-dynamic-object-size-1.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-10.c | 9 + > .../gcc.dg/builtin-dynamic-object-size-11.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-12.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-13.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-14.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-15.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-16.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-17.c | 8 + > .../gcc.dg/builtin-dynamic-object-size-18.c | 8 + > .../gcc.dg/builtin-dynamic-object-size-19.c | 104 + > .../gcc.dg/builtin-dynamic-object-size-2.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-3.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-4.c | 7 + > .../gcc.dg/builtin-dynamic-object-size-5.c | 8 + > .../gcc.dg/builtin-dynamic-object-size-6.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-7.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-8.c | 5 + > .../gcc.dg/builtin-dynamic-object-size-9.c | 5 + > gcc/testsuite/gcc.dg/builtin-object-size-1.c | 160 +- > gcc/testsuite/gcc.dg/builtin-object-size-16.c | 2 + > gcc/testsuite/gcc.dg/builtin-object-size-17.c | 2 + > gcc/testsuite/gcc.dg/builtin-object-size-2.c | 134 ++ > gcc/testsuite/gcc.dg/builtin-object-size-3.c | 151 ++ > gcc/testsuite/gcc.dg/builtin-object-size-4.c | 99 + > gcc/testsuite/gcc.dg/builtin-object-size-5.c | 12 + > gcc/tree-object-size.c | 1766 +++++++++++------ > gcc/tree-object-size.h | 3 +- > gcc/ubsan.c | 46 +- > 37 files changed, 2499 insertions(+), 620 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size1.C > create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size2.C > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-alloc-size.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-1.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-11.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-12.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-13.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-14.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-15.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-16.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-17.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-18.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-19.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-2.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-3.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-4.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-5.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-6.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-7.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-8.c > create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-9.c > > -- > 2.31.1 >
On 2024-09-08 18:01, Andrew Pinski wrote: > When I was implementing improvements into phiopt I ran into case where > objsz would fail now because we get: > tmp = PHI <CST0, CST1> > ptr = ptr + tmp > > where before the pointer plus was inside each branch instead. So my > question is there any progress on implementing objsz with ranger or > has that work been put off? > I filed https://gcc.gnu.org/PR116556 for this. Do you have any start Sorry I noticed it but couldn't get time to work on it. __bos should trivially handle this I think, I'm surprised that it doesn't. As far as using ranger in __bos is concerned, I believe there was strong opposition to the idea and instead to continue evaluating sizes from first principles like we currently do in the tree-object-size pass. I think the concern was that ranger can be more aggressive in evaluation and assume no undefined behaviour whereas __bos tries to detect undefined behaviour whenever it can and is more conservative about computations. > of patches for this so it maybe it could be taken to finish? If not > then I am going to try to implement it since it blocks my work on > phiopt. I'll be happy to review any changes you make to tree-object-size to fix this. Thanks, Sid