Message ID | 20220702113331.2003820-6-afaria@redhat.com |
---|---|
State | New |
Headers | show |
Series | Introduce an extensible static analyzer | expand |
On 02/07/2022 08:33, Alberto Faria wrote: Alberto, hello. I was testing this patch as follows: ./static-analyzer.py build target/ppc/mmu-hash64.c <snip> > @@ -627,9 +744,31 @@ def is_coroutine_fn(node: Cursor) -> bool: > else: > break > > - return node.kind == CursorKind.FUNCTION_DECL and is_annotated_with( > - node, "coroutine_fn" > - ) > + if node.kind in [CursorKind.DECL_REF_EXPR, CursorKind.MEMBER_REF_EXPR]: > + node = node.referenced > + > + # --- > + > + if node.kind == CursorKind.FUNCTION_DECL: And I receive an exception on the line above saying that node is of type NoneType. Seems that `node = node.referenced` is setting `node` to None in this case. I was unable to understand the root cause of it. Is this an incorrect usage of the tool from my part? Full error message below Traceback (most recent call last): File "./static-analyzer.py", line 327, in analyze_translation_unit checker(tu, context.absolute_path, log) File "./static-analyzer.py", line 613, in check_coroutine_pointers and is_coroutine_fn(right) File "./static-analyzer.py", line 781, in is_coroutine_fn if node.kind == CursorKind.FUNCTION_DECL: AttributeError: 'NoneType' object has no attribute 'kind' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar return list(map(*args)) File "./static-analyzer.py", line 329, in analyze_translation_unit raise RuntimeError(f"Error analyzing {relative_path}") from e RuntimeError: Error analyzing target/ppc/mmu-hash64.c """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "./static-analyzer.py", line 893, in <module> main() File "./static-analyzer.py", line 123, in main analyze_translation_units(args, contexts) File "./static-analyzer.py", line 240, in analyze_translation_units results = pool.map(analyze_translation_unit, contexts) File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map return self._map_async(func, iterable, mapstar, chunksize).get() File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get raise self._value RuntimeError: Error analyzing target/ppc/mmu-hash64.c > + return is_annotated_with(node, "coroutine_fn") > + > + if node.kind in [ > + CursorKind.FIELD_DECL, > + CursorKind.VAR_DECL, > + CursorKind.PARM_DECL, > + ]: > + > + if is_annotated_with(node, "coroutine_fn"): > + return True > + > + # TODO: If type is typedef or pointer to typedef, follow typedef. > + > + return False > + > + if node.kind == CursorKind.TYPEDEF_DECL: > + return is_annotated_with(node, "coroutine_fn") > + > + return False Best regards,
Hi Víctor, On Mon, Jul 4, 2022 at 3:18 PM Víctor Colombo <victor.colombo@eldorado.org.br> wrote: > And I receive an exception on the line above saying that node is of type > NoneType. Seems that `node = node.referenced` is setting `node` to None > in this case. > > I was unable to understand the root cause of it. Is this an incorrect > usage of the tool from my part? Full error message below Unfortunately there seem to be a lot of corner cases that libclang can throw at us. I hadn't come across this one before. I expected that DECL_REF_EXPR/MEMBER_REF_EXPR would always reference something. This may be due to some build error -- libclang tries to continue processing a translation unit by dropping subtrees or nodes that have problems. Is there a "too many errors emitted, stopping now; this may lead to false positives and negatives" line at the top of the script's output? Thanks, Alberto
On 04/07/2022 13:57, Alberto Faria wrote: > Hi Víctor, > > On Mon, Jul 4, 2022 at 3:18 PM Víctor Colombo > <victor.colombo@eldorado.org.br> wrote: >> And I receive an exception on the line above saying that node is of type >> NoneType. Seems that `node = node.referenced` is setting `node` to None >> in this case. >> >> I was unable to understand the root cause of it. Is this an incorrect >> usage of the tool from my part? Full error message below > > Unfortunately there seem to be a lot of corner cases that libclang can > throw at us. I hadn't come across this one before. I expected that > DECL_REF_EXPR/MEMBER_REF_EXPR would always reference something. > > This may be due to some build error -- libclang tries to continue > processing a translation unit by dropping subtrees or nodes that have > problems. Is there a "too many errors emitted, stopping now; this may > lead to false positives and negatives" line at the top of the script's > output? > Yes, this line is present at the beginning of the output Is this caused by problems with the code being analyzed or is it because libclang is getting confused with something that is outside of our control?
On Mon, Jul 4, 2022 at 6:46 PM Víctor Colombo <victor.colombo@eldorado.org.br> wrote: > Yes, this line is present at the beginning of the output > Is this caused by problems with the code being analyzed or is it because > libclang is getting confused with something that is outside of our > control? I think I found the problem: the commands in the compilation database weren't being parsed properly. I switched to shlex.split() and it seems to be working now. The WIP v2 is available at [1], if you want to give it a try. Thanks for reporting this! Alberto [1] https://gitlab.com/albertofaria/qemu/-/tree/static-analysis
On 04/07/2022 15:04, Alberto Faria wrote: > On Mon, Jul 4, 2022 at 6:46 PM Víctor Colombo > <victor.colombo@eldorado.org.br> wrote: >> Yes, this line is present at the beginning of the output >> Is this caused by problems with the code being analyzed or is it because >> libclang is getting confused with something that is outside of our >> control? > > I think I found the problem: the commands in the compilation database > weren't being parsed properly. I switched to shlex.split() and it > seems to be working now. The WIP v2 is available at [1], if you want > to give it a try. > > Thanks for reporting this! > > Alberto > > [1] https://gitlab.com/albertofaria/qemu/-/tree/static-analysis > I tested the version from the WIP v2 and seems to be working now. Thanks!
diff --git a/static-analyzer.py b/static-analyzer.py index 8abc552b82..485d054052 100755 --- a/static-analyzer.py +++ b/static-analyzer.py @@ -36,6 +36,8 @@ TranslationUnit, TranslationUnitLoadError, TypeKind, + SourceRange, + TokenGroup, ) Cursor.__hash__ = lambda self: self.hash # so `Cursor`s can be dict keys @@ -515,6 +517,90 @@ def check_coroutine_calls( log(call, "non-coroutine_fn function calls coroutine_fn") +@check("coroutine-pointers") +def check_coroutine_pointers( + translation_unit: TranslationUnit, + translation_unit_path: str, + log: Callable[[Cursor, str], None], +) -> None: + + # What we would really like is to associate annotation attributes with types + # directly, but that doesn't seem possible. Instead, we have to look at the + # relevant variable/field/parameter declarations, and follow typedefs. + + # This doesn't check all possible ways of assigning to a coroutine_fn + # field/variable/parameter. That would probably be too much work. + + # TODO: Check struct/union/array initialization. + # TODO: Check assignment to struct/union/array fields. + + for [*_, node] in all_paths(translation_unit.cursor): + + # check initialization of variables using coroutine_fn values + + if node.kind == CursorKind.VAR_DECL: + + children = [ + c + for c in node.get_children() + if c.kind + not in [ + CursorKind.ANNOTATE_ATTR, + CursorKind.INIT_LIST_EXPR, + CursorKind.TYPE_REF, + CursorKind.UNEXPOSED_ATTR, + ] + ] + + if ( + len(children) == 1 + and not is_coroutine_fn(node) + and is_coroutine_fn(children[0]) + ): + log(node, "assigning coroutine_fn to non-coroutine_fn") + + # check initialization of fields using coroutine_fn values + + # TODO: This only checks designator initializers. + + if node.kind == CursorKind.INIT_LIST_EXPR: + + for initializer in filter( + lambda n: n.kind == CursorKind.UNEXPOSED_EXPR, + node.get_children(), + ): + + children = list(initializer.get_children()) + + if ( + len(children) == 2 + and children[0].kind == CursorKind.MEMBER_REF + and not is_coroutine_fn(children[0].referenced) + and is_coroutine_fn(children[1]) + ): + log( + initializer, + "assigning coroutine_fn to non-coroutine_fn", + ) + + # check assignments of coroutine_fn values to variables or fields + + if is_binary_operator(node, "="): + + [left, right] = node.get_children() + + if ( + left.kind + in [ + CursorKind.DECL_REF_EXPR, + CursorKind.MEMBER_REF_EXPR, + ] + and not is_coroutine_fn(left.referenced) + and is_coroutine_fn(right) + ): + log(node, "assigning coroutine_fn to non-coroutine_fn") + + # ---------------------------------------------------------------------------- # # Traversal @@ -555,6 +641,31 @@ def subtrees_matching( # Node predicates +def is_binary_operator(node: Cursor, operator: str) -> bool: + return ( + node.kind == CursorKind.BINARY_OPERATOR + and get_binary_operator_spelling(node) == operator + ) + + +def get_binary_operator_spelling(node: Cursor) -> Optional[str]: + + assert node.kind == CursorKind.BINARY_OPERATOR + + [left, right] = node.get_children() + + op_range = SourceRange.from_locations(left.extent.end, right.extent.start) + + tokens = list(TokenGroup.get_tokens(node.translation_unit, op_range)) + if not tokens: + # Can occur when left and right children extents overlap due to + # misparsing. + return None + + [op_token, *_] = tokens + return op_token.spelling + + def is_valid_coroutine_fn_usage(parent: Cursor) -> bool: """ Check if an occurrence of `coroutine_fn` represented by a node with parent @@ -599,7 +710,13 @@ def is_valid_allow_coroutine_fn_call_usage(node: Cursor) -> bool: `node` appears at a valid point in the AST. This is the case if its right operand is a call to: - - A function declared with the `coroutine_fn` annotation. + - A function declared with the `coroutine_fn` annotation, OR + - A field/variable/parameter whose declaration has the `coroutine_fn` + annotation, and of function pointer type, OR + - [TODO] A field/variable/parameter of a typedef function pointer type, + and the typedef has the `coroutine_fn` annotation, OR + - [TODO] A field/variable/parameter of a pointer to typedef function type, + and the typedef has the `coroutine_fn` annotation. TODO: Ensure that `__allow_coroutine_fn_call()` is in the body of a non-`coroutine_fn` function. @@ -627,9 +744,31 @@ def is_coroutine_fn(node: Cursor) -> bool: else: break - return node.kind == CursorKind.FUNCTION_DECL and is_annotated_with( - node, "coroutine_fn" - ) + if node.kind in [CursorKind.DECL_REF_EXPR, CursorKind.MEMBER_REF_EXPR]: + node = node.referenced + + # --- + + if node.kind == CursorKind.FUNCTION_DECL: + return is_annotated_with(node, "coroutine_fn") + + if node.kind in [ + CursorKind.FIELD_DECL, + CursorKind.VAR_DECL, + CursorKind.PARM_DECL, + ]: + + if is_annotated_with(node, "coroutine_fn"): + return True + + # TODO: If type is typedef or pointer to typedef, follow typedef. + + return False + + if node.kind == CursorKind.TYPEDEF_DECL: + return is_annotated_with(node, "coroutine_fn") + + return False def is_annotated_with(node: Cursor, annotation: str) -> bool:
Extend static-analyzer.py to enforce coroutine_fn restrictions on function pointer operations. Invalid operations include assigning a coroutine_fn value to a non-coroutine_fn function pointer, and invoking a coroutine_fn function pointer from a non-coroutine_fn function. Signed-off-by: Alberto Faria <afaria@redhat.com> --- static-analyzer.py | 147 +++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 143 insertions(+), 4 deletions(-)