Given the existence of macros, doesn’t this let package maintainers run arbitrary code in the painter sandbox?
I’m afraid this question doesn’t make a lot of sense. You seem to be confused about the purpose of the
painter
tool, or how macros work, probably both.Neither is
painter
a sandbox tool, nor do macros have the ability to “run code”, arbitrary or otherwise, anywhere.painter
is just a call graph analysis tool for thecrates.io
ecosystem. It does the analysis based on generated LLVM IR code (which is not “runnable” as is) from all versions of all crates.It’s security application is to reliably find what crate releases are vulnerable if a vulnerability is found in releases of crate Foo.
Note that we already have
cargo-audit
andadvisory-db
.painter
’s goal is to confirm via call graph analysis that “Yes, your crate is vulnerable. This part of your crate calls this vulnerable part of crate Foo”.No crate code is actually run by or in
painter
, except the code written to run the tool itself, of course ;)As for Rust macros, they get expanded in the parsing stage after lexing. You can see what’s literally expanded with
cargo expand
. Macros are long gone by the time you get to code generation.Incidentally, painter has this current limitation listed in the README:
- LTO and optimizers are disabled to prevent inlining, but many cases exist in which the invocation is lost at a bytecode level. Source analysis can improve this. Examples of cases where an invoke is likely lost:
- Dynamic function calls (pointers, vtables, etc.)
- Inlining
That’s real source/expanded code lost by the time we got to the generated IR code stage. For macros to “run arbitrary code” at that stage would be quite something ;)
To generate the LLVM code correctly you need to run
build.rs
if there is any, and run proc macros which are natively compiled compiler plugins, currently running without any sandbox.The final code isn’t run, but the build process of Cargo crates can involve running of arbitrary code.
The compilation process can be sandboxed as a whole, but if it runs arbitrary code, a malicious crate could take over the build process and falsify the LLVM output.
- LTO and optimizers are disabled to prevent inlining, but many cases exist in which the invocation is lost at a bytecode level. Source analysis can improve this. Examples of cases where an invoke is likely lost: