Important: At the moment, we are recording results in a "nop" mode. BorrowSanitizer's run-time checks are being inserted, but they do not do anything. This is helpful for debugging the LLVM components of our tool. However, you can expect fully instrumented execution to be orders of magnitude worse.
We measure performance as a multiple of the execution time of an uninstrumented program. To summarize the performance of a crate, we calculate this metric for every unit test and then report the median value. When we execute individual tests, we record wall time as the mean of 10 iterations after three "warm-up" runs using hyperfine. We execute benchmarks on standard GitHub Actions runners, so these results are definitely noisy. This data is still useful for gauging performance, even though it is not scientifically rigorous.