Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: remove some useless cases in the test #1485

Merged
merged 1 commit into from
May 28, 2024

Conversation

yihau
Copy link
Member

@yihau yihau commented May 24, 2024

Problem

some changes for bumping Rust version: #1309

test_new_from_file_crafted_executable is failed on nightly-2024-05-02 (run on linux). I have bisected it. it looks like everything works fine before nightly-2024-03-27.

the reproducible command is:

RUST_BACKTRACE=all cargo +nightly-2024-05-02 test --package solana-accounts-db --lib -- append_vec::tests::test_new_from_file_crafted_executable::storageaccess_mmap_expects --exact --show-output

the error I got:

thread 'append_vec::tests::test_new_from_file_crafted_executable::storageaccess_mmap_expects' panicked at accounts-db/src/append_vec.rs:1416:29:
This didn't occur if this test passed.
stack backtrace:
   0: rust_begin_unwind
             at /rustc/c987ad527540e8f1565f57c31204bde33f63df76/library/std/src/panicking.rs:652:5
   1: core::panicking::panic_fmt
             at /rustc/c987ad527540e8f1565f57c31204bde33f63df76/library/core/src/panicking.rs:72:14
   2: solana_accounts_db::append_vec::tests::test_new_from_file_crafted_executable::{{closure}}
             at ./src/append_vec.rs:1416:29
   3: solana_accounts_db::append_vec::AppendVec::get_stored_account_meta_callback
             at ./src/append_vec.rs:549:22
   4: solana_accounts_db::append_vec::tests::test_new_from_file_crafted_executable
             at ./src/append_vec.rs:1398:13
   5: solana_accounts_db::append_vec::tests::test_new_from_file_crafted_executable::storageaccess_mmap_expects
             at ./src/append_vec.rs:1366:5
   6: solana_accounts_db::append_vec::tests::test_new_from_file_crafted_executable::storageaccess_mmap_expects::{{closure}}
             at ./src/append_vec.rs:1366:38
   7: core::ops::function::FnOnce::call_once
             at /rustc/c987ad527540e8f1565f57c31204bde33f63df76/library/core/src/ops/function.rs:250:5
   8: core::ops::function::FnOnce::call_once
             at /rustc/c987ad527540e8f1565f57c31204bde33f63df76/library/core/src/ops/function.rs:250:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Summary of Changes

if I understand correctly, we tried to convert a byte to bool here. I think assert!(!account.sanitize_executable()); has already covered the case and catch it. do we want to keep those undefined behaviors checking? I lean to remove them if we don't have similar use cases in the real world. (or maybe get them like a doc and don't need to run them)

(thanks @brooksprumo for sharing the original issue solana-labs#26009)

Comment on lines 1405 to 1427

// we can observe crafted value by ref
{
let executable_bool: &bool = &account.account_meta.executable;
// Depending on use, *executable_bool can be truthy or falsy due to direct memory manipulation
// assert_eq! thinks *executable_bool is equal to false but the if condition thinks it's not, contradictorily.
assert!(!*executable_bool);
#[cfg(not(target_arch = "aarch64"))]
{
const FALSE: bool = false; // keep clippy happy
if *executable_bool == FALSE {
panic!("This didn't occur if this test passed.");
}
}
assert_eq!(*account.ref_executable_byte(), crafted_executable);
}

// we can NOT observe crafted value by value
{
let executable_bool: bool = account.executable();
assert!(!executable_bool);
assert_eq!(account.get_executable_byte(), 0); // Wow, not crafted_executable!
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should remove this whole thing. At most, only the part that's protected by the target_arch:

                    #[cfg(not(target_arch = "aarch64"))]
                    {
                        const FALSE: bool = false; // keep clippy happy
                        if *executable_bool == FALSE {
                            panic!("This didn't occur if this test passed.");
                        }
                    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understand correctly, the executable_bool has been corrupted because we forcibly write a byte into a bool. If we can catch it by sanitize_executable, I guess we don't need to care about the value 🤔

Copy link
Member

@ryoqun ryoqun May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, let me explain the motive of these seemingly meaningless assertions.

Firstly, this is added here by me: solana-labs#7464 (comment) note that sanitization is run inside the following code (av.set_file(path);).

so, from very the start, these assertions are worthless in strict sense as @yihau pointed out. The reason i added them is that i hoped i can detect UB behavior changes in the distant future for my pure curiosity.

so the dream has come true today after 4.5 years.

it was kind of fun to conclude to this: #1485 (comment) it was a good exercise for me to hunt down to the specific rust commit.

and the result of investigation indicates there's no way to observe this specifc UB anymore. so, I'd prefer removing the specific part @brooksprumo has narrowed down.

i'd like to see more ub changes in yet distant future and to dig in by leaving the others as-is as much as possible. thanks for understanding. :)

@brooksprumo
Copy link

brooksprumo commented May 24, 2024

It probably would be good to root-cause what happened with nightly-2024-03-26 that caused this to start failing.

@yihau
Copy link
Member Author

yihau commented May 25, 2024

It probably would be good to root-cause what happened with nightly-2024-03-26 that caused this to start failing.

ah! sorry, the first error version is nightly-2024-03-27. just updated the PR body!

@ryoqun
Copy link
Member

ryoqun commented May 27, 2024

I'll look around this today. so, this test is failing only in coverage with recent nightly as seen at #1309

@ryoqun
Copy link
Member

ryoqun commented May 27, 2024

I'll look around this today. so, this test is failing only in coverage with recent nightly as seen at #1309

never mind, i remember nightly is used only for coverage, not other ci steps...

@ryoqun
Copy link
Member

ryoqun commented May 27, 2024

I have bisected it. it looks like everything works fine before nightly-2024-03-27.

thanks for bisecting.

$ rustup install nightly-2024-03-27
info: syncing channel updates for 'nightly-2024-03-27-x86_64-unknown-linux-gnu'
info: latest update on 2024-03-27, rust version 1.79.0-nightly (47ecded35 2024-03-26)
info: downloading component 'cargo'
info: downloading component 'clippy'


$ rustup install nightly-2024-03-26
info: syncing channel updates for 'nightly-2024-03-26-x86_64-unknown-linux-gnu'
info: latest update on 2024-03-26, rust version 1.79.0-nightly (5f2c7d2bf 2024-03-25)
info: downloading component 'cargo'
info: downloading component 'clippy'

... so, this is the commit range of the nightly:

rust-lang/rust@5f2c7d2...47ecded

also rust-lang/stdarch@56087ea...967e7af

@ryoqun
Copy link
Member

ryoqun commented May 28, 2024

rust-lang/rust@5f2c7d2...47ecded

I'm now pretty sure this is the commit which changed this test behavior: rust-lang/rust#122849

@ryoqun
Copy link
Member

ryoqun commented May 28, 2024

I'm now pretty sure this is the commit which changed this test behavior: rust-lang/rust#122849

the pr started to omit emitting noundef attribute for -O0 build (ie. our nighty coverage build is affected)

and lack of the attribute inhibits optimizations based on the assumption that there should be no undefined values

https://llvm.org/docs/LangRef.html#parameter-attributes:

noundef

This attribute applies to parameters and return values. If the value representation contains any undefined or poison bits, the behavior is undefined. Note that this does not refer to padding introduced by the type’s storage representation.

https://llvm.org/docs/LangRef.html#undefined-values:

Note

A ‘poison’ value (described in the next section) should be used instead of ‘undef’ whenever possible. Poison values are stronger than undef, and enable more optimizations. Just the existence of ‘undef’ blocks certain optimizations (see the examples below).

i haven't actually dumped llvm ir but I think 0b1111_1110 for bool definitely falls into the category of undefined or poison here.

With this all background info, the conclusion is that: the test has been testing the existence of UB. And that UB was caused by optimizations leveraging noundef. But newer nightly is now preventing these optimizations by omitting noundef. so the problematic optimization has gone, so the ub has. ;)

@brooksprumo
Copy link

Thanks for root-causing this, @ryoqun!

@yihau
Copy link
Member Author

yihau commented May 28, 2024

okay! thank you! I just pushed a new update. only removed the platform specific part 🫡

Copy link

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@yihau yihau merged commit 12b6665 into anza-xyz:master May 28, 2024
42 checks passed
@yihau yihau deleted the remove-ub-test branch May 28, 2024 17:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants