Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bubble up errors in bank_fork_utils instead of exiting process #34277

Merged
merged 4 commits into from
Nov 30, 2023

Conversation

steviez
Copy link
Contributor

@steviez steviez commented Nov 30, 2023

Problem

There are operations in bank_fork_utils that may fail; we explicitly call std::process::exit() on several of these. Granted we may end up exiting the process higher up the callstack, bubbling the errors up allow a caller that could handle the error to do so.

Summary of Changes

Make several functions return a Result<...> instead of exiting, and properly handle the result up the callstack

There are operations in bank_fork_utils that may fail; we explicitly
call std::process::exit() on several of these. Granted we may end up
exiting the process higher up the callstack, bubbling the errors up
allow a caller that could handle the error to do so.
Copy link

codecov bot commented Nov 30, 2023

Codecov Report

Merging #34277 (2280c5b) into master (935e06f) will decrease coverage by 0.1%.
Report is 6 commits behind head on master.
The diff coverage is 5.5%.

Additional details and impacted files
@@            Coverage Diff            @@
##           master   #34277     +/-   ##
=========================================
- Coverage    81.9%    81.9%   -0.1%     
=========================================
  Files         819      819             
  Lines      219765   219783     +18     
=========================================
- Hits       180094   180085      -9     
- Misses      39671    39698     +27     

@steviez steviez marked this pull request as ready for review November 30, 2023 02:34
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this idea. And also, let no good deed go unpunished...

I'm most against the idea of use String for the Error type on the Result, though. My preference is to use cascading/tiered error enums with thiserror::Error. Here's an example that I did when replacing all the unwraps in AccountsBackgroundService's snapshot handling with real errors:

/// Errors that can happen in `add_bank_snapshot()`
#[derive(Error, Debug)]
pub enum AddBankSnapshotError {
#[error("bank snapshot dir already exists: {0}")]
SnapshotDirAlreadyExists(PathBuf),
#[error("failed to create snapshot dir: {0}")]
CreateSnapshotDir(#[source] std::io::Error),
#[error("failed to hard link storages: {0}")]
HardLinkStorages(#[source] HardLinkStoragesToSnapshotError),
#[error("failed to serialize bank: {0}")]
SerializeBank(#[source] Box<SnapshotError>),
#[error("failed to serialize status cache: {0}")]
SerializeStatusCache(#[source] Box<SnapshotError>),
#[error("failed to write snapshot version file: {0}")]
WriteSnapshotVersionFile(#[source] std::io::Error),
#[error("failed to mark snapshot as 'complete': {0}")]
CreateStateCompleteFile(#[source] std::io::Error),
}
/// Errors that can happen in `hard_link_storages_to_snapshot()`
#[derive(Error, Debug)]
pub enum HardLinkStoragesToSnapshotError {
#[error("failed to create accounts hard links dir: {0}")]
CreateAccountsHardLinksDir(#[source] std::io::Error),
#[error("failed to flush storage: {0}")]
FlushStorage(#[source] AccountsFileError),
#[error("failed to get the snapshot's accounts hard link dir: {0}")]
GetSnapshotHardLinksDir(#[from] GetSnapshotAccountsHardLinkDirError),
#[error("failed to hard link storage: {0}")]
HardLinkStorage(#[source] std::io::Error),
}
/// Errors that can happen in `get_snapshot_accounts_hardlink_dir()`
#[derive(Error, Debug)]
pub enum GetSnapshotAccountsHardLinkDirError {
#[error("invalid account storage path: {0}")]
GetAccountPath(PathBuf),
#[error("failed to create the snapshot hard link dir: {0}")]
CreateSnapshotHardLinkDir(#[source] std::io::Error),
#[error("failed to symlink snapshot hard link dir {link} to {original}: {source}")]
SymlinkSnapshotHardLinkDir {
source: std::io::Error,
original: PathBuf,
link: PathBuf,
},
}

Would something like that work here too?

@steviez
Copy link
Contributor Author

steviez commented Nov 30, 2023

I like this idea. And also, let no good deed go unpunished...

😬

...
Would something like that work here too?

I added something similar, bound to bank_fork_utils.rs. I'll call your attention to the fact that I dropped newlines from the error message. I went back and forth a little; having the nicer display is nice. On the other hand, if someone is grep'ing logs by ERROR, they might miss the additional lines. I'm a little on the fence about this aspect, so let me know what you think.

I initially went with String as that is what Validator::new() returns; that (and ledger-tool) could arguably use some similar cleanup but I think that is out of the scope of this PR. As such, I left String in ledger-tool to remain generic; I can circle back with some similar cleanup down the road in there

Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it!

I don't love the change to a String error type for load_and_process_ledger(), but I think adding more tiered errors here to resolve the circular dependencies of which errors are from/sources of others (1) can be done later, and (2) likely has diminishing returns w.r.t. time investment currently.

How does the text of the errors look now in comparison to what's currently in master?

ledger/src/bank_forks_utils.rs Outdated Show resolved Hide resolved
ledger/src/bank_forks_utils.rs Outdated Show resolved Hide resolved
@steviez
Copy link
Contributor Author

steviez commented Nov 30, 2023

How does the text of the errors look now in comparison to what's currently in master?

Here is an old instance of the error:

[2023-11-30T17:53:33.031850614Z ERROR solana_ledger::bank_forks_utils] Account paths not present when booting from snapshot

and here is a new instance of the same error:

[2023-11-30T17:50:22.513117854Z ERROR solana_validator] Failed to start validator: "accounts path(s) not present when booting from snapshot"

It is nice that these will now have the Failed to start validator text in common with other errors

brooksprumo
brooksprumo previously approved these changes Nov 30, 2023
@steviez steviez merged commit 479b7ee into solana-labs:master Nov 30, 2023
33 checks passed
@steviez steviez deleted the bf_utils_err_handling branch November 30, 2023 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants