Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: move partition #1326

Conversation

zhiqiangxu
Copy link
Contributor

@zhiqiangxu zhiqiangxu commented Jun 30, 2023

This PR implements this FIP.

@zhiqiangxu zhiqiangxu force-pushed the feature/move_partition_verify branch from 23d838a to 181dc2e Compare July 4, 2023 02:52
Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. I haven't looked at the tests much until we align on the implementation.

The big outstanding question is whether this should more literally just move the partition, intact, to a new deadline, or remove a partition and add the sectors individually to the new deadline. You've done the latter, but I think the former will be better, definitely cheaper. But it does require adding a new method on Deadline state to accept a new partition.

Also there seems to be some confusion over which partitions to move. Only prove and move those the caller specified.

deadline_distance(policy, current_deadline, to_deadline)
< deadline_distance(policy, current_deadline, from_deadline)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look correct, but please add tests to demonstrate all the cases.

actors/miner/src/lib.rs Outdated Show resolved Hide resolved
actors/miner/src/lib.rs Outdated Show resolved Hide resolved
actors/miner/src/lib.rs Outdated Show resolved Hide resolved
actors/miner/src/lib.rs Outdated Show resolved Hide resolved
Comment on lines 3229 to 3231
state.delete_sectors(store, &dead).map_err(|e| {
e.downcast_default(ExitCode::USR_ILLEGAL_STATE, "failed to delete dead sectors")
})?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to do this if keeping the partition intact (including its reference to terminated sectors) while moving. I can see below that's not what you're doing, but please leave this comment until we resolve that.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking into the "moving partitions intact" approach, but Deadline.expirations_epochs seems quite complicated to maintain while moving. I guess it's the main reason current compact_partitions was implemented by removing then adding sectors.

@anorth Not sure whether it's worth to load sector infos just to calculate an accurate epoch for Deadline.expirations_epochs. Does it matter if the epoch is a period later? If that's the case, we can simply use the source epoch to calculate the target epoch with a different quant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I see. I think we should avoid loading sector info if at all possible.

If we say that a partition with faulty sectors can't be moved, then I think we only need to worry about on-time expirations. In that case I suggest rounding up, i.e. find the next epoch quantized to the new deadline that is not sooner than epoch in the queue. Then the SP might need to maintain a sector up to 1 proving period longer than committed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about removing anything from the old deadline's queue. It will have redundant entries, but they're harmless.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about other sectors: faulty, unproven, terminated? I was considering to allow all sectors for this "moving partitions intact" approach.

Also, I found there is a test case which requires the epoch to be exactly quant.quantize_up(sector.expiration), should we modify this check?

I've temporarily pushed the latest commit here in case you want to try the test cases like ok_to_move_partitions_from_safe_epoch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to consider allowing any partition to be moved too, but at the moment I don't think it's worth the work/risk of changing how remove_partitions works at the moment. Let's require non-faulty, fully proven sectors for now.

Yes I think we should probably remove that test case. The quantisation is mentioned elsewhere too but I think we can probably remove it, and update associated code to handle arbitrary quantisation. This can be an independent change ahead of what you're doing here – for now just comment out that state assertion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do push that commit to this branch - I think it's the right approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I found there is a test case which requires the epoch to be exactly quant.quantize_up(sector.expiration), should we modify this check?

I looked into this and I think we can just make a simple change to make it epoch >= target.

Copy link
Contributor Author

@zhiqiangxu zhiqiangxu Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it a bit more restrictive: epoch >= target && (epoch-target)%quant.unit ==0

actors/miner/src/lib.rs Outdated Show resolved Hide resolved
actors/miner/src/lib.rs Outdated Show resolved Hide resolved
actors/miner/src/lib.rs Outdated Show resolved Hide resolved
actors/miner/src/lib.rs Outdated Show resolved Hide resolved
@jennijuju
Copy link
Member

jennijuju commented Sep 28, 2023

could you please point me to where are we checking the amount of partitions after of to deadline doesn't exceed max_partitions_per_deadline after the moving?

@jennijuju
Copy link
Member

can we add some tests for dispute scenarios as well, please?

@Stebalien
Copy link
Member

could you please point me to where are we checking the amount of partitions after of to deadline doesn't exceed max_partitions_per_deadline after the moving?

That's a good point, I don't think we're doing that.

2. add check for `max_partitions_per_deadline`
@zhiqiangxu
Copy link
Contributor Author

could you please point me to where are we checking the amount of partitions after of to deadline doesn't exceed max_partitions_per_deadline after the moving?

Good point, just added this check!

@zhiqiangxu
Copy link
Contributor Author

zhiqiangxu commented Sep 28, 2023

can we add some tests for dispute scenarios as well, please?

OK, this may take a bit longer though.

@jennijuju
Copy link
Member

can we add some tests for dispute scenarios as well, please?

OK, this may take a bit longer though.

Can be added in a follow up pr too!

@zhiqiangxu
Copy link
Contributor Author

can we add some tests for dispute scenarios as well, please?

OK, this may take a bit longer though.

Can be added in a follow up pr too!

I'll add it in a follow up pr then, and we'll also manually test this case!

Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with more tests to come in follow-up

let mut orig_deadline =
deadlines.load_deadline(store, params.orig_deadline).context_code(
ExitCode::USR_ILLEGAL_STATE,
format!("failed to load deadline {}", params.orig_deadline),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use with_context_code when doing a format! operation for the message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I guess this is to avoid alloc when no error happens?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

@Stebalien Stebalien added this pull request to the merge queue Sep 29, 2023
Merged via the queue into filecoin-project:master with commit 5ec2a6b Sep 29, 2023
11 checks passed
Comment on lines +296 to +331
|| {
let current_deadline = h.current_deadline(&rt);

let from_deadline = new_deadline_info(
rt.policy(),
if current_deadline.index < orig_deadline_id {
current_deadline.period_start - rt.policy().wpost_proving_period
} else {
current_deadline.period_start
},
orig_deadline_id,
*rt.epoch.borrow(),
);

let from_ddl = h.get_deadline(&rt, orig_deadline_id);

let entropy = RawBytes::serialize(h.receiver).unwrap();
rt.expect_get_randomness_from_beacon(
DomainSeparationTag::WindowedPoStChallengeSeed,
from_deadline.challenge,
entropy.to_vec(),
TEST_RANDOMNESS_ARRAY_FROM_ONE,
);

let post = h.get_submitted_proof(&rt, &from_ddl, 0);

let all_ignored = BitField::new();
let vi = h.make_window_post_verify_info(
&sectors_info,
&all_ignored,
sectors_info[1].clone(),
Randomness(TEST_RANDOMNESS_ARRAY_FROM_ONE.into()),
post.proofs,
);
rt.expect_verify_post(vi, ExitCode::OK);
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This window post expectation-setting is long and duplicated in other tests. Can you follow up to factor it out into either a flag on move_partitions, or another method on the harness that returns this closure?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw the comments here, I'll do it tomorrow.

use vm_api::util::{apply_ok, get_state, DynBlockstore};
use vm_api::VM;

#[test]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to export tests to be run in other frameworks, test entry points no longer live here. Instead, they go in the test_vm package with a simple wrapper that invokes a method here with a single VM parameter. Please follow up to use the same pattern as other integration tests.

@zhiqiangxu
Copy link
Contributor Author

zhiqiangxu commented Oct 11, 2023

@anorth I just pushed the suggested modifications to the follow-up PR: #1430

aarshkshah1992 added a commit that referenced this pull request Nov 10, 2023
aarshkshah1992 added a commit that referenced this pull request Nov 10, 2023
arajasek pushed a commit that referenced this pull request Nov 14, 2023
* Revert "Feature: move partition  (#1326)"

This reverts commit 5ec2a6b.

* ci

* run CI
arajasek pushed a commit that referenced this pull request Nov 14, 2023
* Revert "Feature: move partition  (#1326)"

This reverts commit 5ec2a6b.

* ci

* run CI
github-merge-queue bot pushed a commit that referenced this pull request Dec 12, 2023
* Revert "Feature: move partition  (#1326)"

This reverts commit 5ec2a6b.

* update amt deps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

6 participants