-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include merge dim positions in group keys emitted by split_fragments
#521
Include merge dim positions in group keys emitted by split_fragments
#521
Conversation
@norlandrhagen thanks for getting this started. Would it be okay with you if I take a shot at finishing this PR? |
@cisaacstern that would be great! |
As of the last commit, the test added here now replicates the bug reported in #517: pytest -vx tests/test_rechunking.py -k test_split_and_combine_fragments_with_merge_dim
Thanks again to @norlandrhagen for kicking this off, and @rabernat for pairing on this yesterday. 🙏 Now that the test is in place, I'll work on fixing the bug! |
Summary to date:
|
Correction... the IndexedPositions look correct, but the subfragments generated by
Note that |
Super exciting progress @cisaacstern ! |
Thanks to @rabernat for a thoughtful offline critique of this PR. In brief, Ryan pointed out that we parallelize writes horizontally across merge dimensions, therefore combining merge dimensions in the |
I believe this is now quite close to the intended behavior. The relevant integration tests appear to all be passing, and most of the unit tests are as well. There are two unit test cases which are failing, I just need to figure out if that's a problem with some assumption in the unit test itself, or if I'm actually catching a meaningful corner case. |
Great progress Charles! Let me know if I can be helpful here. |
Thanks so much to @jbusecke for pairing on this, which helped me understand the specific failure mode of the failing test cases (which do appear to be a meaningful bug, and not a testing mistake...):
|
@rabernat, the test added in f0d3159 reproduces the behavior documented in #521 (comment) above. To your eye, is this indeed a bug, or have I misunderstood what group keys to expect in this situation? |
Trying to parse this example scenario, which is indeed pretty complicated. But the crux of it seems to be
Based on my reading, this conclusion seems correct. |
The bug in question seems purely related to the rechunking logic along the concat dim. So it's suspicious that it only arises once you bring in a merge dim. That seems like an important clue. |
tests/test_rechunking.py
Outdated
offset = 1 | ||
index = Index([(dimension, IndexedPosition(offset, dimsize=nt_total))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't quite seem right to me. If each dataset has two days in it (nt=2
), the only possible offsets are even (0, 2, 4, 6, and 8). So it feels like we are creating inconsistent test data here.
tests/test_rechunking.py
Outdated
# replicates indexes created by IndexItems transform. | ||
unique_times = np.unique([ds.time[0].values for ds in dsets]) | ||
time_positions = {t: i for i, t in enumerate(unique_times)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems wrong. The index in IndexedPosition
needs to be an actual offset from the beginning of the array.
@rabernat thanks so much for the review, your two inline comments were a serious a-ha moment for me. I thought I must've been misunderstanding something, and it turns out I was. To recap (for my future self and any others reading this), there are actually 3 indexing spaces in play here:
I had misunderstood the I'll now remove the possible bug test, and fix this issue in the unit test. 🚀 |
I am not super happy about how these types look, but I do believe this is documented. pangeo-forge-recipes/pangeo_forge_recipes/types.py Lines 25 to 30 in 5edf5ec
|
Cannot Combine Fragments
Error - Issue 517 - Testingsplit_fragments
@rabernat AFAICT this is good to go. Would love your final review, in case there's anything I've overlooked. Two other gut checks in process:
IMHO the testing here is pretty robust, so I don't think we should wait on either of these items to merge, but whenever they both complete, they'll be great further verification of these changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
🥳 |
Incredible @cisaacstern! Super exciting to have this PR merged in. |
Awesome @cisaacstern! |
So great to see the release within reach!! |
Opening up a PR to start working on a bugfix for the error described in Issue #517 .
The error from #517 happens in L185 of
rechunking.py
in thecombine_fragments
function.As @rabernat pointed out in #517, the error might be upstream in the
split_fragments
function.So if I'm understanding this correctly,
split_fragments
does not take theMergeDim
into account and outputs PCollections/Fragments that contain multiple variables. When fed intocombine_fragments
, we get the error:Expected a hypercube of shape [1] but got 2 fragments
There is the begining of a test named
test_split_fragment_merge_dim
intest_rechunking.py
cc @cisaacstern