Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use stacked ambiguous date checking #1072

Merged
merged 4 commits into from
Oct 26, 2022

Conversation

victorlin
Copy link
Member

@victorlin victorlin commented Oct 24, 2022

Description of proposed changes

Previously, the ambiguous date checks were 1:1 with the generated date columns. This should not be the case since month needs to check for ambiguous year, and week needs to check for anything that is ambiguous.

Separate the ambiguous date checking from the column generation, and update the conditions for the former to be "stacking" (i.e. year is always checked, month is checked for month/week, and day is checked for week only).

Related issue(s)

Fixes #1071

Testing

  • Existing test updated
  • Checks pass

Checklist

  • Add a message in CHANGES.md summarizing the changes in this PR. Keep headers and formatting consistent with the rest of the file.

@victorlin victorlin self-assigned this Oct 24, 2022
@victorlin victorlin changed the base branch from master to victorlin/filter/use-temporary-date-columns October 24, 2022 22:18
@victorlin victorlin force-pushed the victorlin/filter/fix-ambiguous-date-check branch from b1373b6 to 964febd Compare October 24, 2022 22:28
@victorlin victorlin marked this pull request as ready for review October 24, 2022 22:36
@victorlin victorlin requested a review from a team October 24, 2022 22:36
@codecov
Copy link

codecov bot commented Oct 24, 2022

Codecov Report

Attention: Patch coverage is 90.90909% with 2 lines in your changes missing coverage. Please review.

Project coverage is 61.83%. Comparing base (0826806) to head (0a8760c).
Report is 1171 commits behind head on master.

Files Patch % Lines
augur/filter.py 90.90% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1072      +/-   ##
==========================================
+ Coverage   61.80%   61.83%   +0.02%     
==========================================
  Files          52       52              
  Lines        6321     6331      +10     
  Branches     1551     1558       +7     
==========================================
+ Hits         3907     3915       +8     
  Misses       2141     2141              
- Partials      273      275       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@joverlee521 joverlee521 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for tackling this! The changes make sense to me, but I'm just going to play devil's advocate here: Is there any legitimate reason to group by month regardless of year value?

augur/filter.py Outdated Show resolved Hide resolved
@victorlin victorlin force-pushed the victorlin/filter/use-temporary-date-columns branch from 8375de9 to 7eacd26 Compare October 25, 2022 20:22
@victorlin
Copy link
Member Author

Is there any legitimate reason to group by month regardless of year value?

There could be, but that would be a separate discussion / feature request since month is already converted to (year, month):

augur/augur/filter.py

Lines 1065 to 1066 in 0a5bfc1

# month = (year, month)
metadata['month'] = list(zip(metadata['year'], metadata['month']))

Base automatically changed from victorlin/filter/use-temporary-date-columns to master October 25, 2022 20:37
Both Series.isnull() and DataFrame.dropna() accomplish the same thing,
but use just one to improve readability.
Previously, the ambiguous date checks were 1:1 with the generated date
columns. This should not be the case since month needs to check for
ambiguous year, and week needs to check for anything that is ambiguous.

Separate the ambiguous date checking from the column generation, and
update the conditions for the former to be "stacking" (i.e. year is
always checked, month is checked for month/week, and day is checked for
week only).
@victorlin victorlin force-pushed the victorlin/filter/fix-ambiguous-date-check branch from 6e90865 to 07881f4 Compare October 25, 2022 20:39
@tsibley
Copy link
Member

tsibley commented Oct 25, 2022

There could be, but that would be a separate discussion / feature request since month is already converted to (year, month):

Ah, the difference between "month" (a specific one out of an ~infinite set) and "month of the year" (one of the 12).

CHANGES.md Outdated Show resolved Hide resolved
@victorlin victorlin force-pushed the victorlin/filter/fix-ambiguous-date-check branch from 07881f4 to 6ec0105 Compare October 25, 2022 23:41
@victorlin victorlin merged commit 2891ef0 into master Oct 26, 2022
@victorlin victorlin deleted the victorlin/filter/fix-ambiguous-date-check branch October 26, 2022 21:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

filter: Ambiguous years should not be allowed when grouping by month
3 participants