-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
filter: Restore automatic boolean conversion #1410
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1410 +/- ##
==========================================
+ Coverage 67.38% 67.46% +0.08%
==========================================
Files 69 69
Lines 7465 7484 +19
Branches 1836 1840 +4
==========================================
+ Hits 5030 5049 +19
Misses 2161 2161
Partials 274 274 ☔ View full report in Codecov by Sentry. |
a3fada4
to
df73174
Compare
Use one comment to explain both the example file and command. This will make the file more readable with multiple commands.
Previous wording was wordy. Make it more concise.
df73174
to
6765b8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix @victorlin!
I think there's a potential bug if user specifies --query-column
with a numeric
type. Otherwise, changes look good to me.
Since filter can now automatically infer boolean types, should bool
be added to the ACCEPTED_TYPES
for the --query-columns
flag? Maybe outside of the scope of this PR, but just wondering if there are plans to add it later.
Automatic conversion is applied by default. Reserve the 'numeric' type for automatic conversions and keep the accepted values to --query-columns strict (for numeric, either 'int' or 'float').
Boolean conversion was not considered when automatic nullable numeric conversion was applied¹, but it continued to work because augur filter relied on pandas.read_csv's automatic type inference up until it was disabled in favor of reading all columns as string². A note to include boolean was added³ but removed inadvertently in another big change⁴. ¹ b325b97: Try converting all columns to numerical type ² 9f9be3a: Read all metadata as string type ³ 725e1b4: Expand comment on numeric conversion ⁴ b0a0d11: Add --query-columns option
This was not previously supported by pandas.read_csv's built-in type inference, but it aligns with the existing support for nullable numeric columns.
6765b8b
to
d44506d
Compare
Description of proposed changes
Boolean conversion was not considered when automatic nullable numeric conversion was applied¹, but it continued to work because augur filter relied on pandas.read_csv's automatic type inference up until it was disabled in favor of reading all columns as string².
A note to include boolean was added³ but removed inadvertently in another big change⁴.
¹ b325b97: Try converting all columns to numerical type
² 9f9be3a: Read all metadata as string type
³ 725e1b4: Expand comment on numeric conversion
⁴ b0a0d11: Add --query-columns option
Related issue(s)
Checklist