Skip to content

Commit

Permalink
Refactor filters into separate functions
Browse files Browse the repository at this point in the history
Refactors filter logic into separate function with the same signature of
`func(metadata, **kwargs)` that returns a `set` of strain names that
pass the filter. Although this work does not reduce the complexity of
the code by itself, it sets up a pattern that will allow us to move all
filters into a single loop through all user-requested filters. This
change should simplify the main logic and also allow us to short-cut
evaluation when filters remove all possible strains (e.g.,
`--exclude-all`), avoiding unnecessary checks.

This refactoring also includes new functions for sequence-based filters.
As part of these sequence-based functions, we update the sequence index
data frame to be indexed by strain name to be consistent with the
metadata data frame.

One side-effect of this refactoring is the additional of a functional
test for both `--include-where` and `--exclude-where` filters to make
sure these are properly implemented and no regressions occur during
refactoring. The lack of this test initially allowed the refactoring of
`--exclude-where` logic to introduce a bug.

Finally, we also define a new function to include strains by a query.
Note that this implementation relies on the same query parser used by
the `--exclude-where` argument which allows the negation operator and
also the code that lowercases the strings before comparison. This change
is backward compatible, however, and only adds functionality that is
consistent with the `--exclude-where` functionality.
  • Loading branch information
huddlej committed Jul 9, 2021
1 parent 76b2c86 commit dbf000f
Show file tree
Hide file tree
Showing 4 changed files with 369 additions and 83 deletions.
3 changes: 2 additions & 1 deletion .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,8 @@ disable=print-statement,
multiple-imports,
no-else-return,
unscriptable-object,
relative-beyond-top-level
relative-beyond-top-level,
no-member

# Enable the message, report, category or checker with the given id(s). You can
# either give multiple identifier separated by comma (,) or put this option
Expand Down
Loading

0 comments on commit dbf000f

Please sign in to comment.