read_metadata lacks standard error handling #576
Labels
breaking
Makes a backwards incompatible change and should wait for major release
bug
Something isn't working
easy problem
Requires less work than most issues
good first issue
A relatively isolated issue appropriate for first-time contributors
priority: moderate
To be resolved after high priority issues
Current Behavior
The
read_metadata
function in augur.utils allows users to pass invalid filenames or existing files with missing columns and returns an empty metadata set in each of these cases.Expected behavior
read_metadata
should raise appropriate errors when it receives invalid inputs. These invalid inputs most likely represent issues with the input data or the way the user is calling the function. By silently accepting these errors,read_metadata
masks real issues with the data.How to reproduce
See unit tests for missing and invalid files names and files with missing strain/name fields.
Possible solution
Empty, invalid, or missing filenames should raise a
FileNotFound
exception and not return an empty result set. This exception is raised by pandas whenread_csv
is called.An existing file that doesn't have a strain or name column should raise a
KeyError
exception instead of an empty set. This exception corresponds to the case when a user would try to access a key of a dictionary that did not exist. Sincestrain
orname
are required for downstream analyses, we need to raise an exception when they are missing.Additional context
This issue was raised by contribution of unit tests for the unexpected behavior in #564.
The text was updated successfully, but these errors were encountered: