Skip to content

Releases: nextstrain/augur

26.0.0

17 Sep 17:35
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Major Changes

  • filter: Duplicate header names in the FASTA file (--sequences) will now result in an error. #1613 (@victorlin)
  • parse: When both strain and name fields are present, the strain field will now be used as the sequence ID field. #1629 (@victorlin)
  • merge: Generated source columns (e.g. __source_metadata_{NAME}) are now omitted by default. They may be explicitly included with --source-columns=TEMPLATE or explicitly omitted with --no-source-columns. This may be a breaking change for any existing uses of augur merge relying on the generated columns, though as augur merge is relatively new we believe usage to be scant if extant at all. #1625 #1632 (@tsibley)

Bug Fixes

  • filter: Previously, when --subsample-max-sequences was slightly lower than the number of groups, it was possible to fail with an uncaught AssertionError. Internal calculations have been adjusted to prevent this from happening. #1588 #1598 (@victorlin)

25.4.0

03 Sep 19:06
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • merge: Table-specific id columns and delimiters may now be specified, e.g. --metadata-id-columns X=id Y=strain and --metadata-delimiters X=, Y=';', to allow more precise behaviour and avoid ordering issues. #1594 (@tsibley)

Bug Fixes

  • filter: Improved warning and error messages in the case of missing columns. #1604 (@victorlin)
  • merge: Any user-customized ~/.sqliterc file is now ignored so it doesn't break augur merge's internal use of SQLite. #1608 (@tsibley)
  • merge: Non-id columns in metadata inputs that would conflict with the output id column are now forbidden and will cause an error if present. Previously they would overwrite values in the output id column, causing incorrect output. #1593 (@tsibley)
  • import: Spaces in BEAST MCC tree annotations (for example, from a discrete state reconstruction) no longer break augur import beast's parsing. #1610 (@watronfire)

25.3.0

22 Aug 17:39
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • A new command, augur merge, now allows for generalized merging of two or more metadata tables. #1563 (@tsibley)
  • Two new commands, augur read-file and augur write-file, now allow external programs to do i/o like Augur by piping from/to these new commands. They provide handling of compression formats and newlines consistent with the rest of Augur. #1562 (@tsibley)
  • A new debugging mode can be enabled by setting the AUGUR_DEBUG environment variable to 1 (or any non-empty value). Currently the only effect is to print more information about handled (i.e. anticipated) errors. For example, stack traces and parent exceptions in an exception chain are normally omitted for handled errors, but setting this env var includes them. Future debugging and troubleshooting features, like verbose operation logging, will likely also condition on this new debugging mode. #1577 (@tsibley)
  • filter: Added the ability to use weights in subsampling. See help text of --group-by-weights and the updated Filtering and Subsampling guide for more information. #1454 (@victorlin)

Bug Fixes

  • Embedded newlines in quoted field values of metadata files read/written by many commands, annotation files read by augur curate apply-record-annotations, and index files written by augur index are now properly handled. #1561 #1564 (@tsibley)
  • Output written to stderr (e.g. informational messages, warnings, errors, etc.) is now always line-buffered regardless of the Python version in use. This helps with interleaved stderr and stdout. Previously, stderr was block-buffered on Python 3.8 and line-buffered on 3.9 and higher. #1563 (@tsibley)

25.2.0

24 Jul 16:41
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • export v2: we now limit numerical precision on floats in the JSON. This should not change how a dataset is displayed / interpreted in Auspice but allows the gzipped & minimised JSON filesize to be reduced by around 30% (dataset-dependent). #1512 (@jameshadfield)
  • traits, export v2: augur traits now reports all confidence values above 0.1% rather than limiting them to the top 4 results. There is no change in the eventual Auspice dataset as augur export v2 will still only consider the top 4. #1512 (@jameshadfield)
  • curate: Excel (.xlsx and .xls) and OpenOffice (.ods) spreadsheet files are now also supported as metadata inputs (--metadata). The first sheet in the workbook is read as tabular data. #1550 (@tsibley)

Bug Fixes

  • titers sub: Fixes a bug where antigenic weights were assigned to branches for substitutions in the incorrect order of <derived allele><position><ancestral allele> instead of <ancestral allele><position><derived allele>. #1555 (@huddlej)

25.1.1

15 Jul 17:55
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Bug Fixes

  • curate parse-genbank-location: Fix a bug where a mix of empty and populated location-field values would result in inconsistent fields in the output NDJSON #1531(@genehack)

25.1.0

11 Jul 23:39
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

25.0.0

10 Jul 21:39
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Major changes

  • curate format-dates: Raises an error if provided date field does not exist in records. #1509 (@joverlee521)
  • All curate subcommands: Verifies all input records have the same fields and raises an error if a record does not have matching fields. #1518 (@joverlee521)

Features

  • Added a new sub-command augur curate apply-geolocation-rules to apply user curated geolocation rules to the geolocation fields in a metadata file. Previously, this was available as a script within the nextstrain/ingest repo. #1491 (@victorlin)
  • Added a default color for the "Asia" region that will be used in augur export is no custom colors are provided. #1490 (@joverlee521)
  • Added a new sub-command augur curate apply-record-annotations to apply user curated annotations to existing fields in a metadata file. Previously, this was available as a merge-user-metadata in the nextstrain/ingest repo. #1495 (@joverlee521)
  • Added a new sub-command augur curate abbreviate-authors to abbreviate lists of authors to " et al." Previously, this was avaliable as the transform-authors script within the nextstrain/ingest repo. [#1483][] (@genehack)
  • Added a new sub-command augur curate parse-genbank-location to parse the geo_loc_name field from GenBank reconds. Previously, this was available as the translate-genbank-location script within the nextstrain/ingest repo. [#1485][] (@genehack)
  • curate format-dates: Added defaults to --expected-date-formats so that ISO 8601 dates (%Y-%m-%d) and its various masked forms (e.g. %Y-XX-XX) are automatically parsed by the command. #1501 (@joverlee521)
  • Added a new sub-command augur curate transform-strain-name to filter strain names based on matching a regular expression. Previously, this was available as the transform-strain-names script within the nextstrain/ingest repo. #1514 (@genehack)
  • Added a new sub-command augur curate rename to rename field / column names. Previously, a similar version was available as the transform-field-names script within the nextstrain/ingest repo however the behaviour is slightly changed here. #1506 (@jameshadfield)

Bug Fixes

  • filter: Improve speed of checking duplicates in metadata, especially for large files. #1466 (@victorlin)
  • curate: Stop adding double quotes to the metadata TSV output when field values have internal quotes. #1493 (@joverlee521)
  • curate format-dates: Mask empty date values as XXXX-XX-XX to represent unknown dates. #1509 (@joverlee521)

24.4.0

15 May 23:20
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • All commands: Allow repeating an option that takes multiple values. Previously, if multiple option flags were specified (e.g. --exclude-where 'region=A' --exclude-where 'region=B'), only the last one was used. Now, all values are used. #1445 (@victorlin)
  • ancestral, translate: output node data files are now validated. The argument --validation-mode is added which controls this behaviour (default: error). This argument also controls validation of the input node-data file (ancestral only). #1440 (@jameshadfield)
  • export: Updated default latitudes and longitudes for geography traits. This only applies if you are not using --lat-longs to override the built in mappings. #1449 (@trvrb)

Bug Fixes

  • validation: we no longer exit with a non-zero exit code when the requested validation mode is "warn" #1440 (@jameshadfield)
  • validation: we no longer perform any validation when the requested validation mode is "skip" #1440 (@jameshadfield)
  • filter: Send all log messages to stderr. This allows output to be written to stdout (e.g. --output-strains /dev/stdout). #1459 (@victorlin)

24.3.0

18 Mar 17:20
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Features

  • filter: Added a new option --max-length to filter out sequences that are longer than a certain amount of base pairs. #1429 (@victorlin)
  • parse: Added support for environments that use pandas 2.x. #1436 (@emollier, @victorlin)

Bug Fixes

  • filter: Updated docs with an example of tiered subsampling. #1425 (@victorlin)
  • export: Fixes bug #1433 introduced in v23.1.0, that causes validation to fail when gene names start with nuc, e.g. nucleocapsid. #1434 (@corneliusroemer)
  • import: Fixes bug introduced in v24.2.0 that prevented import beast from running. #1439 (@tomkinsc)
  • translate, ancestral: Compound CDS are now exported as segmented CDS and are now viewable in Auspice. #1438 (@jameshadfield)

24.2.3

23 Feb 22:08
Compare
Choose a tag to compare

These release notes are automatically extracted from the full changelog.

Bug Fixes

  • filter: Updated the help and report text of --min-length to explicitly state that the minimum length filter only counts standard nucleotide characters A, C, G, or T (case-insensitive). This has been the behavior since version 3.0.3.dev1, but has never been explicitly documented. #1422 (@joverlee521)
  • frequencies: Fixed a bug introduced in 24.2.0 and 24.1.0 that prevented --regions from working when providing regions other than the default "global" region. #1424