Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential bugs from combining v5 + v6 augur #379

Closed
jameshadfield opened this issue Oct 2, 2019 · 3 comments
Closed

potential bugs from combining v5 + v6 augur #379

jameshadfield opened this issue Oct 2, 2019 · 3 comments

Comments

@jameshadfield
Copy link
Member

Issue moved from slack to here.

  • v5 augur translate is different to v6 augur translate (off-by-one changes 😱 )
  • v5 augur translate + v5 augur export is the same as v6 augur translate + v6 augur export v1 (by design, so that v1 JSONs don't change)

This leads to the following potential bugs. I'm not sure whether it's worth it to try to catch them.

  • v5 augur translate + v6 augur export v1 will produce v1 JSONs who's genes have off-by-one starts and are all on the positive strand.
  • v6 augur translate + v5 augur export will produce similarly buggy JSONs (I think)
@emmahodcroft
Copy link
Member

v5 translate + v5 export and v6 translate + v6 export v1 look like this:

  "annotations": {
    "3D": {
      "end": 1386,
      "start": 0,
      "strand": 1
    },
    "nuc": {
      "end": 1386,
      "start": 0,
      "strand": 1
    }
  }

v6 translate + v5 export looks like this: (incorrect)

  "annotations": {
    "3D": {
      "end": 1386,
      "seqid": "config/echo30-3D-ref.gb",
      "start": 1,
      "strand": "+",
      "type": "CDS"
    },
    "nuc": {
      "end": 1386,
      "seqid": "config/echo30-3D-ref.gb",
      "start": 1,
      "strand": "+",
      "type": "source"
    }
  },

I think the only way to address this in 'real life' is to put a check in auspice - a v1-style JSON should never have seqid, type, etc - if it does, numbers should be adjusted accordingly, as it's a sign a v6 translate output has been put in a v5 export. But the risk of this happening is probably small - once people go to augur v6 it's unlikely they go back - possibly not worth putting in auspice?

We can add a check in v5 export for ourselves, and anyone else who actually is going back and forth for some reason, with the hopes that they are updating both versions.

v5 translate + v6 export v1 looks like this: (incorrect)

"annotations": {
    "3D": {
      "end": 1386,
      "start": -1,
      "strand": 1
    },
    "nuc": {
      "end": 1386,
      "start": -1,
      "strand": 1
    }
  },

For this, we should add a check in v6 export v1.

I'll propose some PRs for the above.

Finally, v5 translate + v6 export v2 fails validation (because strand is 0 or 1 instead of + or -), and is also off-by-one. However, it's not very intuitive that the problem is the 'version' of the translate file. Should we add a more explicit check for this in export v2?

@tsibley
Copy link
Member

tsibley commented Nov 4, 2019

One simple solution here might be to include an Augur major version in the intermediate JSON files produced by Augur and check that for sanity during augur export or any other step which combines the intermediate files.

@jameshadfield
Copy link
Member Author

Closed by #396 & #392

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants