Proteins with no genes #31

stuppie · 2018-03-05T23:04:36Z

I think it would be useful for mygene to also store information about proteins with no associated Entrez record. For example:
http://www.uniprot.org/uniprot/A2NXD2
http://www.uniprot.org/uniprot/Q5NV61

sirloon · 2018-05-10T17:37:45Z

@newgene this issue would require to adjust ID conversion in uniprot parser. Currently it tries to convert uniprot_acc to entrez ID, or if not possible, Ensembl ID. But if none of them are available the document is skipped. Probably some fix around this: https://github.com/biothings/mygene.info/blob/master/src/hub/dataload/sources/uniprot/parser.py#L53. What do you think ?

newgene · 2018-05-10T22:23:23Z

We need to give more thoughts on this one. Supposedly MyGene.info is all about genes, if not a gene, no record in MyGene.info. But I agree, including those uniprot IDs is useful, as genes and proteins are often so tied together. With no associated gene ID for a protein, it just means the corresponding gene has not be identified yet, but there should be a gene somewhere in the genome encoding this protein.

With this in mind, I am not against the idea of giving a "fake" gene id place-holder for a document, and put the corresponding uniprot ID within this document (so that this uniprot ID will be searchable).

One way of making this "fake" gene id is like this:

"_id": "NO_GENE_ID_FOR_A2NXD2"

This expands the gene _id priority list to three tier: NCBI Gene ID-->Ensembl Gene ID-->NO_GENE_ID for Uniprot-only gene.

Your opinions? @stuppie @sirloon @cyrus0824 @andrewsu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proteins with no genes #31

Proteins with no genes #31

stuppie commented Mar 5, 2018 •

edited

Loading

sirloon commented May 10, 2018

newgene commented May 10, 2018

Proteins with no genes #31

Proteins with no genes #31

Comments

stuppie commented Mar 5, 2018 • edited Loading

sirloon commented May 10, 2018

newgene commented May 10, 2018

stuppie commented Mar 5, 2018 •

edited

Loading