Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARAX synonym returning unexpected Biolink prefixes #1920

Open
gaurav opened this issue Oct 10, 2022 · 1 comment
Open

ARAX synonym returning unexpected Biolink prefixes #1920

gaurav opened this issue Oct 10, 2022 · 1 comment

Comments

@gaurav
Copy link

gaurav commented Oct 10, 2022

The NodeNorm team received an issue from a user who was receiving unusual identifiers in the ARAX synonym search for Interleukin 2: TranslatorSRI/NodeNormalization#151

Oddly enough, the primary identifier this returns is LOINC:MTHU015779, which is not an identifier prefix for biolink:Protein, SmallMolecule or Gene. By contrast, the SRI NameRes service returns a number of results including PUBCHEM.COMPOUND:130168 and NCBIGene:3558 (https://name-lookup.transltr.io/lookup?string=interleukin%202&offset=0&limit=10), which refer to the small molecule and gene form of interleukin-2 respectively (https://nodenorm.transltr.io/1.3/get_normalized_nodes?curie=PUBCHEM.COMPOUND%3A130168&curie=NCBIGene%3A3558&conflate=false).

Could you please check how this identifier is getting into the ARAX synonym search? If it's coming from an SRI tool, please let us know!

@amykglen
Copy link
Member

Hi @gaurav - so in our latest ARAX Node Synonymizer, there are a few clusters for Interleukin 2. The first two correspond to the gene/protein vs. small molecule versions of IL2 and align with the SRI's clusters for those concepts:

https://arax.ncats.io/devLM/?term=NCBIGene:3558
https://arax.ncats.io/devLM/?term=PUBCHEM.COMPOUND:130168

And the third cluster includes that LOINC identifier you reported:

https://arax.ncats.io/devLM/?term=LOINC:MTHU015779

This latter cluster includes only nodes from RTX-KG2 that are not recognized by the SRI NodeNormalizer. They're not tacked onto the first two clusters because their categories are NamedThing (which probably could be improved in RTX-KG2, but nonetheless is the current level of detail provided - I just reported this in this issue: RTXteam/RTX-KG2#277).

Do you have any concerns remaining here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants