Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we treat biolink:Metabolite as a drug for DTD? #1323

Closed
chunyuma opened this issue Mar 24, 2021 · 3 comments
Closed

Should we treat biolink:Metabolite as a drug for DTD? #1323

chunyuma opened this issue Mar 24, 2021 · 3 comments
Labels

Comments

@chunyuma
Copy link
Collaborator

chunyuma commented Mar 24, 2021

This question is raised from issue #1321.

The biolink:Metabolite is a direct child of biolink:ChemicalSubstance but biolink:ChemicalSubstance has many other children such as FoodAdditive, Nutrient, ProcessedMaterial, etc.
See Children at https://biolink.github.io/biolink-model/docs/ChemicalSubstance.html

Here is an example:
Glycyl-Histidine (HMDB0028843) is a metabolite so it should be a chemical too. Although it might not be directly used as drug but it does involves some biological actions like physiological or cell-signaling effect.

Screen Shot 2021-03-24 at 1 05 50 AM

@dkoslicki
Copy link
Member

Personally, I think we should treat any child of chemical substance as a drug. Recall the example of cyclic vomiting being treated with ethanol; so I would assume that we shouldn’t exclude food additives, nutrients, etc. from drug repurposing targets. Better to cast a wide net in the beginning, and then after potential repurposing targets are found, figure out if it would actually be viable to use as a drug.

@chunyuma
Copy link
Collaborator Author

Hi @dkoslicki, theoretically, it is true that all children of chemical substance can be treated as a drug. But in practice, we might not be able to do this because of the training data. Currently, the training data we used for DTD model are mainly from MyChem, semeddb and NDF. The MyChem training data are mainly the curies with prefix CHEMBL.COMPOUND; the semeddb data are mainly the curies with prefix CHEBI, CHEMBL.COMPOUND and MESH; the NDF data are also CHEMBL.COMPOUND. If we want to include food additives, nutrients, etc as a drug, we have to also include them in our training data. Otherwise, the model can't learn the features from these children of chemical substance.

I don't know if our training data MyChem, semeddb and NDF have already contains the pairs of food additives, nutrients with disease since currently kg2.5.2 don't have these categories yet. But I checked the provider source distribution of biolink:Metabolite based on kg2.5.2c, they are mainly KEGG. In our training data, we don't have KEGG.

Screen Shot 2021-03-24 at 3 42 45 PM

To validate whether the model has the predictive power for the curies outside biolink:Drug and biolink:ChemicalSubstance, we need to add a set of drug-disease pairs for the curies outside biolink:Drug and biolink:ChemicalSubstance in this plot:

90923608-f9597380-e3bb-11ea-8abe-ee3bdc1e84aa

@chunyuma
Copy link
Collaborator Author

Thanks to @dkoslicki's suggestion, in the next version of DTD model, we decide to treat biolink:Metabolite, biolink:ChemicalSubstance and biolink:Drug as general drugs in DTD model. So this issue can be closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants