Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treats refactor and xDTD #2253

Closed
dkoslicki opened this issue Mar 20, 2024 · 3 comments
Closed

treats refactor and xDTD #2253

dkoslicki opened this issue Mar 20, 2024 · 3 comments
Assignees

Comments

@dkoslicki
Copy link
Member

Here is a description of the treats refactor, which is also listed in the biolink repo here. We need to determine what, if any, changes are needed in xDTD. This could be as simple as choosing a different predicate, or as complex as inspecting answers to determine which new treats predicate is appropriate.

@chunyuma
Copy link
Collaborator

Hi @dkoslicki, let's have a further discussion on this. I think it is not just as simple as choosing a different predicate, because the current xDTD training does depend on any "treat" predicates in RTX-KG2. As you know, the ground-truth data (i.e., TP and TN) we used for training is from MyChem, SemMedDB, NDF-RT, and RepoDB. They only provide pairs of drug-disease without specific predicates. So the xDTD predications are all defined as "treat" or "no treat" based on probability.

@chunyuma
Copy link
Collaborator

If we need xDTD to predict predicate as well, we might need to consider what data sources we can use, such as the existing "treat" predicate in KG2?

chunyuma added a commit that referenced this issue Mar 29, 2024
@chunyuma
Copy link
Collaborator

After discussing with David, we decide to change the predicate name from the original biolink:treats to biolink:treats_or_applied_or_studied_to_treat to solve this treats refactor issue. This change applies to all 'treat' xdtd-based predictions between drugs and diseases in ARAX.

Here is an example:
For query:

    query = {
            "nodes": {
                "disease": {
                    "ids": ["MONDO:0015564"]
                },
                "chemical": {
                    "categories": ["biolink:ChemicalEntity"]
                }
            },
            "edges": {
                "t_edge": {
                    "object": "disease",
                    "subject": "chemical",
                    "predicates": ["biolink:treats"],
                    "knowledge_type": "inferred"
                }
            }
        }

We now have biolink:biolink:treats_or_applied_or_studied_to_treat:

{'attributes': [{'attribute_source': None,
                 'attribute_type_id': 'metatype:Datetime',
                 'attributes': None,
                 'description': None,
                 'original_attribute_name': 'defined_datetime',
                 'value': '2024-03-29 13:33:17',
                 'value_type_id': None,
                 'value_url': None},
                {'attribute_source': 'infores:arax',
                 'attribute_type_id': 'EDAM-DATA:1772',
                 'attributes': None,
                 'description': 'This edge is a container for a computed value '
                                'between two nodes that is not directly '
                                'attachable to other edges.',
                 'original_attribute_name': None,
                 'value': True,
                 'value_type_id': 'metatype:Boolean',
                 'value_url': None},
                {'attribute_source': None,
                 'attribute_type_id': 'EDAM-DATA:0951',
                 'attributes': None,
                 'description': None,
                 'original_attribute_name': 'probability_treats',
                 'value': '0.8930130705825227',
                 'value_type_id': None,
                 'value_url': None},
                {'attribute_source': 'infores:arax',
                 'attribute_type_id': 'biolink:support_graphs',
                 'attributes': None,
                 'description': None,
                 'original_attribute_name': None,
                 'value': ['aux_graph_5_creative_DTD_option_group_0'],
                 'value_type_id': None,
                 'value_url': None}],
 'object': 'MONDO:0015564',
 'predicate': 'biolink:treats_or_applied_or_studied_to_treat',
 'qualifiers': None,
 'sources': [{'resource_id': 'infores:arax',
              'resource_role': 'primary_knowledge_source',
              'source_record_urls': None,
              'upstream_resource_ids': None}],
 'subject': 'PUBCHEM.COMPOUND:10150081'}

The updated code has passed all local tests. Once it clears the CI/CD, this issue can be closed? @dkoslicki

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants