Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some protein nodes are missing descriptions #20

Closed
dkoslicki opened this issue Mar 10, 2018 · 13 comments
Closed

Some protein nodes are missing descriptions #20

dkoslicki opened this issue Mar 10, 2018 · 13 comments

Comments

@dkoslicki
Copy link
Member

Eg. P08563

dkoslicki added a commit that referenced this issue Mar 10, 2018
saramsey added a commit that referenced this issue Mar 26, 2018
@dkoslicki
Copy link
Member Author

Found another such node:

match (n) where n.name="DOID:10923" return n

@saramsey
Copy link
Member

confirmed this bug in the current dev KG (in container rtxsteve on rtxdev.saramsey.org)

saramsey added a commit that referenced this issue Mar 29, 2018
@saramsey
Copy link
Member

Confirmed fixed in http://rtxsteve.saramsey.org:7474

screen shot 2018-03-29 at 2 56 14 pm

@saramsey
Copy link
Member

Working on copying the Neo4j database to the less-expensive "rtxdev" EC2 instance now....

@saramsey
Copy link
Member

Updated database has been pushed to http://rtxdev.saramsey.org:7674

screen shot 2018-03-29 at 3 02 02 pm

@saramsey
Copy link
Member

all drugs now have descriptions:

screen shot 2018-03-29 at 3 06 30 pm

@saramsey
Copy link
Member

Looks like 303 protein nodes lack descriptions. Not clear from a search of the code-base where these are coming from. More investigation needed:
screen shot 2018-03-29 at 3 09 52 pm

@saramsey saramsey changed the title Some nodes are missing descriptions Some protein nodes are missing descriptions Mar 29, 2018
@dkoslicki
Copy link
Member Author

Note also that some of the target edges are missing probabilities:

match p=(n:pharos_drug)-[t:targets]-(:uniprot_protein) return t.probability limit 10

t.probability
--
null
null
null
0.03231776208
0.00615026285
7.3254e-7
1
0.00000596881
0.00000803035
0.00000190193


@saramsey
Copy link
Member

saramsey commented Mar 29, 2018 via email

@dkoslicki
Copy link
Member Author

MATCH p=(s:pharos_drug)-[r:targets]->(t:uniprot_protein) where r.probability is Null return s.name, t.name, r.probability limit 10


s.name | t.name | r.probability
-- | -- | --
"CHEMBL1166" | "Q9H244" | null
"CHEMBL1166" | "Q9BQB6" | null
"CHEMBL1166" | "P03952" | null
"CHEMBL1166" | "P38435" | null
"CHEMBL1201244" | "P02708" | null
"CHEMBL1201244" | "P11230" | null
"CHEMBL1201244" | "Q07001" | null
"CHEMBL1201244" | "P08913" | null
"CHEMBL602" | "P13569" | null
"CHEMBL602" | "P37231" | null

@dkoslicki
Copy link
Member Author

@saramsey Closing the loop on this issue: some of the probabilities are still not populated:

match p=(n:chemical_substance)-[t:directly_interacts_with]-(m:protein) where not exists(t.probability) return t.probability, n.description, m.description limit 100

eg edge connecting triclofos and GABRE.

If this is due to Pharos not having the info, then it's fine and we can close this issue. I just wanted to make sure it wasn't a problem with the orangeboard construction itself.

@saramsey
Copy link
Member

saramsey commented Apr 16, 2018 via email

@dkoslicki
Copy link
Member Author

Fair enough: let's chalk it up to a shortcoming of the KS's. Guess we can close this issue then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants