-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate how PubMed references are being processed #166
Comments
Hi @ireneisdoomed, @DSuveges, I've looked into this question which was originally asked by Asier. As I mentioned before, ClinVar stores three types of literature references: disease specific, variant specific, and evidence support (“observed in”). I have investigated each of those types separately. Evidence support (“observed in”) referencesIn the example Asier provided, the one reference displayed on the ClinVar website is the evidence support publication: 17886299 “Molecular consequences of dominant Bethlem myopathy collagen VI mutations”. This paper is about an observation of this specific variant (among others) in a specific disease. Disease specific referencesThe three other references in that record, which are not displayed on the website but are stored in the XML, are disease specific. They are either reviews which summarise the knowledge on the disease, or clinical practice guidelines. In this example, the three publications are:
Variant specific referencesThe references of this type are not present in this record, but I collected several examples from other records. These appear to also be large scale reviews and recommendations, but focusing on genetics rather than disease classes:
Operation of pipeline v2.0.0+Our pipeline only includes evidence support (“observed in”) literature references into the evidence strings. If you would like to see this changed, please let me know. |
Reported by @AsierGonzalez via Slack
Hi Kirill, I’d like to ask you where the literature references you include in the evidence strings under
evidence.variant2disease.provenance_type.literature
andliterature
come from. I would assume that they are extracted from the ClinVar XML but we have found a case (RCV000018694) where the ClinVar website and dbSNP list one publication but there are four in the evidence string (see the OT website).As a side note, in the future we should get rid of the
.literature
field, as it’s a duplication ofevidence.variant2disease.provenance_type.literature
that we don’t use and just adds to the file sizeThe text was updated successfully, but these errors were encountered: