-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Representing sequencing library preparations in the HCA DCP metadata standard #87
Merged
Merged
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit
Hold shift + click to select a range
860687d
First draft library prep RFC.
malloryfreeberg a6fab3b
Fixed image links.
malloryfreeberg e189bf7
Fixed formatting of things.
malloryfreeberg e514cd4
Fixed formatting of things.
malloryfreeberg ad190ca
Tested resizing image.
malloryfreeberg eab833c
Tested resizing image.
malloryfreeberg 7489fbb
Tested resizing image.
malloryfreeberg 2cbc221
Tested resizing image.
malloryfreeberg cdfd533
Tested resizing image.
malloryfreeberg 8c59981
Fixed formatting of things.
malloryfreeberg 36de394
Fixed formatting of things.
malloryfreeberg b8fb931
Fixed formatting of things.
malloryfreeberg 21bc457
Debugging.
malloryfreeberg 7f23eba
Debugging.
malloryfreeberg 05dcbab
Debugging.
malloryfreeberg 17db6f8
Debugging.
malloryfreeberg 42bfd26
Debugging.
malloryfreeberg 40cc466
Debugging.
malloryfreeberg 17b0b6c
Debugging.
malloryfreeberg 8bbcb85
Final formatting fixes.
malloryfreeberg 14b7200
Fixed author formatting.
malloryfreeberg 66dac51
Added graffle image files.
malloryfreeberg d1d3e3c
Table formatting fixes.
malloryfreeberg 9106e72
Added section about removing old field.
malloryfreeberg 8e05dba
Fixed table formatting.
malloryfreeberg 7ffd25b
Updated graffle file.
malloryfreeberg e6f5036
Updated ES query.
malloryfreeberg 4fb9d46
Updated definition of LP biomaterial.
malloryfreeberg f1cd7fb
Added caveat about ncbi_taxon_id.
malloryfreeberg 694201e
Added Justin as Shepherd.
malloryfreeberg 96a8317
Cleaned up author/shepherd formatting.
malloryfreeberg 6e5c9ae
Updated logical unit definition.
malloryfreeberg 884346c
spelling fix in surname
lauraclarke 2073de7
add last call for oversight
justincc 41c6957
Rename 0000-rfc-library-preparation.md to 0010-rfc-library-preparatio…
justincc 68f0c42
add link
justincc File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Fixed image links.
- Loading branch information
commit a6fab3b42a8207dd6bcac53daca49e1fcaf95244
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it concerning that we are solving a downstream processing problem by changing the storage structure of the data. The lack of any separation between storage and processing is severely limiting. This is not sustainable if we have other or conflicting analysis needs that require being notified in a different manner.
This is the only short-term hack currently available, so I am not suggesting that it not be done, but that it shouldn't be considered a general solutuon.
It also creates complexity as it is creating bundles that look different to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that implementation of this RFC means technically bundles might look different to the user (some bundles might have a single triplet of fastq files if an LP was sequenced once, while other bundles might have multiple triplets of fastq files if an LP was sequenced more than once), the overwhelming opinion I get from the comp bios I've talked to (e.g. @barkasn @kishorikonwar) is that "all fastq files from the same library preparation need to be processed together", not just in the DCP but as a general rule for processing single cell sequencing data anywhere.
Thus, from my perspective, providing an efficient way to get all fastq files per LP is preferred over providing that exact same structure within a bundle (e.g. 1 triplet per bundle, what we've been doing up until now). We already know the structure within a bundle is going to be different for imaging data, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and created a scratch DCP terminology page. Here I've written that a bundle is just a collection of file references so if you don't agree please edit/comment :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have the Bundle Definition RFC! #93
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt that an rfc was the right mechanism for that. Why is andrey the only reviewer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And in the latest part of this comedy it turns out there is already a Google doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justincc I believe that RFC is still in Draft mode. Re. google docs: we should be slowly migrating the important info from google docs to RFCs to avoid the google drive morass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't suitable for an RFC since it's something that changes in increments over time, You can't start an RFC every time you want to add a definition - that just means no one every writes anything because the process barrier is too high.