Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVA-3563 remove the option of providing csv mapping through command line #37

Merged
merged 11 commits into from
May 22, 2024

Conversation

nitin-ebi
Copy link
Collaborator

@nitin-ebi nitin-ebi commented May 20, 2024

No description provided.

@nitin-ebi nitin-ebi self-assigned this May 20, 2024
Copy link
Member

@tcezard tcezard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions and comments.
Main point would be around catching exceptions at the top level to provide clear error message

@@ -53,6 +53,7 @@ Analysis:
Experiment Type: experimentType
Reference: referenceGenome
optional:
Assembly Fasta: assemblyFasta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to make the name explicit.

Suggested change
Assembly Fasta: assemblyFasta
Assembly Fasta Path: assemblyFasta

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 94 to 95
if not os.path.exists(os.path.abspath(metadata_file)):
raise Exception(f'The provided metadata file {metadata_file} does not exist')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a specific exception for command line error that can be caught in the main binary so that a clean error message is provided and not a code exception that looks like a crash ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to FileNotFoundError

tests/resources/EVA_Submission_test_with_asm_report.json Outdated Show resolved Hide resolved
tests/resources/EVA_Submission_test_with_asm_report.json Outdated Show resolved Hide resolved
tests/resources/EVA_Submission_test_with_asm_report.json Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think Assembly Fasta should be in bold since it's optional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

},
"assemblyFasta": {
"type": "string",
"description": "The assembly in fasta format that was used to derive the VCF"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The assembly in fasta format that was used to derive the VCF"
"description": "The path to the assembly in fasta format that was used to create the VCF. (This is a local path)"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

},
"assemblyReport": {
"type": "string",
"description": "The assembly report associated with the assembly as found in NCBI assemblies"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"description": "The assembly report associated with the assembly as found in NCBI assemblies"
"description": "The path to the assembly report associated with the assembly as found in NCBI assemblies. (This is a local path)"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -58,8 +53,6 @@ def get_version():
vcf_group.add_argument('--vcf_files', nargs='+', help="One or several vcf files to validate")
vcf_group.add_argument('--assembly_fasta',
help="The fasta file containing the reference genome from which the variants were derived")
vcf_group.add_argument("--vcf_files_mapping",
help="csv file with the mappings for vcf files, fasta and assembly report")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the readme as well? It's just easier to keep the documentation up-to-date when we do it in parallel with the code updates.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@@ -165,6 +165,14 @@
"referenceGenome": {
"type": "string",
"description": "Reference against which the analysis was performed. ENA or NCBI accession (starting with GCA) accepted."
},
"assemblyFasta": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spreadsheet we request "Reference" and state it can be "GCA accession for assemblies or associated INSDC accession for sequences", and then ask for "Assembly Fasta Path" which sounds like it should be for an assembly only.
Meanwhile here in the JSON we request "referenceGenome" and say it has to start with GCA.

I think we should try to be consistent with our terminology (and also whether we state we accept single INSDC sequences or not...)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excel does this dumb thing where it opens the workbook on whatever tab it was last saved, which is a bit disorienting if it's not the "Please read first" tab.... doesn't really matter now since it's just us, but just to keep in mind once we have external users.

@nitin-ebi nitin-ebi merged commit cca9835 into EBIvariation:main May 22, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants