Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bakta update to 1.5.0 #4787

Merged
merged 19 commits into from
Sep 16, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
add a new tool BAKTA for genome annotation
  • Loading branch information
pimarin committed Aug 17, 2022
commit fe1cdf884df206d842be4f0768acb06b0bbcf56f
13 changes: 13 additions & 0 deletions tools/bakta/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
name: "bakta"
owner: "iuc"
long_description: |
"Bakta is a tool for the rapid & standardized annotation of bacterial genomes and plasmids from both isolates and MAGs.
It provides dbxref-rich and sORF-including annotations in machine-readable JSON & bioinformatics standard file formats for automatic downstream analysis."
categories:
- Sequence Analysis
remote_repository_url: "https://github.com/mesocentre-clermont-auvergne/galaxy-tools/tree/master/tools/bakta"
homepage_url: "https://github.com/oschwengers/bakta"
type: unrestricted
auto_tool_repositories:
name_template: "{{ tool_id }}"
description_template: "{{ tool_name }}: rapid and standardized annotation of bacterial genomes, MAGs and plasmids"
483 changes: 483 additions & 0 deletions tools/bakta/bakta.xml

Large diffs are not rendered by default.

66 changes: 66 additions & 0 deletions tools/bakta/macro.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
<?xml version="1.0"?>
<macros>
<token name="@TOOL_VERSION@">1.4.2</token>
<token name="@VERSION_SUFFIX@">0</token>
<token name="@PROFILE@">21.05</token>
<xml name="version_command">
<version_command><![CDATA[bakta --version]]></version_command>
</xml>
<xml name="edam">
<edam_topics>
<edam_topic>topic_3174</edam_topic>
</edam_topics>
</xml>
<xml name="xrefs">
<xrefs>
<xref type='bio.tools'>Bakta</xref>
</xrefs>
</xml>
<xml name="requirements">
<requirements>
<requirement type="package" version="@TOOL_VERSION@">bakta</requirement>
</requirements>
</xml>
<xml name="citations">
<citations>
<citation type="bibtex">
@article{mbs:/content/journal/mgen/10.1099/mgen.0.000685,
author = "Schwengers, Oliver and Jelonek, Lukas and Dieckmann, Marius Alfred and Beyvers, Sebastian and Blom, Jochen and Goesmann, Alexander",
title = "Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification",
journal= "Microbial Genomics",
year = "2021",
volume = "7",
number = "11",
pages = "",
doi = "https://doi.org/10.1099/mgen.0.000685",
url = "https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.000685",
publisher = "Microbiology Society",
issn = "2057-5858",
type = "Journal Article",
keywords = "whole-genome sequencing",
keywords = "bacteria",
keywords = "metagenome-assembled genomes",
keywords = "plasmids ",
keywords = "genome annotation",
eid = "000685",
abstract = "Command-line annotation software tools have continuously gained popularity compared to
centralized online services due to the worldwide increase of sequenced bacterial genomes.
However, results of existing command-line software pipelines heavily depend on taxon-specific
databases or sufficiently well annotated reference genomes. Here, we introduce Bakta, a new command-line
software tool for the robust, taxon-independent, thorough and, nonetheless, fast annotation of bacterial genomes.
Bakta conducts a comprehensive annotation workflow including the detection of small proteins taking into account
replicon metadata. The annotation of coding sequences is accelerated via an alignment-free sequence identification
approach that in addition facilitates the precise assignment of public database cross-references.
Annotation results are exported in GFF3 and International Nucleotide Sequence Database Collaboration (INSDC)-compliant flat files,
as well as comprehensive JSON files, facilitating automated downstream analysis.
We compared Bakta to other rapid contemporary command-line annotation software tools in both targeted and taxonomically broad benchmarks
including isolates and metagenomic-assembled genomes. We demonstrated that Bakta outperforms other tools in terms of functional annotations,
the assignment of functional categories and database cross-references, whilst providing comparable wall-clock runtimes.
Bakta is implemented in Python 3 and runs on MacOS and Linux systems.
It is freely available under a GPLv3 license at https://github.com/oschwengers/bakta.
An accompanying web version is available at https://bakta.computational.bio.",
}
</citation>
</citations>
</xml>
</macros>
49 changes: 49 additions & 0 deletions tools/bakta/test-data/NC_002127.1.fna
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
>NC_002127.1 Escherichia coli O157:H7 str. Sakai plasmid pOSAK1, complete sequence
TTCTTCTGCGAGTTCGTGCAGCTTCTCACACATGGTGGCCTGCTCGTCAGCATCGAGTGCGTCCAGTTTT
TCGAGCAGCGTCAGGCTCTGGCTTTTTATGAATCCCGCCATGTTGAGTGCAGTTTGCTGCTGCTTGTTCA
TCTTTCTGTTTTCTCCGTTCTGTCTGTCATCTGCGTCGTGTGATTATATCGCGCACCACTTTTCGACCGT
CTTACCGCCGGTATTCTGCCGACGGACATTTCAGTCAGACAACACTGTCACTGCCAAAAAACAGCAGTGC
TTTGTTGGTAATTCGAACTTGCAGACAGGACAGGATGTGCAATTGTTATACCGCGCATACATGCACGCTA
TTACAATTACCCTGGTCAGGGCTTCGCCCCGACACCCCATGTCAGATACGGAGCCATGTTTTATGACAAA
ACGAAGTGGAAGTAATACGCGCAGGCGGGCTATCAGTCGCCCTGTTCGTCTGACGGCAGAAGAAGACCAG
GAAATCAGAAAAAGGGCTGCTGAATGCGGCAAGACCGTTTCTGGTTTTTTACGGGCGGCAGCTCTCGGTA
AGAAAGTTAACTCACTGACTGATGACCGGGTGCTGAAAGAAGTTATGCGACTGGGGGCGTTGCAGAAAAA
ACTCTTTATCGACGGCAAGCGTGTCGGGGACAGAGAGTATGCGGAGGTGCTGATCGCTATTACGGAGTAT
CACCGTGCCCTGTTATCCAGGCTTATGGCAGATTAGCTTCCCGGAGAGAAACTGTCGAAAACAGACGGTA
TGAACGCCGTAAGCCCCCAAACCGATCGCCATTCACTTTCATGCATAGCTATGCAGTGAGCTGAAAGCGA
TCCTGACGCATTTTTCCGGTTTACCCCGGGGAAAACATCTCTTTTTGCGGTGTCTGCGTCAGAATCGCGT
TCAGCGCGTTTTGGCGGTGCGCGTAATGAGACGTTATGGTAAATGTCTTCTGGCTTGATATTATATTGGA
ATGCCTTTTTTCAAAGCAAATGATGTGGCTTTGGATAGAAGGTTTACGTTGATCTTATCAAAGTTTTTTT
TAAAGAACGAAGCCGAGAGCTCAGATAAATCATTATATTCATCAGTTTTCGTAACTTTGTTTAATGTGTA
ACTTGAAAACTTCTCGCCATTAAATGACGTATAGACGTAACGATCTTTTTTTCCACCGTTAGGAATTATT
AAATCAAAAAAAACATCACCCTTGCTTTTCTTTTTCTTCAAGTCGGATTCGATTTTTGAGAAAAATTCGC
TCGGGCTATAAATATCAGTAGCATAGACAATAAATAAAGTTTTATCTTTATTTTTTATTGCTTCTATTTG
ATATTTTTTATCTTTTTTCATAATTTCAACCTAGCTACGCCACCATCTATTAATTGGCAAACGGTATCGA
TGATTGCGATTGATTCTAATTTGTTAATTGTCTTCGTGTCAGCTATTCCTGGTTTCATATGAAACAAACC
ATGCCTGTTCTCATGCCAGTAAGTGTAGCATTCACACAAAACTTCCGCTATTTCACCATTTATAGTTTCT
TGGTGTATTTCTCTGATTATATATTTGGGTTTGTTTTCAGTGAAGTATTCGCCAAGGTTCTTTGATGATG
ATGGATTGCAAACATCATTAAGTATTTGATATATAAAGCCTTCTATGGCTCTTAATGCAGAGAAGCAGTA
TGTTGAGTAATCTTCCATTTCGACATCTATTTTTTTCATTATTAGCGAGCATGATAGCTGTTTTTTGATA
TCTTCATGGATTTTATCGATGCTTTTTGGTAGTTTGCTATGCAACTCGGACTCAATAGTTTCTTTTTTTA
TGTCAACATTAAATTCTTTATTTTTTTGTTCGACAATCTCTTTCATGTTTAGTATTGAGCACATGAAATC
GTTAATCAAACTCGCGATTTGAAGGTATTTTCCTTGGAATTGAATAGAGCCGCGCTTGTAAATTTTTGCC
CTGACCCTGTCACCATTGCTGGTGGTCATAATATATTGGTGTTTACAATTAGGATCGTTATTATTATCTT
CTGTTATTGTTATCCCCTCTTCAGAAAGAAATTCAAATAGATTTGCCCTGTCATCATCACTGAATTTTGG
AATGGTGTATTCAAAGTTCTTTGTGTCTGAATACAAACAGTTTTCTTTTATAATCAAGGCGATTTCATCA
AAGTAAGTGTTATTTTGCCCAGACGCTCTTCCGATAGTGGTATTTCCGCCTGATGGCATTAGTTTTATTA
AGAAGTCTATTCCTTTATATGTGCCAGATATGTGAGTTTCTCTTTCGTTTTTTACATTAGAGGAATAGTT
TGTGACGCCATTCTGCGTCAGAGCAGACTCAATCTTGTCAATATTGATATTTAGTGCTTTAAACGGGTTC
TGTGCCATTGGGTCAATCCGTTGTTTTTTTTGAATATGTACAGATCTTGTTTTTTTGTCAACGGAATAGC
TGTTCGTTGACTTGATAGACCGATTGATTCATCATCTCATAAATAAAGAAAAACCACCGCTACCAACGGT
GGTTTTCTCAAGGTTCGCTGAGCTACCAACTCTTTGAACCAAGGTAACTGGCTTGGAGGAGCGCAGTCAC
CAAAATCTGTTCTTTCAGTTTAGCCTTAACAGGTGCATAACTTCAAGACAAACTCCTCTAAATCAGTTAC
CAATGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAA
GGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAA
CTGAGATACCAACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATC
CGGTAAGTGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCGGGGGGAAACGCCTGGTATCTTTA
TAGTCCTGTCGGGTTTCGCCACCTCTGGCTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGC
CTATGGAAAAACGCCTGCGGTGCTGGCTTCTTCCGGTGCTTTGCTTTTTGCTCACATGTTCTTTCCGGCT
TTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGACACCGCTCGCCGCAGTCGAA
CGACCGAGCGTAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCTTATGTGACATTTTCTCCTTACGCT
CTGTTGTGCCGTTCGGCATCCTGCCCTGAGCGTTATATCTCTGTGCTATTTTCTACTTCAAAGCGTGTCT
GTATGCTGTTCTGGAG
137 changes: 137 additions & 0 deletions tools/bakta/test-data/TEST_1/TEST_1.embl
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
ID contig_1; ; circular; DNA; ; PRO; 3306 BP.
XX
AC contig_1;
XX
DE plasmid unnamed1, complete sequence
XX
OS .
OC .
XX
CC Annotated with Bakta
CC Software: v1.4.2
CC Database: v3.0
CC DOI: 10.1099/mgen.0.000685
CC URL: github.com/oschwengers/bakta
CC
CC ##Genome Annotation Summary:##
CC Annotation Date :: 08/17/2022, 09:35:24
CC Annotation Pipeline :: Bakta
CC Annotation Software version :: v1.4.2
CC Annotation Database version :: v3.0
CC CDSs :: 3
CC tRNAs :: 0
CC tmRNAs :: 0
CC rRNAs :: 0
CC ncRNAs :: 0
CC regulatory ncRNAs :: 0
CC CRISPR Arrays :: 0
CC oriCs/oriVs :: 0
CC oriTs :: 0
CC gaps :: 0
XX
FH Key Location/Qualifiers
FH
FT source 1..3306
FT /mol_type="genomic DNA"
FT /plasmid="unnamed1"
FT gene <2..736
FT /locus_tag="DOGAIA_00005"
FT CDS <2..736
FT /product="hypothetical protein"
FT /locus_tag="DOGAIA_00005"
FT /translation="SSASSCSFSHMVACSSASSASSFSSSVRLWLFMNPAMLSAVCCCL
FT FIFLFSPFCLSSASCDYIAHHFSTVLPPVFCRRTFQSDNTVTAKKQQCFVGNSNLQTGQ
FT DVQLLYRAYMHAITITLVRASPRHPMSDTEPCFMTKRSGSNTRRRAISRPVRLTAEEDQ
FT EIRKRAAECGKTVSGFLRAAALGKKVNSLTDDRVLKEVMRLGALQKKLFIDGKRVGDRE
FT YAEVLIAITEYHRALLSRLMAD"
FT /codon_start=1
FT /transl_table=11
FT /protein_id="gnl|Bakta|DOGAIA_00005"
FT /inference="ab initio prediction:Prodigal:2.6"
FT gene complement(971..1351)
FT /locus_tag="DOGAIA_00010"
FT CDS complement(971..1351)
FT /product="hypothetical protein"
FT /locus_tag="DOGAIA_00010"
FT /translation="MKKDKKYQIEAIKNKDKTLFIVYATDIYSPSEFFSKIESDLKKKK
FT SKGDVFFDLIIPNGGKKDRYVYTSFNGEKFSSYTLNKVTKTDEYNDLSELSASFFKKNF
FT DKINVNLLSKATSFALKKGIPI"
FT /codon_start=1
FT /transl_table=11
FT /protein_id="gnl|Bakta|DOGAIA_00010"
FT /inference="ab initio prediction:Prodigal:2.6"
FT gene complement(1348..2388)
FT /locus_tag="DOGAIA_00015"
FT CDS complement(1348..2388)
FT /product="hypothetical protein"
FT /locus_tag="DOGAIA_00015"
FT /translation="MAQNPFKALNINIDKIESALTQNGVTNYSSNVKNERETHISGTYK
FT GIDFLIKLMPSGGNTTIGRASGQNNTYFDEIALIIKENCLYSDTKNFEYTIPKFSDDDR
FT ANLFEFLSEEGITITEDNNNDPNCKHQYIMTTSNGDRVRAKIYKRGSIQFQGKYLQIAS
FT LINDFMCSILNMKEIVEQKNKEFNVDIKKETIESELHSKLPKSIDKIHEDIKKQLSCSL
FT IMKKIDVEMEDYSTYCFSALRAIEGFIYQILNDVCNPSSSKNLGEYFTENKPKYIIREI
FT HQETINGEIAEVLCECYTYWHENRHGLFHMKPGIADTKTINKLESIAIIDTVCQLIDGG
FT VARLKL"
FT /codon_start=1
FT /transl_table=11
FT /protein_id="gnl|Bakta|DOGAIA_00015"
FT /inference="ab initio prediction:Prodigal:2.6"
XX
SQ Sequence 3306 BP; 797 A; 692 C; 743 G; 1074 T; 0 other;
ttcttctgcg agttcgtgca gcttctcaca catggtggcc tgctcgtcag catcgagtgc 60
gtccagtttt tcgagcagcg tcaggctctg gctttttatg aatcccgcca tgttgagtgc 120
agtttgctgc tgcttgttca tctttctgtt ttctccgttc tgtctgtcat ctgcgtcgtg 180
tgattatatc gcgcaccact tttcgaccgt cttaccgccg gtattctgcc gacggacatt 240
tcagtcagac aacactgtca ctgccaaaaa acagcagtgc tttgttggta attcgaactt 300
gcagacagga caggatgtgc aattgttata ccgcgcatac atgcacgcta ttacaattac 360
cctggtcagg gcttcgcccc gacaccccat gtcagatacg gagccatgtt ttatgacaaa 420
acgaagtgga agtaatacgc gcaggcgggc tatcagtcgc cctgttcgtc tgacggcaga 480
agaagaccag gaaatcagaa aaagggctgc tgaatgcggc aagaccgttt ctggtttttt 540
acgggcggca gctctcggta agaaagttaa ctcactgact gatgaccggg tgctgaaaga 600
agttatgcga ctgggggcgt tgcagaaaaa actctttatc gacggcaagc gtgtcgggga 660
cagagagtat gcggaggtgc tgatcgctat tacggagtat caccgtgccc tgttatccag 720
gcttatggca gattagcttc ccggagagaa actgtcgaaa acagacggta tgaacgccgt 780
aagcccccaa accgatcgcc attcactttc atgcatagct atgcagtgag ctgaaagcga 840
tcctgacgca tttttccggt ttaccccggg gaaaacatct ctttttgcgg tgtctgcgtc 900
agaatcgcgt tcagcgcgtt ttggcggtgc gcgtaatgag acgttatggt aaatgtcttc 960
tggcttgata ttatattgga atgccttttt tcaaagcaaa tgatgtggct ttggatagaa 1020
ggtttacgtt gatcttatca aagttttttt taaagaacga agccgagagc tcagataaat 1080
cattatattc atcagttttc gtaactttgt ttaatgtgta acttgaaaac ttctcgccat 1140
taaatgacgt atagacgtaa cgatcttttt ttccaccgtt aggaattatt aaatcaaaaa 1200
aaacatcacc cttgcttttc tttttcttca agtcggattc gatttttgag aaaaattcgc 1260
tcgggctata aatatcagta gcatagacaa taaataaagt tttatcttta ttttttattg 1320
cttctatttg atatttttta tcttttttca taatttcaac ctagctacgc caccatctat 1380
taattggcaa acggtatcga tgattgcgat tgattctaat ttgttaattg tcttcgtgtc 1440
agctattcct ggtttcatat gaaacaaacc atgcctgttc tcatgccagt aagtgtagca 1500
ttcacacaaa acttccgcta tttcaccatt tatagtttct tggtgtattt ctctgattat 1560
atatttgggt ttgttttcag tgaagtattc gccaaggttc tttgatgatg atggattgca 1620
aacatcatta agtatttgat atataaagcc ttctatggct cttaatgcag agaagcagta 1680
tgttgagtaa tcttccattt cgacatctat ttttttcatt attagcgagc atgatagctg 1740
ttttttgata tcttcatgga ttttatcgat gctttttggt agtttgctat gcaactcgga 1800
ctcaatagtt tcttttttta tgtcaacatt aaattcttta tttttttgtt cgacaatctc 1860
tttcatgttt agtattgagc acatgaaatc gttaatcaaa ctcgcgattt gaaggtattt 1920
tccttggaat tgaatagagc cgcgcttgta aatttttgcc ctgaccctgt caccattgct 1980
ggtggtcata atatattggt gtttacaatt aggatcgtta ttattatctt ctgttattgt 2040
tatcccctct tcagaaagaa attcaaatag atttgccctg tcatcatcac tgaattttgg 2100
aatggtgtat tcaaagttct ttgtgtctga atacaaacag ttttctttta taatcaaggc 2160
gatttcatca aagtaagtgt tattttgccc agacgctctt ccgatagtgg tatttccgcc 2220
tgatggcatt agttttatta agaagtctat tcctttatat gtgccagata tgtgagtttc 2280
tctttcgttt tttacattag aggaatagtt tgtgacgcca ttctgcgtca gagcagactc 2340
aatcttgtca atattgatat ttagtgcttt aaacgggttc tgtgccattg ggtcaatccg 2400
ttgttttttt tgaatatgta cagatcttgt ttttttgtca acggaatagc tgttcgttga 2460
cttgatagac cgattgattc atcatctcat aaataaagaa aaaccaccgc taccaacggt 2520
ggttttctca aggttcgctg agctaccaac tctttgaacc aaggtaactg gcttggagga 2580
gcgcagtcac caaaatctgt tctttcagtt tagccttaac aggtgcataa cttcaagaca 2640
aactcctcta aatcagttac caatggctgc tgccagtggc gataagtcgt gtcttaccgg 2700
gttggactca agacgatagt taccggataa ggcgcagcgg tcgggctgaa cggggggttc 2760
gtgcacacag cccagcttgg agcgaacgac ctacaccgaa ctgagatacc aacagcgtga 2820
gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg gacaggtatc cggtaagtgg 2880
cagggtcgga acaggagagc gcacgaggga gcttccgggg ggaaacgcct ggtatcttta 2940
tagtcctgtc gggtttcgcc acctctggct tgagcgtcga tttttgtgat gctcgtcagg 3000
ggggcggagc ctatggaaaa acgcctgcgg tgctggcttc ttccggtgct ttgctttttg 3060
ctcacatgtt ctttccggct ttatcccctg attctgtgga taaccgtatt accgcctttg 3120
agtgagctga caccgctcgc cgcagtcgaa cgaccgagcg tagcgagtca gtgagcgagg 3180
aagcggaaga gcgccttatg tgacattttc tccttacgct ctgttgtgcc gttcggcatc 3240
ctgccctgag cgttatatct ctgtgctatt ttctacttca aagcgtgtct gtatgctgtt 3300
ctggag 3306
//
6 changes: 6 additions & 0 deletions tools/bakta/test-data/TEST_1/TEST_1.faa
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
>DOGAIA_00005 hypothetical protein
SSASSCSFSHMVACSSASSASSFSSSVRLWLFMNPAMLSAVCCCLFIFLFSPFCLSSASCDYIAHHFSTVLPPVFCRRTFQSDNTVTAKKQQCFVGNSNLQTGQDVQLLYRAYMHAITITLVRASPRHPMSDTEPCFMTKRSGSNTRRRAISRPVRLTAEEDQEIRKRAAECGKTVSGFLRAAALGKKVNSLTDDRVLKEVMRLGALQKKLFIDGKRVGDREYAEVLIAITEYHRALLSRLMAD
>DOGAIA_00010 hypothetical protein
MKKDKKYQIEAIKNKDKTLFIVYATDIYSPSEFFSKIESDLKKKKSKGDVFFDLIIPNGGKKDRYVYTSFNGEKFSSYTLNKVTKTDEYNDLSELSASFFKKNFDKINVNLLSKATSFALKKGIPI
>DOGAIA_00015 hypothetical protein
MAQNPFKALNINIDKIESALTQNGVTNYSSNVKNERETHISGTYKGIDFLIKLMPSGGNTTIGRASGQNNTYFDEIALIIKENCLYSDTKNFEYTIPKFSDDDRANLFEFLSEEGITITEDNNNDPNCKHQYIMTTSNGDRVRAKIYKRGSIQFQGKYLQIASLINDFMCSILNMKEIVEQKNKEFNVDIKKETIESELHSKLPKSIDKIHEDIKKQLSCSLIMKKIDVEMEDYSTYCFSALRAIEGFIYQILNDVCNPSSSKNLGEYFTENKPKYIIREIHQETINGEIAEVLCECYTYWHENRHGLFHMKPGIADTKTINKLESIAIIDTVCQLIDGGVARLKL
6 changes: 6 additions & 0 deletions tools/bakta/test-data/TEST_1/TEST_1.ffn
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
>DOGAIA_00005 hypothetical protein
TCTTCTGCGAGTTCGTGCAGCTTCTCACACATGGTGGCCTGCTCGTCAGCATCGAGTGCGTCCAGTTTTTCGAGCAGCGTCAGGCTCTGGCTTTTTATGAATCCCGCCATGTTGAGTGCAGTTTGCTGCTGCTTGTTCATCTTTCTGTTTTCTCCGTTCTGTCTGTCATCTGCGTCGTGTGATTATATCGCGCACCACTTTTCGACCGTCTTACCGCCGGTATTCTGCCGACGGACATTTCAGTCAGACAACACTGTCACTGCCAAAAAACAGCAGTGCTTTGTTGGTAATTCGAACTTGCAGACAGGACAGGATGTGCAATTGTTATACCGCGCATACATGCACGCTATTACAATTACCCTGGTCAGGGCTTCGCCCCGACACCCCATGTCAGATACGGAGCCATGTTTTATGACAAAACGAAGTGGAAGTAATACGCGCAGGCGGGCTATCAGTCGCCCTGTTCGTCTGACGGCAGAAGAAGACCAGGAAATCAGAAAAAGGGCTGCTGAATGCGGCAAGACCGTTTCTGGTTTTTTACGGGCGGCAGCTCTCGGTAAGAAAGTTAACTCACTGACTGATGACCGGGTGCTGAAAGAAGTTATGCGACTGGGGGCGTTGCAGAAAAAACTCTTTATCGACGGCAAGCGTGTCGGGGACAGAGAGTATGCGGAGGTGCTGATCGCTATTACGGAGTATCACCGTGCCCTGTTATCCAGGCTTATGGCAGATTAG
>DOGAIA_00010 hypothetical protein
ATGAAAAAAGATAAAAAATATCAAATAGAAGCAATAAAAAATAAAGATAAAACTTTATTTATTGTCTATGCTACTGATATTTATAGCCCGAGCGAATTTTTCTCAAAAATCGAATCCGACTTGAAGAAAAAGAAAAGCAAGGGTGATGTTTTTTTTGATTTAATAATTCCTAACGGTGGAAAAAAAGATCGTTACGTCTATACGTCATTTAATGGCGAGAAGTTTTCAAGTTACACATTAAACAAAGTTACGAAAACTGATGAATATAATGATTTATCTGAGCTCTCGGCTTCGTTCTTTAAAAAAAACTTTGATAAGATCAACGTAAACCTTCTATCCAAAGCCACATCATTTGCTTTGAAAAAAGGCATTCCAATATAA
>DOGAIA_00015 hypothetical protein
ATGGCACAGAACCCGTTTAAAGCACTAAATATCAATATTGACAAGATTGAGTCTGCTCTGACGCAGAATGGCGTCACAAACTATTCCTCTAATGTAAAAAACGAAAGAGAAACTCACATATCTGGCACATATAAAGGAATAGACTTCTTAATAAAACTAATGCCATCAGGCGGAAATACCACTATCGGAAGAGCGTCTGGGCAAAATAACACTTACTTTGATGAAATCGCCTTGATTATAAAAGAAAACTGTTTGTATTCAGACACAAAGAACTTTGAATACACCATTCCAAAATTCAGTGATGATGACAGGGCAAATCTATTTGAATTTCTTTCTGAAGAGGGGATAACAATAACAGAAGATAATAATAACGATCCTAATTGTAAACACCAATATATTATGACCACCAGCAATGGTGACAGGGTCAGGGCAAAAATTTACAAGCGCGGCTCTATTCAATTCCAAGGAAAATACCTTCAAATCGCGAGTTTGATTAACGATTTCATGTGCTCAATACTAAACATGAAAGAGATTGTCGAACAAAAAAATAAAGAATTTAATGTTGACATAAAAAAAGAAACTATTGAGTCCGAGTTGCATAGCAAACTACCAAAAAGCATCGATAAAATCCATGAAGATATCAAAAAACAGCTATCATGCTCGCTAATAATGAAAAAAATAGATGTCGAAATGGAAGATTACTCAACATACTGCTTCTCTGCATTAAGAGCCATAGAAGGCTTTATATATCAAATACTTAATGATGTTTGCAATCCATCATCATCAAAGAACCTTGGCGAATACTTCACTGAAAACAAACCCAAATATATAATCAGAGAAATACACCAAGAAACTATAAATGGTGAAATAGCGGAAGTTTTGTGTGAATGCTACACTTACTGGCATGAGAACAGGCATGGTTTGTTTCATATGAAACCAGGAATAGCTGACACGAAGACAATTAACAAATTAGAATCAATCGCAATCATCGATACCGTTTGCCAATTAATAGATGGTGGCGTAGCTAGGTTGAAATTATGA
Loading