Skip to content
Steve Bond edited this page Oct 7, 2015 · 8 revisions

--blast, -bl

Description

BLAST is a local alignment algorithm commonly used to search large collections of sequences for likely homologs to a query sequence. The SeqBuddy blast tool searches a pre-existing blast database with all input sequences and returns the matches as a new sequence file. The BLAST databases must be made with the NCBI C++ toolkit makeblastdb program, using the -parse_seqids option.

To make a blast database from the command line:

$: makeblastdb -in path/to/fasta_file -out db_name -dbtype {nucl, prot} -parse_seqids 

At the moment, the SeqBuddy blast tool uses hard coded parameters when it calls the blast executable, but adding custom parameters is on the ToDo list. The following command is an example of how blastn would be called by SeqBuddy:

$: blastn -db database -query in_file.fa -out temp.txt -num_threads 4 -evalue 0.01 -outfmt 6

Dependencies

blastn, blastp, and blastdbcmd binaries, from the [NCBI C++ toolkit] (http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/) must be present in your system path.

Argument

Path to BLAST database ( str )

BLAST databases consist of 6 separate files; provide a relative or absolute path to any of these files or the base name of all files.

Example

Input file: Drosophila.nex

#NEXUS
begin data;
    dimensions ntax=8 nchar=316;
    format datatype=protein missing=? gap=-;
matrix
'Dme-Panxδ3' -----GFI---K----IDNMVFRCHYRITAILFTC-CIIVTANNLIGDPISCI--IPMHVINTFCWITYTYTV---A--GPGLE-K--HSYYQWVPFVLFFQGLMFYVPHWVWKM-D-GKIRMITG--VDDRDRIL-KYFVNNT--HNGYSFYFFCELLNFINVIVNIFMVDKFLGGAFMSYGTDVLKFSNMDQ-DRFDPMIEIFPRLTKCTFHKFGPSGSVQKHDTLCVLALNILNEKIYIFLWFWFIILATISGVAVLYSVVI---TR-TIR----------K--EGDFLILHFLSQNLSTRSYSDML-Q----
'Dme-Panxδ7' --L--SV----R-Q-RIDNIVFKLHYRWTVILLVA-TLLITSRQYIGEHIQCL--VVSPVINTFCFFTPTF-VD--P---PGI--D-RHAYYQWVPFVLFFQALCFYIPHALWKW-EGGRIKALVK--LG-MERVKD---IRDM--RLNWG-HVFAEVLNLINLLLQITWTNRFLGGQFLTLG------HALKN-RSDEVV---FPKITKCKFHKFGDSGSIQMHDALCVMALNIMNEKIYIILWFWYAFLLIVTVLGLLWRLCF---VR-WSL----------P-LASNWMFLFFLRSNLS-----E-L----DN
'Dme-Panxδ2' MDVFGSVKGLLKIDQV-DNNVFRMHYKATVIILIAFSLLVTSRQYIGDPIDCIVEIPLGVMDTYCWIYSTFTVPEGRDVQP--GSEKYHKYYQWVCFVLFFQAILFYVPRYLWKSWEGGRLKMLVDLSVNDKDRKIVDYFG-NLNRHNFYAFFFVCEALNFVNVIGQIYFVDFFLDGEFSTYGSDVLKFTELEPDERIDPMARVFPKVTKCTFHKYGPSGSVQTHDGLCVLPLNIVNEKIYVFLWFWFIILSIMSI-SLIYRIAVAPKLRHLLLRARSRAESEVEVAIGDWFLLYQLGKNIDPLIYKEVISDLEMG
'Dme-Panxδ5' MSAVKPLSKYLQFKIRIYDSVFTIHSRCTVVILLTCSLLLSARQYFGDPIQCI-S-EEKNIESYCWTMGTYYNEASIAE--GVEIRQYLRYYQWVIILLLFQSFVFYFPSCLWKVWEGRRLKQLCEVDNTRRM--LVKYFDMHFC----YMAYVFCEVLNFLISVVNIIVLEVFLNGFWSKYLRALW-------DRWV-SV---FPKIAKCELKF-GGSGTANVMDNLCILPLNILNEKIFVFLWAWFL-LALMSGLNLLCRLAICSRLREQMIRTKRHVKRALDLTIGDWFLMMKVSVNVNPMLFRDLMQEL---
'Dme-Panxδ6' MAAVKPLSNYLRLKVRIYDPIFTLHSKCTIVILLTCTFLLSAKQYFGEPILCL-S-SERQADSYCWTMGTYWNEQSIAE--GVETRMYLRYYQWVFMILLFQSLLFYFPSFLWKVWEGQRMEQLCEVDRTRQM--LTRYFPIHWC----YSIYAFCELLNVFISILNFWLMDVVFNGFWYKYIHALW-------NLWM-RV---FPKVAKCEFVY-GPSGTPNIMDILCVLPLNILNEKIFAVLYVWFL-FALLAIMNILYRLLICCPLRLQLLNPKSHVREVLSAGYGDWFVLMCVSINVNPTLFRELLEQL--D
'Dme-Panxδ4' MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPIQCF-G-D-KDMDAFCWIYGAYL-QCAVSK--VVEN--YITYYQWVVLVLLLESFVFYMPAFLWKIWEGGRLKHLCDFKRTHRV--LVNYFETHFR----YFVYVFCEILNLSISILNFLLLDVFFGGFWGRYRNALY-------NQWI-AV---FPKCAKCEYKG-GPSGSSNIYDYLCLLPLNILNEKIFAFLWIWFI-LAMLISLKFLYRLAVLYPMRLQLLRPKKHLQVALNCSFGDWFVLMRVGNNISPELFRKLLEEL---
'Dme-Panxδ1' YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPISCIVGVP-HVVNTFCWIHSTFTMPDRREVHPGVDF-KYYTYYQWVCFVLFFQAMACYTPKFLWNKFEGGLMRMIVGLNITRKRDALLDYLIKHVKRHKLY-AYWACEFLCCINIIVQMYLMNRFFDGEFLSYGTNIMKLSDVPQEQRVDPMVYVFPRVTKCTFHKYGPSGSLQKHDSLCILPLNIVNEKTYVFIWFWFWILLVLLGL--VFRCIIFPKFRPRLLNASNRIPMECRLDIGDWWLIYMLGRNLDPVIYKDVMSEFQVP
'Dme-Panxδ8' LDIFRGLKNLVKVSVKTDSIVFRLHYSITVMILMSFSLIITTRQYVGNPIDCVTDIP-DVLNTYCWIQSTYTLKSLVSVYPGIGNKKHYKYYQWVCFCLFFQAILFYTPRWLWKSWEGGKIHALIDLDISEKKKLLLDYLWENLRYHNWW-AYYVCELLALINVIGQMFLMNRFFDGEFITFGLKVIDYMETDQEDRMDPMIYIFPRMTKCTFFKYGSSGEVEKHDAICILPLNVVNEKIYIFLWFWFILLTFLTLLTLIYRVIIFPRMRVYLFRMRFRVRRDIEIKMGDWFLLYLLGENIDTVIFRDVVQDLRL-
;
end;    

Database directory

$: ls path/to/blastdb
>>> Abacion_magnum.nex  Abacion_magnum.nhr  Abacion_magnum.nin  Abacion_magnum.nog
>>> Abacion_magnum.nsd  Abacion_magnum.nsi  Abacion_magnum.nsq

Usage example

$: sb Drosophila.nex -bl /path/to/blastdb/Abacion_magnum

output

>4086 comp4411_c0_seq1|m.4086
MFDVLGSLKSVFLRLKTISVDNSIFKLHYRLTTIILAVFSILVTSKQYLGDPIDCTTSST
TIRAELLDQYCWVSSTYSLPKAFDQKVGRFGHVSHPGIATYHEGDQVIYHQYYQWVCFVL
FLQSMMFYLPHYLWKIWECGRLKALADDIQGPLTSDETKKGKLAAISAYFSTSLFHHNFY
ATRYSICEVLNFANVVGQMFLTNRFLGGTFLTYGTEVIEFSESNQLNRTDPMIKVFPRVT
KCSFFTYGSSGDMQNHDALCVLPVNIINEKIYIVLWFWFIILAVLSGLAIIYRLIVTFSV
RARYLALRSRANSVSRSEIEKIAYNTEFGDWFVLYLLSKNVNSYVFKEVVDVVVKQLDNS
DYVPKEKHGLFKKLPL*
>5440 comp6054_c0_seq1|m.5440
FVLFFQAMLFYIPRFLWKMWEGKRLETIVLGMHVGILTEEEKNNRKKVLLEYLTRHFRRH
TFYAIKYYICELLCLVNVIGQMYLMNKFLGGEFMDYGSRVLEFSEQNQDSRTDPMIYVFP
RMTKCTFHKFGTSGDIQRHDALCVLPLNIVNEKIYIFLWFWFIILATLTALVLCYRILII
AFPKFRPQILHARCRLTPMKTINSVLRNADLGDWFLFYLLGKNMDPCIFREVCIELSKKL
ETAESNNP*

Main Toolkit Pages





Further Reading

Clone this wiki locally