Skip to content

SB Average sequence length

Steve Bond edited this page Oct 8, 2015 · 4 revisions

--ave_seq_length, -asl

Description

Return the average length of all sequences.

Argument

'clean'

Optional. Pass in the word 'clean' to run the clean_seq tool to remove all non-sequence characters (e.g., gaps) from the sequence before calculating average length.

Examples

Input file: Drosophila.nex

#NEXUS
begin data;
    dimensions ntax=8 nchar=316;
    format datatype=protein missing=? gap=-;
matrix
'Dme-Panxδ1' YKLLGSLKSYLKWQIQTDNAVFRLHNSFTTVLLLTCSLIITATQYVGQPISCIVGVP-HVVNTFCWIHSTFTMPDRREVHPGVDF-KYYTYYQWVCFVLFFQAMACYTPKFLWNKFEGGLMRMIVGLNITRKRDALLDYLIKHVKRHKLY-AYWACEFLCCINIIVQMYLMNRFFDGEFLSYGTNIMKLSDVPQEQRVDPMVYVFPRVTKCTFHKYGPSGSLQKHDSLCILPLNIVNEKTYVFIWFWFWILLVLLGL--VFRCIIFPKFRPRLLNASNRIPMECRLDIGDWWLIYMLGRNLDPVIYKDVMSEFQVP
'Dme-Panxδ2' MDVFGSVKGLLKIDQV-DNNVFRMHYKATVIILIAFSLLVTSRQYIGDPIDCIVEIPLGVMDTYCWIYSTFTVPEGRDVQP--GSEKYHKYYQWVCFVLFFQAILFYVPRYLWKSWEGGRLKMLVDLSVNDKDRKIVDYFG-NLNRHNFYAFFFVCEALNFVNVIGQIYFVDFFLDGEFSTYGSDVLKFTELEPDERIDPMARVFPKVTKCTFHKYGPSGSVQTHDGLCVLPLNIVNEKIYVFLWFWFIILSIMSI-SLIYRIAVAPKLRHLLLRARSRAESEVEVAIGDWFLLYQLGKNIDPLIYKEVISDLEMG
'Dme-Panxδ3' -----GFI---K----IDNMVFRCHYRITAILFTC-CIIVTANNLIGDPISCI--IPMHVINTFCWITYTYTV---A--GPGLE-K--HSYYQWVPFVLFFQGLMFYVPHWVWKM-D-GKIRMITG--VDDRDRIL-KYFVNNT--HNGYSFYFFCELLNFINVIVNIFMVDKFLGGAFMSYGTDVLKFSNMDQ-DRFDPMIEIFPRLTKCTFHKFGPSGSVQKHDTLCVLALNILNEKIYIFLWFWFIILATISGVAVLYSVVI---TR-TIR----------K--EGDFLILHFLSQNLSTRSYSDML-Q----
'Dme-Panxδ4' MAAVKPLSKYLQFKVHIYDAIFTLHSKVTVALLLACTFLLSSKQYFGDPIQCF-G-D-KDMDAFCWIYGAYL-QCAVSK--VVEN--YITYYQWVVLVLLLESFVFYMPAFLWKIWEGGRLKHLCDFKRTHRV--LVNYFETHFR----YFVYVFCEILNLSISILNFLLLDVFFGGFWGRYRNALY-------NQWI-AV---FPKCAKCEYKG-GPSGSSNIYDYLCLLPLNILNEKIFAFLWIWFI-LAMLISLKFLYRLAVLYPMRLQLLRPKKHLQVALNCSFGDWFVLMRVGNNISPELFRKLLEEL---
'Dme-Panxδ5' MSAVKPLSKYLQFKIRIYDSVFTIHSRCTVVILLTCSLLLSARQYFGDPIQCI-S-EEKNIESYCWTMGTYYNEASIAE--GVEIRQYLRYYQWVIILLLFQSFVFYFPSCLWKVWEGRRLKQLCEVDNTRRM--LVKYFDMHFC----YMAYVFCEVLNFLISVVNIIVLEVFLNGFWSKYLRALW-------DRWV-SV---FPKIAKCELKF-GGSGTANVMDNLCILPLNILNEKIFVFLWAWFL-LALMSGLNLLCRLAICSRLREQMIRTKRHVKRALDLTIGDWFLMMKVSVNVNPMLFRDLMQEL---
'Dme-Panxδ6' MAAVKPLSNYLRLKVRIYDPIFTLHSKCTIVILLTCTFLLSAKQYFGEPILCL-S-SERQADSYCWTMGTYWNEQSIAE--GVETRMYLRYYQWVFMILLFQSLLFYFPSFLWKVWEGQRMEQLCEVDRTRQM--LTRYFPIHWC----YSIYAFCELLNVFISILNFWLMDVVFNGFWYKYIHALW-------NLWM-RV---FPKVAKCEFVY-GPSGTPNIMDILCVLPLNILNEKIFAVLYVWFL-FALLAIMNILYRLLICCPLRLQLLNPKSHVREVLSAGYGDWFVLMCVSINVNPTLFRELLEQL--D
'Dme-Panxδ7' --L--SV----R-Q-RIDNIVFKLHYRWTVILLVA-TLLITSRQYIGEHIQCL--VVSPVINTFCFFTPTF-VD--P---PGI--D-RHAYYQWVPFVLFFQALCFYIPHALWKW-EGGRIKALVK--LG-MERVKD---IRDM--RLNWG-HVFAEVLNLINLLLQITWTNRFLGGQFLTLG------HALKN-RSDEVV---FPKITKCKFHKFGDSGSIQMHDALCVMALNIMNEKIYIILWFWYAFLLIVTVLGLLWRLCF---VR-WSL----------P-LASNWMFLFFLRSNLS-----E-L----DN
'Dme-Panxδ8' LDIFRGLKNLVKVSVKTDSIVFRLHYSITVMILMSFSLIITTRQYVGNPIDCVTDIP-DVLNTYCWIQSTYTLKSLVSVYPGIGNKKHYKYYQWVCFCLFFQAILFYTPRWLWKSWEGGKIHALIDLDISEKKKLLLDYLWENLRYHNWW-AYYVCELLALINVIGQMFLMNRFFDGEFITFGLKVIDYMETDQEDRMDPMIYIFPRMTKCTFFKYGSSGEVEKHDAICILPLNVVNEKIYIFLWFWFILLTFLTLLTLIYRVIIFPRMRVYLFRMRFRVRRDIEIKMGDWFLLYLLGENIDTVIFRDVVQDLRL-
;
end;  

Usage example 1

$: sb Drosophila.nex -asl

Output

316.0

Usage example 2

$: sb Drosophila.nex -asl clean

Output

289.38

Main Toolkit Pages





Further Reading

Clone this wiki locally