Skip to content

SB Delete recs with feature

Steve Bond edited this page Aug 11, 2017 · 2 revisions

--delete_recs_with_feature, -drf

Implemented in version 1.3

Description

Remove all sequences with feature names/types containing a regular expression pattern match.

Argument

One or more search strings ( regex )

As many simple strings or regular expressions as you want. To avoid issues with special characters, make a habit of adding 'single quotes' around the search terms.

Examples

Input file: Mnemiopsis_cds.gb

LOCUS       Mle-Panxα7A             1560 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML218922a-1.
ACCESSION   Mle-Panxα7A
VERSION     Mle-Panxα7A
KEYWORDS    .
SOURCE      
  ORGANISM  . .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..143,144..292,293..407,408..528,529..675,676..862,
                     863..1560)
                     /label="ML218922a"
                     /modified_by="User"
                     /created_by="User"
     TMD1            88..150
     TMD2            403..465
     TMD3            631..723
     splice_donor    853..862
                     /label="Donor"
                     /created_by="User"
     splice_acceptor complement(863..872)
                     /label="Acceptor"
                     /created_by="User"
     TMD4            916..1008
     splice_donor    1541..1550
                     /label="Donor"
                     /created_by="User"
ORIGIN
        1 atgggggtgg aaattctgtt tcccataatc aacagagcca ccgctccgat caagtctgtt
       61 aacatcgacg atttgagtag tcagctcaac cgaactttta tgttttactt atcgctgact
      121 ttcgccatca ctatcaccat caggcagcag ctaggcggag cgtacattgc ttgtgacgga
      181 ttctccagag acgaggaata tgaacggttt gcagaggagt ggtgctggag tagtggaatc
      241 tacactatca aggaggctta tgagatgagc aacagagtca gcccttatcc gggaataatc
      301 ccagaaaatc taccagcctg tatagagatg gagctgatat ctgggggcag agtagagtgt
      361 cctgaagaga aagacgtcaa gcctttcacc aggatatacc aatcttggta cccatttgtg
      421 atgttttact attggctgac tgctttgatg tttttcttgc cgtaccagct ttacaaggtc
      481 tttggttttg aagacgtcaa agcggttgta gctatgttgc agaacccggt agaggatggc
      541 tttgagaaga aggagctgat aaaaaggggt tcagtgtggc tgtatcttaa atctacaatg
      601 accctttcca acccttcagt ttactcgagc ttcatcgtga aacacagcct agctttctac
      661 gctcttactg tcaaggttat gtatttgggg aacacacttc ttatgtactg gctgactcac
      721 aaaatgttca agtttggatc gtttgcggag tacggtcttc tctgggacac aagaaaccca
      781 cttaacaacg tccagagcct tgttcaagag aaattgttcc caaaagttgc agcgtgcgaa
      841 gtaaagcgct ttggtgcatc gggacttgag gaggaccaag gtatgtgcat gctggctcta
      901 aacgtcctta accagtacct cttcttaatc ttctggttct gtctcctttt cgtgacaata
      961 gtcaacacca tatccctcct cctcaccctc cttaacatca tctctccttg ctttatgctc
     1021 caacagtttc tcctggcctc ctctcttgat aggagtccag ctgtcggtgt catatccaag
     1081 ctgtatcttg actgtggttc ctcgctaagg ttcatcatga ctatatttgc ttggaatgta
     1141 gacccaaaat tgtttgggga gattttagta cagctcaact ctctccttgc caaggatgaa
     1201 tcgcccaggg ctgaagtgct gaagcggcgt tcaaaaaaag tgaaagttcc aagtccaaga
     1261 aagcctaaac tcttatttca tgaagaaatt aagaaaaagt taataaagag gactgaacga
     1321 aaagatgata acctaaccaa tttcacgaat atgtcgaaaa ttagcaagaa gtttgaaggc
     1381 ctcaaaaaaa gaaatctcct tcagacgaaa agcatcatta atgtaagcgt tcctaagaaa
     1441 atgagcgagt tagaagtaga agaagatttc attcttactc ctactgaaga aagtggtatc
     1501 caaaacaacc ctgacaccaa gtatgctcaa gaggatgtac tggactcaga gtacgtagtg
//
LOCUS       Mle-Panxα1              960 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML078817.
ACCESSION   Mle-Panxα1
VERSION     Mle-Panxα1
KEYWORDS    .
SOURCE      
  ORGANISM  . .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..572,573..960)
                     /label="ML078817"
                     /created_by="User"
     TMD1            91..153
     TMD2            373..435
     TMD3            625..687
     TMD4            871..960
ORIGIN
        1 atgtactgga tatttgagat ttgtcaagag ataaagcgag ctcaatcctg ccgaaagttc
       61 gcgatagacg gaccattcga ctggacgaac cggattatca tgccaacact catggtaatc
      121 tgctgctttc tccaaacctt caccttcatg ttcggcagca acatcagctg tatcggcttc
      181 gagaagttgg aaaggaactt tgtggaggag tactgctgga cccagggtat ctatacaagc
      241 aaggctgcgt ataacatgcc attacatact ccctacccgg ggattgcccc ctgtgtgccc
      301 gagtatgatc ccgtgactca gaagtattgg ttaccctgtg gggtggagga agaagacaag
      361 gcttatcatt tgtggtatca gtgggttccg ttttactttc tcgctgtggc cgtgggttat
      421 tatttgccat ttcttatctt gaagggttca aagctgcatc aggtgaagcc gctgattacg
      481 tatttgatga accagaggaa cctggagact gatcctaacc atttggtagg aaagctatcg
      541 cattggatct tcagacagct tgtttattca aggtttgcgg ccacctctac aatcagaatg
      601 tactggcacg actgggggct tgtcctcctt gtttgctctg taaagatcct ctaccttacc
      661 gtctctctta tccacctctt tgccactgcc aagatgttcc acatcggcaa ctggtttacg
      721 tacgggatca tgttcgcgcg gcgcagcaac agtcacacta cccacgttaa ggatgtgttc
      781 ttcccgaaga tggtggcctg taagatcgag acatggagtt tcacagggaa gaatcatctt
      841 cacgggatgt gtgttttagc tctgaacgtg atgaaccaat atttgttttt gatcgtgtgg
      901 tacgtcaacg taatcatcat cttcctcaac agtatcagct gtatttacac tatagtcaag
//
LOCUS       Mle-Panxα3              1020 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML036514a.
ACCESSION   Mle-Panxα3
VERSION     Mle-Panxα3
KEYWORDS    .
SOURCE      
  ORGANISM  . .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..151,152..334,335..456,457..549,550..726,727..845,
                     846..1020)
                     /label="ML036514a"
                     /modified_by="User"
                     /created_by="User"
     enzyme_acceptor 14..46
     TMD1            85..147
     TMD2            394..456
     TMD3            652..714
     TMD4            904..996
ORIGIN
        1 atgttgttgc tcggctcact cggaacgatc aagaacttga gcatcttcaa agacctgtcc
       61 ttggacgact ggctggatca gatgaacagg accttcatgt ttctactgct ctgtttcatg
      121 ggaacaattg tcgccgttag tcagtacact ggtaaaaaca tatcttgcga tggctttacg
      181 aagttcggag aagatttctc gcaagactac tgctggaccc agggcttgta cacgattaaa
      241 gaagcgtacg acttgcccga gtcccagatc ccgtatcctg ggattatccc tgaaaacgtg
      301 ccggcatgta gagagcacgc tctgaaaaac ggaggaaaga tagtctgccc tcctgaagat
      361 caagtgaagc ccctgacccg ggctcgacat ctctggtacc agtggatacc tttctacttc
      421 tgggtgatag ctccagtctt ctatctccct tacatgtttg tgaaaaggat gggacttgac
      481 agaatgaaac ctctgttgaa gatcatgagc gactactacc actgcactac agagacacct
      541 tcagaggaga taatagtgaa gtgtgcagac tgggtataca acagtatagt agacaggctg
      601 tcagagggca gcagctggac aagctggaga aacagacacg gtcttggtct ggctgtcttg
      661 gtcagcaagt tcatgtatct cggaggtagt gtcctcgtca tgatgatgac cactctcatg
      721 ttccaggttg gtgatttcaa gacgtacggt atagagtggt tgaggcagtt ccctaatcca
      781 gaaaactatt cgacctcagt taaacacaaa ctattcccca aaatggtagc ctgtgagata
      841 aaacgatggg gcactaccgg gctagaagag gagaatggaa tgtgtgtcct tgccccgaat
      901 gtcatctacc agtacatttt tctaatcatg tggttcgctc tagccatcac catatgcacc
      961 aacttcggca acatattttt ctatctcttc aagctgacag ccactagata cacttacaac
//

Usage example 1

$: sb Mnemiopsis_cds.gb -drf 'splice'

Output

LOCUS       Mle-Panxα1               960 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML078817.
ACCESSION   Mle-Panxα1
VERSION     Mle-Panxα1
KEYWORDS    .
SOURCE
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..572,573..960)
                     /label="ML078817"
                     /created_by="User"
     TMD1            91..153
     TMD2            373..435
     TMD3            625..687
     TMD4            871..960
ORIGIN
        1 atgtactgga tatttgagat ttgtcaagag ataaagcgag ctcaatcctg ccgaaagttc
       61 gcgatagacg gaccattcga ctggacgaac cggattatca tgccaacact catggtaatc
      121 tgctgctttc tccaaacctt caccttcatg ttcggcagca acatcagctg tatcggcttc
      181 gagaagttgg aaaggaactt tgtggaggag tactgctgga cccagggtat ctatacaagc
      241 aaggctgcgt ataacatgcc attacatact ccctacccgg ggattgcccc ctgtgtgccc
      301 gagtatgatc ccgtgactca gaagtattgg ttaccctgtg gggtggagga agaagacaag
      361 gcttatcatt tgtggtatca gtgggttccg ttttactttc tcgctgtggc cgtgggttat
      421 tatttgccat ttcttatctt gaagggttca aagctgcatc aggtgaagcc gctgattacg
      481 tatttgatga accagaggaa cctggagact gatcctaacc atttggtagg aaagctatcg
      541 cattggatct tcagacagct tgtttattca aggtttgcgg ccacctctac aatcagaatg
      601 tactggcacg actgggggct tgtcctcctt gtttgctctg taaagatcct ctaccttacc
      661 gtctctctta tccacctctt tgccactgcc aagatgttcc acatcggcaa ctggtttacg
      721 tacgggatca tgttcgcgcg gcgcagcaac agtcacacta cccacgttaa ggatgtgttc
      781 ttcccgaaga tggtggcctg taagatcgag acatggagtt tcacagggaa gaatcatctt
      841 cacgggatgt gtgttttagc tctgaacgtg atgaaccaat atttgttttt gatcgtgtgg
      901 tacgtcaacg taatcatcat cttcctcaac agtatcagct gtatttacac tatagtcaag
//
LOCUS       Mle-Panxα3              1020 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML036514a.
ACCESSION   Mle-Panxα3
VERSION     Mle-Panxα3
KEYWORDS    .
SOURCE
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..151,152..334,335..456,457..549,550..726,727..845,
                     846..1020)
                     /label="ML036514a"
                     /modified_by="User"
                     /created_by="User"
     enzyme_acceptor 14..46
     TMD1            85..147
     TMD2            394..456
     TMD3            652..714
     TMD4            904..996
ORIGIN
        1 atgttgttgc tcggctcact cggaacgatc aagaacttga gcatcttcaa agacctgtcc
       61 ttggacgact ggctggatca gatgaacagg accttcatgt ttctactgct ctgtttcatg
      121 ggaacaattg tcgccgttag tcagtacact ggtaaaaaca tatcttgcga tggctttacg
      181 aagttcggag aagatttctc gcaagactac tgctggaccc agggcttgta cacgattaaa
      241 gaagcgtacg acttgcccga gtcccagatc ccgtatcctg ggattatccc tgaaaacgtg
      301 ccggcatgta gagagcacgc tctgaaaaac ggaggaaaga tagtctgccc tcctgaagat
      361 caagtgaagc ccctgacccg ggctcgacat ctctggtacc agtggatacc tttctacttc
      421 tgggtgatag ctccagtctt ctatctccct tacatgtttg tgaaaaggat gggacttgac
      481 agaatgaaac ctctgttgaa gatcatgagc gactactacc actgcactac agagacacct
      541 tcagaggaga taatagtgaa gtgtgcagac tgggtataca acagtatagt agacaggctg
      601 tcagagggca gcagctggac aagctggaga aacagacacg gtcttggtct ggctgtcttg
      661 gtcagcaagt tcatgtatct cggaggtagt gtcctcgtca tgatgatgac cactctcatg
      721 ttccaggttg gtgatttcaa gacgtacggt atagagtggt tgaggcagtt ccctaatcca
      781 gaaaactatt cgacctcagt taaacacaaa ctattcccca aaatggtagc ctgtgagata
      841 aaacgatggg gcactaccgg gctagaagag gagaatggaa tgtgtgtcct tgccccgaat
      901 gtcatctacc agtacatttt tctaatcatg tggttcgctc tagccatcac catatgcacc
      961 aacttcggca acatattttt ctatctcttc aagctgacag ccactagata cacttacaac
//

Usage example 2

$: sb Mnemiopsis_cds.gb -drf '(splice|enzyme)_acceptor'

Output

LOCUS       Mle-Panxα1               960 bp    DNA              UNA 02-JAN-2015
DEFINITION  cDNA - ML078817.
ACCESSION   Mle-Panxα1
VERSION     Mle-Panxα1
KEYWORDS    .
SOURCE
  ORGANISM  .
            .
FEATURES             Location/Qualifiers
     CDS             order(1..572,573..960)
                     /label="ML078817"
                     /created_by="User"
     TMD1            91..153
     TMD2            373..435
     TMD3            625..687
     TMD4            871..960
ORIGIN
        1 atgtactgga tatttgagat ttgtcaagag ataaagcgag ctcaatcctg ccgaaagttc
       61 gcgatagacg gaccattcga ctggacgaac cggattatca tgccaacact catggtaatc
      121 tgctgctttc tccaaacctt caccttcatg ttcggcagca acatcagctg tatcggcttc
      181 gagaagttgg aaaggaactt tgtggaggag tactgctgga cccagggtat ctatacaagc
      241 aaggctgcgt ataacatgcc attacatact ccctacccgg ggattgcccc ctgtgtgccc
      301 gagtatgatc ccgtgactca gaagtattgg ttaccctgtg gggtggagga agaagacaag
      361 gcttatcatt tgtggtatca gtgggttccg ttttactttc tcgctgtggc cgtgggttat
      421 tatttgccat ttcttatctt gaagggttca aagctgcatc aggtgaagcc gctgattacg
      481 tatttgatga accagaggaa cctggagact gatcctaacc atttggtagg aaagctatcg
      541 cattggatct tcagacagct tgtttattca aggtttgcgg ccacctctac aatcagaatg
      601 tactggcacg actgggggct tgtcctcctt gtttgctctg taaagatcct ctaccttacc
      661 gtctctctta tccacctctt tgccactgcc aagatgttcc acatcggcaa ctggtttacg
      721 tacgggatca tgttcgcgcg gcgcagcaac agtcacacta cccacgttaa ggatgtgttc
      781 ttcccgaaga tggtggcctg taagatcgag acatggagtt tcacagggaa gaatcatctt
      841 cacgggatgt gtgttttagc tctgaacgtg atgaaccaat atttgttttt gatcgtgtgg
      901 tacgtcaacg taatcatcat cttcctcaac agtatcagct gtatttacac tatagtcaag
//

Main Toolkit Pages





Further Reading

Clone this wiki locally