Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal character "'" at 1:35518 after LexToken(COMMA,',',1,35516) #57

Open
fletchowns opened this issue Sep 3, 2013 · 3 comments
Open
Assignees
Labels

Comments

@fletchowns
Copy link

slimit seems to be having trouble minifying bootstrap-datepicker.js

$ wget https://raw.github.com/eternicode/bootstrap-datepicker/511c1b0241eb9804892df6f9388e0afd00107253/js/bootstrap-datepicker.js
$ slimit bootstrap-datepicker.js

Results in:

Illegal character "'" at 1:35518 after LexToken(COMMA,',',1,35516)
Illegal character '\\' at 1:35519 after LexToken(COMMA,',',1,35516)
Illegal character '\\' at 1:35531 after LexToken(STRING,"').split('",1,35521)
Traceback (most recent call last):
  File "/home/fletch/my-venv/bin/slimit", line 9, in <module>
    load_entry_point('slimit==0.8.1', 'console_scripts', 'slimit')()
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/minifier.py", line 69, in main
    text, mangle=options.mangle, mangle_toplevel=options.mangle_toplevel)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/minifier.py", line 38, in minify
    tree = parser.parse(text)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/parser.py", line 93, in parse
    return self.parser.parse(text, lexer=self.lexer, debug=debug)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/ply/yacc.py", line 265, in parse
    return self.parseopt_notrack(input,lexer,debug,tracking,tokenfunc)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/ply/yacc.py", line 1047, in parseopt_notrack
    tok = self.errorfunc(errtoken)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/parser.py", line 116, in p_error
    self._raise_syntax_error(token)
  File "/home/fletch/my-venv/local/lib/python2.7/site-packages/slimit/parser.py", line 89, in _raise_syntax_error
    self.lexer.prev_token, self.lexer.token())
SyntaxError: Unexpected token (STRING, "').split('") at 1:35521 between LexToken(NUMBER,'0',1,35520) and LexToken(NUMBER,'0',1,35532)
@ghost ghost assigned rspivak Sep 21, 2013
@Arvi3d
Copy link

Arvi3d commented Mar 27, 2014

agree and support this request.

@fomojola
Copy link

Just looked at this: it appears to be unable to handle a single character escape sequence if it is part of a function call. If you replace

var separators = format.replace(this.validParts, '\0').split('\0'), parts = format.match(this.validParts);

with

var escape_null = String.fromCharCode(0)[0];
var separators = format.replace(this.validParts, escape_null), parts = format.match(this.validParts);
separators = separators.split(escape_null);

Then it does the right thing. Seems like an easy enough bug, but my Python isn't good enough to figure out where in the lexer things are going wrong.

@redapple
Copy link

For what it's worth, I use slimit in js2xml and I had to add a simple OR to the lexer's string literal regexes, \\\d{1,}:

It's probably not acurate, for example after reading http://mathiasbynens.be/notes/javascript-escapes#octal , but it works for me

    string = r"""
    (?:
        # double quoted string
        (?:"                               # opening double quote
            (?: [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ")                                 # closing double quote
        |
        # single quoted string
        (?:'                               # opening single quote
            (?: [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ')                                 # closing single quote
    )
    """ 

became:

    string = r"""
    (?:
        # double quoted string
        (?:"                               # opening double quote
            (?: [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^"\\\n\r]                 # no \, line terminators or "
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ")                                 # closing double quote
        |
        # single quoted string
        (?:'                               # opening single quote
            (?: [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
            )*?                            # zero or many times
            (?: \\\n                       # multiline ?
              (?:
                [^'\\\n\r]                 # no \, line terminators or '
                | \\[a-zA-Z!-\/:-@\[-`{-~] # or escaped characters
                | \\\d{1,}
                | \\x[0-9a-fA-F]{2}        # or hex_escape_sequence
                | \\u[0-9a-fA-F]{4}        # or unicode_escape_sequence
              )*?                          # zero or many times
            )*
        ')                                 # closing single quote
    )
    """ 

https://github.com/redapple/js2xml/blob/master/js2xml/lexer.py

metatoaster added a commit to calmjs/calmjs.parse that referenced this issue Jun 8, 2017
- Also note that the following issues were addressed, where applicable
  to the lexer or parser.

  - rspivak/slimit#52
  - rspivak/slimit#54
  - rspivak/slimit#57
  - rspivak/slimit#59
  - rspivak/slimit#62
  - rspivak/slimit#65
  - rspivak/slimit#70
  - rspivak/slimit#73
  - rspivak/slimit#79
  - rspivak/slimit#81
  - rspivak/slimit#82
  - rspivak/slimit#90

- Will get the release out when I get some sleep.
metatoaster added a commit to metatoaster/calmjs.parse that referenced this issue Sep 18, 2017
- Turns out the fix provided didn't actually fix this exact case, only
  the standard octal was included.
- Demonstrate with the code reported by rspivak/slimit#57, and provide
  the fix.
metatoaster added a commit to metatoaster/calmjs.parse that referenced this issue Sep 19, 2017
- Turns out the fix provided didn't actually fix this exact case, only
  the standard octal was included.
- Demonstrate with the code reported by rspivak/slimit#57, and provide
  the fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants