Migrate regex linter rules to use parsed patterns #5416

rzvxa · 2024-09-02T23:30:12Z

Merging #5256 provides support for optional parsed regex literals, We can enhance our regex-related linter rules to use this parsed pattern whenever available.

Here is the list of the rules operating on regex patterns which are written before the introduction of regex parser.

The text was updated successfully, but these errors were encountered:

Boshen · 2024-09-03T00:54:27Z

I'll add "good first issue" when the required PRs are merged.

Part of #5416, Paves the road for upcoming refactors by adding the `oxc_regular_expression` dependency and a helper method for ease of access.

…ace-all` rule (#5943) - part of #5416 Replaces the `is_simple_string` method with a more robust check against the parsed terms from the regular expression.

…s-ends-with` (#5949) - part of #5416 This change enhances the accuracy of the `prefer_string_starts_ends_with` rule by using the parsed regex patterns for analysis. It allows for more precise detection of patterns that can be replaced with `startsWith()` and `endsWith()` methods, reducing false positives and improving the overall effectiveness of the linter. ### What changed? - Replaced the simple string-based regex analysis with a more robust AST-based approach. - Removed the `is_simple_string` function as it's no longer needed.

- part of #5416 Use the `oxc_regular_expression` parser to make these checks more robust. a few snapshots are updated because they now output more accurate diagnostics based on the regex AST. for example, `/ ?/` now correctly only highlights two spaces rather than three (because the last one is part of a quantifier)

…5974) - part of #5416 Replaces the handwritten regex parsing logic with the `oxc_regular_expression` parser, which should be more accurate and enables support for unicode sets.

…ule (#5980) - part of #5416 Uses the parsed regular expression patterns for detecting empty character classes. This is more robust than the handwritten pattern matching from before and allows us to provide more accurate diagnostics and actually point to the empty character class in the literal.

) - part of #5416 This pull request includes significant improvements to the `no_hex_escape` rule in the `oxc_linter` crate. The changes enhance the detection and replacement of hexadecimal escapes within regular expressions by introducing a more comprehensive AST traversal. - Implemented a new `visit_terms` function and its helper functions to traverse the regex AST and apply checks on individual terms. - Introduced the `check_character` function to replace hexadecimal escapes with Unicode escapes within regex patterns. - Updated snapshots to reflect the new diagnostic messages and replacements for hexadecimal escapes in regex patterns.

- closes #5416 Rewrites the `no-control-regex` rule to use a regular expression AST visitor instead of the `regex` crate and parsing by hand. This change simplifies the code and makes it easier to maintain. One notable change in the snapshots is the printing of the control characters. Previously, we always printed from the source text. Now, we print a representation of the control character itself based on its numeric value. This resulted in the nonprintable chars being printed, which are invisible. The other reason for this change is that the spans output by the regex parser for unicode escapes do not match 1:1 when raw strings and escapes are involved. This resulted in goofy looking spans in the output: ``` ⚠ eslint(no-control-regex): Unexpected control character: '*\\x' ╭─[no_control_regex.tsx:1:22] 1 │ new RegExp('\\u{1111}*\\x1F', 'u') · ──── ╰──── ``` Not sure where the bug lies there yet.

rzvxa added the C-enhancement Category - New feature or request label Sep 2, 2024

This was referenced Sep 2, 2024

feat(ast, parser): add oxc_regular_expression types to the parser and AST. #5256

Merged

refactor(linter): use "parsed pattern" in no_div_regex rule. #5417

Merged

rzvxa added a commit that referenced this issue Sep 4, 2024

refactor(linter): use "parsed pattern" in no_div_regex rule. (#5417)

df7eddd

Part of #5416, Paves the road for upcoming refactors by adding the `oxc_regular_expression` dependency and a helper method for ease of access.

rzvxa added a commit that referenced this issue Sep 4, 2024

refactor(linter): use "parsed pattern" in no_div_regex rule. (#5417)

fdb8857

Part of #5416, Paves the road for upcoming refactors by adding the `oxc_regular_expression` dependency and a helper method for ease of access.

Boshen added the good first issue Experience Level - Good for newcomers label Sep 4, 2024

This was referenced Sep 21, 2024

refactor(linter): Use parsed patterns for unicorn/prefer-string-replace-all rule #5943

Merged

refactor(linter): Use parsed patterns in unicorn/prefer-string-starts-ends-with #5949

Merged

camchenry mentioned this issue Sep 21, 2024

refactor(linter): use regex parser in eslint/no-regex-spaces #5952

Merged

camchenry mentioned this issue Sep 22, 2024

feat(linter): add unicode sets support to no-useless-escape rule #5974

Merged

camchenry mentioned this issue Sep 22, 2024

refactor(linter): use parsed patterns in no-empty-character-class rule #5980

Merged

camchenry mentioned this issue Sep 23, 2024

refactor(linter): use parsed patterns for unicorn/no-hex-escape #5985

Merged

camchenry self-assigned this Sep 26, 2024

camchenry mentioned this issue Sep 27, 2024

refactor(linter): use regexp AST visitor in no-control-regex #6129

Merged

camchenry closed this as completed Sep 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate regex linter rules to use parsed patterns #5416

Migrate regex linter rules to use parsed patterns #5416

rzvxa commented Sep 2, 2024 •

edited by camchenry

Loading

Boshen commented Sep 3, 2024

Migrate regex linter rules to use parsed patterns #5416

Migrate regex linter rules to use parsed patterns #5416

Comments

rzvxa commented Sep 2, 2024 • edited by camchenry Loading

Boshen commented Sep 3, 2024

rzvxa commented Sep 2, 2024 •

edited by camchenry

Loading