Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup regex diagnostic output #1947

Merged
merged 10 commits into from
Jan 21, 2020
Merged

Conversation

danmoseley
Copy link
Member

Updates to the diagnostic output you get if you pass RegexOptions.Debug and have a debug-build of System.Text.RegularExpressions.dll

  • Log the pattern itself
  • Fix an actual bug in the char class output - introduced long ago when the implementation was changed, but the dumping was not. Generate the table from the implementation instead.
  • Change Boyer-Moore logging to fewer lines: it's only interesting if there's a bug in B-M, probably.
  • Improve the indenting and spacing of the node and code trees.

Example:

[8340] Pattern:     (abc|de){2,11}
[8340] Capture                  index = 0 
[8340]   Loop                     min = 2, max = 11 
[8340]     Capture                  index = 1 
[8340]       Alternate                 
[8340]         Multi                    abc 
[8340]         Multi                    de 
[8340] Direction:  left-to-right 
[8340] Firstchars: [ad] 
[8340] Prefix:     n/a 
[8340] Anchors:    None 
[8340]  
[8340] 000000 *Lazybranch       addr = 23 
[8340] 000002 *Setmark           
[8340] 000003 *Setcount         value = -1 
[8340] 000005 *Setmark           
[8340] 000006 *Lazybranch       addr = 12 
[8340] 000008  Multi            abc 
[8340] 000010 *Goto             addr = 14 
[8340] 000012  Multi            de 
[8340] 000014 *Capturemark      index = 1 
[8340] 000017 *Branchcount      addr = 5, limit = 9 
[8340] 000020 *Capturemark      index = 0 
[8340] 000023  Stop              

I did not change the execution logging, although that's very verbose and possibly ought to have a separate toggle.

And boo to all apps that ship and don't remove their debug output! I've opened bugs against VS in the past for this, they've fixed them and now I've another to open. Although, that's not the worst culprit.

@danmoseley
Copy link
Member Author

cc @pgovind

danmoseley and others added 3 commits January 20, 2020 19:31
@danmoseley
Copy link
Member Author

OK now?

@danmoseley
Copy link
Member Author

Also noticed RegexParser.RightChar could be private.

danmoseley and others added 2 commits January 20, 2020 20:43
…egularExpressions/RegexBoyerMoore.cs

Co-Authored-By: Stephen Toub <stoub@microsoft.com>
@stephentoub stephentoub merged commit 539d9e6 into dotnet:master Jan 21, 2020
@danmoseley danmoseley deleted the regex.output branch January 21, 2020 17:38
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants