Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: swap order of escape regexes to avoid lookahead #546

Merged
merged 3 commits into from
Mar 8, 2023
Merged

perf: swap order of escape regexes to avoid lookahead #546

merged 3 commits into from
Mar 8, 2023

Conversation

ngavalas
Copy link
Contributor

@ngavalas ngavalas commented Mar 8, 2023

When looking at some CPU profiles of metric -> string serialization, I noticed that escaping all of the strings and labels was showing up in the flamegraph more than I would expect. This isn't a massive win by any means, but it's pretty consistently ~20% faster on a microbenchmark: https://jsbench.me/qklez5ombn/1. I'll take 20%.

Some notes:

  1. The benchmarks literally don't test this case, as far as I can tell. None of the registry benchmarks have quotes or newlines or anything in the labels? An A/A test (master vs. 14.2.0) was producing inconsistent results, so I'm not quite sure I trust that benchmark anyway.
  2. Correctness-wise, this should work. Replacing the \ before replacing the \n fixes the case that made you need the .replace(/\\(?!n)/g, '\\\\') regex in the first place, so now we can just do \\ -> \\\\ and \n -> \\n and " -> \\" in that order without lookaheads.

Relevant to #543.

@zbjornson
Copy link
Collaborator

zbjornson commented Mar 8, 2023

Nice. prom-client still support Node.js 10 though, and String.prototype.replaceAll wasn't added until Node.js 15.0.0. I think the other maintainers are fine with bumping the minimum supported version, but we should maybe wait until Apr 30 when Node.js v14.x is EOL.

Did you benchmark just reversing the two regexps so that a lookbehindahead isn't needed? I'm curious if that would give a similar speedup.

@ngavalas
Copy link
Contributor Author

ngavalas commented Mar 8, 2023

Totally forgot that we used to live in a world without replaceAll. Dark times.

From the microbenchmark, it seems like the swapped regex is as fast, if not faster (!!!). Somewhat surprising given that replaceAll is like the simplest case of regex matching, so I'd expect it to be as fast as the equivalent regex, but...

https://jsbench.me/qklez5ombn/1

I'm happy to just swap the regex order and call it a day given that it seems to be the winner anyway. The thing that really matters is that these operations (original regex vs. swapped w/o lookahead) are equivalent; am I missing anything?

FWIW I did make sure that on my test string (const test = 'askjdskajldbhflkashfdkja\nsakljdalsjdaslkdjalsdjlkasd""""""sadjklasd\adqweojkioe23mac\n\n\nasdlkjqodq\qasjkdadasdaskdjqlk;j'), removing the lookahead but retaining the original regex order leads to the wrong results (expected) and swapped them without the lookahead gives the same results as the original code (which is what we wanted).

@ngavalas ngavalas changed the title perf: replace escape regexes with simpler string.replaceAll perf: swap order of escape regexes to avoid lookahead Mar 8, 2023
@ngavalas
Copy link
Contributor Author

ngavalas commented Mar 8, 2023

I'll update the changelog after all the checks come back green to avoid an extra roundtrip if something's still wrong. My bad. :)

@zbjornson
Copy link
Collaborator

Awesome, easy win.

Yes, the behavior from just swapping the regexps looks correct.

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zbjornson zbjornson merged commit 199b7d1 into siimon:master Mar 8, 2023
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
SimenB pushed a commit that referenced this pull request Mar 9, 2023
* swap regex order to avoid lookahead

* changelog

* changelog formatting apparently
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants