Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
searcher: use naive line counting on big-endian
This patches out bytecount's "fast" vectorized algorithm on big-endian machines, where it has been observed to fail. Going forward, bytecount should probably fix this on their end, but for now, we take a small performance hit on big-endian machines. Fixes #1144
- Loading branch information
a4868b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running ripgrep 0.10.0 testsuite using grep-searcher 0.1.2 (to address this bug - it required to update encoding-rs-io too) is causing the two test failures:
Rings a bell? Or I report an issue or should I cherry-pick a fix from ripgrep?
a4868b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Yeah, you'll probably want to cherry pick. The grep-searcher crate has a crlf related bug fix that I think required updating ripgrep's integration tests, and also appears to require an update in grep-regex as well. I'm beyond the point of being able to do a patch release of ripgrep though, so you might need to manually apply the big endian patch? Alternatively, it looks like a fix in bytecount is coming soon.
This was the commit that fixed the crlf related bugs: 9d70311
a4868b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, maybe it wil be easier to wait for a new version of ripgrep & bytecount
a4868b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bytecount 0.5.1
is out: https://crates.io/crates/bytecountI confirmed that it fixes the issue, and master has been updated to use it. I think if you just bump bytecount on your end, then you should be good.
a4868b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BurntSushi do you have an ETA for the new ripgrep various crates?
9d70311 will be hard to backport as it is touching the different crates
thanks
a4868b8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could do a release of the various crates, but I don't have an ETA on the next proper release of ripgrep though. Those are a lot of work.
Also, I don't think you need to backport
9d70311
unless you're specifically looking to patch the CRLF fix, which is really mostly only relevant on Windows. At this point, you should just be able to bumpbytecount
and the big endian bug should be fixed. What am I missing?