From 2effc28495811b59af3bba7a6f83ca787817b763 Mon Sep 17 00:00:00 2001 From: Andrew Gallant Date: Tue, 30 Jul 2024 14:52:43 -0400 Subject: [PATCH] ffi: pass non-empty slice when haystack is empty To work around likely bugs in (older versions of) PCRE2. Namely, at one point, PCRE2 would dereference the haystack pointer even when the length was zero. This was reported in #10 and we worked around this in #11 by passing a pointer to a const `&[]`, with the (erroneous) presumption that this would be a valid pointer to dereference. In retrospect though, this was a little silly, because you should never be dereferencing a pointer to an empty slice. It's not valid. Alas, at that time, Rust did actually hand you a valid pointer that could be dereferenced. But [this PR][rust-pull] changed that. And thus, we're back to where we started: handing buggy versions of PCRE2 a zero length haystack with a dangling pointer. So we fix this once and for all by passing a slice of length 1, but with a haystack length of 0, to the PCRE2 search routine when searching an empty haystack. This will guarantee the provision of a dereferencable pointer should PCRE2 decide to dereference it. Fixes #42 [rust-pull]: https://github.com/rust-lang/rust/pull/123936 --- src/ffi.rs | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/src/ffi.rs b/src/ffi.rs index bef2ab8..aaabf74 100644 --- a/src/ffi.rs +++ b/src/ffi.rs @@ -433,20 +433,35 @@ impl MatchData { start: usize, options: u32, ) -> Result { - // When the subject is empty, we use an empty slice - // with a known valid pointer. Otherwise, slices derived - // from, e.g., an empty `Vec` may not have a valid - // pointer, since creating an empty `Vec` is guaranteed - // to not allocate. - const EMPTY: &[u8] = &[]; + // When the subject is empty, we use an NON-empty slice with a known + // valid pointer. Otherwise, slices derived from, e.g., an empty + // `Vec` may not have a valid pointer, since creating an empty + // `Vec` is guaranteed to not allocate. + // + // We use a non-empty slice since it is otherwise difficult + // to guarantee getting a dereferencable pointer. Which makes + // sense, because the slice is empty, the pointer should never be + // dereferenced! + // + // Alas, older versions of PCRE2 did exactly this. While that bug has + // been fixed a while ago, it still seems to pop up[1]. So we try + // harder. + // + // Note that even though we pass a non-empty slice in this case, we + // still pass a length of zero. This just provides a pointer that won't + // explode if you try to dereference it. + // + // [1]: https://github.com/BurntSushi/rust-pcre2/issues/42 + static SINGLETON: &[u8] = &[0]; + let len = subject.len(); if subject.is_empty() { - subject = EMPTY; + subject = SINGLETON; } let rc = pcre2_match_8( code.as_ptr(), subject.as_ptr(), - subject.len(), + len, start, options, self.as_mut_ptr(),