Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OTHER] Improve internal safety comments and architecture #100

Closed
Manishearth opened this issue Jul 10, 2023 · 9 comments · Fixed by #135 or #142
Closed

[OTHER] Improve internal safety comments and architecture #100

Manishearth opened this issue Jul 10, 2023 · 9 comments · Fixed by #135 or #142
Assignees
Labels
A-sec Issues with potential security implications. enhancement New feature or request high priority High priority
Milestone

Comments

@Manishearth
Copy link
Contributor

I was reviewing the lexical-write-integer crate and I discovered a bunch of things that could be improved wrt safety.

The main issue was the one already filed as #95. That's actual UB, this issue is more about the safety comments and how the invariants are upheld. The safe mode doesn't disable this code that has actual UB, either.

Besides the issue that already exists, there are a whole bunch of functions (e.g. write_digits()) that unsafely index from table are documented with "This is safe as long as the buffer is large enough to hold T::MAX digits in radix N.",but they do not include the safety requirement on table`.

Furthermore, some functions (also write_digits()) take a mut index: usize argument that is used to directly index. These safety requirements are also not mentioned.

Besides safety comments, I found a couple areas where it was hard to review the unsafe code:

A bunch of code relies on the radix being from amongst a valid set (e.g. get_table()). May benefit from being an enum; it's hard to tell if this invariant is being upheld.

Other code which is calling the write_*() functions (e.g. unsigned()) declare they are safe as long as the buffer can hold FORMATTED_SIZE elements, but that invariant is not really clear from the functions used, it's kinda hidden away.

Might be cool to have a ValidBuffer<FORMAT> wrapper that allows calling these write functions in an encapsulated way.

Anyway, hope this helps. I don't have time to devote to doing this work myself but I figured I'd share my notes in case someone else does.

@Alexhuszagh
Copy link
Owner

I'll be looking comprehensively over this later today.

@Alexhuszagh Alexhuszagh added enhancement New feature or request high priority High priority A-sec Issues with potential security implications. labels Sep 9, 2024
@Alexhuszagh Alexhuszagh added this to the 1.0 milestone Sep 9, 2024
Alexhuszagh added a commit that referenced this issue Sep 9, 2024
Closes #104.

Part of fixes to addrss #100.
Alexhuszagh added a commit that referenced this issue Sep 9, 2024
Closes #104

Part of fixes to address #100
Alexhuszagh added a commit that referenced this issue Sep 9, 2024
Closes #104

Part of fixes to address #100
@Alexhuszagh
Copy link
Owner

I've made some major progress on this on the safety branch, benchmarking every significant change I've made to avoid any regressions while slowly removing unsafety. I had high-level "unchecked" functions which I removed to allow more localized safety invariants.

Alexhuszagh added a commit that referenced this issue Sep 11, 2024
Initial Littany of Safety Enhancements.

This removes a lot of unsafe code, documents the cases where removing it would have significant performance impacts but the safety invariants can be easily guaranteed, and likewise makes other enhancements to remove potentially unsafe behavior. This also redoes some architecture to make more code wrapped into safe variants, where rather than say if x.get(0) == b'0'. then do an unchecked index, instead it just has a peek and step in a single function, where applicable.. This also simplifies the code base a lot.

Part of many commits to address #100.
@Alexhuszagh
Copy link
Owner

I'll be also adding in documentation at the top of each module that uses unsafe code what the general purpose guarantees are and then better # SAFETY invariant documentation internally. There's a few cases where I wasn't able to remove unsafety (this is performance related, often with a 20%+ hit in performance) despite relative easy of proving that the code was safe so I'm also commenting on why the unsafety remains.

@Alexhuszagh
Copy link
Owner

Alexhuszagh commented Sep 14, 2024

This has now been closed: ~90% of all unsafety has been removed and all that remains is trivial to verify with major documentation on the vulnerable parts. The crate documentation also now clearly describes the possible sources of unsoundness, why the code is sound, and any other requirements.

Once #135 I'll remove this and publish this to crates.io.

Finally, a security policy has been created.

Alexhuszagh added a commit that referenced this issue Sep 14, 2024
This introduces numerous different layers of security enhancements:

1. Removal of most unsafe code (according to count-unsafe, the code went from 160 unsafe functions and 3088 unsafe exprs to 8 unsafe functions and 1248 unsafe exprs, However, all the remaining unsafe code has much clearly documented safety guarantees and is isolated into safe abstractions.
2. Clear documentation of the locations where unsafe code is used and at the crate-level documentation so it's clearly visible.

A security policy has also been added, with stricter requirements for soundness with PRs.

Closes #100.
@Manishearth
Copy link
Contributor Author

@Alexhuszagh looking briefly through the remaining unsafe, I still feel like a lot of it can still be reduced. E.g. there are safe parser patterns that avoid peek/unchecked semantics. There ought to be ways to encapsulate the unsafety better.

1000+ unsafe blocks, even if documented, is still a lot to review, so I don't suspect anyone will perform a new audit any time soon for this.

(There are basically two ways you can have 1000 unsafe blocks: either they're all very different, in which case that's a huge task to review, or there are a lot of similarities, in which case there can probably be some safe abstractions built)

@Alexhuszagh
Copy link
Owner

Alexhuszagh commented Sep 15, 2024

(There are basically two ways you can have 1000 unsafe blocks: either they're all very different, in which case that's a huge task to review, or there are a lot of similarities, in which case there can probably be some safe abstractions built)

I tried this a few ways, but one of them was using read_if and encapsulating everything there, which unfortunately didn't have close to the performance I was looking for.

    /// Peek the next value and consume it if the read value matches the
    /// expected one.
    #[inline(always)]
    fn read_if<Pred: FnOnce(&u8) -> bool>(&mut self, pred: Pred) -> Option<Self::Item> {
        if let Some(peeked) = self.peek() {
            if pred(peeked) {
                // SAFETY: the slice cannot be empty because we peeked a value.
                unsafe { self.step_unchecked() };
                Some(peeked)
            } else {
                None
            }
        } else {
            None
        }
    }

However, a lot of the remaining unsafety would be in the implementation of the stack vector and the iterator API:

pub unsafe trait BytesIter<'a>: Iterator<Item = &'a u8> + Buffer<'a> {

https://github.com/Alexhuszagh/rust-lexical/blob/aeab32205c18c4569402f83817eeb9fb983ee64b/lexical-parse-float/src/bigint.rs

Both of which are almost entirely encapsulated. We do have a fair bit that could be removed from the stack vector in Bigint since a lot of methods for that code could be removed.

I'll see if there are locations I can still fit read_if in where it doesn't have noticeable on performance.

EDIT: Ok this really only affects match-cases where I do a read_if for 2 cases to handle the steps. Which makes sense. Benchmarking the rest now.

@Alexhuszagh Alexhuszagh reopened this Sep 15, 2024
Alexhuszagh added a commit that referenced this issue Sep 15, 2024
These used to have safety considerations although none do anymore, but
they require validation or there could be runtime panics.

- Addresses #100
- Addresses #138
Alexhuszagh added a commit that referenced this issue Sep 15, 2024
These used to have safety considerations although none do anymore, but
they require validation or there could be runtime panics.

- Addresses #100
- Addresses #138
Alexhuszagh added a commit that referenced this issue Sep 15, 2024
These used to have safety considerations although none do anymore, but
they require validation or there could be runtime panics.

- Addresses #100
- Addresses #138
@Alexhuszagh
Copy link
Owner

Alexhuszagh commented Sep 15, 2024

A large percentage of it remaining was the Options API, which originally had only 2 functions in it that were actually unsafe (set_nan_string, set_inf_string), but the rest were reserved as unsafe since any use without validating our format specifications with is_valid was breaking our guarantees. However, none of those are unsafe and after removing them, we're at 315 total unsafe expressions in lexical-core, of which the majority are in 1 or 2 encapsulated classes (the 417 includes the test suite, which is used to test the validity of any unsafe logic for soundness, and therefore is a misleading number).

I've created #138 to track this.

@Manishearth
Copy link
Contributor Author

Thanks. Yeah I find it incredibly unlikely that this amount of unsafe is necessary for performance, a lot of these encapsulations can be zero cost. 315 is still a lot but more manageable.

@Alexhuszagh Alexhuszagh modified the milestones: 1.0, 1.0.1 Sep 15, 2024
@Alexhuszagh
Copy link
Owner

We're down to less than 250 unsafe expressions and almost all of it is within 2 abstractions: Iter and Bigint. Since these abstractions are fully encapsulated except for a few invocations where the guarantees are clearly written, there is practically no non-encapsulated unsafety left, so I feel comfortable reclosing this.

We have <60 expressions of unsafe code outside of those 2 abstractions. A lot of the remaining unsafety is code like this:

    // Compute xi and zi.
    // SAFETY: safe, since value must be finite and therefore in the correct range.
    // `-324 <= exponent <= 308`, so `x * log10(2) - log10(4 / 3)` must be in
    // `-98 <= x <= 93`, so the final value must be in [-93, 98] (for f64). We have
    // precomputed powers for [-292, 326] for f64 (same logic applies for f32) so
    // this is **ALWAYS** safe.
    let pow5 = unsafe { F::dragonbox_power(-minus_k) };

Overall, this introduces 8 unsafe expressions, which out of almost the entire remainder is in the integer writer, which is a well-known algorithm and has a major performance hit if it uses checked indexing.

I think this is acceptable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sec Issues with potential security implications. enhancement New feature or request high priority High priority
Projects
None yet
2 participants