Performance degrades if stdlib function uses io::Error::new() with &'static str #82812

Frago9876543210 · 2021-03-05T19:55:28Z

I am facing a problem while reading different kinds of binary formats. If you are going to read primitives like i32 too often, then performance degrades. This is mainly due to the fact that Error::new(kind, &'static str) returns box. Also it does boxing TWICE!

//stdlib code
enum Repr {
    Os(i32),
    Simple(ErrorKind),
    Custom(Box<Custom>),
}

#[derive(Debug)]
struct Custom {
    kind: ErrorKind,
    error: Box<dyn error::Error + Send + Sync>,
}

doc link: https://doc.rust-lang.org/std/io/struct.Error.html#method.new

I wrote simple bench to show pref problem.

cargo new --vcs=none io-perf
cd io-perf
mkdir benches
cat /dev/null > benches/read.rs
# now you have to modify this file e.g. `subl benches/read.rs`

read.rs:

#![feature(test)]

extern crate test;

use std::hint::black_box;
use std::io::{ErrorKind, Read, Result};
use test::Bencher;

pub trait CustomRead {
    fn read_exact_fast(&mut self, buf: &mut [u8]) -> Result<()>;
}

impl CustomRead for &[u8] {
    /// Clone of `&[u8]::read_exact` without boxing `&'static str`
    #[inline]
    fn read_exact_fast(&mut self, buf: &mut [u8]) -> Result<()> {
        if buf.len() > self.len() {
            // [!] no message here:
            return Err(ErrorKind::UnexpectedEof.into());
        }
        let (a, b) = self.split_at(buf.len());

        if buf.len() == 1 {
            buf[0] = a[0];
        } else {
            buf.copy_from_slice(a);
        }

        *self = b;
        Ok(())
    }
}

impl<R: CustomRead + ?Sized> CustomRead for &mut R {
    #[inline]
    fn read_exact_fast(&mut self, buf: &mut [u8]) -> Result<()> {
        (**self).read_exact_fast(buf)
    }
}

const READ_SIZE: usize = 4; //e.g. reading int x1000 times
const LOOPS: usize = 1000;

#[bench]
fn bench_read_exact_std_err(b: &mut Bencher) {
    b.iter(|| {
        let mut buf = [0u8; READ_SIZE];
        let mut r: &[u8] = &[]; //always would be error

        for _ in 0..LOOPS {
            black_box(Read::read_exact(&mut &*r, &mut buf).is_err());
        }
    });
}

#[bench]
fn bench_read_exact_fast_err(b: &mut Bencher) {
    b.iter(|| {
        let mut buf = [0u8; READ_SIZE];
        let mut r: &[u8] = &[]; //always would be error

        for _ in 0..LOOPS {
            black_box(CustomRead::read_exact_fast(&mut &*r, &mut buf).is_err());
        }
    });
}

#[bench]
fn bench_read_exact_std_ok(b: &mut Bencher) {
    b.iter(|| {
        let mut buf = [0u8; READ_SIZE];
        let mut r: &[u8] = &[0u8; READ_SIZE]; //always would be ok

        for _ in 0..LOOPS {
            black_box(Read::read_exact(&mut &*r, &mut buf).is_ok());
        }
    });
}

#[bench]
fn bench_read_exact_fast_ok(b: &mut Bencher) {
    b.iter(|| {
        let mut buf = [0u8; READ_SIZE];
        let mut r: &[u8] = &[0u8; READ_SIZE]; //always would be ok

        for _ in 0..LOOPS {
            black_box(CustomRead::read_exact_fast(&mut &*r, &mut buf).is_ok());
        }
    });
}

Results

Use cargo bench to get it

running 4 tests
test bench_read_exact_fast_err ... bench:         255 ns/iter (+/- 239)
test bench_read_exact_fast_ok  ... bench:         256 ns/iter (+/- 9)
test bench_read_exact_std_err  ... bench:     164,789 ns/iter (+/- 11,572)
test bench_read_exact_std_ok   ... bench:         256 ns/iter (+/- 11)

We can clearly see that performance is degrading. The more calls it takes to read, the worse the results will be.

Affected crates and stdlib itself

The first very popular crate that comes to mind is byteorder.
https://github.com/BurntSushi/byteorder/blob/5d9d0386488d0763954f14a5cb02b1c0afccbdcb/src/io.rs#L84-L89

It mainly depends on the data format. But what if you need to receive data over the network? The attacker may abuse wrong data to make ton of pointless memory allocations.

stdlib is also use Error::new() often enough. Let's run a simple bash to make sure of this

cd ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src
grep -r --include="*.rs" -E "ErrorKind::\w+," -A 2 -B 2

In many cases, you will see that it is used &'static str. stdlib sometimes create io::Error with format!() macro, however it doesn't happen very often.

rust/library/std/src/sys/unix/os.rs-            if path_len <= 1 {
rust/library/std/src/sys/unix/os.rs-                return Err(io::Error::new(
rust/library/std/src/sys/unix/os.rs:                    io::ErrorKind::Other,
rust/library/std/src/sys/unix/os.rs-                    "KERN_PROC_PATHNAME sysctl returned zero-length string",
rust/library/std/src/sys/unix/os.rs-                ));
--
rust/library/std/src/sys/unix/os.rs-        }
rust/library/std/src/sys/unix/os.rs-        Err(io::Error::new(
rust/library/std/src/sys/unix/os.rs:            io::ErrorKind::Other,
rust/library/std/src/sys/unix/os.rs-            "/proc/curproc/exe doesn't point to regular file.",
rust/library/std/src/sys/unix/os.rs-        ))

Possible solutions

Consider adding a new static method such as io::ErrorKind::with_message(_: &'static str) and new internal Repr member of that's enum.
How about using something like Cow, but with static lifetime?

Meta

rustc --version --verbose:

rustc 1.52.0-nightly (94736c434 2021-02-27)
binary: rustc
commit-hash: 94736c434ee154b30e2ec22ec112b79e3f6c5884
commit-date: 2021-02-27
host: x86_64-unknown-linux-gnu
release: 1.52.0-nightly
LLVM version: 11.0.1

Update 07.03.2021: Benchmarks have been improved as per the comment. This should show the problem better.

The text was updated successfully, but these errors were encountered:

SkiFire13 · 2021-03-07T15:55:26Z

fn bench_read_exact_std_ok(b: &mut Bencher) {
    b.iter(|| {
        let mut buf = [0u8; READ_SIZE];
        let mut r: &[u8] = &[0u8; READ_SIZE]; //always would be ok

        for _ in 0..LOOPS {
            black_box(Read::read_exact(&mut r, &mut buf).is_ok());
        }
    });
}

Note that the implementation of Read for &[u8] will mutate the slice reference to point to the remainder, so here the read_exact will succeed only once because after that r will be empty. This means that you're actually testing the error path in both "ok" and "err" benches. To actually measure the happy path you'll need for example to reborrow r (i.e. call Read::read_exact(&mut &*r, &mut buf). Applying this change to every bench function leads to the following results on my computer:

test bench_read_exact_fast_err ... bench:         237 ns/iter (+/- 5)
test bench_read_exact_fast_ok  ... bench:         234 ns/iter (+/- 2)
test bench_read_exact_std_err  ... bench:     126,251 ns/iter (+/- 2,466)
test bench_read_exact_std_ok   ... bench:         233 ns/iter (+/- 5)

The "std_err" bench is still pretty slow though

Frago9876543210 · 2021-03-07T16:06:09Z

@SkiFire13 Yes, I probably made a mistake while copying the code for the benchmark of bench_read_exact_fast_ok. It should be .is_ok(). However, this shows that io::Error::new() not zero-cost.

m-ou-se · 2021-03-27T10:50:30Z

This is fixed by #83353

m-ou-se · 2021-03-27T10:51:41Z

TWICE!

Turns out it was three times, if you also count the String that the &str is copied into.

m-ou-se · 2021-03-27T11:06:43Z

Running your benchmark (thanks!):

Before:

test bench_read_exact_fast_err ... bench:         256 ns/iter (+/- 6)
test bench_read_exact_fast_ok  ... bench:         257 ns/iter (+/- 7)
test bench_read_exact_std_err  ... bench:      34,733 ns/iter (+/- 2,409)
test bench_read_exact_std_ok   ... bench:         256 ns/iter (+/- 8)

After:

test bench_read_exact_fast_err ... bench:         256 ns/iter (+/- 17)
test bench_read_exact_fast_ok  ... bench:         256 ns/iter (+/- 7)
test bench_read_exact_std_err  ... bench:       1,508 ns/iter (+/- 39)
test bench_read_exact_std_ok   ... bench:         256 ns/iter (+/- 7)

Looks like it isn't completely solved yet. Adding #[inline] to io::Error::new_const now... :)

m-ou-se · 2021-03-27T11:13:24Z

That fixed it.

test bench_read_exact_fast_err ... bench:         258 ns/iter (+/- 10)
test bench_read_exact_fast_ok  ... bench:         256 ns/iter (+/- 9)
test bench_read_exact_std_err  ... bench:         257 ns/iter (+/- 6)
test bench_read_exact_std_ok   ... bench:         256 ns/iter (+/- 5)

…r=jackh726 Add #[inline] to io::Error methods Fixes rust-lang#82812

Frago9876543210 added the C-bug Category: This is a bug. label Mar 5, 2021

jyn514 mentioned this issue Mar 22, 2021

io::Error::new(_, string literal) allocates Box<Box<String>> and copies the string #83352

Open

m-ou-se mentioned this issue Mar 27, 2021

Add #[inline] to io::Error methods #83555

Merged

Frago9876543210 closed this as completed Mar 27, 2021

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Mar 27, 2021

Rollup merge of rust-lang#83555 - m-ou-se:inline-io-error-new-const, …

7d6af67

…r=jackh726 Add #[inline] to io::Error methods Fixes rust-lang#82812

Frago9876543210 mentioned this issue Aug 25, 2021

Use const generics to implement Io trait for [T; N] Frago9876543210/endiannezz#12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degrades if stdlib function uses io::Error::new() with &'static str #82812

Performance degrades if stdlib function uses io::Error::new() with &'static str #82812

Frago9876543210 commented Mar 5, 2021 •

edited

Loading

SkiFire13 commented Mar 7, 2021

Frago9876543210 commented Mar 7, 2021

m-ou-se commented Mar 27, 2021

m-ou-se commented Mar 27, 2021

m-ou-se commented Mar 27, 2021 •

edited

Loading

m-ou-se commented Mar 27, 2021 •

edited

Loading

Performance degrades if stdlib function uses io::Error::new() with &'static str #82812

Performance degrades if stdlib function uses io::Error::new() with &'static str #82812

Comments

Frago9876543210 commented Mar 5, 2021 • edited Loading

Results

Affected crates and stdlib itself

Possible solutions

Meta

SkiFire13 commented Mar 7, 2021

Frago9876543210 commented Mar 7, 2021

m-ou-se commented Mar 27, 2021

m-ou-se commented Mar 27, 2021

m-ou-se commented Mar 27, 2021 • edited Loading

m-ou-se commented Mar 27, 2021 • edited Loading

Frago9876543210 commented Mar 5, 2021 •

edited

Loading

m-ou-se commented Mar 27, 2021 •

edited

Loading

m-ou-se commented Mar 27, 2021 •

edited

Loading