Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define non-panicking UTF encoding methods on char #52580

Closed
wants to merge 5 commits into from

Conversation

strake
Copy link
Contributor

@strake strake commented Jul 20, 2018

Resolves #52579.

@rust-highfive
Copy link
Collaborator

r? @sfackler

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 20, 2018
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
    100% |████████████████████████████████| 61kB 7.7MB/s 
Collecting botocore==1.10.62 (from awscli)
/usr/local/lib/python2.7/dist-packages/pip/_vendor/requests/packages/urllib3/util/ssl_.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.io/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
  Downloading https://files.pythonhosted.org/packages/24/ec/95df2edaa21e426581f41745e0de355170b8cb6eed2a2a5641c0c348df33/botocore-1.10.62-py2.py3-none-any.whl (4.4MB)
    0% |                                | 10kB 39.4MB/s eta 0:00:01
    0% |▏                               | 20kB 16.8MB/s eta 0:00:01
    0% |▎                               | 30kB 22.0MB/s eta 0:00:01
    0% |▎                               | 40kB 15.0MB/s eta 0:00:01
---
[01:07:54]    Doc-tests core
[01:08:00] 
[01:08:00] running 2100 tests
[01:08:08] ....................................................................................................
[01:08:16] .........................F...FFF....................................................................
[01:08:36] .....................................i..............................................................
[01:08:43] ....................................................................................................
[01:08:51] ....................................................................................................
[01:08:59] ....................................................................................................
---
[01:10:52] ............................i...............................................i.......................
[01:10:52] 
[01:10:52] failures:
[01:10:52] 
[01:10:52] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 567) stdout ----
[01:10:52] error[E0599]: no method named `unwrap` found for type `&mut [u16]` in the current scope
[01:10:52]  --> char/methods.rs:570:39
[01:10:52]   |
[01:10:52] 6 | let result = '𝕊'.encode_utf16(&mut b).unwrap();
[01:10:52] 
[01:10:52] thread 'char/methods.rs - char::methods::char::try_encode_utf16 (line 567)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:10:52] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:10:52] 
[01:10:52] 
[01:10:52] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 478) stdout ----
[01:10:52] error[E0599]: no method named `unwrap` found for type `&mut str` in the current scope
[01:10:52]  --> char/methods.rs:481:38
[01:10:52]   |
[01:10:52] 6 | let result = 'ß'.encode_utf8(&mut b).unwrap();
[01:10:52] 
[01:10:52] thread 'char/methods.rs - char::methods::char::try_encode_utf8 (line 478)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:10:52] 
[01:10:52] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 579) stdout ----
[01:10:52] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 579) stdout ----
[01:10:52] error[E0308]: mismatched types
[01:10:52]  --> char/methods.rs:582:1
[01:10:52]   |
[01:10:52] 6 | assert_eq!(None, '𝕊'.encode_utf16(&mut b));
[01:10:52]   | |
[01:10:52]   | |
[01:10:52]   | expected enum `std::option::Option`, found &mut [u16]
[01:10:52]   | help: try using a variant of the expected type: `Some(*right_val)`
[01:10:52]   = note: expected type `std::option::Option<_>`
[01:10:52]              found type `&mut [u16]`
[01:10:52]   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
[01:10:52] 
[01:10:52] 
[01:10:52] thread 'char/methods.rs - char::methods::char::try_encode_utf16 (line 579)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:10:52] 
[01:10:52] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 490) stdout ----
[01:10:52] error[E0308]: mismatched types
[01:10:52]  --> char/methods.rs:493:1
[01:10:52]   |
[01:10:52] 6 | assert_eq!(None, 'ß'.encode_utf8(&mut b));
[01:10:52]   | |
[01:10:52]   | |
[01:10:52]   | expected enum `std::option::Option`, found &mut str
[01:10:52]   | help: try using a variant of the expected type: `Some(*right_val)`
[01:10:52]   = note: expected type `std::option::Option<_>`
[01:10:52]              found type `&mut str`
[01:10:52]   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
[01:10:52] 
---
[01:10:52] 
[01:10:52] error: test failed, to rerun pass '--doc'
[01:10:52] 
[01:10:52] 
[01:10:52] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:10:52] 
[01:10:52] 
[01:10:52] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:10:52] Build completed unsuccessfully in 0:23:06
[01:10:52] Build completed unsuccessfully in 0:23:06
[01:10:52] make: *** [check] Error 1
[01:10:52] Makefile:58: recipe for target 'check' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:017def04
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@oli-obk
Copy link
Contributor

oli-obk commented Jul 21, 2018

Since one needs to check the return value for None anyway, what's the motivation for this change?

You can write

if buf.len() >= c.utf8_len() {
    let s = c.encode_utf8(buf);
    // work with str
} else {
    // alternate action
}

instead of

if let Some(s) = c.try_encode_utf8(buf) {
    // work with str
} else {
    // alternate action, but cannot access `buf` here until nll are a thing
}

Do you have a real world usage example that can be used as motivation for this change?

Your PR message should also reference the corresponding issue that this PR is closing

@sfackler
Copy link
Member

I'm also interested in the motivation for these methods.

@strake
Copy link
Contributor Author

strake commented Jul 21, 2018

In general, i think the try_encode API (as defined in this PR) is more difficult to mess up and panic with; i needn't remember to check any lengths. It seems analogous to [_]::index vs [_]::get.

My use case is passing an encoding function to another function which fills a buffer and writes it out when full:

extern crate io;

use core::mem;
pub use self::io::*;

fn fst2<S, T>((x, _): (S, T)) -> S { x }

pub fn writeCode<Codon: Copy, F: FnMut(T, &mut [Codon]) -> Option<usize>, I: Iterator<Item = T>, T, W: Write<Codon>>
                (mut encode: F, w: &mut W, xs : I) -> Result<usize, W::Err> where W::Err: From<EndOfFile> {
    let mut buf: [Codon; 4096] = unsafe { mem::uninitialized() };
    let mut pos = 0;
    let mut nBytesWritten = 0;
    for x in xs {
        match encode(x, &mut buf[pos..]) {
            Some(n) => { pos += n; },
            None => {
                try!(w.write_all(&buf[0..pos]).map_err(fst2));
                nBytesWritten += pos;
                pos = 0;
            }
        }
    }
    try!(w.write_all(&buf[0..pos]).map_err(fst2));
    nBytesWritten += pos;
    Ok(nBytesWritten)
}

It is much more convenient to take a function encode which returns Option (or Result) rather than separate functions to encode and compute length.

(It actually wants Option<usize> but it's much easier to map(len) rather than check the length of the buffer.)

#[unstable(feature = "try_unicode_encode_char", issue = "52579")]
#[inline]
pub fn try_encode_utf16(self, dst: &mut [u16]) -> Option<&mut [u16]> {
if dst.len() < self.len_utf16() { None } else { Some(self.encode_utf16(dst)) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is essentially doing the checks twice in the happy path, maybe move the unsafe code into this function and have the panicky function match on the result and panic on None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

#[unstable(feature = "try_unicode_encode_char", issue = "52579")]
#[inline]
pub fn try_encode_utf8(self, dst: &mut [u8]) -> Option<&mut str> {
if dst.len() < self.len_utf8() { None } else { Some(self.encode_utf8(dst)) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for this function

@oli-obk oli-obk added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Jul 21, 2018
@strake
Copy link
Contributor Author

strake commented Jul 22, 2018

Bikeshed: ought this to return Result<_, UtfError> rather than Option (for some hypothetical type UtfError)? It might simplify the panicking methods, but i'm not sure whether it's more idiomatic.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:57:15] 
[00:57:15]    Doc-tests core
[00:57:19] 
[00:57:19] running 2100 tests
[00:57:27] ....................................................................F.F.............................
[00:57:34] .........................F..FF.F....................................................................
[00:57:51] .....................................i..............................................................
[00:57:57] ....................................................................................................
[00:58:04] ....................................................................................................
[00:58:11] ....................................................................................................
---
[00:59:50] 
[00:59:50] failures:
[00:59:50] 
[00:59:50] ---- char/methods.rs - char::methods::char::encode_utf16 (line 515) stdout ----
[00:59:50] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[00:59:50]  --> char/methods.rs:518:18
[00:59:50]   |
[00:59:50] 6 | let result = '𝕊'.try_encode_utf16(&mut b).unwrap();
[00:59:50]   |
[00:59:50]   |
[00:59:50]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[00:59:50] thread 'char/methods.rs - char::methods::char::encode_utf16 (line 515)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[00:59:50] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[00:59:50] 
[00:59:50] ---- char/methods.rs - char::methods::char::encode_utf16 (line 525) stdout ----
[00:59:50] ---- char/methods.rs - char::methods::char::encode_utf16 (line 525) stdout ----
[00:59:50] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[00:59:50]  --> char/methods.rs:528:22
[00:59:50]   |
[00:59:50] 6 | assert_eq!(None, '𝕊'.try_encode_utf16(&mut b));
[00:59:50]   |
[00:59:50]   |
[00:59:50]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[00:59:50] thread 'char/methods.rs - char::methods::char::encode_utf16 (line 525)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[00:59:50] 
[00:59:50] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 549) stdout ----
[00:59:50] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 549) stdout ----
[00:59:50] error[E0599]: no method named `unwrap` found for type `&mut [u16]` in the current scope
[00:59:50]  --> char/methods.rs:552:39
[00:59:50]   |
[00:59:50] 6 | let result = '𝕊'.encode_utf16(&mut b).unwrap();
[00:59:50] 
[00:59:50] thread 'char/methods.rs - char::methods::char::try_encode_utf16 (line 549)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[00:59:50] 
[00:59:50] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 455) stdout ----
[00:59:50] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 455) stdout ----
[00:59:50] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[00:59:50]  --> char/methods.rs:458:18
[00:59:50]   |
[00:59:50] 6 | let result = 'ß'.try_encode_utf8(&mut b).unwrap();
[00:59:50]   |
[00:59:50]   |
[00:59:50]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[00:59:50] thread 'char/methods.rs - char::methods::char::try_encode_utf8 (line 455)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[00:59:50] 
[00:59:50] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 467) stdout ----
[00:59:50] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 467) stdout ----
[00:59:50] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[00:59:50]  --> char/methods.rs:470:22
[00:59:50]   |
[00:59:50] 6 | assert_eq!(None, 'ß'.try_encode_utf8(&mut b));
[00:59:50]   |
[00:59:50]   |
[00:59:50]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[00:59:50] thread 'char/methods.rs -6_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release
122320 ./obj/build/x86_64-unknown-linux-gnu/stage1-rustc/x86_64-unknown-linux-gnu/release/deps
121692 ./obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps
119000 ./obj/build/x86_64-unknown-linux-gnu/stage1-std

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[01:04:12] 
[01:04:12]    Doc-tests core
[01:04:17] 
[01:04:17] running 2100 tests
[01:04:24] .....................................................................FF.............................
[01:04:32] ...........................F.FFF....................................................................
[01:04:51] .....................................i..............................................................
[01:04:57] ....................................................................................................
[01:05:04] ....................................................................................................
[01:05:11] ....................................................................................................
---
[01:06:57] 
[01:06:57] failures:
[01:06:57] 
[01:06:57] ---- char/methods.rs - char::methods::char::encode_utf16 (line 515) stdout ----
[01:06:57] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[01:06:57]  --> char/methods.rs:518:18
[01:06:57]   |
[01:06:57] 6 | let result = '𝕊'.try_encode_utf16(&mut b).unwrap();
[01:06:57]   |
[01:06:57]   |
[01:06:57]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[01:06:57] thread 'char/methods.rs - char::methods::char::encode_utf16 (line 515)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:06:57] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:06:57] 
[01:06:57] ---- char/methods.rs - char::methods::char::encode_utf16 (line 525) stdout ----
[01:06:57] ---- char/methods.rs - char::methods::char::encode_utf16 (line 525) stdout ----
[01:06:57] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[01:06:57]  --> char/methods.rs:528:22
[01:06:57]   |
[01:06:57] 6 | assert_eq!(None, '𝕊'.try_encode_utf16(&mut b));
[01:06:57]   |
[01:06:57]   |
[01:06:57]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[01:06:57] thread 'char/methods.rs - char::methods::char::encode_utf16 (line 525)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:06:57] 
[01:06:57] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 549) stdout ----
[01:06:57] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 549) stdout ----
[01:06:57] error[E0599]: no method named `unwrap` found for type `&mut [u16]` in the current scope
[01:06:57]  --> char/methods.rs:552:39
[01:06:57]   |
[01:06:57] 6 | let result = '𝕊'.encode_utf16(&mut b).unwrap();
[01:06:57] 
[01:06:57] thread 'char/methods.rs - char::methods::char::try_encode_utf16 (line 549)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:06:57] 
[01:06:57] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 467) stdout ----
[01:06:57] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 467) stdout ----
[01:06:57] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[01:06:57]  --> char/methods.rs:470:22
[01:06:57]   |
[01:06:57] 6 | assert_eq!(None, 'ß'.try_encode_utf8(&mut b));
[01:06:57]   |
[01:06:57]   |
[01:06:57]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[01:06:57] thread 'char/methods.rs - char::methods::char::try_encode_utf8 (line 467)' panicked at 'couldn't compile the test', librustdoc/test.rs:327:17
[01:06:57] 
[01:06:57] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 455) stdout ----
[01:06:57] ---- char/methods.rs - char::methods::char::try_encode_utf8 (line 455) stdout ----
[01:06:57] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[01:06:57]  --> char/methods.rs:458:18
[01:06:57]   |
[01:06:57] 6 | let result = 'ß'.try_encode_utf8(&mut b).unwrap();
[01:06:57]   |
[01:06:57]   |
[01:06:57]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[01:06:57] thread 'char/methods.rs -1 passed; 6 failed; 3 ignored; 0 measured; 0 filtered out
[01:06:57] 
[01:06:57] error: test failed, to rerun pass '--doc'
[01:06:57] 
[01:06:57] 
[01:06:57] 
[01:06:57] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "core" "--" "--quiet"
[01:06:57] 
[01:06:57] 
[01:06:57] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:06:57] Build completed unsuccessfully in 0:22:02
[01:06:57] Build completed unsuccessfully in 0:22:02
[01:06:57] Makefile:58: recipe for target 'check' failed
[01:06:57] make: *** [check] Error 1

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:03604264
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[01:08:22] 
[01:08:22]    Doc-tests core
[01:08:27] 
[01:08:27] running 2102 tests
[01:08:34] .......................................................................F..F.........................
[01:08:42] ...........................F..F.FF..................................................................
[01:09:02] .......................................i............................................................
[01:09:09] ....................................................................................................
[01:09:16] ....................................................................................................
[01:09:23] ....................................................................................................
---
[01:11:10]   = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
[01:11:10] 
[01:11:10] thread 'char/methods.rs - char::methods::char::encode_utf16 (line 527)' panicked at 'couldn't compile the test', librustdoc/test.rs:332:13
[01:11:10] 
[01:11:10] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 551) stdout ----
[01:11:10] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[01:11:10]  --> char/methods.rs:554:18
[01:11:10]   |
[01:11:10] 6 | let result = '𝕊'.try_encode_utf16(&mut b).unwrap();
[01:11:10]   |
[01:11:10]   |
[01:11:10]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[01:11:10] thread 'char/methods.rs - char::methods::char::try_encode_utf16 (line 551)' panicked at 'couldn't compile the test', librustdoc/test.rs:332:13
[01:11:10] 
[01:11:10] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 561) stdout ----
[01:11:10] ---- char/methods.rs - char::methods::char::try_encode_utf16 (line 561) stdout ----
[01:11:10] error[E0658]: use of unstable library feature 'try_unicode_encode_char' (see issue #52579)
[01:11:10]  --> char/methods.rs:564:22
[01:11:10]   |
[01:11:10] 6 | assert_eq!(None, '𝕊'.try_encode_utf16(&mut b));
[01:11:10]   |
[01:11:10]   |
[01:11:10]   = help: add #![feature(try_unicode_encode_char)] to the crate attributes to enable
[01:11:10] 
[01:11:10] thread 'char/methods.rs - char::methods::char::try_encode_utf16 (linnknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release
121804 ./obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps
118740 ./obj/build/x86_64-unknown-linux-gnu/stage1-std
113396 ./obj/build/x86_64-unknown-linux-gnu/test/mir-opt
107624 ./src/llvm/test/CodeGen

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@TimNN
Copy link
Contributor

TimNN commented Aug 7, 2018

Ping from triage, @sfackler. This PR requires your review.

Also, @strake it looks like your PR needs to update some doc tests.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-5.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:45:25] ....................................................................................................
[00:45:28] ....................................................................................................
[00:45:30] ....................................................................................................
[00:45:33] ....................................................................................................
[00:45:36] .............iiiiiiiii..............................................................................
[00:45:42] ....................................................................................................
[00:45:45] ...................i................................................................................
[00:45:48] ............................i.......................................................................
[00:45:51] ....................................................................................................
---
[01:01:36] 
[01:01:36]    Doc-tests core
[01:01:41] 
[01:01:41] running 2124 tests
[01:01:48] .......................................................................F..F.........................
[01:02:04] ....................................................................................................
[01:02:14] .......................................i............................................................
[01:02:21] ....................................................................................................
[01:02:28] ....................................................................................................

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@SimonSapin
Copy link
Contributor

I agree with @oli-obk that the motivation doesn’t feel enough to add new methods. It seems easy enough to check the buffer length against c.utf8_len(). And if you need to interoperate with some existing Option- or Result-based API, it seems easy enough to write a three-lines wrapper that does the utf8_len check.

You can even omit the length check entirely and use a 4-bytes buffer:

something.write_str(c.encode_utf8(&mut [0_u8; 4]))

So unless new arguments come up I’m inclined to not accept this:

@rfcbot fcp close

@rfcbot
Copy link

rfcbot commented Aug 8, 2018

Team member @SimonSapin has proposed to close this. The next step is review by the rest of the tagged teams:

No concerns currently listed.

Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. labels Aug 8, 2018
@strake
Copy link
Contributor Author

strake commented Aug 9, 2018

@SimonSapin Unfortunately merely checking the length emits all the panicking code, which i'd prefer to avoid.

Edit: but maybe this is an optimizer issue? is it reasonable to expect the optimizer to recognize this and elide the panicking case?

@SimonSapin
Copy link
Contributor

I’d certainly be open to tweaking the internals of encode_utf8 (without changing the API surface) so that some patterns optimize better. I’m not sure how to make it happen in this case however.

@strake
Copy link
Contributor Author

strake commented Aug 9, 2018

It's not a huge deal — i can rather define an extension trait (for example) — but extension traits are cumbersome, and it seems unfortunate to reinvent this functionality.

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Aug 13, 2018
@rfcbot
Copy link

rfcbot commented Aug 13, 2018

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Aug 13, 2018
@rfcbot
Copy link

rfcbot commented Aug 23, 2018

The final comment period, with a disposition to close, as per the review above, is now complete.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Aug 23, 2018
Mark-Simulacrum added a commit to Mark-Simulacrum/rust that referenced this pull request Dec 24, 2019
…oli-obk

Clean up unsafety in char::encode_utf8

This originally started as an attempt to allow LLVM to optimize through
encode_utf8 to detect the try_encode_utf8 case (rust-lang#52579, rust-lang#52580), but due to a
typo my conclusion that my optimizations were successful was incorrect.

Furthermore, as far as I can tell, this optimization is probably just not
possible with LLVM today.  This [code](https://rust.godbolt.org/z/JggRj4)
compiles down to a long series of compares, notably, two identical series of
compares. That essentially means that LLVM is today unable to see that these two
ifs are identical and as such can be merged and then realize that no value of
the if condition can result in a call to `please_delete`. As such, for now, we
do not attempt to specifically optimize for that case.
Centril added a commit to Centril/rust that referenced this pull request Dec 24, 2019
…oli-obk

Clean up unsafety in char::encode_utf8

This originally started as an attempt to allow LLVM to optimize through
encode_utf8 to detect the try_encode_utf8 case (rust-lang#52579, rust-lang#52580), but due to a
typo my conclusion that my optimizations were successful was incorrect.

Furthermore, as far as I can tell, this optimization is probably just not
possible with LLVM today.  This [code](https://rust.godbolt.org/z/JggRj4)
compiles down to a long series of compares, notably, two identical series of
compares. That essentially means that LLVM is today unable to see that these two
ifs are identical and as such can be merged and then realize that no value of
the if condition can result in a call to `please_delete`. As such, for now, we
do not attempt to specifically optimize for that case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-close This PR / issue is in PFCP or FCP with a disposition to close it. finished-final-comment-period The final comment period is finished for this PR / Issue. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants