Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparc issue when calling C++ functions which returns >16bytes structure #46679

Closed
psumbera opened this issue Dec 12, 2017 · 7 comments · Fixed by #47541
Closed

sparc issue when calling C++ functions which returns >16bytes structure #46679

psumbera opened this issue Dec 12, 2017 · 7 comments · Fixed by #47541
Labels
C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. O-SPARC Target: SPARC processors

Comments

@psumbera
Copy link
Contributor

I have an issue with rust (now version 1.22.1) on Solaris sparc (64bit). On intel I don't see it. I originally see it as part of Firefox build (https://bugzilla.mozilla.org/show_bug.cgi?id=1413887)

I now believe it's rustc issue (though it could be still issue with clang-sys). Any help would be greatly appreciated.

mkdir -p test-clang/src
cat > test-clang/Cargo.toml <<EOF
[package]
name = "test-clang"
version = "0.1.0"
authors = ["Petr Sumbera <petr.sumbera@oracle.com>"]

[dependencies]
clang-sys = "0.21.1"
libc = "0.2.14"
EOF

cat > test-clang/src/main.rs <<EOF
extern crate libc;
use libc::{c_char};
extern crate clang_sys;
use clang_sys::*;

fn main() {
    unsafe {
      let index = clang_createIndex(0, 0);
      let tu = clang_createTranslationUnit(index, "tests/header.h".as_ptr() as *const c_char);
      println!("tu is: {:p}", tu);
      clang_getTranslationUnitCursor(tu);
     }
}
EOF

cd test-clang
LIBCLANG_PATH=/usr/lib/64 cargo build
LIBCLANG_PATH=/usr/lib/64 cargo run

Run output looks like:

LIBCLANG_PATH=/usr/lib/64 cargo run
    Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
     Running `target/debug/test-clang`
tu is: 0x0
Segmentation Fault (core dumped)

Stack:

test-clang:core> $C
ffffffff7fffe0d1 libclang.so.5.0`clang_getTranslationUnitCursor+0x44(ffffffff7fffeb20, 0, 0, 1, 1001f4800, ffffffff7c604438)
ffffffff7fffe1e1 test_clang::main::h39d0f3581df1cb39+0xfc(ffffffff7fffeac8, 100071bc8, 10, 1001f4800, 0, 0)
ffffffff7fffe351 std::sys_common::backtrace::__rust_begin_short_backtrace::h6a22156fb88cf7c0+4(100071f34, 0, bc00, ffffffff7c227948, ffffffff7c226000, 0)
ffffffff7fffe401 std::panicking::try::do_call::h10d3852066fc0682+8(ffffffff7fffee50, 0, 1, 0, 0, 0)
ffffffff7fffe4b1 __rust_maybe_catch_panic+0x14(1000b93ec, ffffffff7fffee88, ffffffff7fffeef0, ffffffff7fffeea8, 1001f4800, 0)
ffffffff7fffe591 std::rt::lang_start::h6e897821881fc078+0x458(ffffffff7fffeea8, 10020dec8, 1, 1, 1, 10022e440)
ffffffff7fffe701 main+0x48(8, ffffffff7ffff088, 1001f4800, 1, ffffffff7ffff088, 1)
ffffffff7fffe7c1 _start+0x64(0, 0, 0, 0, 0, 0)

With tracing:

$ apptrace -o log ./target/debug/test-clang
tu is: 0x0

apptrace: ./target/debug/test-clang: Segmentation Fault(Core dump)

tu is: 0x0
Segmentation Fault (core dumped)

$ grep clang_ log
-> test-clang -> libclang.so.5:clang_createIndex(0x0, 0x0, 0x2) ** NR
-> test-clang -> libclang.so.5:clang_createTranslationUnit(0x100230f40, 0x1000577b4, 0xe) ** NR
-> test-clang -> libclang.so.5:clang_getTranslationUnitCursor(0xffffffff7fffeea0, 0x0, 0x0) ** NR

Note that clang_getTranslationUnitCursor isn't called with 'tu' set to 0.

I have find out that the issue seems to be with the size of CXCursor structure which is returned:

CXCursor clang_getTranslationUnitCursor(CXTranslationUnit TU);

CXCursor is 32 bytes long. But if I tell clang-sys it's 16 bytes everything is perfect (when it's 17 bytes or more it causes the trouble):

diff --git a/src/lib.rs b/src/lib.rs
index b750b9d..05aaa88 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1103,9 +1103,10 @@ default!(CXCompletionResult);
 #[derive(Copy, Clone, Debug)]
 #[repr(C)]
 pub struct CXCursor {
-    pub kind: CXCursorKind,
-    pub xdata: c_int,
-    pub data: [*const c_void; 3],
+    pub x1: [u8; 16],
+//    pub kind: CXCursorKind,
+//    pub xdata: c_int,
+//    pub data: [*const c_void; 3],
 }

Following is disassembly when CXCursor is pretended to be just 16 bytes:

_ZN10test_clang4main17heae3a929f5bb6356E+0x58:  stx       %o0, [%fp + 0x7b7]
_ZN10test_clang4main17heae3a929f5bb6356E+0x5c:  ldx       [%fp + 0x7bf], %o0
_ZN10test_clang4main17heae3a929f5bb6356E+0x60:  call      +0x18553c     <PLT=libclang.so.5.0`clang_createTranslationUnit>
_ZN10test_clang4main17heae3a929f5bb6356E+0x64:  ldx       [%fp + 0x7b7], %o1
_ZN10test_clang4main17heae3a929f5bb6356E+0x68:  ba        +0x8          <_ZN10test_clang4main17heae3a929f5bb6356E+0x70>
_ZN10test_clang4main17heae3a929f5bb6356E+0x6c:  stx       %o0, [%fp + 0x7d7]   // (*)
_ZN10test_clang4main17heae3a929f5bb6356E+0x70:  call      +0x18554c     <PLT=libclang.so.5.0`clang_getTranslationUnitCursor>
_ZN10test_clang4main17heae3a929f5bb6356E+0x74:  ldx       [%fp + 0x7d7], %o0   // corectly sets what was returned from clang_createTranslationUnit (*)
_ZN10test_clang4main17heae3a929f5bb6356E+0x78:  stx       %o1, [%fp + 0x7f7]
_ZN10test_clang4main17heae3a929f5bb6356E+0x7c:  stx       %o0, [%fp + 0x7ef]

And now with 17 bytes:

_ZN10test_clang4main17heae3a929f5bb6356E+0x58:  stx       %o0, [%fp + 0x7bf]
_ZN10test_clang4main17heae3a929f5bb6356E+0x5c:  ldx       [%fp + 0x7c7], %o0
_ZN10test_clang4main17heae3a929f5bb6356E+0x60:  call      +0x18553c     <PLT:clang_createTranslationUnit>
_ZN10test_clang4main17heae3a929f5bb6356E+0x64:  ldx       [%fp + 0x7bf], %o1
_ZN10test_clang4main17heae3a929f5bb6356E+0x68:  ba        +0x8          <_ZN10test_clang4main17heae3a929f5bb6356E+0x70>
_ZN10test_clang4main17heae3a929f5bb6356E+0x6c:  stx       %o0, [%fp + 0x7df]
_ZN10test_clang4main17heae3a929f5bb6356E+0x70:  ldx       [%fp + 0x7df], %o1
_ZN10test_clang4main17heae3a929f5bb6356E+0x74:  call      +0x185548     <PLT:clang_getTranslationUnitCursor>
_ZN10test_clang4main17heae3a929f5bb6356E+0x78:  add       %fp, 0x7e7, %o0
_ZN10test_clang4main17heae3a929f5bb6356E+0x7c:  ba        +0x8          <_ZN10test_clang4main17heae3a929f5bb6356E+0x84>

@kennytm kennytm added C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. O-SPARC Target: SPARC processors labels Dec 12, 2017
@psumbera
Copy link
Contributor Author

I wrote more minimalistic version of test case which rules out use of clang-sys.

mkdir -p rust-test/src
cd rust-test
cat > Cargo.toml <<EOF
[package]
name = "rust-test"
version = "0.1.0"
authors = ["Petr Sumbera <petr.sumbera@oracle.com>"]
build = "build.rs"

[dependencies]
libc = "0.2.33"

[build-dependencies]
cc = "1.0"
EOF

cat > build.rs <<EOF
extern crate cc;

fn main() {
    cc::Build::new()
        .file("src/myfunc.c")
        .compile("libmyfunc.a");
}
EOF

cat > src/myfunc.c <<EOF
#define LEN 32

struct MyObj {
    char a[LEN];
};

struct MyObj MyFunc() {
  struct MyObj obj;
  for (int i=0; i<LEN; i++) obj.a[i]=i;
  return obj;
}
EOF

cat >> src/main.rs << EOF
extern crate libc;

#[derive(Copy, Clone, Debug)]
#[repr(C)]
pub struct MyObj {
  pub a: [u8; 32],
}

extern {
  fn MyFunc() -> MyObj;
}

fn main() {
  unsafe {
    let obj = MyFunc();
    for i in 0..16 {
      println!("obj.a[{}]={} ", i, obj.a[i]);
    }
  }
}
EOF
cargo run

On intel it works as expected:

    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling cc v1.0.3
   Compiling libc v0.2.34
   Compiling rust-test v0.1.0 (file:///scratch/rust-test)
    Finished dev [unoptimized + debuginfo] target(s) in 8.77 secs
     Running `target/debug/rust-test`
obj.a[0]=0
obj.a[1]=1
obj.a[2]=2
obj.a[3]=3
obj.a[4]=4
obj.a[5]=5
obj.a[6]=6
obj.a[7]=7
obj.a[8]=8
obj.a[9]=9
obj.a[10]=10
obj.a[11]=11
obj.a[12]=12
obj.a[13]=13
obj.a[14]=14
obj.a[15]=15

But on sparc it looks weird:

    Updating registry `https://github.com/rust-lang/crates.io-index`
   Compiling libc v0.2.34
   Compiling cc v1.0.3
   Compiling rust-test v0.1.0 (file:///scratch/rust-ffi-examples/rust-test)
    Finished dev [unoptimized + debuginfo] target(s) in 8.64 secs
     Running `target/debug/rust-test`
obj.a[0]=255
obj.a[1]=255
obj.a[2]=255
obj.a[3]=255
obj.a[4]=255
obj.a[5]=255
obj.a[6]=255
obj.a[7]=255
obj.a[8]=255
obj.a[9]=255
obj.a[10]=255
obj.a[11]=255
obj.a[12]=255
obj.a[13]=255
obj.a[14]=255
obj.a[15]=255

But with this "hack" it looks better:


--- src/main.rs
+++ src/main.rs
@@ -3,7 +3,7 @@
 #[derive(Copy, Clone, Debug)]
 #[repr(C)]
 pub struct MyObj {
-  pub a: [u8; 32],
+  pub a: [u8; 16],
 }

 extern { 
Compiling rust-test v0.1.0 (file:///scratch/rust-ffi-examples/rust-test)
 Finished dev [unoptimized + debuginfo] target(s) in 1.26 secs
  Running `target/debug/rust-test`
obj.a[0]=0
obj.a[1]=1
obj.a[2]=2
obj.a[3]=3
obj.a[4]=4
obj.a[5]=5
obj.a[6]=6
obj.a[7]=7
obj.a[8]=8
obj.a[9]=9
obj.a[10]=10
obj.a[11]=11
obj.a[12]=12
obj.a[13]=13
obj.a[14]=14
obj.a[15]=15

@psumbera
Copy link
Contributor Author

Ok. I'm new to both Rust and LLVM. I tried to create for above example llvm-ir data on intel and used llc to build it on sparc. The same result could be seen. Does it mean that that issue is rather in LLVM?!

On intel I did:

rustc --crate-name rust_test src/main.rs --crate-type bin --emit=llvm-ir -C debuginfo=2 -C metadata=e19299b369c7fab4 -C extra-filename=-e19299b369c7fab4 --out-dir /scratch/rust-test/target/debug/deps -L dependency=/scratch/rust-test/target/debug/deps --extern libc=/scratch/rust-test/target/debug/deps/liblibc-4242d2235e1c985c.rlib -L native=/scratch/rust-test/target/debug/build/rust-test-1c2036192ff415b2/out -l static=myfunc
cp target/debug/deps/rust_test-e19299b369c7fab4.ll ~/rust-intel.ll

On Sparc I did:

cp ~/rust-intel.ll .

# Following allows to build llvm-ir from intel on sparc (e.g. hashes differs)
gsed -i 's/target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"/target datalayout = "E-m:e-i64:64-n32:64-S128"/' rust-intel.ll
gsed -i 's/target triple = "x86_64-pc-solaris"/target triple = "sparcv9-sun-solaris"/' rust-intel.ll
gsed -i 's/h25e88cdc4b742d37/868f8880dc2af3fb/' rust-intel.ll
gsed -i 's/h62ae80c5f7549c57/h695fc56fad203862/' rust-intel.ll 
gsed -i 's/hb1a77afe5cf34a1f/h12114874ef566064/' rust-intel.ll
gsed -i 's/h01dfd331a3ddfd23/hde26e71a5da81bd3/' rust-intel.ll
gsed -i 's/h92300f5a5dcbe9fc/h6e897821881fc078/' rust-intel.ll
gsed -i 's/h260bdf0389a41d93/he0edee3f5f00176d/' rust-intel.ll
gsed -i 's/h68259f12b9557c97/hb3a671a76eb88d18/' rust-intel.ll
gsed -i 's/h93039a1a453ca6fa/h89b991e782a34d3c/' rust-intel.ll
gsed -i 's/h1cedb2ea3dec2dd2/h565f072da92eba4a/' rust-intel.ll
gsed -i 's/ha4d058251336ce5a/hce1c61317f4621e6/' rust-intel.ll
gsed -i 's/hd203f6187fbbf08f/h23743788f08f09ac/' rust-intel.ll
gsed -i 's/h887a1a8a37a961fb/h92ddcfd6dccbb247/' rust-intel.ll
gsed -i 's/868f8880dc2af3fb/h868f8880dc2af3fb/' rust-intel.ll

# Without this I'm getting error [1]
gsed -i 's/, file: !6)/)/' rust-intel.ll

llc --filetype=obj -o test.o rust-intel.ll
gcc -c -o myfunc.o src/myfunc.c
gcc -o test test.o myfunc.o -L /usr/lib/rustlib/sparcv9-sun-solaris/lib/ -lstd-899f7b6f82885664
LD_LIBRARY_PATH=/usr/lib/rustlib/sparcv9-sun-solaris/lib/ ./test

The output is bad. See comment above (obj.a[?]=255).

[1]

$ llc --filetype=obj -o test.o rust-intel.ll
llc: rust-intel.ll:2187:46: error: invalid field 'file'
!7 = !DINamespace(name: "result", scope: !8, file: !6)
                                             ^

@jrtc27
Copy link

jrtc27 commented Dec 22, 2017

https://github.com/rust-lang/rust/blob/master/src/librustc_trans/cabi_sparc64.rs#L27 and https://github.com/rust-lang/rust/blob/master/src/librustc_trans/cabi_sparc64.rs#L53 are wrong; structs up to 32 bytes in size, i.e. 256 bits, are returned in registers. I was also suspicious of this code in general when it was first merged as it doesn't seem very long, and handling mixed float and int structs is a little fiddly.

EDIT: Well, actually, is_homogeneous_aggregate should vary based on whether it's an argument (which can be up to 16 bytes) or return value (up to 32 bytes).

@psumbera
Copy link
Contributor Author

psumbera commented Jan 2, 2018

EDIT: Well, actually, is_homogeneous_aggregate should vary based on whether it's an argument (which can be up to 16 bytes) or return value (up to 32 bytes).

Can you please suggest how to implement this?

I can confirm that when modifying cabi_sparc64.rs to contain 256 my above test examples passes as expected (though Firefox still wouldn't build).

@psumbera
Copy link
Contributor Author

As mentioned above. With partial fix Firefox build dies later when it calls clang_visitChildren(). Which takes three arguments and the problematic first one is again 32 bytes long structure (CXCursor).

@glaubitz
Copy link
Contributor

@psumbera Maybe you can suggest a preliminary patch to at least improve the code generation?

@eddyb
Copy link
Member

eddyb commented Jan 18, 2018

EDIT: Well, actually, is_homogeneous_aggregate should vary based on whether it's an argument (which can be up to 16 bytes) or return value (up to 32 bytes).

Are you sure? That doesn't entirely control whether a struct is passed by value or indirectly, but specifically it handles structs which the ABI treats like arrays of primitives (incl. SIMD vectors).

kennytm added a commit to kennytm/rust that referenced this issue Jan 23, 2018
Fixes sparc64 cabi fixes.

Argument up to 16 bytes size is provided in registers.
Return value up to 32 bytes size is stored in registers.

Fixes: rust-lang#46679

---

Firefox now (almost) build on sparc. Original rust issue seems to be gone. Note that I'm not rust expert and the fix was suggested in bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. I-crash Issue: The compiler crashes (SIGSEGV, SIGABRT, etc). Use I-ICE instead when the compiler panics. O-SPARC Target: SPARC processors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants