Skip to content
This repository has been archived by the owner on Jan 10, 2023. It is now read-only.

Importing third-party packages before TensorFlow causes a runtime error #4

Open
zachgrayio opened this issue Apr 27, 2018 · 7 comments
Assignees

Comments

@zachgrayio
Copy link

zachgrayio commented Apr 27, 2018

Continuing our discussion from the group here.

Full background - I've just copied my comment directly from the group:

I've had some success in using third-party SPM packages by creating a dynamic library and linking to it when launching the REPL, however, it seems like the import order of TensorFlow vs other packages is important; importing the 3rd-party lib first causes a C++ runtime error in TensorFlow.

Here's some snippets:

Package.swift

import PackageDescription

let package = Package(
    name: "TFExample",
    products: [
        .library(
            name: "TFExample",
            type: .dynamic,    // allow use of this package and it's deps from the REPL
            targets: ["TFExample"]
        )
    ],
    dependencies: [
        .package(url: "https://github.com/ReactiveX/RxSwift.git", "4.0.0" ..< "5.0.0")
    ],
    targets: [
        .target(
            name: "TFExample",
            dependencies: ["RxSwift"]),
        .testTarget(
            name: "TFExampleTests",
            dependencies: ["TFExample"]),
    ]
)

... then we just fetch dependencies and build with vanilla commands, then invoke the REPL:

Invocation

swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample

At this point, I'm able to import RxSwift and TensorFlow in the REPL without errors in any order; however, when I actually interact with the packages, the incorrect import order does result in a runtime error:

Scenario 1 (OK)

  1> import TensorFlow
  2> import RxSwift
  3>  _ = Observable.from([1,2]).subscribe(onNext: { print($0) })
1
2
  4> var x = Tensor([[1, 2], [3, 4]])
2018-04-27 17:13:12.514107: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
x: TensorFlow.Tensor<Double> = [[1.0, 2.0], [3.0, 4.0]]

Scenario 2 (runtime error)

  1> import RxSwift
  2> import TensorFlow
  3> _ = Observable.from([1,2]).subscribe(onNext: { print($0) })
1
2
  4> var x = Tensor([[1, 2], [3, 4]])
x: TensorFlow.Tensor<Double> =terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid

The full process is outlined here if more detail is necessary: https://github.com/zachgrayio/swift-tensorflow/blob/example/package/README.md#run-with-dependencies-advanced

@abl abl added the tensorflow label Apr 27, 2018
@dan-zheng
Copy link

Thanks for providing so much detail!
I'm looking into this now.

@dan-zheng
Copy link

I was able to replicate the issue:

$ docker run --rm --privileged --cap-add sys_ptrace -it -v ${PWD}:/usr/src zachgray/swift-tensorflow:4.2 swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample
Welcome to Swift version 4.2-dev (LLVM 04bdb56f3d, Clang b44dbbdf44). Type :help for assistance.
  1> import TensorFlow
  2> import RxSwift
  3> Tensor(1)
error: Couldn't lookup symbols:
  protocol witness table for Swift.Double : TensorFlow.AccelerableByTensorFlow in TensorFlow
  _swift_tfc_StartTensorComputation
  _swift_tfc_FinishTensorComputation
  direct field offset for TensorFlow.TensorHandle.cTensorHandle : Swift.OpaquePointer
  type metadata accessor for TensorFlow.TensorHandle

  3> var x = Tensor([[1, 2], [3, 4]])
x: TensorFlow.Tensor<Double> =terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid

The solution is to add an extra -lswiftTensorFlow flag:

$ docker run --rm --privileged --cap-add sys_ptrace -it -v ${PWD}:/usr/src zachgray/swift-tensorflow:4.2 swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample -lswiftTensorFlow
Welcome to Swift version 4.2-dev (LLVM 04bdb56f3d, Clang b44dbbdf44). Type :help for assistance.
  1> import RxSwift
  2> import TensorFlow
  3> _ = Observable.from([1,2]).subscribe(onNext: { print($0) })
1
2
  4> var x = Tensor([[1, 2], [3, 4]])
2018-04-27 23:07:35.467557: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
x: TensorFlow.Tensor<Double> = [[1.0, 2.0], [3.0, 4.0]]

I tested the Swift interpreter by putting the code into test.swift, then running:
docker run --rm --privileged --cap-add sys_ptrace -it -v ${PWD}:/usr/src zachgray/swift-tensorflow:4.2 swift -I/usr/lib/swift/clang/include -I/usr/src/TFExample/.build/debug -L/usr/src/TFExample/.build/debug -lTFExample -O /usr/src/test.swift

This worked without specifying -lswiftTensorFlow, suggesting the problem is probably REPL-specific and involves linker flags.

On Linux, the Swift shared runtime library path is found at <path_to_toolchain>/usr/lib/swift/linux. It contains shared libraries like libswiftCore.so, libswiftTensorFlow.so, libswiftPython.so, etc.

In lib/Driver/Toolchains.cpp (used by the interpreter/compiler), toolchains::GenericUnix::constructInvocation automatically adds flags that add the Swift shared runtime library path and link libswiftCore.so. Ostensibly, there's other logic for handling other libraries in the same path (like libswiftPython.so) but I couldn't find it.

The REPL uses entirely separate logic for linking libraries (somewhere in google/swift-lldb). I'll do some digging and try to fix this.

This linking is probably related to #5.

@zachgrayio
Copy link
Author

zachgrayio commented Apr 28, 2018

@dan-zheng - nice work man. This is exactly what I was missing. See the following:

docker run --rm --privileged --cap-add sys_ptrace -itv ${PWD}:/usr/src \
    zachgray/swift-tensorflow:4.2 \
    swift \
    -I/usr/lib/swift/clang/include \
    -I/usr/src/TFExample/.build/debug \
    -L/usr/src/TFExample/.build/debug \
    -lswiftPython \
    -lswiftTensorFlow \
    -lTFExample

Welcome to Swift version 4.2-dev (LLVM 04bdb56f3d, Clang b44dbbdf44). Type :help for assistance.
  1> import RxSwift
  2> import Python
  3> import TensorFlow
  4> var x = Tensor([[1, 2], [3, 4]])
2018-04-28 00:11:10.828554: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
x: TensorFlow.Tensor<Double> = [[1.0, 2.0], [3.0, 4.0]]
  5> _ = Observable.from([1,2]).subscribe(onNext: { print($0) })
1
2
  6> var x: PyValue = [1, "hello", 3.14]
x: Python.PyValue = [1, 'hello', 3.14]
  7> :exit

** edited formatting

@dan-zheng dan-zheng self-assigned this Apr 29, 2018
@dan-zheng
Copy link

I'm working on a simple fix now.
Regarding import order: I didn't notice errors when importing Python before TensorFlow so that's the order I'll use.

@dan-zheng
Copy link

I believe this is fixed in 1969380.
You can try the pre-built packages from 05-10 to verify.

marcrasi pushed a commit to swiftlang/swift that referenced this issue Jun 14, 2018
…ries on Linux.

Addresses google#4.

On Linux, the REPL fails with "Couldn't lookup symbols" errors when:
- Linking an external Swift shared library (produced by `swift build`) and
  importing the corresponding Swift module.
- Importing the `TensorFlow` and `Python` modules, without manually linking
  `libswiftPython.so` and `libswiftTensorFlow.so`.

A manual workaround involves specifying the `-lswiftPython` and
`-lswiftTensorFlow` flags (in that specific order) when invoking the REPL.
Also, `Python` and `TensorFlow` must be imported before the external Swift
module to avoid the error.

Conditionally adding the linker flags here seems to solve the issue. This is
robust assuming that toolchain artifacts are not manipulated (so that somehow
`Python.swiftmodule` exists while `libswiftPython.so` doesn't).

PiperOrigin-RevId: 195572160
marcrasi pushed a commit to swiftlang/swift that referenced this issue Jun 22, 2018
…ries on Linux.

Addresses google#4.

On Linux, the REPL fails with "Couldn't lookup symbols" errors when:
- Linking an external Swift shared library (produced by `swift build`) and
  importing the corresponding Swift module.
- Importing the `TensorFlow` and `Python` modules, without manually linking
  `libswiftPython.so` and `libswiftTensorFlow.so`.

A manual workaround involves specifying the `-lswiftPython` and
`-lswiftTensorFlow` flags (in that specific order) when invoking the REPL.
Also, `Python` and `TensorFlow` must be imported before the external Swift
module to avoid the error.

Conditionally adding the linker flags here seems to solve the issue. This is
robust assuming that toolchain artifacts are not manipulated (so that somehow
`Python.swiftmodule` exists while `libswiftPython.so` doesn't).

PiperOrigin-RevId: 195572160
marcrasi pushed a commit to swiftlang/swift that referenced this issue Jun 28, 2018
…ries on Linux.

Addresses google#4.

On Linux, the REPL fails with "Couldn't lookup symbols" errors when:
- Linking an external Swift shared library (produced by `swift build`) and
  importing the corresponding Swift module.
- Importing the `TensorFlow` and `Python` modules, without manually linking
  `libswiftPython.so` and `libswiftTensorFlow.so`.

A manual workaround involves specifying the `-lswiftPython` and
`-lswiftTensorFlow` flags (in that specific order) when invoking the REPL.
Also, `Python` and `TensorFlow` must be imported before the external Swift
module to avoid the error.

Conditionally adding the linker flags here seems to solve the issue. This is
robust assuming that toolchain artifacts are not manipulated (so that somehow
`Python.swiftmodule` exists while `libswiftPython.so` doesn't).

PiperOrigin-RevId: 195572160
@pschuh
Copy link

pschuh commented May 11, 2019

This should be able to be solved by -module-link-name . That avoids this hack into the compiler.

@pschuh pschuh reopened this May 11, 2019
@pschuh
Copy link

pschuh commented May 11, 2019

I've reproduced linking this way outside of the swift compiler. This is also the way that foundation and xctest works. That avoids the problem of linking these libs into every binary if they are needed or not.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants