Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ascii filenames cause InvalidPathException in java_test #15106

Closed
mpdn opened this issue Mar 23, 2022 · 5 comments
Closed

Non-ascii filenames cause InvalidPathException in java_test #15106

mpdn opened this issue Mar 23, 2022 · 5 comments
Assignees
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-Java Issues for Java rules

Comments

@mpdn
Copy link

mpdn commented Mar 23, 2022

It seems that the environment Bazel executes the JVM in on Linux does not allow for non-ascii filenames.

Example:

Test.java:

import java.nio.file.Files;
import java.io.IOException;

class Test {
    public static void main(String[] args) throws IOException {
        Files.createTempFile("æøå", null);
    }
}

BUILD:

java_test(
    name = "test",
    srcs = ["Test.java"],
    main_class = "Test",
    test_class = "Test",
)

Executing this (on Linux):

bazel run --java_runtime_version=remotejdk_11 //:test

Fails with this:

Exception in thread "main" java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: ???5404569911566851009.tmp
	at java.base/sun.nio.fs.UnixPath.encode(UnixPath.java:145)
	at java.base/sun.nio.fs.UnixPath.<init>(UnixPath.java:69)
	at java.base/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:279)
	at java.base/java.nio.file.TempFileHelper.generatePath(TempFileHelper.java:59)
	at java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:126)
	at java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160)
	at java.base/java.nio.file.Files.createTempFile(Files.java:913)
	at Test.main(Test.java:6)

This issue seems to only happen on Linux (but I've only tried Ubuntu 20.04.3 LTS). On Mac, it works without issues. I have not tried on Windows.

Bazel version:

$ bazel info release
release 5.0.0

I have found some issues of people getting InvalidPathException, but they seem to be exceptions in Bazel itself. This is an exception from the application being tested instead.

I have also tried setting LANG via --test_env and --action_env, and setting the sun.jnu.encoding and file.encoding JVM parameters, but to no avail.

@mpdn mpdn changed the title Non-ascii filenames cause InvalidPathException in java_test Non-ascii filenames cause InvalidPathException in java_test Mar 23, 2022
@ckolli5 ckolli5 added team-Rules-Java Issues for Java rules untriaged labels Mar 23, 2022
@comius
Copy link
Contributor

comius commented Mar 29, 2022

I don't see any InvalidPathExceptions thrown from Bazel (except in Android rules, but this isn't invoked here). https://cs.opensource.google/search?q=InvalidPathException&sq=&ss=bazel%2Fbazel

For now I'll assume this is problem with underlying filesystem/OS, unless I'm missing something.

@comius comius added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) and removed untriaged labels Mar 29, 2022
@mpdn
Copy link
Author

mpdn commented Mar 30, 2022

It's not thrown from Bazel, it's thrown from the JVM that Bazel launches as part of the test, so there is no Bazel code running in that Java application. I think that, due to the environment defined by Bazel or sandboxing or something else, the JVM believes that the filesystem does not support unicode paths on linux - which it does.

It seems like surprising behavior to me, as the JVM on linux usually has no problem dealing with unicode paths.

@comius comius added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) labels Apr 1, 2022
@comius
Copy link
Contributor

comius commented Apr 1, 2022

due to the environment defined by Bazel or sandboxing or something else, the JVM believes that the filesystem does not support unicode paths on linux

If that's the case, further investigation is welcome and even a fix.

@fmeum
Copy link
Collaborator

fmeum commented Apr 1, 2022

In SystemProps, the value of sun.jnu.encoding is set explicitly without taking command-line overrides into account. It is then read in a static initializer in sun.nio.fs.Utils, so there isn't much time for programmatic changes to the property to have any effect. This explains why workarounds don't apply. On Mac OS, the encoding is force to UTF-8 here, which explains why this is not an issue on a Mac.

On Linux, the JVM parses the current locale to determine the value of sun.jnu.encoding. This usually amounts to reading LC_ALL, LC_CTYPE and LANG, all of which are not set in a test action by default (regardless of whether the sandbox is used or not). For me on Ubuntu 21.04, I get:

# Passes since it inherits the full environment:
$ bazel run --java_runtime_version=remotejdk_11 //:test
# Fails, since the test action doesn't inherit any of `LC_CTYPE`, `LC_ALL` and `LANG` and thus `setlocale` falls back to some legacy encoding (ANSI_X3.4-1968 on my machine):
$ bazel test --java_runtime_version=remotejdk_11 //:test
# Passes:
$ bazel test --java_runtime_version=remotejdk_11 //:test --test_env=LANG

In general, test actions on Linux would probably have to either inherit LC_ALL and LANG or set them to fixed values, such as C.utf-8.

@comius Do you have a preference how to deal with this host-dependent configuration?

@comius
Copy link
Contributor

comius commented Apr 1, 2022

Setting --test_env flag or setting LANG to be inherited by default here https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/bazel/rules/BazelRuleClassProvider.java;l=184;drc=dbb6e9954b6e4423f727feb2719ffc75a93b514b

The former is slightly better because you can set a fixed value.
The latter might result in different behaviours based on the user's installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-Java Issues for Java rules
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants