Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_parquet_read_encryption fails #5959

Closed
revans2 opened this issue Jul 6, 2022 · 1 comment · Fixed by #5994
Closed

[BUG] test_parquet_read_encryption fails #5959

revans2 opened this issue Jul 6, 2022 · 1 comment · Fixed by #5994
Labels
bug Something isn't working P1 Nice to have for release

Comments

@revans2
Copy link
Collaborator

revans2 commented Jul 6, 2022

Describe the bug
When I run the python integration tests on Spark 3.2.0 or spark_3.3.0

env -u SPARK_CONF_DIR SPARK_HOME=~/spark_3.2.0/ ./run_pyspark_from_build.sh -k 'parquet and encryption'

Most of them fail with an error like

Caused by: java.lang.ClassNotFoundException: Class org.apache.parquet.crypto.keytools.mocks.InMemoryKMS not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2571)
	at org.apache.parquet.hadoop.util.ConfigurationUtil.getClassFromConfig(ConfigurationUtil.java:33)

All I did was to run buildall.sh and then cd into the integration tests directory and run the tests.

I looks like

@pytest.mark.skipif(os.environ.get('INCLUDE_PARQUET_HADOOP_TEST_JAR', 'false') == 'false', reason='INCLUDE_PARQUET_HADOOP_TEST_JAR is disabled')

is the culprit. It is being set to true in the run_pyspark_from_build.sh without ever even checking if the hadoop dependency was placed where it is expected to be. But honestly I have no idea because I don't know what jar is expected to be on the classpath. It could be the fact that I build all of the versions and somehow the wrong dependency was downloaded. Either way this is a bad user experience.

@revans2 revans2 added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jul 6, 2022
@revans2
Copy link
Collaborator Author

revans2 commented Jul 6, 2022

FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs3-] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs0-] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs1-parquet] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs2-parquet] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs5-] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs2-] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs4-] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs3-parquet] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs0-parquet] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs1-] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs4-parquet] - py4j.protocol.Py4JJavaError: An error occurred while calling o310.parquet.
FAILED ../../src/main/python/parquet_test.py::test_parquet_read_encryption[reader_confs5-parquet] - py4j.protocol.Py4JJavaError: An error occurred while calling o400.parquet.

@sameerz sameerz added P1 Nice to have for release and removed ? - Needs Triage Need team to review and classify labels Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 Nice to have for release
Projects
None yet
2 participants