Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix fastparquet tests to work with HDFS
Fixes NVIDIA#9545. This commit fixes the `fastparquet` tests to run on Spark clusters where the `fs.default.name` does not point to the local filesystem. Before this commit, the `fastparquet` tests assumed that the parquet files generated for the tests were written to local filesystem, and could be read from both `fastparquet` and Spark from the same location. However, this fails when run against clusters whose default filesystem is HDFS. `fastparquet` can only read from the local filesystem. This commit changes the tests as follows: 1. For tests where data is generated by Spark, the data is copied to local filesystem before it is read by `fastparquet`. 2. For tests where data is generated by `fastparquet`, the data is copied to the default Hadoop filesystem before reading through Spark. Signed-off-by: MithunR <mythrocks@gmail.com>
- Loading branch information