Throws error if subfolder not found or empty and aborts the table merge #20

premkaliap · 2022-10-10T23:02:49Z

Getting below error while its encountering missing path and download for that table aborts.

[ForkJoinPool-1-worker-3] WARN com.guidewire.cda.TableReader - Copy Job FAILED for 'cc_claim' for fingerprint '4e588b71e9a149148b623a22da443314': org.apache.spark.sql.AnalysisException: Path does not exist: s3a://tenant-xxx/cc/4e588b71e9a149148b623a22da443314/1664585730000/*.parquet;

I was told by GW that its normal to have empty folder reference on the .cda/batch-metrics.json file but the actual path don't exists. How to handle this scenario?

"The failures are related to timestamp folders that don’t contain any Parquet files in them. This is standard behavior where, if all records processed in a batch for a given table were deemed as duplicates (previously seen), CDA would correctly not write them out to S3. But it would still write out the reconciliation stats to .cda/batch-metrics.jsonfile. It’s the presence of this file that’s causing the folder to show up in S3, even if there are no Parquet files inside of it."

LearnerEnabler added a commit to LearnerEnabler/cda-client that referenced this issue Dec 28, 2022

temp fix in issue Guidewire#20

f9cd446

LearnerEnabler mentioned this issue Dec 28, 2022

temp fix in issue https://github.com/Guidewire/cda-client/issues/20 #26

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Throws error if subfolder not found or empty and aborts the table merge #20

Throws error if subfolder not found or empty and aborts the table merge #20

premkaliap commented Oct 10, 2022

Throws error if subfolder not found or empty and aborts the table merge #20

Throws error if subfolder not found or empty and aborts the table merge #20

Comments

premkaliap commented Oct 10, 2022