Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Criteo HugeCTR Inference Configuration Fix (#1522)
* API Overhaul First draft of the API overhauls changes. Adds most core functionality, including defining workflow graphs with a ColumnGroup class, the workflow and dataset changes , most operators converted to use the new api, etc. * remove debug print statement * Fix test_io unittest Also partially fix some tests inside test_workflow * Handle multi-column joint/combo categorify * Update JoinGroupby * Fix differencelag * add dependencies method (#498) * Convert TargetEncoding op * Update nvtabular/workflow.py Co-authored-by: Richard (Rick) Zamora <rzamora217@gmail.com> * Update nvtabular/workflow.py Co-authored-by: Richard (Rick) Zamora <rzamora217@gmail.com> * Remove workflow code from dataloaders We should be doing online transforms like ```KerasSequenceLoader(workflow.transform(dataset), ...``` instead of ```KerasSequenceLoader(dataset, workflows=[workflow], ...``` now * Unittest ops + bugfix in Bucketize (#496) * test_minmix * updates test * unittest ops * First draft get_embedding_sizes support Re-add get_embedding_sizes . Note that this changes how we support multi-hot columns here (sizes are returned same as single hot, and we don't use this method to distinguish between multi and singlehot columns) * isort * Remove groupbystatistics * implement serialization of statistics add save_stats/load_stats/clear_stats methods to the workflow, with each statoperator getting called as appropiate * Fix TF dataloader unittests * test_torch_dataloader fixes * doc strings * add comma to ps.json Co-authored-by: Ben Frederickson <github@benfrederickson.com> Co-authored-by: rnyak <ronayak@hotmail.com> Co-authored-by: Richard (Rick) Zamora <rzamora217@gmail.com> Co-authored-by: root <root@dgx06.aselab.nvidia.com>
- Loading branch information