Skip to content

Commit

Permalink
Wrapper for TagAs did not work (#1462)
Browse files Browse the repository at this point in the history
* API Overhaul

First draft of the API overhauls changes. Adds most core functionality, including
defining workflow graphs with a ColumnGroup class, the workflow and dataset changes
, most operators converted to use the new api, etc.

* remove debug print statement

* Fix test_io unittest

Also partially fix some tests inside test_workflow

* Handle multi-column joint/combo categorify

* Update JoinGroupby

* Fix differencelag

* add dependencies method (#498)

* Convert TargetEncoding op

* Update nvtabular/workflow.py

Co-authored-by: Richard (Rick) Zamora <rzamora217@gmail.com>

* Update nvtabular/workflow.py

Co-authored-by: Richard (Rick) Zamora <rzamora217@gmail.com>

* Remove workflow code from dataloaders

We should be doing online transforms like
```KerasSequenceLoader(workflow.transform(dataset), ...```  instead of
```KerasSequenceLoader(dataset, workflows=[workflow], ...``` now

* Unittest ops + bugfix in Bucketize (#496)

* test_minmix

* updates test

* unittest ops

* First draft get_embedding_sizes support

Re-add get_embedding_sizes . Note that this changes how we support multi-hot columns here
(sizes are returned same as single hot, and we don't use this method to distinguish between
multi and singlehot columns)

* isort

* Remove groupbystatistics

* implement serialization of statistics

add save_stats/load_stats/clear_stats methods to the workflow, with each statoperator getting
called as appropiate

* Fix TF dataloader unittests

* test_torch_dataloader fixes

* doc strings

* fix tagas

Co-authored-by: Ben Frederickson <github@benfrederickson.com>
Co-authored-by: rnyak <ronayak@hotmail.com>
Co-authored-by: Richard (Rick) Zamora <rzamora217@gmail.com>
Co-authored-by: root <root@dgx06.aselab.nvidia.com>
Co-authored-by: Karl Higley <kmhigley@gmail.com>
  • Loading branch information
6 people authored Mar 18, 2022
1 parent 7e64ef1 commit 154ed10
Showing 1 changed file with 12 additions and 16 deletions.
28 changes: 12 additions & 16 deletions nvtabular/ops/add_metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,25 +53,21 @@ def __init__(self, properties=None):


# Wrappers for common features
class TagAsUserID(Operator):
@property
def output_tags(self):
return [Tags.USER_ID]
class TagAsUserID(AddTags):
def __init__(self, tags=None):
super().__init__(tags=[Tags.USER_ID, Tags.USER])


class TagAsItemID(Operator):
@property
def output_tags(self):
return [Tags.ITEM_ID]
class TagAsItemID(AddTags):
def __init__(self, tags=None):
super().__init__(tags=[Tags.ITEM_ID, Tags.ITEM])


class TagAsUserFeatures(Operator):
@property
def output_tags(self):
return [Tags.USER]
class TagAsUserFeatures(AddTags):
def __init__(self, tags=None):
super().__init__(tags=[Tags.USER])


class TagAsItemFeatures(Operator):
@property
def output_tags(self):
return [Tags.ITEM]
class TagAsItemFeatures(AddTags):
def __init__(self, tags=None):
super().__init__(tags=[Tags.ITEM])

0 comments on commit 154ed10

Please sign in to comment.