How I Can Apply Inverse Transform #1858

mustfkeskin · 2023-08-04T06:39:30Z

How I can return to my original feature set.
What is the equivalent of the inverse transform function in scikit-learn here?

After train retrieval model i want to know my original user and itemid :)

rnyak · 2023-08-07T19:08:57Z

@mustfkeskin your mapped values are stored in the categories folder when you run Categorify op. For your item_id column you will see unique.tem_id parquet file under the categories folder. Categorify op does the mapping as follows:

0 is reserved for padding. so you should not have any 0 in your transformed data
1 is reserved for Nulls. so if you have any nulls in any categorical columns, they are mapped to 1
OOVs are mapped to 2
the regular encoding starts from 3. the most frequent item in a categorical col is encoded as 3 , the second most frequent as 4 , so on so fort..

the index column is your encoded values. From there you can write a simple pandas mapping script to revert back the encoded ids to original ids.

mustfkeskin · 2023-08-08T06:28:22Z

Thank u @rnyak
This solved my problem.
This is a problem for newbies like me. There was no example of this in the tutorials.

unique_query_sku_df = pd.read_parquet("../data/categories/categories/unique.query_sku.parquet")
unique_query_sku_df["index"] = unique_query_sku_df.index
unique_query_sku_df.head()


query_embs_df = pd.merge(query_embs_df,
                         unique_query_sku_df, 
                         how="inner",
                         on="index")
query_embs_df = query_embs_df[["query_sku", "embeddings"]]
query_embs_df.columns = ["id", "embeddings"]

hkristof03 · 2024-06-11T09:28:28Z

@rnyak would it be possible to include the reverse transformation of the item IDs as a built-in utility mechanism?

CarloNicolini · 2024-06-11T15:28:01Z

@rnyak would it be possible to include the reverse transformation of the item IDs as a built-in utility mechanism?

I also take advantage of the categories/unique.feature.parquet file

mustfkeskin added the question Further information is requested label Aug 4, 2023

mustfkeskin closed this as completed Aug 8, 2023

hkristof03 mentioned this issue Jun 11, 2024

[QST] How to use pretrained embeddings as features in DLRM? NVIDIA-Merlin/models#1013

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How I Can Apply Inverse Transform #1858

How I Can Apply Inverse Transform #1858

mustfkeskin commented Aug 4, 2023

rnyak commented Aug 7, 2023 •

edited

Loading

mustfkeskin commented Aug 8, 2023

hkristof03 commented Jun 11, 2024

CarloNicolini commented Jun 11, 2024

How I Can Apply Inverse Transform #1858

How I Can Apply Inverse Transform #1858

Comments

mustfkeskin commented Aug 4, 2023

rnyak commented Aug 7, 2023 • edited Loading

mustfkeskin commented Aug 8, 2023

hkristof03 commented Jun 11, 2024

CarloNicolini commented Jun 11, 2024

rnyak commented Aug 7, 2023 •

edited

Loading