Feature/transf date #29

armgilles · 2022-09-21T14:48:26Z

Allow datetime column in Eurybia

Reference Issues/PRs : #28

Signed-off-by: Gillesa <arm.gilles@gmail.com>

ThomasBouche · 2022-09-21T15:09:54Z

Great !

Maybe your transformation is not in method _analyze_consistency, but before, to be more clearer on a transformation of datasets.
you can add a method to do all your transformation and execute in the compile

Signed-off-by: Gillesa <arm.gilles@gmail.com>

armgilles · 2022-09-22T16:32:23Z

Create a method to check columns datetime before _analyze_consistency
Fix name doublon in test.

import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
from eurybia.data.data_loader import data_loading
from eurybia import SmartDrift


house_df, house_dict = data_loading('house_prices')
house_df_learning = house_df.loc[house_df['YrSold'] == 2006]
house_df_2007 = house_df.loc[house_df['YrSold'] == 2007]

y_df_learning=house_df_learning['SalePrice'].to_frame()
X_df_learning=house_df_learning[house_df_learning.columns.difference(['SalePrice','YrSold'])]
y_df_2007=house_df_2007['SalePrice'].to_frame()
X_df_2007=house_df_2007[house_df_2007.columns.difference(['SalePrice','YrSold'])]

# Create random columns dates
X_df_learning['random_col_date'] = np.random.choice(pd.date_range(start='01/01/2000', end='31/12/2006'), size=len(X_df_learning))
X_df_learning['other_random_col_date'] = np.random.choice(pd.date_range(start='01/01/2000', end='31/12/2006'), size=len(X_df_learning))

X_df_2007['random_col_date'] = np.random.choice(pd.date_range(start='01/01/2007', end='31/12/2007'), size=len(X_df_2007))
X_df_2007['other_random_col_date'] = np.random.choice(pd.date_range(start='01/01/2007', end='31/12/2007'), size=len(X_df_2007))

# Just a Random models
regressor = RandomForestRegressor(n_estimators=2).fit(X_df_learning[['1stFlrSF', '2ndFlrSF']],
                                                      y_df_learning
)

# Should be ok & informed user of transformation
SD = SmartDrift(df_current=X_df_2007,
                df_baseline=X_df_learning,
                dataset_names={"df_current": "2007 dataset", "df_baseline": "Learning dataset"}
               )
SD.compile()
# Column random_col_date will be dropped and transformed in df_current by : random_col_date_year, random_col_date_month, random_col_date_day
# Column other_random_col_date will be dropped and transformed in df_current by : other_random_col_date_year, other_random_col_date_month, other_random_col_date_day
# Column random_col_date will be dropped and transformed in df_baseline by : random_col_date_year, random_col_date_month, random_col_date_day
# Column other_random_col_date will be dropped and transformed in df_baseline by : other_random_col_date_year, other_random_col_date_month, other_random_col_date_day

# Should raise error
SD = SmartDrift(df_current=X_df_2007,
                df_baseline=X_df_learning,
                deployed_model=regressor,
                dataset_names={"df_current": "2007 dataset", "df_baseline": "Learning dataset"}
               )
SD.compile()
# TypeError: df_current have datetime column. You should drop it

ThomasBouche · 2022-09-23T08:22:02Z

Great, Thanks for you contribution !

armgilles added 2 commits September 20, 2022 17:09

linting feature

9bba030

Signed-off-by: Gillesa <arm.gilles@gmail.com>

linting tests

70873da

Signed-off-by: Gillesa <arm.gilles@gmail.com>

Create a method to check & transform column date ✏️

21dfe8d

Signed-off-by: Gillesa <arm.gilles@gmail.com>

ThomasBouche approved these changes Sep 23, 2022

View reviewed changes

ThomasBouche merged commit f60be26 into MAIF:master Sep 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/transf date #29

Feature/transf date #29

armgilles commented Sep 21, 2022

ThomasBouche commented Sep 21, 2022

armgilles commented Sep 22, 2022

ThomasBouche commented Sep 23, 2022

Feature/transf date #29

Feature/transf date #29

Conversation

armgilles commented Sep 21, 2022

ThomasBouche commented Sep 21, 2022

armgilles commented Sep 22, 2022

ThomasBouche commented Sep 23, 2022