Skip to content

This's the tool for CTR, including FM, FFM, NFFM and so on.

Notifications You must be signed in to change notification settings

guoday/ctrNet-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

1.Introduction

This's the tool for CTR, including FM, FFM, NFFM, XdeepFM and so on.

Note: only implement FM, FFM and NFFM. More detail and another models will be implemented

2.Requirements

  • python3
  • sklearn
  • TensorFlow>=1.6

3.Kernel for NFFM

You can find kaggle kernel for NFFM in the following link: https://www.kaggle.com/guoday/nffm-baseline-0-690-on-lb

4.Kernel for Xdeepfm

You can find kaggle kernel of Xdeepfm in the following link: https://www.kaggle.com/guoday/xdeepfm-baseline

5.Quick Start

Loading dataset

import pandas as pd
import numpy as np
import tensorflow as tf
import ctrNet
from sklearn.model_selection import train_test_split
from src import misc_utils as utils
import os
train_df=pd.read_csv('data/train_small.txt',header=None,sep='\t')
train_df.columns=['label']+['f'+str(i) for i in range(39)]
train_df, dev_df,_,_ = train_test_split(train_df,train_df,test_size=0.1, random_state=2019)
dev_df, test_df,_,_ = train_test_split(dev_df,dev_df,test_size=0.5, random_state=2019)
features=['f'+str(i) for i in range(39)]

Creating hparams

hparam=tf.contrib.training.HParams(
            model='ffm', #['fm','ffm','nffm']
            k=16,
            hash_ids=int(1e5),
            batch_size=64,
            optimizer="adam", #['adadelta','adagrad','sgd','adam','ftrl','gd','padagrad','pgd','rmsprop']
            learning_rate=0.0002,
            num_display_steps=100,
            num_eval_steps=1000,
            epoch=3,
            metric='auc', #['auc','logloss']
            init_method='uniform', #['tnormal','uniform','normal','xavier_normal','xavier_uniform','he_normal','he_uniform']
            init_value=0.1,
            feature_nums=len(features))
utils.print_hparams(hparam)

Building model

os.environ["CUDA_DEVICE_ORDER"]='PCI_BUS_ID'
os.environ["CUDA_VISIBLE_DEVICES"]='0'
model=ctrNet.build_model(hparam)

Training model

#You can use control-c to stop training if the model doesn't improve.
model.train(train_data=(train_df[features],train_df['label']),\
            dev_data=(dev_df[features],dev_df['label']))

Testing model

from sklearn import metrics
preds=model.infer(dev_data=(test_df[features],test_df['label']))
fpr, tpr, thresholds = metrics.roc_curve(test_df['label']+1, preds, pos_label=2)
auc=metrics.auc(fpr, tpr)
print(auc)

About

This's the tool for CTR, including FM, FFM, NFFM and so on.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages