Skip to content
This repository has been archived by the owner on Jun 22, 2022. It is now read-only.

LightGBM with row aggregations

Kamil A. Kaczmarek edited this page Jul 10, 2018 · 2 revisions

dromedary camel 🐪

Features

Recipe for row aggregations is simple: take single row from the dataset and calculate several summary statistics of that row, for example: mean, max, min, std, count_non_zero, fraction_non_zero. These aggregations are implemented in the feature_extraction.py:L111.

In the future solution we will add much more aggregations.

Model

LightGBM with our steppy-style wrapper.

Results

  • lightGBM on row aggregations data: 1.36 CV and 1.48 LB
  • lightGBM with both raw features and row aggregations: 1.35 CV and 1.41 LB 🏆

Combined raw features with row aggregations led us to the great increase in the both CV and LB results.

Pipeline diagram

pipeline-solution-3