Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sigma Bug in Development Estimator #463

Closed
chrisbooden opened this issue Sep 18, 2023 · 3 comments
Closed

Sigma Bug in Development Estimator #463

chrisbooden opened this issue Sep 18, 2023 · 3 comments
Assignees
Labels

Comments

@chrisbooden
Copy link

chrisbooden commented Sep 18, 2023

Describe the bug
When using standard errors it's common to exclude historical years or link ratios that are extreme outliers. When prepping the development estimator by excluding ratios it seems to correctly exclude the link ratio from the sigma calculation but still keeps it in the count for the 1/(n-1) factor.

In our company data we often exclude large amounts of historic years as are not reflective of current business, the resulting sigma values are then coming out too low as a result of this bug.

To Reproduce

import chainladder as cl

tri = cl.load_sample('raa')

tri_dev = cl.Development(drop = [('1981',12),('1981',24),('1981',36),('1981',48),('1981',60),('1981',72),('1981',84),('1981',96),('1981',108)]).fit_transform(tri)

display(tri_dev.sigma_)

image

Expected behavior

import chainladder as cl
import pandas as pd

df = pd.read_csv('chainladder/utils/data/raa.csv')
df = df[df["origin"] > 1981]

tri_2 = cl.Triangle(
    df,
    origin="origin",
    development="development",
    columns="values",
    cumulative=True
)

tri_dev_2 = cl.Development().fit_transform(tri_2)

display(tri_dev_2.sigma_)

image

Or as a unit test:

# Unit test for checking sigma values in the Development estimator
import chainladder as cl

def test_dev_sigma():
    # Method 1 for estimating sigma by excluding the first origin year 
    tri = cl.load_sample('raa')
    tri_dev = cl.Development(drop = [('1981',12),('1981',24),('1981',36),('1981',48),('1981',60),('1981',72),('1981',84),('1981',96),('1981',108)]).fit_transform(tri)

    # Remove the interpolated last value and the prev value (as this will be interpolated by the next method)
    sigma_1 = tri_dev.sigma_.iloc[0,0,0,:-2].values

    # Method 2 for estimating sigma by excluding the first origin year from the original data set
    df = pd.read_csv('chainladder/utils/data/raa.csv')
    df = df[df["origin"] > 1981]

    tri_2 = cl.Triangle(
        df,
        origin="origin",
        development="development",
        columns="values",
        cumulative=True
    )

    tri_dev_2 = cl.Development().fit_transform(tri_2)
    
    # Remove the interpolated last value (now array will have same length as method 1)
    sigma_2 = tri_dev_2.sigma_.iloc[0,0,0,:-1].values

    # Take difference and convert to a single list, round to a suitable value. Expected diffs are zero
    diff_sigma = [round(y,4) for y in (sigma_1 - sigma_2)[0][0][0]]

    # Expected diffs
    zeros = [0 for i in range(len(diff_sigma))]
    
    assert diff_sigma == zeros

Desktop (please complete the following information):

  • Numpy Version 1.21
  • Pandas Version 2.03
  • Chainladder Version 0.9.0
@chrisbooden chrisbooden changed the title [BUG] Sigma Bug in Development Estimator Sep 18, 2023
@jbogaardt
Copy link
Collaborator

Wow, I'm surprised that this defect exists. This is a super helpful bug report and unit test. Thank you for identifying @chrisbooden. We'll prioritize for next release.

jbogaardt pushed a commit that referenced this issue Sep 19, 2023
@jbogaardt
Copy link
Collaborator

Released in v0.8.18

@chrisbooden
Copy link
Author

Top man. Just re-tested this in the new release and can confirm it's resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants