#864 development constant tail bug#868
Conversation
…eds to be reduced by the tail
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #868 +/- ##
==========================================
+ Coverage 88.28% 88.39% +0.10%
==========================================
Files 88 88
Lines 5029 5075 +46
Branches 642 652 +10
==========================================
+ Hits 4440 4486 +46
Misses 444 444
Partials 145 145
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…nto #864_DevelopmentConstant_Tail_Bug
| else: | ||
| raise ValueError("callable axis needs to be 0 or 1") | ||
|
|
||
| patterns = self.patterns(rows.iloc[0]) |
There was a problem hiding this comment.
Callable path determines shape from first row only
Low Severity
When patterns is callable, self.patterns(rows.iloc[0]) is called to determine include_last and dev_slice based on the first row's pattern length. Each subsequent row is then independently processed in _callable_row, which may produce a different row_tail_cdf. If different rows return patterns of different lengths, the obj skeleton shape (determined solely by the first row) may be inappropriate for other rows — for example, if the first row's pattern is short (include_last=False) but another row's is long, the obj.ddims will have one fewer period than that row needs.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 6f28c21. Configure here.
| ) | ||
| assert np.all( | ||
| np.round(result.ldf_.to_frame().values.flatten(), 6) | ||
| == np.array([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.1]) |
There was a problem hiding this comment.
this doesn't match "reported_pattern"
There was a problem hiding this comment.
It's not supposed to be, the reported_pattern is in CDF form, but the LHS/RHS check here is in LDF form.
There was a problem hiding this comment.
Let me know if you agree lol.
I had to be super careful, made many mistakes before. Just double check to see if you are aligned.
There was a problem hiding this comment.
the 1.1 comes at the 10th in the returned ldf_, instead of the 11th element implied by the original pattern
There was a problem hiding this comment.
Because the data object doesn't have the 11th origin period?
There was a problem hiding this comment.
right. so we have an issue here because the ldf_ returns changes, beyond filling with additional 1.0's, depending on how large the triangle is. in theory, we would want DevelopmentConstant().fit(trI_9x9).ldf_[:-1] to be equal to DevelopmentConstant().fit(trI_8x8).ldf_ but that's not happening
There was a problem hiding this comment.
Let me reply you on the main thread.
| 1.1, | ||
| 1.1, | ||
| 1.1, | ||
| 1.1**2, |
There was a problem hiding this comment.
this doesn't match reported_patterns
There was a problem hiding this comment.
It does? The only thing that's different is the last LDF, those need to be grouped, or it will be incorrectly discarded.
There was a problem hiding this comment.
there's no 1.21 in the original ldf pattern.
There was a problem hiding this comment.
Yes because the pattern extends beyond what is needed by the data object. Are you suggesting that you just discard the last one?
There was a problem hiding this comment.
no, the PR is what's discarding the last ldf. i'm saying we need to keep the last ldf, that would make it consistent with Development
raa = cl.load_sample('raa')
dev = cl.Development().fit(raa)
raa_1987 = raa[(raa.valuation <= '1988-01-01')]
print(dev.transform(raa).ldf_)
print(dev.transform(raa_1987).ldf_) 12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120
(All) 2.999359 1.623523 1.270888 1.171675 1.113385 1.041935 1.033264 1.016936 1.009217
12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120
(All) 2.999359 1.623523 1.270888 1.171675 1.113385 1.041935 1.033264 1.016936 1.009217
So I think the above is only true if the pattern supplied is shorter than a length of 7. Try this example, and comment/uncomment the pattern length, everything is working as expected to me. When pattern is raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']
reported_patterns = {
12: 1.1,
24: 1.1,
36: 1.1,
48: 1.1,
60: 1.1,
72: 1.1,
84: 1.1,
96: 1.1,
108: 1.1,
# 120: 1.1,
# 132: 1.1,
}
raa10x10_ldf_ = cl.DevelopmentConstant(
patterns=reported_patterns, style="ldf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
patterns=reported_patterns, style="ldf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)In raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']
reported_patterns = {
12: 1.1,
24: 1.1,
36: 1.1,
48: 1.1,
60: 1.1,
72: 1.1,
84: 1.1,
96: 1.1,
108: 1.1,
# 120: 1.1,
# 132: 1.1,
}
raa10x10_ldf_ = cl.DevelopmentConstant(
patterns=reported_patterns, style="cdf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
patterns=reported_patterns, style="cdf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_) |
|
Sorry, but I am not following your question/feedback, but can you give me an example of where you think my implementation doesn't give the correct answer, and what the correct answer should be? |
right. when the supplied ldf or cdf is longer, the resulting ldf_ is distorted. right now you are forcing those tests to pass by comparing the resulting ldf_ to something other than what was supplied. |
this implementation doesn't give the right answer when the supplied pattern is longer. the correct answer should replicate the supplied ldf or the implied ldf of the suppled cdf exactly (not counting factors of 1 to fill the space). basically, if elsewhere in the package we want stuff like raa = cl.load_sample('raa')
dev = cl.Development().fit(raa)
raa_1987 = raa[(raa.valuation <= '1988-01-01')]
print(dev.transform(raa).ldf_)
print(dev.transform(raa_1987).ldf_) |
I don't think I agree with this. You are saying to just discard the extra LDF pattern beyond the triangle object? Look at this example: raa10x10 = cl.load_sample("raa")
raa9x9 = raa10x10[raa10x10.valuation<'1990']
reported_patterns = {
12: 1.1,
24: 1.1,
36: 1.1,
48: 1.1,
60: 1.1,
72: 1.1,
84: 1.1,
96: 1.1,
108: 1.1, #9th LDF, or if this is a CDF, it would've been 1.21
120: 1.1, #10th's LDF, not CDF
# 132: 1.1,
}
raa10x10_ldf_ = cl.DevelopmentConstant(
patterns=reported_patterns, style="ldf"
).fit_transform(raa10x10).ldf_
raa9x9_ldf_ = cl.DevelopmentConstant(
patterns=reported_patterns, style="ldf"
).fit_transform(raa9x9).ldf_
print(raa10x10_ldf_)
print(raa9x9_ldf_)Returns: 12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120 120-132
(All) 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1
12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120
(All) 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.21
# ^ this 9th's LDF should
# be 1.1 instead of 1.21You are saying, we should just discard the remaining
This is different, in this example, you are estimating patterns using a set of data, and estimating another pattern using a subset of that data, it should be clear that the pattern estimated using the subset lacks something (i.e. the tail). Let me try to se if I can convince you with another example: Do you agree that these two patterns are the same? DC_LDF = cl.DevelopmentConstant(
patterns={
12: 1.1,
24: 1.1,
36: 1.1,
48: 1.1,
60: 1.1,
72: 1.1,
84: 1.1,
96: 1.1,
108: 1.1,
120: 1.1,
}, style="ldf"
)
DC_CDF = cl.DevelopmentConstant(
patterns={
12: 1.1**10,
24: 1.1**9,
36: 1.1**8,
48: 1.1**7,
60: 1.1**6,
72: 1.1**5,
84: 1.1**4,
96: 1.1**3,
108: 1.1**2,
120: 1.1,
}, style="cdf"
)If so, then you should get the same CDF, not LDF, no matter what triangle object you fit on. Again, in my opinion, one of the current implementation flaw that I think is that if the pattern provided is shorter, and it is in LDF form, the extra pattern is basically discarded. My PR fixes that. |
…nto #864_DevelopmentConstant_Tail_Bug
|
@henrydingliu do you agree with my latest post? I am trying to resolve the last bug bot comment. |
|
@henrydingliu do you have notifications muted? 😜 |
|
sorry, everything sends a notification and this got buried in the pile. |
i think we both want the same outcome. it's actually this implementation that isn't following your latest post. this implementation is discarding LDFs . it's discarding LDFs here and here. i'm trying to say that it shouldn't be discarding anything. |
…nto #864_DevelopmentConstant_Tail_Bug
|
Ok I think I finally understood where the misalignment is after re-reading many times because I definitely said some weird stuff from misreading lol... Sorry! But let me restate the problem first, and we can decide where to go. I actually don't think there's ANY bug in this implementation, but the question is about how it's implemented. Let me know if you disagree. The different view is actually simple, it's how we treat My proposal and the current PR is that everything that goes beyond the triangle object are consolidated into a single tail CDF, and this CDF is assigned to the last age. @henrydingliu's proposal is that we just assign the Consider this example: raa10x10 = cl.load_sample("raa")
raa5x5 = raa10x10[raa10x10.valuation<'1986']
patterns_len_10 = {
12: 1.1,
24: 1.1,
36: 1.1,
48: 1.1,
60: 1.1,
72: 1.1,
84: 1.1,
96: 1.1,
108: 1.1,
120: 1.1,
}
raa10x10_factors = cl.DevelopmentConstant(
patterns=patterns_len_10, style="ldf"
).fit_transform(raa10x10)
raa5x5_factors = cl.DevelopmentConstant(
patterns=patterns_len_10, style="ldf"
).fit_transform(raa5x5)
print("Full tri")
print(raa10x10_factors.cdf_)
print(raa10x10_factors.ldf_)
print("\nSmall tri")
print(raa5x5_factors.cdf_)
print(raa5x5_factors.ldf_)My implemntation returns: Full tri
12-Ult 24-Ult 36-Ult 48-Ult 60-Ult 72-Ult 84-Ult 96-Ult 108-Ult 120-Ult
(All) 2.593742 2.357948 2.143589 1.948717 1.771561 1.61051 1.4641 1.331 1.21 1.1
12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120 120-132
(All) 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1
Small tri
12-Ult 24-Ult 36-Ult 48-Ult 60-Ult
(All) 2.593742 2.357948 2.143589 1.948717 1.771561
12-24 24-36 36-48 48-60 60-72
(All) 1.1 1.1 1.1 1.1 1.771561Note the But I think @henrydingliu you wanted Full tri
12-Ult 24-Ult 36-Ult 48-Ult 60-Ult 72-Ult 84-Ult 96-Ult 108-Ult 120-Ult
(All) 2.593742 2.357948 2.143589 1.948717 1.771561 1.61051 1.4641 1.331 1.21 1.1
12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120 120-132
(All) 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1
Small tri
12-Ult 24-Ult 36-Ult 48-Ult 60-Ult 72-Ult 84-Ult 96-Ult 108-Ult 120-Ult
(All) 2.593742 2.357948 2.143589 1.948717 1.771561 1.61051 1.4641 1.331 1.21 1.1
12-24 24-36 36-48 48-60 60-72 72-84 84-96 96-108 108-120 120-132
(All) 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1 1.1Basically, the I don't think there's any other disagreement except this one. @henrydingliu do you agree with the summary above? Once we are aligned here, we can discuss which one to go with. I don't think there's any other misalignment. By the way, I think it is time we move forward with test based development or whatever it's called. We can break PRs into two parts, the tests, and the implementation. That way we don't have misalignment on the implementation and having to go back and redo. This is basically the problem of not having good spec sheet. I'll add this to |
yessir
i do think test-based development is a good idea in general. a couple of concerns based on our current scenario
|
Ya all good point, I might just try to do this personally myself. It's something I learned in school but never practiced. I might give it a try.
Ok perfect!! So my implementation, even though it looks complex in terms of the algorithm, it's actually simpler in implementation. Your suggestion that we extend beyond the raa = cl.load_sample('raa')
dev = cl.Development().fit(raa)
raa_1987 = raa[(raa.valuation <= '1988-01-01')]
print(dev.transform(raa).ldf_)
print(dev.transform(raa_1987).ldf_)In the above example, the "triangle" (pattern) is already built on the full grid, and we are just bringing that full grid into the smaller triangle, so the ddim is already there. I do think your implementation has one benefit though, is that we can get the next 12 month expected emergence. But even with that, I still think we should merge this as is given that there's not a bug here (so I can unblock my Friedland recon work) and we can consider opening a new ticket for this feature. Thoughts? |
|
you know, this whole thing started when you said that you weren't entirely happy with this, and it's probably introducing future tech debt.
my preference is to not use the result as the rationale to compromise on a principle. i provided this example to illustrate that the existing behavior within the development estimators is to let the full pattern ride. we can definitely deviate from that behavior if that behavior doesn't make sense in a given situation. but we don't seem to be making an argument for that. so do you feel good justifying a deviation in package behavior with "well cuz it'd be hard otherwise..."
unfortunately this is the one place I disagree with you on. to me, this truncation of the LDF is indeed a bug, because it's altering the input data. _you cannot get back the original input pattern using the newly created you can proceed with your friedland work by adding a tailconstant to the existing developmentconstant.
that was just to show that it's possible. my preferred implementation would be a complete rewrite, and pivoting to creating dummy triangles using the supplied pattern and strategically placed 1's to leverage all the robustness this package has around |
|
Ya I think I agree with you now, you convinced me. What's the idea of using the fake triangles? So you create a new triangle with the right shape with the latest diagonal just being 1s? And then the 2nd to last is whatever works out to the correct LDF/CDF? |
…nto #864_DevelopmentConstant_Tail_Bug
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5fdf5d8. Configure here.
| for k in sorted_keys[:n_dev_periods]: | ||
| cdf_patterns[int(k)] /= tail_cdf | ||
|
|
||
| return cdf_patterns, tail_cdf |
There was a problem hiding this comment.
Tail index depends on triangle
High Severity
When the external pattern has more ages than the triangle, _prepare_cdf_patterns picks the tail using sorted_keys[n_dev_periods] and rebases only the first n_dev_periods pattern keys, then multiplies the last fitted ldf_ by that tail. Changing tri_dev_periods moves which pattern age is treated as tail, so the same pattern can yield different ldf_ (including the last link ratio) across triangle shapes instead of preserving supplied LDF/CDF values beyond filling with 1.0.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit 5fdf5d8. Configure here.
| dev_slice = slice(None) if include_last else slice(None, -1) | ||
|
|
||
| # this is the object to fill out the patterns, skeleton frame | ||
| obj = obj.iloc[..., :1, dev_slice] * 0 + 1 |
There was a problem hiding this comment.
Short pattern drops dev column
Medium Severity
If the supplied pattern has fewer ages than the triangle, the code warns that missing ages will be filled with 1.0, but sets include_last=False and uses dev_slice=slice(None, -1), shrinking the skeleton so ldf_ has one fewer development column than the triangle. Missing ages are not represented on the full triangle grain.
Reviewed by Cursor Bugbot for commit 5fdf5d8. Configure here.


Summary of Changes
Addressed two bugs in the DevelopmentConstant()
Related GitHub Issue(s)
Fixes #864
Additional Context for Reviewers
This PR fixes both bugs, even though only 1 is reported on #864.
There was also an old bug(?) in
test_constant_callable_axis1, not sure whypatterns.valueshadpatterns.values[:, :-1]dropped the last value. This is corrected.uv run pytest) and documentation changes (uv run jb build docs --builder=custom --custom-builder=doctest)Note
Medium Risk
Changes actuarial development-pattern fitting used before reserving; behavior shifts for long/short/LDF patterns but impact is confined to DevelopmentConstant with broad new tests.
Overview
DevelopmentConstant.fitis reworked so external CDF/LDF patterns align correctly with triangle development length, including tails beyond the triangle.A new
_prepare_cdf_patternsconverts LDF inputs to CDFs, splits an extra-age tail factor, and rebases in-triangle CDFs.fituses that tail on the last LDF, chooses whether to keep the final dev period, fills short patterns with 1.0 (with aUserWarning), and applies the same preparation per row for callable patterns on index or columns.Tests add coverage for no-tail vs tail, exact/short/long patterns (CDF and LDF), incremental triangles, and fix
test_constant_callable_axis1to compare full CDF values.Reviewed by Cursor Bugbot for commit 5fdf5d8. Bugbot is set up for automated code reviews on this repo. Configure here.