fix: VarLookupDict __iter__ + DataFrame dtype compat (fixes #259, #232)#260
fix: VarLookupDict __iter__ + DataFrame dtype compat (fixes #259, #232)#260kouyouqi123 wants to merge 3 commits into
Conversation
VarLookupDict is a dict-like object but does not implement __iter__, __len__, or keys(). This causes a CPython crash (via a KeyError: 0) when Python's C-level error display code tries to iterate over the locals dict during traceback formatting. Fixes pydata#259
…me compat When _eval_factor receives a pandas DataFrame as result and tries to raise a PatsyError, it accesses result.dtype which doesn't exist on DataFrames (only .dtypes). This causes a confusing secondary AttributeError instead of the intended error message. Fixes pydata#232
for more information, see https://pre-commit.ci
|
This is the type of change that is considered hard in patsy. Could you explain more as to why this is experienced when it does not come up in standard CPython? |
|
Also, does this happen in standard CPython, only not get tested in patsy? |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #260 +/- ##
==========================================
+ Coverage 97.60% 98.20% +0.59%
==========================================
Files 30 30
Lines 3096 3114 +18
Branches 591 679 +88
==========================================
+ Hits 3022 3058 +36
+ Misses 39 35 -4
+ Partials 35 21 -14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I think I understand this issues now. Shouldn't the fix either go into the notebook environment or into CPython directly? These appear to be fully upstream of patsy. The hesitation is around the general treatment of patsy in being in maintenance mode. This has been interpreted to mean fixes for change in CPython where essential for it to run. This said, statsmodels need to get out the version where formulaic is available. |
|
For this to have a realistic prospect (given the status of the bug and the maintenance mode of patsy), we will need 100% coverage. Ideally you could add a single test that triggers the error using the same API that caused this issue in the first place, and we will see that everything that has been added will be hit. |
Fixes #259 — VarLookupDict missing
__iter__Added
__iter__,__len__, andkeys()toVarLookupDictto prevent CPython crash during traceback formatting.Fixes #232 — DataFrame
.dtypeaccess in_eval_factorChanged
result.dtypetonp.asarray(result).dtypeso the error message works with both numpy arrays and pandas DataFrames.