[ntuple][python][ATLAS experiment] Re-Implement context management pr…#22432
[ntuple][python][ATLAS experiment] Re-Implement context management pr…#22432rybkine wants to merge 1 commit into
Conversation
fcff6da to
c4e56cb
Compare
…otocol for RNTupleReader/Writer bindings/pyroot/pythonizations/python/ROOT/_pythonization/_rntuple.py: add __enter__ method - returns self (an instance of RNTupleReader/RNTupleWriter), __exit__ method - calls RNTupleReader/RNTupleWriter destructor (if not destructed yet). tree/ntuple/test/ntuple_basics.py: update tests
c4e56cb to
52bc377
Compare
|
@vepadulano @silverweed Your reviews would be useful |
|
Perhaps, to say the obvious - the proposed implementation is virtually the same as that of the Python file object. And this is exactly what is needed here. |
vepadulano
left a comment
There was a problem hiding this comment.
I understand the feature request, but I'm not sure deleting the current implementation is the right approach. This PR is modifying quite a few tests which were present before, which is a sign of major changes and thus need to be carefully evaluated. I believe @jblomer and ultimately @silverweed should say whether/how the feature request should be addressed.
|
It is not a feature request - it is rather an alternative implementation proposed, which closely follows the implementation in the Python file object. Hence, the changes made. |
I am not sure what you mean with "Python file object". I think at the current stage the changes are too drastic. I would be happy to review a PR that proposes to keep the functionality of current context manager as-is, while also adding the iterator capabilities on top of that. In such case, the current test suite should not be changed. New tests would then need to be added. |
https://docs.python.org/3/glossary.html#term-file-object
This is what an alternative implementation means - replacement. The current implementation is not needed.
The current context manager does not provide a useful functionality. Quite on the contrary. What it does is counterproductive - it interferes with the available RNTupleReder functionality, e.g., breaks iterability. The only sensible approach is to get rid of it altogether and simply add the context manager functionality in a transparent way. That is what the PR proposes. And that is how it is done for the Python file object for that matter https://github.com/python/cpython/blob/629da5c914b4407e01c1dc06cbcbd8dce825fef3/Lib/_pyio.py#L473-L490. root/bindings/pyroot/pythonizations/python/ROOT/_pythonization/_tfile.py Lines 98 to 136 in caf87c6
First of all, the changes to the tests reflect the crucial improvement of context management protocol implementation that the proposed PR brings. But they also do the testing in a more useful and natural way, e.g., they fill and read an ntuple with several events (rather than one) and test the iterability as well. They demonstrate in detail that NTupleReader/RNTupleWriter context managers are single use context managers - see test_singleuse_ctxmanager, using the standard Python terminology (rather than "weird", for example). They make use of more specific exceptions, e.g., |
|
Thanks for the pointer, this is a glossary term, there is no "Python file object" in general.
I disagree on both parts of this sentence.
I understand that is your opinion, but it cannot be used to claim that this PR is fixing an existing bug.
I disagree.
Just because the TFile context manager was implemented in a certain way it does not mean that the RNTupleReader/Writer context managers must be implemented in the same way. After all, TFile and RNTupleReader/Writer are different classes.
As I have already argued, this PR is not bringing any crucial improvements, rather an opinionated dismissal of existing implementation.
I agree that in general we should have tests on the Python side of reading/writing more than one event via RNTupleReader/Writer.
The proposed test changes are wrong in general. I appreciate that the stile of the testing can be improved, e.g. by using terms like "single use context manager" instead of weird and by adding a bit more context to the assertion messages. Everything else needs to be seriously reconsidered. Once more, this PR should be reviewed by the original author of the Pythonization of RNTupleReader/Writer just to evaluate if in principle the idea of providing the iterator protocol on the Python side is desirable or not. Every other decision will derive from this first one. Until then, there's nothing else to discuss. |
This term is as specific as we need here.
There is - anything returned by the Python built-in function
This PR restores iterability and thus does fix a bug, why can it not be affirmed?
Without arguments to support this disagreement what is it worth?
This does mean the RNTupleReader/Writer context managers must be implemented in way not inferior to the known implementations. This PR does so by virtually borrowing one (the Python file object implementation).
This PR is an improvement as it fixes an issue and has every right to dismiss the existing implementation simply because it proposes a superior implementation.
We are expected to be very specific when making this sort of statements. All the test changes in this PR are correct and relevant until shown otherwise.
The iterator protocol and anything else available on the C++ side (with very rare exceptions) are supposed to be available on the Python side. This is what the Python bindings to the ROOT framework written in C++ are about. |
…otocol for RNTupleReader/Writer
bindings/pyroot/pythonizations/python/ROOT/_pythonization/_rntuple.py: add
__enter__method - returns self (an instance of RNTupleReader/RNTupleWriter),__exit__method - calls RNTupleReader/RNTupleWriter destructor (if not destructed yet).tree/ntuple/test/ntuple_basics.py: update tests
This Pull request:
Changes or fixes:
Checklist:
This PR fixes #22431