Skip to content

[ntuple][python][ATLAS experiment] Re-Implement context management pr…#22432

Open
rybkine wants to merge 1 commit into
root-project:masterfrom
rybkine:master-ntuple-python-ctx-mgr
Open

[ntuple][python][ATLAS experiment] Re-Implement context management pr…#22432
rybkine wants to merge 1 commit into
root-project:masterfrom
rybkine:master-ntuple-python-ctx-mgr

Conversation

@rybkine
Copy link
Copy Markdown

@rybkine rybkine commented May 29, 2026

…otocol for RNTupleReader/Writer

bindings/pyroot/pythonizations/python/ROOT/_pythonization/_rntuple.py: add __enter__ method - returns self (an instance of RNTupleReader/RNTupleWriter), __exit__ method - calls RNTupleReader/RNTupleWriter destructor (if not destructed yet).
tree/ntuple/test/ntuple_basics.py: update tests

This Pull request:

Changes or fixes:

Checklist:

  • tested changes locally
  • updated the docs (if necessary)

This PR fixes #22431

@rybkine rybkine force-pushed the master-ntuple-python-ctx-mgr branch from fcff6da to c4e56cb Compare May 29, 2026 21:08
@ferdymercury ferdymercury requested a review from silverweed May 30, 2026 13:05
…otocol for RNTupleReader/Writer

bindings/pyroot/pythonizations/python/ROOT/_pythonization/_rntuple.py: add __enter__ method - returns self (an instance of RNTupleReader/RNTupleWriter), __exit__ method - calls RNTupleReader/RNTupleWriter destructor (if not destructed yet).
tree/ntuple/test/ntuple_basics.py: update tests
@rybkine rybkine force-pushed the master-ntuple-python-ctx-mgr branch from c4e56cb to 52bc377 Compare May 30, 2026 17:36
@jblomer
Copy link
Copy Markdown
Contributor

jblomer commented Jun 1, 2026

@vepadulano @silverweed Your reviews would be useful

@rybkine
Copy link
Copy Markdown
Author

rybkine commented Jun 1, 2026

Perhaps, to say the obvious - the proposed implementation is virtually the same as that of the Python file object. And this is exactly what is needed here.

Copy link
Copy Markdown
Member

@vepadulano vepadulano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the feature request, but I'm not sure deleting the current implementation is the right approach. This PR is modifying quite a few tests which were present before, which is a sign of major changes and thus need to be carefully evaluated. I believe @jblomer and ultimately @silverweed should say whether/how the feature request should be addressed.

@rybkine
Copy link
Copy Markdown
Author

rybkine commented Jun 1, 2026

It is not a feature request - it is rather an alternative implementation proposed, which closely follows the implementation in the Python file object. Hence, the changes made.

@vepadulano
Copy link
Copy Markdown
Member

closely follows the implementation in the Python file object

I am not sure what you mean with "Python file object".

I think at the current stage the changes are too drastic. I would be happy to review a PR that proposes to keep the functionality of current context manager as-is, while also adding the iterator capabilities on top of that. In such case, the current test suite should not be changed. New tests would then need to be added.

@rybkine
Copy link
Copy Markdown
Author

rybkine commented Jun 2, 2026

I am not sure what you mean with "Python file object".

https://docs.python.org/3/glossary.html#term-file-object

I think at the current stage the changes are too drastic.

This is what an alternative implementation means - replacement. The current implementation is not needed.

I would be happy to review a PR that proposes to keep the functionality of current context manager as-is, while also adding the iterator capabilities on top of that.

The current context manager does not provide a useful functionality. Quite on the contrary. What it does is counterproductive - it interferes with the available RNTupleReder functionality, e.g., breaks iterability. The only sensible approach is to get rid of it altogether and simply add the context manager functionality in a transparent way. That is what the PR proposes. And that is how it is done for the Python file object for that matter https://github.com/python/cpython/blob/629da5c914b4407e01c1dc06cbcbd8dce825fef3/Lib/_pyio.py#L473-L490.
Incidentally, the context manager is implemented almost the same way for the ROOT TFile as well

def _TFileExit(obj, exc_type, exc_val, exc_tb):
"""
Close the TFile object.
Signature and return value are imposed by Python, see
https://docs.python.org/3/library/stdtypes.html#typecontextmanager.
"""
# A TFile might be storing references to objects retrieved by the user in
# a cache. Make sure the cache is cleaned at exit time rather than having
# to wait for the garbage collector.
try:
delattr(obj, "_cached_items")
except AttributeError:
pass
obj.Close()
return False
@pythonization('TFile')
def pythonize_tfile(klass):
"""
TFile inherits from
- TDirectory the pythonized attr syntax (__getattr__) and WriteObject method.
- TDirectoryFile the pythonized Get method (pythonized only in Python)
and defines the __enter__ and __exit__ methods to work as a context manager.
"""
# Pythonizations for TFile::Open
klass.Open.__creates__ = True
klass._OriginalOpen = klass.Open
klass.Open = classmethod(_TFileOpen)
# Pythonization for TFile constructor
klass._OriginalConstructor = klass.__init__
klass.__init__ = _TFileConstructor
# Pythonization for __enter__ and __exit__ methods
# These make TFile usable in a `with` statement as a context manager
klass.__enter__ = lambda tfile: tfile
klass.__exit__ = _TFileExit
.

In such case, the current test suite should not be changed. New tests would then need to be added.

First of all, the changes to the tests reflect the crucial improvement of context management protocol implementation that the proposed PR brings. But they also do the testing in a more useful and natural way, e.g., they fill and read an ntuple with several events (rather than one) and test the iterability as well. They demonstrate in detail that NTupleReader/RNTupleWriter context managers are single use context managers - see test_singleuse_ctxmanager, using the standard Python terminology (rather than "weird", for example). They make use of more specific exceptions, e.g., ROOT.RException instead of Exception, or ReferenceError instead of RuntimeError. They also make better use of the unittest functionality, in particular, moving all the code comments into the test assertion messages (that will be displayed in case of error, failure). All in all, the proposed changes to the tests are also a major improvement. And they are indispensable as they are.

@vepadulano
Copy link
Copy Markdown
Member

https://docs.python.org/3/glossary.html#term-file-object

Thanks for the pointer, this is a glossary term, there is no "Python file object" in general.

This is what an alternative implementation means - replacement. The current implementation is not needed.

I disagree on both parts of this sentence.

The current context manager does not provide a useful functionality. Quite on the contrary. What it does is counterproductive - it interferes with the available RNTupleReder functionality, e.g., breaks iterability.

I understand that is your opinion, but it cannot be used to claim that this PR is fixing an existing bug.

The only sensible approach is to get rid of it altogether and simply add the context manager functionality in a transparent way.

I disagree.

def _TFileExit(obj, exc_type, exc_val, exc_tb):
"""
Close the TFile object.
Signature and return value are imposed by Python, see
https://docs.python.org/3/library/stdtypes.html#typecontextmanager.
"""
# A TFile might be storing references to objects retrieved by the user in
# a cache. Make sure the cache is cleaned at exit time rather than having
# to wait for the garbage collector.
try:
delattr(obj, "_cached_items")
except AttributeError:
pass
obj.Close()
return False
@pythonization('TFile')
def pythonize_tfile(klass):
"""
TFile inherits from
- TDirectory the pythonized attr syntax (__getattr__) and WriteObject method.
- TDirectoryFile the pythonized Get method (pythonized only in Python)
and defines the __enter__ and __exit__ methods to work as a context manager.
"""
# Pythonizations for TFile::Open
klass.Open.__creates__ = True
klass._OriginalOpen = klass.Open
klass.Open = classmethod(_TFileOpen)
# Pythonization for TFile constructor
klass._OriginalConstructor = klass.__init__
klass.__init__ = _TFileConstructor
# Pythonization for __enter__ and __exit__ methods
# These make TFile usable in a `with` statement as a context manager
klass.__enter__ = lambda tfile: tfile
klass.__exit__ = _TFileExit

Just because the TFile context manager was implemented in a certain way it does not mean that the RNTupleReader/Writer context managers must be implemented in the same way. After all, TFile and RNTupleReader/Writer are different classes.

First of all, the changes to the tests reflect the crucial improvement of context management protocol implementation that the proposed PR brings.

As I have already argued, this PR is not bringing any crucial improvements, rather an opinionated dismissal of existing implementation.

they fill and read an ntuple with several events (rather than one)

I agree that in general we should have tests on the Python side of reading/writing more than one event via RNTupleReader/Writer.

All in all, the proposed changes to the tests are also a major improvement. And they are indispensable as they are.

The proposed test changes are wrong in general. I appreciate that the stile of the testing can be improved, e.g. by using terms like "single use context manager" instead of weird and by adding a bit more context to the assertion messages. Everything else needs to be seriously reconsidered.

Once more, this PR should be reviewed by the original author of the Pythonization of RNTupleReader/Writer just to evaluate if in principle the idea of providing the iterator protocol on the Python side is desirable or not. Every other decision will derive from this first one. Until then, there's nothing else to discuss.

@rybkine
Copy link
Copy Markdown
Author

rybkine commented Jun 3, 2026

https://docs.python.org/3/glossary.html#term-file-object

Thanks for the pointer, this is a glossary term

This term is as specific as we need here.

there is no "Python file object" in general.

There is - anything returned by the Python built-in function open (on success).

The current context manager does not provide a useful functionality. Quite on the contrary. What it does is counterproductive - it interferes with the available RNTupleReder functionality, e.g., breaks iterability.

I understand that is your opinion, but it cannot be used to claim that this PR is fixing an existing bug.

This PR restores iterability and thus does fix a bug, why can it not be affirmed?

The only sensible approach is to get rid of it altogether and simply add the context manager functionality in a transparent way.

I disagree.

Without arguments to support this disagreement what is it worth?

Just because the TFile context manager was implemented in a certain way it does not mean that the RNTupleReader/Writer context managers must be implemented in the same way. After all, TFile and RNTupleReader/Writer are different classes.

This does mean the RNTupleReader/Writer context managers must be implemented in way not inferior to the known implementations. This PR does so by virtually borrowing one (the Python file object implementation).

First of all, the changes to the tests reflect the crucial improvement of context management protocol implementation that the proposed PR brings.

As I have already argued, this PR is not bringing any crucial improvements, rather an opinionated dismissal of existing implementation.

This PR is an improvement as it fixes an issue and has every right to dismiss the existing implementation simply because it proposes a superior implementation.

The proposed test changes are wrong in general.

We are expected to be very specific when making this sort of statements. All the test changes in this PR are correct and relevant until shown otherwise.

evaluate if in principle the idea of providing the iterator protocol on the Python side is desirable or not.

The iterator protocol and anything else available on the C++ side (with very rare exceptions) are supposed to be available on the Python side. This is what the Python bindings to the ROOT framework written in C++ are about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

In Python, RNTupleReader no longer iterable

3 participants