Skip to content

gh-150107: Fix asyncio sendfile fallback ignoring non-zero offset#150270

Open
grantlouisherman wants to merge 3 commits into
python:mainfrom
grantlouisherman:bugfix-150107-asyncio-offset
Open

gh-150107: Fix asyncio sendfile fallback ignoring non-zero offset#150270
grantlouisherman wants to merge 3 commits into
python:mainfrom
grantlouisherman:bugfix-150107-asyncio-offset

Conversation

@grantlouisherman
Copy link
Copy Markdown

Original Bug Report:

Bug report

Bug description:

BaseEventLoop._sock_sendfile_fallback (https://github.com/python/cpython/blob/main/Lib/asyncio/base_events.py#L971) erroneously doesn't seek when passed the offset 0.

        if offset:
            file.seek(offset)

This check in the function doesn't call file.seek if offset is falsey. If the file in question has a file pointer that is not currently at the 0 offset then this will erroneously send from the current offset and not the start of the file as desired. This can happen when sendfile is called with the destination socket is an SSL socket.

Repro code (generated by Claude), because of needing to create an SSL socket it's a bit involved - including making a self signed cert with the openssl command.

import asyncio
import os
import ssl
import subprocess
import sys
import tempfile
from pathlib import Path

CONTENT = b"X" * 1000


def make_throwaway_cert(tmp: Path) -> tuple[Path, Path]:
    cert = tmp / "cert.pem"
    key = tmp / "key.pem"
    subprocess.run(
        [
            "openssl", "req", "-x509", "-newkey", "rsa:2048", "-nodes",
            "-keyout", str(key), "-out", str(cert),
            "-days", "1", "-subj", "/CN=localhost",
        ],
        check=True,
        capture_output=True,
    )
    return cert, key


async def main() -> int:
    with tempfile.TemporaryDirectory() as tmp_:
        tmp = Path(tmp_)
        cert, key = make_throwaway_cert(tmp)

        data_path = tmp / "data"
        data_path.write_bytes(CONTENT)

        # SSL server that drains whatever it receives.
        server_ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
        server_ctx.load_cert_chain(cert, key)
        received = 0
        done = asyncio.Event()

        async def handle(reader: asyncio.StreamReader, writer: asyncio.StreamWriter) -> None:
            nonlocal received
            try:
                while chunk := await reader.read(4096):
                    received += len(chunk)
            finally:
                writer.close()
                done.set()

        server = await asyncio.start_server(handle, "127.0.0.1", 0, ssl=server_ctx)
        port = server.sockets[0].getsockname()[1]

        # SSL client → transport's `_sendfile_compatible == UNSUPPORTED` → fallback path.
        client_ctx = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
        client_ctx.check_hostname = False
        client_ctx.verify_mode = ssl.CERT_NONE
        _, writer = await asyncio.open_connection(
            "127.0.0.1", port, ssl=client_ctx, server_hostname="localhost"
        )

        # Open the file and pre-advance its position. In the real-world bug
        # this happens because the calling code previously read through the
        # same fh (e.g. to parse an HTTP header).
        f = open(data_path, "rb", buffering=0)
        f.seek(len(CONTENT))  # file position is at EOF

        # Send from offset=0. Expectation: 1000 bytes transferred.
        loop = asyncio.get_running_loop()
        sent = await loop.sendfile(writer.transport, f, offset=0)

        writer.close()
        await writer.wait_closed()
        await done.wait()
        server.close()
        await server.wait_closed()
        f.close()

        print(f"Python:       {sys.version}")
        print(f"file size:    {len(CONTENT)}")
        print(f"file.tell() before sendfile: {len(CONTENT)} (EOF)")
        print(f"loop.sendfile(..., offset=0) returned: {sent}")
        print(f"server received: {received} bytes")
        print()

        if sent == 0 and received == 0:
            print("BUG REPRODUCED: offset=0 was ignored; the fallback read from "
                  "file.tell() instead of from the requested offset.")
            return 1
        else:
            print("Bug NOT reproduced on this Python build.")
            return 0


sys.exit(asyncio.run(main()))

The relevant lines are the line f.seek(len(CONTENT)) and the line sent = await loop.sendfile(writer.transport, f, offset=0). Most of the rest is setup and teardown.

CPython versions tested on:

3.12, 3.13

Operating systems tested on:

Linux

Fix

I think what I did was correct or understood the issue, but essentially as far as I could understand 0 is always falsey so if offset was ever set to zero than the file.seek wasent happening even though 0 offset is a legitimate offset. I simply just did an instance on offset to make sure it is an int and then the f.seek goes through. I validated with the repro command and passing unit tests.

@vstinner
Copy link
Copy Markdown
Member

We should also fix the issue in sendfile(). See my draft adding tests: #150142 (comment). The problem is that sock_sendfile() and sendfile() don't have a consistent seek() behavior. Depending on the code path, seek() may be called or not. We should make asyncio more consistent for that.

@grantlouisherman
Copy link
Copy Markdown
Author

Sure @vstinner I must have missed that comment.

@grantlouisherman grantlouisherman force-pushed the bugfix-150107-asyncio-offset branch 2 times, most recently from 688d182 to 3b33313 Compare May 26, 2026 03:00
@grantlouisherman
Copy link
Copy Markdown
Author

So in @vstinner from doing some digging and playing around with this I essentially updated the unix event to always update the file pointers based on the current offset. I think the previous guard was buggy, obviously exposed by the test you drafted, where if total sent returned 0 which was the case when we were EOF a file pointer wouldnt get updated because total sent would == 0. Let me know your thoughts

Signed-off-by: grantlouisherman <grantlouisherman041@gmail.com>
@grantlouisherman grantlouisherman force-pushed the bugfix-150107-asyncio-offset branch from dc70850 to 2a884c8 Compare May 26, 2026 14:14
@grantlouisherman
Copy link
Copy Markdown
Author

@1st1 is there anything Im missing from this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants