Skip to content

Fix HTTPS pooled client crash on unexpected SSL EOF (#3307)#3316

Open
Felix-Gong wants to merge 2 commits into
apache:masterfrom
Felix-Gong:fix-https-pooled-error-code
Open

Fix HTTPS pooled client crash on unexpected SSL EOF (#3307)#3316
Felix-Gong wants to merge 2 commits into
apache:masterfrom
Felix-Gong:fix-https-pooled-error-code

Conversation

@Felix-Gong
Copy link
Copy Markdown
Contributor

Summary

  • Fix crash when HTTPS pooled connection encounters unexpected SSL EOF (error_code=0)
  • Properly handle SSL_ERROR_SSL in socket.cpp DoRead() to return ESSL instead of 0
  • Discovered during RISC-V porting and integration testing

Root Cause

When OpenSSL 3.x detects unexpected EOF (peer closed without close_notify), SSL_read returns 0 with SSL_ERROR_SSL. The code didn't return -1, causing error_code=0 to propagate to Controller::SetFailed() which triggers CHECK(false).

Fix

In the default branch of DoRead(), return -1 with errno=ESSL when SSL errors are detected, instead of falling through and returning nr.

Fixes #3307

Comment thread src/brpc/socket.cpp Outdated
Comment thread src/brpc/socket.cpp Outdated
BIO_fd_non_fatal_error(saved_errno) != 0 ||
nr < 0;
PLOG_IF(WARNING, is_fatal_error) << "Fail to read from ssl_fd=" << fd();
if (is_fatal_error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why modify this branch?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is changing errno from saved_errno to ESSL, which is a minor semantic improvement but not a bug fix.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change this behavior, you'd better check the callers' code and make sure they don't depend on the value of errno.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller only checks errno == EAGAIN and errno == EINTR for retry logic. ESSL matches neither, so control flow is unchanged — it still enters SetFailed().
The only difference is the error message. But I'll revert this change to keep the fix minimal.

@Felix-Gong Felix-Gong force-pushed the fix-https-pooled-error-code branch from 9b8eff6 to f11c336 Compare May 30, 2026 05:30
@wwbmmm
Copy link
Copy Markdown
Contributor

wwbmmm commented May 30, 2026

BTW, Can you add some unit tests to verify this fix?

When OpenSSL 3.x detects unexpected EOF (peer closed without
close_notify), SSL_read returns 0 with SSL_ERROR_SSL. The code
didn't return -1, causing error_code=0 to propagate to
Controller::SetFailed() which triggers CHECK(false).

Fix by returning -1 with errno=ESSL when SSL errors are detected
in DoRead(), instead of falling through and returning nr.

Discovered during RISC-V porting and integration testing.

Fixes apache#3307
Signed-off-by: Felix Gong <gongxiaofei24@iscas.ac.cn>
@Felix-Gong Felix-Gong force-pushed the fix-https-pooled-error-code branch from f11c336 to 7a1dfe7 Compare May 31, 2026 07:39
@Felix-Gong
Copy link
Copy Markdown
Contributor Author

BTW, Can you add some unit tests to verify this fix?

I'll think about how to add a test that covers the full DoRead() path for the unexpected SSL EOF scenario, and submit it later.

@Felix-Gong
Copy link
Copy Markdown
Contributor Author

BTW, Can you add some unit tests to verify this fix?

@wwbmmm I have added unit test ssl_unexpected_eof in PR #3321 to verify this fix. The test creates an SSL server that abruptly closes the connection without close_notify, then
verifies DoRead() returns -1 with errno=ESSL. CI of #3321 will fail until this fix is merged; after merging, rebase #3321 and CI should pass.

@chenBright
Copy link
Copy Markdown
Contributor

I think you should add the unit test ssl_unexpected_eof to this PR, rather than creating a new PR.

@Felix-Gong
Copy link
Copy Markdown
Contributor Author

I think you should add the unit test ssl_unexpected_eof to this PR, rather than creating a new PR.

I'd prefer to keep the test in a separate PR (#3321) because the fix and the test are independent changes. This way the fix can be reviewed and merged first, and the test PR can be rebased to verify CI passes. If you think it's better to combine them, I can merge them into one PR.

@chenBright
Copy link
Copy Markdown
Contributor

Add unit test ssl_unexpected_eof to verify the fix in #3316

I think it would be more appropriate to merge them into one PR.

Test that Socket::DoRead() returns -1 with errno=ESSL when the remote
side closes the TCP connection without sending close_notify, verifying
the fix in commit 7a1dfe7.
@Felix-Gong
Copy link
Copy Markdown
Contributor Author

Done. The unit test ssl_unexpected_eof has been added to this PR (commit dc09eaa). Please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

brpc HTTPS pooled client 可能因 Controller::SetFailed(error_code=0) 崩溃

3 participants