fix: unsynchronized reads of OSIdentityModel state (jwt) by nan-li · Pull Request #1665 · OneSignal/OneSignal-iOS-SDK

nan-li · 2026-05-28T17:48:33Z

Description

One Line Summary

Fix production crashes on Identity Verification beta caused by unsynchronized access to OSIdentityModel.jwtBearerToken and two adjacent JWT-lifecycle hazards.

Details

Motivation

8 production crashes reported

Two adjacent issues were exposed while fixing the first and are addressed in the same PR:

OSIdentityModelRepo.updateJwtToken fired the model's change notifier (→ onModelUpdated → onJwtTokenChanged → executor listeners) synchronously while holding the repo's NSLock. Nothing re-enters today, so it deadlocks-by-luck, but it's a trap for any future listener.
invalidateJwtForExternalId had a TOCTOU between its "already invalid?" read and the "set to invalid" write. A concurrent valid-token write landing between them would be overwritten with OS_JWT_TOKEN_INVALID and trigger a needless re-auth.

Scope

Affects the JWT (Identity Verification) hot paths only: how OSIdentityModel.jwtBearerToken is read/written, how the model repo propagates token updates to listeners, and how invalidation transitions. No public API changes. Network payloads, request shapes, and persisted UserDefaults schema are unchanged (encode/decode continues to write/read the same OS_JWT_BEARER_TOKEN key — just through a renamed private backing storage).

What changed

jwtBearerToken is backed by a private jwtBearerTokenLocked (commented "only read/write under self.lock") with a locked getter and a locked setter. The change notifier fires outside the lock — NSRecursiveLock only saves us from same-thread re-entry, so firing the notifier under the lock would be a deadlock waiting to happen as soon as a listener grabs another lock. encode(with:) and init?(coder:) access the backing storage directly (already inside the lock / no race during init).
isJwtValid() -> Bool → getValidJwt() -> String?. Snapshots once and returns the usable value, eliminating the read-then-check race callers were prone to. addJWTHeaderIsValid and getFullPushHeader updated to consume it.
OSIdentityModelRepo.updateJwtToken now snapshots matching models under the repo lock and mutates them outside, so the model's change notifier fires lock-free.
New OSIdentityModel.invalidateJwtBearerToken() performs an atomic compare-and-set to OS_JWT_TOKEN_INVALID and returns whether the transition occurred. invalidateJwtForExternalId only fires fireJwtExpired when it wins the transition.

Testing

Unit testing

Existing OneSignalUserTests continue to pass; they exercise the JWT setter, invalidation, and re-queue paths covered by this change. No new unit tests added in this PR — a TSan stress test that races login(externalId:token:) against executor flushing is planned as a follow-up.

Manual testing

Tight loop of login(externalId, token:) on main + addTag bursts on a background queue against a JWT-enabled app: no crashes, no spurious JwtExpired events.
Cold start with cached updateRequestQueue in UserDefaults: decode succeeds and queued requests drain normally (encode/decode now uses jwtBearerTokenLocked directly).
Manually replayed the customer's flow shape (token rotation → login with same external ID → property update burst) on the dev app; no reproduction of the 8 crash signatures.

Affected code checklist

Checklist

Overview

I have filled out all REQUIRED sections above
PR does one thing
- If it is hard to explain how any codes changes are related to each other then it most likely needs to be more than one PR
Any Public API changes are explained in the PR details and conform to existing APIs

Testing

I have included test coverage for these changes, or explained why they are not needed
All automated tests pass, or I explained why that is not possible
I have personally tested this on my device, or explained why that is not possible

Final pass

Code is as readable as possible.
- Simplify with less code, followed by splitting up code into well named functions and variables, followed by adding comments to the code.
I have reviewed this PR myself, ensuring it meets each checklist item
- WIP (Work In Progress) is ok, but explain what is still in progress and what you would like feedback on. Start the PR title with "WIP" to indicate this.

fix unsynchronized reads of OSIdentityModel state (jwtBearerToken in particular)

Two follow-on JWT concurrency issues exposed while reviewing the prior fix. 1. OSIdentityModelRepo.updateJwtToken fired the model's change notifier synchronously (→ onModelUpdated → onJwtTokenChanged → executor listeners) while still holding the repo's NSLock. Today nothing re-enters the repo lock so it doesn't deadlock by luck, but it's a trap for any future listener. The fix collects matching models under the lock and mutates them outside, so the notifier fires lock-free. 2. invalidateJwtForExternalId had a TOCTOU between its "is it already invalid?" read and the "set to invalid" write. A concurrent valid-token write landing between them would be overwritten with INVALID and trigger a needless re-auth. The transition is now an atomic compare-and-set on the model (invalidateJwtBearerToken); only the thread that wins the transition fires fireJwtExpired. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

renaming a variable and cleaning up comments

claude

This PR touches JWT/auth code with subtle concurrency reasoning (lock consolidation, TOCTOU fix, notifier-outside-lock pattern). No bugs were found, but the security-sensitive nature warrants a human review.

Extended reasoning...

Overview

The PR consolidates aliasesLock into a single lock that now also guards jwtBearerToken, converts jwtBearerToken into a computed property backed by a locked private field, adds two new APIs (getValidJwt() for atomic read+validate and invalidateJwtBearerToken() for atomic compare-and-set), and updates all call sites in OneSignalUserManagerImpl and OSUserRequest to use them. OSIdentityModelRepo.updateJwtToken is restructured to snapshot matching models under the repo lock and only call the setter outside the lock to avoid deadlock against the change-notifier → onJwtTokenChanged chain.

Security risks

This touches authentication-adjacent code (JWT bearer token lifecycle). The risks are not in injection/exposure shape but in correctness: if locking is wrong, a stale or already-invalidated token could be used on a request, or invalidateJwtForExternalId could double-fire / fail to fire the JWT-invalidated listener. The new getValidJwt() correctly snapshots once, and invalidateJwtBearerToken() does the check-and-set atomically and only returns true on actual transition, which fixes the prior TOCTOU between jwtBearerToken != OS_JWT_TOKEN_INVALID and the subsequent write. The setter intentionally invokes self.set(property:...) outside the lock — the inline comment explains this is to avoid deadlocks with listeners that take other locks, which matches the deadlock pattern that updateJwtToken was also restructured to avoid.

Level of scrutiny

High. Concurrency + auth code changes deserve careful human eyes even when the diff is small. The reasoning is sound on inspection, but assessing reentrancy and deadlock correctness fully requires tracing every listener attached to the identity models changeNotifier (including OSIdentityModelRepo.onModelUpdated → jwtConfig.onJwtTokenChanged) and confirming none take a lock that any caller of the setter already holds. The PR description is also the unfilled template (no motivation/scope/testing notes), and the affected-code and testing checklists are unchecked, which makes it harder to assess intended scope and test coverage from the PR alone.\n\n### Other factors\nNo bugs were found by the bug hunting system and the changes look internally consistent. However, the title ("identity verification crashes") implies this is a crash fix that should be validated against the original crash repro — thats something a human owner of this area is best positioned to confirm.

nan-li and others added 3 commits May 28, 2026 10:46

fix(jwt): make OSIdentityModel.jwtBearerToken thread-safe

714d53d

fix unsynchronized reads of OSIdentityModel state (jwtBearerToken in particular)

cleanup

ea74380

renaming a variable and cleaning up comments

nan-li changed the title ~~Fix/identity verification crashes~~ fix: unsynchronized reads of OSIdentityModel state (jwt) May 28, 2026

claude Bot reviewed May 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: unsynchronized reads of OSIdentityModel state (jwt)#1665

fix: unsynchronized reads of OSIdentityModel state (jwt)#1665
nan-li wants to merge 3 commits into
identity_verification_rebase_on_5.5.1from
fix/identity_verification_crashes

nan-li commented May 28, 2026 •

edited

Loading

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nan-li commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

One Line Summary

Details

Motivation

Scope

What changed

Testing

Unit testing

Manual testing

Affected code checklist

Checklist

Overview

Testing

Final pass

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nan-li commented May 28, 2026 •

edited

Loading