Skip to content

feat: opt sort_array into codegen dispatch under strict floating-point mode#4637

Open
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:feat/codegen-dispatch-sort-array
Open

feat: opt sort_array into codegen dispatch under strict floating-point mode#4637
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:feat/codegen-dispatch-sort-array

Conversation

@andygrove

Copy link
Copy Markdown
Member

Which issue does this PR close?

Part of #4596 (the sort_array candidate).

Rationale for this change

CometSortArray reports Incompatible in exactly one situation: when spark.comet.exec.strictFloatingPoint=true and the array element type contains a float or double (strict mode flags float ordering, for example NaN and signed zero, as not bit-identical to Spark). With allowIncompatible unset, that case falls the whole projection back to Spark.

The issue flagged sort_array as needing an eligibility check because it might be CodegenFallback. It is not: Spark's SortArray has a real doGenCode, so the JVM codegen dispatcher accepts it. (And the dispatcher admits CodegenFallback expressions anyway.)

What changes are included in this PR?

  • CometSortArray mixes in CodegenDispatchFallback, so its strict-floating-point Incompatible case routes through the JVM codegen dispatcher (Spark's own doGenCode inside the Comet pipeline) and matches Spark exactly instead of falling back. The Unsupported nested-struct/null element-type case is unchanged (still falls back), and default (non-strict) behavior is unchanged.

How are these changes tested?

New sort_array_strict_fp.sql runs with spark.comet.exec.strictFloatingPoint=true over double and float arrays containing NaN, +/-Infinity, +/-0.0, and nulls, asserting native execution that matches Spark via the dispatcher. The existing comprehensive sort_array.sql (default mode, all element types, including the nested-struct expect_fallback cases) still passes. Both run with CometSqlFileTestSuite on Spark 3.5.

…t mode

CometSortArray reports Incompatible only when spark.comet.exec.strictFloatingPoint
is enabled and the array contains floats. Mixing in CodegenDispatchFallback routes
that case through the JVM codegen dispatcher (Spark's own doGenCode) so it stays
native and matches Spark instead of falling back. SortArray has a real doGenCode
(not CodegenFallback), so the dispatcher accepts it.

Part of apache#4596.
@andygrove andygrove added this to the 0.17.0 milestone Jun 12, 2026

@mbutrovich mbutrovich left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants