Skip to content

chore: fallback for spark.sql.legacy.castComplexTypesToString.enabled = true#4630

Open
comphead wants to merge 4 commits into
apache:mainfrom
comphead:struct_legacy
Open

chore: fallback for spark.sql.legacy.castComplexTypesToString.enabled = true#4630
comphead wants to merge 4 commits into
apache:mainfrom
comphead:struct_legacy

Conversation

@comphead

@comphead comphead commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #4492

Rationale for this change

Spark's spark.sql.legacy.castComplexTypesToString.enabled flag changes how
array, map, and struct are formatted when cast to string:

Flag Wrapping NULL elements
false (default, Comet implements this) {...} for maps/structs, [...] for arrays rendered as the literal "null"
true (legacy) [...] for maps/structs omitted from output

In Spark 4.0 the flag is marked internal and defaults to false, but it is
still honored when set. Comet only implements the default formatting, so
running with the legacy flag on produces results that diverge from Spark.

What changes are included in this PR?

  • CometCast.isSupported: when spark.sql.legacy.castComplexTypesToString.enabled=true,
    return Unsupported for any ArrayType/StructType/MapTypeStringType
    cast so the plan falls back to Spark.
  • Refactor several near-duplicate type-walking helpers into one generic
    SupportLevel.containsType(dt, classOf[T1], classOf[T2], ...) (replaces
    containsFloatingPoint, containsMapType, two local containsBinary).
  • Extract the strict-floating-point gate into
    SupportLevel.strictFloatingPointReason(dt, what) so the five repeated
    STRICT_FP + containsFloatingPoint + boilerplate message blocks collapse to
    one call.
  • SQL file tests:
    • cast_complex_types_to_string_legacy.sql — asserts struct/array/map and
      nested complex types fall back when the legacy flag is on.
    • cast_complex_types_to_string.sql — exhaustive coverage of struct → string
      and array → string across every supported field/element type
      (named/anonymous, ints at min/max, decimals at 38-precision, dates,
      timestamps, binary with non-printable bytes, NaN/±0/±Infinity, NULL at
      every depth, empty containers, deep nesting). Map → string queries assert
      expect_fallback(Cast from MapType) since map → string is not implemented
      natively.

@andygrove andygrove left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending CI. Thanks @comphead.

SELECT cast(map('a', X'616263', 'b', X'', 'c', cast(null as binary)) as string)

-- Map with float / double values: NaN / ±0 / ±Infinity / NULL.
query expect_fallback(Cast from MapType)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cast from maptype to string needs to be improved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] CAST(complex AS STRING) does not honour spark.sql.legacy.castComplexTypesToString.enabled

2 participants