Describe the bug
A scalar subquery inside a RepartitionByExpression (e.g. DISTRIBUTE BY) crashes natively: the subquery is not registered on the native side for this plan shape. Reproduces over a plain Parquet scan.
Steps to reproduce
Add to a suite extending CometTestBase:
test("scalar subquery in repartition") {
withParquetTable((0 until 10).map(i => (i, i)), "t") {
val df = sql("SELECT * FROM t DISTRIBUTE BY (_1 + (SELECT max(_2) FROM t))")
checkSparkAnswer(df)
}
}
org.apache.spark.SparkException: Job aborted due to stage failure: ... org.apache.comet.CometNativeException: Error inserting batch: External error: org.apache.comet.CometRuntimeException: Subquery 75 not found for plan 12.
at org.apache.comet.Native.executePlan(Native Method)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$2(CometExecIterator.scala:155)
at org.apache.comet.vector.NativeUtil.getNextBatch(NativeUtil.scala:173)
at org.apache.comet.CometExecIterator.$anonfun$getNextBatch$1(CometExecIterator.scala:154)
Expected behavior
Query runs and returns Spark-equivalent results with no native crash.
Additional context
Found while enabling CometLocalTableScanExec by default (#4393), but reproduces over a plain Parquet scan. Upstream test: subquery in repartition.
Describe the bug
A scalar subquery inside a
RepartitionByExpression(e.g.DISTRIBUTE BY) crashes natively: the subquery is not registered on the native side for this plan shape. Reproduces over a plain Parquet scan.Steps to reproduce
Add to a suite extending
CometTestBase:Expected behavior
Query runs and returns Spark-equivalent results with no native crash.
Additional context
Found while enabling
CometLocalTableScanExecby default (#4393), but reproduces over a plain Parquet scan. Upstream test:subquery in repartition.