Skip to content

Use datafusion-spark SparkArrayContains for three-valued NULL semantics#3630

Draft
n0r0shi wants to merge 1 commit intoapache:mainfrom
n0r0shi:spark-array-contains-null-semantics
Draft

Use datafusion-spark SparkArrayContains for three-valued NULL semantics#3630
n0r0shi wants to merge 1 commit intoapache:mainfrom
n0r0shi:spark-array-contains-null-semantics

Conversation

@n0r0shi
Copy link

@n0r0shi n0r0shi commented Mar 4, 2026

  • Replace the CASE WHEN wrapper around array_has with a direct call to datafusion-spark's
    SparkArrayContains, which handles Spark's three-valued NULL semantics natively
  • When an element is not found and the array contains NULL elements, the result is now correctly NULL
    instead of false
  • Net reduction of ~25 lines by removing the manual NULL handling logic

Depends on

Test plan

  • Existing CometArrayExpressionSuite tests pass
  • Verify three-valued NULL semantics: array_contains(array(1, NULL, 3), 2) returns NULL

…emantics

Replace the CASE WHEN wrapper around array_has with a direct call to
datafusion-spark's SparkArrayContains, which handles Spark's three-valued
NULL semantics natively: when an element is not found and the array
contains NULL elements, the result is NULL instead of false.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant