Conversation
|
Thanks @rafafrdz! Kicking off CI. |
|
This looks good so far @rafafrdz. CI shows all three On the compatibility side, I noticed that the PR doesn't distinguish between ANSI and legacy mode for invalid URL handling. DataFusion's |
|
Thanks for the suggestion @andygrove What I did:
I also registered |
- Resolve conflict in jni_api.rs: add ParseUrl/TryParseUrl registrations
alongside SparkSpace and SparkBitCount from upstream
- Update datafusion-spark dependency to 52.2.0 (via upstream Cargo.toml)
- Simplify CometParseUrl serde:
- Spark 3.5 path reads failOnError directly from ParseUrl.failOnError
(no reflection needed)
- Spark 4.0 Invoke path uses reflection only on ParseUrlEvaluator
- Remove dead 'invoke' branch from convertExpression; rename
entry point to convertFromInvoke for clarity
- Expand parse_url.sql test: add PATH/FILE/REF/AUTHORITY/USERINFO
queries, ftp URL row, and try_parse_url coverage
- Add CometStringExpressionSuite tests: REF/AUTHORITY/USERINFO parts,
ANSI mode (Spark 3.5), and a dedicated try_parse_url test
- Remove try_parse_url SQL queries: try_parse_url is a Comet-internal DataFusion function name used when serializing parse_url with failOnError=false. It is not a registered Spark SQL function and calling it directly via SQL raises UNRESOLVED_ROUTINE on any Spark version. The NULL-on-invalid-URL behaviour is covered by the 'parse_url with invalid URL in legacy mode' Scala test. - Replace ftp port 21 with 2121 in test URLs: the Rust url crate omits well-known default ports (ftp=21) when serialising authority(), while Java URI.getRawAuthority() preserves them verbatim. Using a non-default port avoids this pre-existing semantic gap and keeps the AUTHORITY test case meaningful on both Spark 3.5 and 4.0.
7c8a8d9 to
9d7720f
Compare
Use Expression directly instead of a generic type parameter T <: Expression. This eliminates the asInstanceOf cast in convertFromInvokeUnchecked and the unchecked method itself, since QueryPlanSerde can now call convertFromInvoke directly without existential type issues.
Summary
parse_urlby mappingParseUrlto the native scalar functionparse_urlin expression serde.parse_urlas supported.Why
This closes one of the missing DataFusion 50 migration functions from issue #2443
Part of #2443