Skip to content

feat: support factorial, pmod, and rint expressions#3586

Open
n0r0shi wants to merge 3 commits intoapache:mainfrom
n0r0shi:math-funcs
Open

feat: support factorial, pmod, and rint expressions#3586
n0r0shi wants to merge 3 commits intoapache:mainfrom
n0r0shi:math-funcs

Conversation

@n0r0shi
Copy link
Contributor

@n0r0shi n0r0shi commented Feb 25, 2026

Summary

  • Wire three math functions from the datafusion-spark crate (SparkFactorial, SparkPmod, SparkRint) to Comet
  • Register in jni_api.rs and add serde mappings via CometScalarFunction in QueryPlanSerde.scala

Test plan

  • New tests in CometMathExpressionSuite (factorial, pmod, rint) — all pass
  • Spotless formatting check passes
  • Rust native build passes

Register datafusion-spark's SparkFactorial, SparkPmod, and SparkRint
UDFs and add serde mappings in mathExpressions.
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serde wiring and UDF registrations look correct for all three expressions.

For the tests, the SQL file framework in spark/src/test/resources/sql-tests/expressions/math/ would be a natural home for these. There are already many math expression SQL files there.

On pmod, it might be worth testing the divide-by-zero case. Spark returns null for pmod(10, 0) with integer types and NaN for floating-point types. Floating-point inputs would also be good to cover since SparkPmod supports them.

On factorial, I like the coverage of negative and boundary values. One small thing: factorial(21) is the first value that returns null due to overflow protection. It could be worth adding that as a test case.

@n0r0shi
Copy link
Contributor Author

n0r0shi commented Mar 5, 2026

@andygrove Thanks for the review! I've addressed your feedback. Also merged with latest main to resolve conflicts.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending CI. Thanks @n0r0shi

@coderfender
Copy link
Contributor

@andygrove , @n0r0shi there are test failures here

@coderfender
Copy link
Contributor

sql-file: expressions/math/pmod.sql [parquet.enable.dictionary=false] *** FAILED *** (149 milliseconds)
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1871.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1871.0 (TID 7089) (localhost executor driver): org.apache.comet.CometNativeException: Divide by zero error

@coderfender
Copy link
Contributor

coderfender commented Mar 12, 2026

@n0r0shi , I just saw that a PR got merged (03/09/26) which fixes this error in upstream datafusion-spark crate. We might want to wait till we upgrade comet to DF53

@coderfender
Copy link
Contributor

coderfender commented Mar 12, 2026

Original PR : apache/datafusion#16829
Patch (which got merged recently) : apache/datafusion#20461

@coderfender
Copy link
Contributor

@n0r0shi , perhaps you could refactor your code to drop support for pmod support so that we can merge this PR and and then come back once comet is upgraded to DF53 ?

@n0r0shi
Copy link
Contributor Author

n0r0shi commented Mar 12, 2026

@coderfender Sure, thank you for the advice. I have other PRs that also depend on DF 53, so I can drop pmod and update this PR for now. Thanks!

@andygrove
Copy link
Member

CI failure:

- sql-file: expressions/math/pmod.sql [parquet.enable.dictionary=false] *** FAILED *** (107 milliseconds)
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1872.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1872.0 (TID 7094) (localhost executor driver): org.apache.comet.CometNativeException: Divide by zero error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants