chore(deps): upgrade to DataFusion 52#1997
Conversation
|
The audit failure (RUSTSEC-2026-0001 for rkyv) is unrelated to this PR - it's being addressed in #1994. Will rebase once that lands. |
Fix for Python Bindings CI FailureThe initial PR failed the Root cause: Fix (commit 33d5608):
The core iceberg-rust crates were already compatible with DataFusion 52 - only the Python bindings needed this update. |
33d5608 to
6e77511
Compare
6a154d0 to
f4cf8da
Compare
|
Please let me know if you run into difficulties with this PR also regarding the FFI change. I think that my approach in apache/datafusion-python#1337 will help resolve the missing elements here. |
ae7b70e to
723e3a6
Compare
|
Is there any progress now? |
723e3a6 to
a19062d
Compare
I rebased and got CI cleaned up. DF 52 Python wheels should be landing in the next day or so - https://lists.apache.org/thread/76v9pmqh7cflgjwx4wnqsmdzw00v62bl. To limit an additional follow up, I am waiting to include that and will open the PR. |
|
Let me know if you need any help with this PR. |
Thanks Tim - it's been a hot minute since I worked on Iceberg and got myself into a CI pickle. I think we should be good after this most recent push. I'll let you know if I run into any trouble. |
19690bf to
3309d6c
Compare
| assert ( | ||
| datafusion.__version__ >= "45" | ||
| ) # iceberg table provider only works for datafusion >= 45 | ||
| if Version(datafusion.__version__) < Version("52.0.0"): |
There was a problem hiding this comment.
Not related to this pr, but is it possible to extract the version from workspace Cargo.toml?
There was a problem hiding this comment.
yes - I can do this on a follow up PR
There was a problem hiding this comment.
+1 should take from datafusion-ffi
3309d6c to
196c528
Compare
kevinjqliu
left a comment
There was a problem hiding this comment.
mostly lgtm, a couple nits
| arrow-schema = "57.0" | ||
| arrow-select = "57.0" | ||
| arrow-string = "57.0" | ||
| arrow-arith = "57.1" |
There was a problem hiding this comment.
👍
datafusion 51.1.0 uses arrow-* 57.1
https://crates.io/crates/datafusion/52.1.0/dependencies
There was a problem hiding this comment.
i think we usually bump these during release.
Previous upgrade PR didnt include this #1899
|
|
||
| # monkey patch the __datafusion_table_provider__ method to the iceberg table | ||
| def __datafusion_table_provider__(self): | ||
| def __datafusion_table_provider__(self, session): |
There was a problem hiding this comment.
|
|
||
| use crate::runtime::runtime; | ||
|
|
||
| fn validate_pycapsule(capsule: &Bound<PyCapsule>, name: &str) -> PyResult<()> { |
There was a problem hiding this comment.
👍
these are from the 52.0.0 upgrade guide
and matches the example https://github.com/apache/datafusion-python/blob/22086650c8df8b7bc382130c9560f76762dbe6c0/examples/datafusion-ffi-example/src/utils.rs
| assert ( | ||
| datafusion.__version__ >= "45" | ||
| ) # iceberg table provider only works for datafusion >= 45 | ||
| if Version(datafusion.__version__) < Version("52.0.0"): |
There was a problem hiding this comment.
+1 should take from datafusion-ffi
Signed-off-by: Ethan Urbanski <ethan@urbanskitech.com>
| datafusion = "51.0" | ||
| datafusion-cli = "51.0" | ||
| datafusion-sqllogictest = "51.0" | ||
| datafusion = "52.1" |
There was a problem hiding this comment.
Don't know if it needs to block but 52.2 is close to release apache/datafusion#20287. It should be a trivial bump from 51.1 so we can always do it later?
There was a problem hiding this comment.
voting just opened - https://lists.apache.org/thread/8g5ozm5vxfnzf1ows028vfrbxl1sz21z
## Which issue does this PR close? - Closes #. ## What changes are included in this PR? While reviewing #1997, i noticed a couple of improvements we can make 1. `make install` should install the local editable `pyiceberg-core` so that i can do `make install && make test` 2. CI should fail on warnings 3. test should not read local env files (we made a similar fix in pyiceberg apache/iceberg-python#3006). Otherwise, tests were reading `~/.pyiceberg.yaml` and polluting the runs ## Are these changes tested? Yes. For (2), see that [it fails in CI](https://github.com/apache/iceberg-rust/actions/runs/22410058261/job/64880767855?pr=2178) before removing the use of deprecated `register_table_provider`
Signed-off-by: Ethan Urbanski <ethan@urbanskitech.com>
|
Thanks for the PR @ethan-tyler lets move this forward and follow up with Thanks everyone for the review |
Which issue does this PR close?
Validates and adopts DataFusion 52
What changes are included in this PR?
__datafusion_table_provider__(session)integrationDataFusion FFI API change
DataFusion 52 expanded table provider FFI construction to include task context and optional logical codec parameters.
Updated the Rust/Python bridge accordingly allowing filter/logical expression serialization remains compatible across the FFI boundary.
Are these changes tested?
Yes