-
Notifications
You must be signed in to change notification settings - Fork 260
feat: support arrow dictionary in schema conversion #1293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support arrow dictionary in schema conversion #1293
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great to me -- thank you @jdockerty
(though note I am not an expert in this code nor a committer on the iceberg project)
Looks like @phillipleblanc made a similar PR here: #1283 |
@liurenjie1024 or @Xuanwo, do you have time to review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jdockerty for this pr, LGTM!
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> Helps #1277 ## What changes are included in this PR? <!-- Provide a summary of the modifications in this PR. List the main changes such as new features, bug fixes, refactoring, or any other updates. --> This expands the `test_type_conversion` test with the specific `DataType::Dictionary` type. Props to @phillipleblanc for this, as I noticed the extra coverage on his PR which I believe is very useful to capture[^1]. [^1]: I was looking to add similar to #1293, but it was merged just before I pushed the changes 😆
Which issue does this PR close?
DataType::Dictionary
for arrow schema conversion #1277What changes are included in this PR?
An expansion to the current schema conversion for arrow to an iceberg schema.
This uses the suggestion provided by @alamb (here) for the schema conversion work.
Are these changes tested?
The current test for schema conversion has been expanded.
I've also taken the liberty of altering the
assert_eq!
call to use thepretty_assertions
version, as it makes viewing the failure much simpler with a large JSON structure - I can revert this though if it is problematic.