Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for complex nested types in List Arrays and Struct Arrays in avro_to_arrow #11342

Open
Tracked by #14096
ameyc opened this issue Jul 8, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@ameyc
Copy link
Contributor

ameyc commented Jul 8, 2024

Is your feature request related to a problem or challenge?

We are currently working on a stream processing system built atop DataFusion and as such Avro is a major format for us given its ubiquity in the Kafka world. We tried using the the existing Avro Reader in data fusion, however found it lacking in some critical ways that make not terribly useful for us in its present state.

The reader currently does not support complex nested datatypes such as -

  1. The List arrays only support primitive types
  2. Dictionary arrays only support Utf8 as its value types.

Lastly, the reader seems to rely on decode_internal method on the apache-avro crate and seems to implement some of the Avro decoding "by hand". We ended up rolling our reader to support and we're able to use decode_from_avro datum and entirely pass on the avro decoding responsibility to the avro package.

Would love to work with @tustvold who seems to contributed here the most to augment the existing limitations here.

Describe the solution you'd like

Addition of support for parsing complex datatypes.

Describe alternatives you've considered

Convert avro > json then rely on json_to_arrow conversion, but this leads to inevitable loss of type information.

Additional context

No response

@ameyc ameyc added the enhancement New feature or request label Jul 8, 2024
@alamb
Copy link
Contributor

alamb commented Jan 12, 2025

Started collecting this and other avro related items in #14096

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants