Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1. user can choose to show nested type when write to csv 2. Add features to show struct type data as json format #6950

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

awol2005ex
Copy link

  1. user can choose to show nested type when write to csv 2. Add features to show struct type data as json format

@github-actions github-actions bot added the arrow Changes to the arrow crate label Jan 7, 2025
@tustvold
Copy link
Contributor

tustvold commented Jan 7, 2025

I'm not sure about this, we already have arrow_json which provides this functionality?

Perhaps you could file a ticket explaining what you're wanting to do and we can go from there?

@awol2005ex
Copy link
Author

I'm not sure about this, we already have arrow_json which provides this functionality?

Perhaps you could file a ticket explaining what you're wanting to do and we can go from there?

I just want to export orc file in hive acid table to csv ,And the file has struct type field and can't be export

@tustvold
Copy link
Contributor

tustvold commented Jan 7, 2025

CSV cannot typically be used to represent structured data, have you considered using arrow_json?

TBC we could add a way to serialize JSON into CSV, its a little odd and I suspect it will confuse some readers as CSV has non-standardized escaping rules, but that would need to be done using the arrow_json machinery - we can't be maintaining two JSON serializers

@awol2005ex
Copy link
Author

CSV cannot typically be used to represent structured data, have you considered using arrow_json?

TBC we could add a way to serialize JSON into CSV, its a little odd and I suspect it will confuse some readers as CSV has non-standardized escaping rules, but that would need to be done using the arrow_json machinery - we can't be maintaining two JSON serializers

ok

@alamb
Copy link
Contributor

alamb commented Jan 8, 2025

Thank you for this contribution @awol2005ex

I think this pR would also need some tests that show what it is attemping to do.

But in general I agree with @tustvold that the idea of writing structured data to CSV is likely not a good addition to this crate as there is no standard (either doc or reference implementation) to refer to and seems like a very specialized (one off) usecase

It does seem reasonable to me to allow user defined formatting for customizing output.
If you wanted to add some API to allow a custom formatter that would likely be a better fit for this crate

@alamb alamb marked this pull request as draft January 8, 2025 22:44
@alamb
Copy link
Contributor

alamb commented Jan 8, 2025

Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants