Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arrow-ipc benchmakrs for the IPC reader and writer #6968

Open
alamb opened this issue Jan 12, 2025 · 0 comments
Open

Add arrow-ipc benchmakrs for the IPC reader and writer #6968

alamb opened this issue Jan 12, 2025 · 0 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog

Comments

@alamb
Copy link
Contributor

alamb commented Jan 12, 2025

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

We are contemplating making the arrow IPC reader/writer faster by allowing the user to opt out of validation but currently have no way to test the validation

Describe the solution you'd like
To make sure this actually improves performance we should have benchmarks for the ipc reader/writer

Describe alternatives you've considered

For benchmarks, what I would recommend is add two new benches:

  • arrow-rs/arrow-ipc/benches/ipc_reader.rs
  • arrow-rs/arrow-ipc/benches/ipc_writer.rs

We can following the existing example from parquet like this:

So someone would run them like

cargo bench --bench ipc_reader

The actual benchmarks I would recommend starting with two sets of data: A record batch with primitive arrays (Int32Array, UInt64 and Float64Array) for example

Then adding tests for

  1. StreamWriter (how fast can the data be serialized to a stream)
  2. FileWriter
  3. StreamReader (how fast can serialized data be read back)
  4. FileReader
    With the basic foundation, we can then

Additional context
Inspired by @totoroyyb

@alamb alamb added the enhancement Any new improvement worthy of a entry in the changelog label Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog
Projects
None yet
Development

No branches or pull requests

1 participant