-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for run containers #12
Comments
I would like to take this on, run containers provide very good wins for my workload. I propose adding a new
Not 100% sure (will verify) but I think the struct takes the same space as the tuple I think storing |
@josephglanville sounds good, let me know if you have any questions. |
Picking this up. I'm not clear what equality / cmp semantic we want for roaring bitmaps. What do the other lang roaring libs do? let input: RoaringBitmap = (0..100).collect();
// input is not run compressed
assert!(!input.remove_run_compression())
let mut compressed = input.clone();
// compressed is run compressed
assert!(input.run_compress())
// Should this assertion succeed or panic?
assert!(input == compressed) |
Another problem I'm running into when implementing this: Most of the integration tests make assumptions about the implementation details of roaring's containers. The optimal container for all of the following are run containers. Adding run containers is going to require a large refactoring of our existing tests.
#[test]
fn array_and_bitmap() {
let mut bitmap1 = (0..2000).collect::<RoaringBitmap>();
let bitmap2 = (1000..8000).collect::<RoaringBitmap>();
let bitmap3 = (0..8000).collect::<RoaringBitmap>();
bitmap1 |= bitmap2;
assert_eq!(bitmap1, bitmap3);
} |
Do you think that it could be a good way to fix that by introducing an iterator type that wraps another iterator i.e. Or should we just use a set of bitmaps from the official datasets? |
I think the underlying problem is more is more fundamental. Integration tests are supposed to be black box tests. These have knowledge of implementation details. Containers are non-public, thus these tests should have no knowledge of container types. IMO: They should be moved to unit tests. |
Just as an FYI and to provide a possible reason for this to happen :) I ran into this while trying to deserialise bitmaps generated by the Java version (which failed spectacularly). For my use-case the serialization/deserialization interoperability is more important than being able to generate run containers. I tried the https://github.com/Kerollmops/roaring-rs/tree/run-containers branch which seems to work, but it's quite far behind master. I tried to make it up to date, but there had been some moving around of code that I ran out of time trying to fix. Being able to easily compile to wasm is also important to me, which is why I wanted to use the native Rust version. |
Hi there |
Hey @aersam,
We worked on that but didn't finish the job. However, it could be really simple to accept the running containers and convert them into either array or bitmap containers when deserializing 🤔 |
well, as long as I can read them I don't care about implementation details - especially write is not important for me (at least for some time) . Others might have different requirements, but I think reading those would be a first step |
Hey @aersam, Would it be possible for you to try this PR in your project and tell me that everything works fine, please? This PR makes it possible to read bitmaps with run containers, don't do any operations as it converts them into array or bitmap containers. |
It compiles and it runs, to test if the indexes are correct is a bit more complicated, but they seem very reasonable |
That would be great, I added a test that makes sure that bitmaps are correctly deserialized but it would be better with more tests in the wild. |
I can confirm that it properly works, I checked it against the java implementation and all indexes are equal |
As of version 0.5, the java version of Roaring bitmaps supports run containers...
https://github.com/lemire/RoaringBitmap/blob/master/src/main/java/org/roaringbitmap/RunContainer.java
These can be created by the "runOptimize" function...
https://github.com/lemire/RoaringBitmap/blob/master/src/main/java/org/roaringbitmap/RoaringBitmap.java#L1286
These containers can improve the compression ratio in some cases.
The text was updated successfully, but these errors were encountered: