-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to do SZ_Finalize on a task based manner? Or provide a thread safe API? #86
Comments
SZ_Finalize is supposed to be called at program startup and shutdown. In LibPressio we provide two solutions to this. The first is an external to SZ, but internal to LibPressio atomic reference counting solution with reader writer locks to ensure the integrity of the global metadata. It also detects if a users has called Init without us and won’t deallocate if the reference count hits zero, unfortunately it can’t prevent the user from calling SZ_Finalize on their own. The second was in very new versions of SZ2, a thread safe API was added that supports a small subset of functionality. We wrap the latter as sz_threadsafe. I suggest referring to the compressor plugins for LibPressio if feel like you must implement this version checking and complexity yourself. You can find the threadsafe API here https://github.com/robertu94/libpressio/blob/master/src/plugins/compressors/sz_threadsafe_plugin.cc SZ3 removes the equivalent of SZ_Init and SZ_Finalize and is threadsafe from the start. It also implements an non overlapping subset of SZ2 and is a breaking change of the API as suggested by the major version change. It will eventually support decompressing data compressed with SZ2, and that functionality will come in a minor release. |
One last comment is that WITHOUT the threadsafe API, it is undefined behavior to use any mode except ABS concurrently, and this can cause crashes at best or a silent error at worst. The ABS error bound can only safely be used from multiple threads provided that only the SAME error bound is passed to all threads. It also is undefined behavior to mix error bounds concurrently with the NONTHREADSAFE api. The SAFE API alleviates these problems, but is a proper subset. |
This is what I thought initially, but without calling SZ_Finalize, there is a memory leak, causing the memory consumption to go all the way up. We have workflows that need to compress and decompress very large data for a large number of time steps. With this memory leak, there is no way to make any of these intensive workflows happen. I am wondering where I can get the SZ3 code? It sounds like that's an ultimate solution. |
This is a bug, and we should probably fix it if practical. It probably won't happen before the SC deadline. Can you send us an example of the databuffer that triggers the bug, and the metadata (size, type, etc...) so we can repoduce?
Here is where you can get the SZ3 code. You can find a usage example of using it here: https://github.com/robertu94/libpressio/blob/master/src/plugins/compressors/sz3.cc . Alternatively, I would like to work with you and your team to make the LibPressio integration easier to use. If we can get someone from your team to offer to maintain MGARD in spack, I would like to submit a PR to spack to add it, which then it could be used from ADIOS when built by spack. |
It doesn't need a specific code to reproduce. You can simply take any SZ test or hello world code without calling SZ_Finalize, put all the compress / decompress operations in a while(true) loop and watch the OS memory monitor. |
Hi Jason, ........ FYI, I am using the latest version (master branch). [sdi@localhost example]$ valgrind --leak-check=yes ./testfloat_compress sz.config ./testdata/x86/testfloat_8_8_128.dat 8 8 128 Best, |
@disheng222 Can you try the following code? Let it run for 10 minutes and watch a operating system resource monitor. Then uncomment the two SZ_Finalize() calls and watch it again. Thanks.
|
@JasonRuonanWang It is expected that the code you sent will leak. It calls SZ_Init multiple times without calling SZ_Finalize. In the non-threadsafe version of SZ you can access the parameters of the compressor via the |
Okay I see. In this case, what if I modify the parameters in one thread while another thread is doing compression or decompression? |
It will do evil things to you 😈 😢 . This isn't defined behavior in SZ2 without the threadsafe API. If you are lucky it will segfault. |
Right. Then it seems like SZ3 is the only way to go. Is SZ3 backward compatible with SZ1 compressed data? |
@disheng222 ☝🏻 I believe this is what Franck wants? I know that SZ2 backwards compatibility for decompression is planned, I wasn't sure if this extended back to SZ1? |
For compression, I understand that SZ_Init and SZ_Finalize must be called in pairs. But if you look at my code above, for the decompression routine, I actually didn't call SZ_Init at all, and it still worked. In this case, should I call SZ_Finalize after decompression? If I still need to call SZ_Finalize, then there isn't any SZ_Init matching it, which makes the code look very weird. Generally, in a library, Init() and Finalize() should always be called in pairs. If some functionality can work without calling Init(), then it should not require Finalize() either. But in case of SZ, if I determine that SZ_Finalize() should not be called because there was no SZ_init() called, then there is going to be a memory leak. I guess this is what I was trying to ask initially. |
In our original implementation in ADIOS2, what we did was
|
Hi Jason,
I tested the compression using a loop, but I didn't find the memory leakage
issue.
FYI, I am using the latest version on github.com/szcopressor/SZ (master
branch).
I am using valgrind to check if there are mem leakage issue, as shown
below:
***@***.*** example]$ valgrind --leak-check=yes ./testfloat_compress
sz.config ./testdata/x86/testfloat_8_8_128.dat 8 8 128
==300870== Memcheck, a memory error detector
==300870== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==300870== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright
info
==300870== Command: ./testfloat_compress sz.config
./testdata/x86/testfloat_8_8_128.dat 8 8 128
==300870==
cfgFile=sz.config
timecost=0.246747
timecost=0.131899
timecost=0.133552
timecost=0.132713
timecost=0.130486
timecost=0.134804
timecost=0.132354
timecost=0.130603
timecost=0.132210
timecost=0.132958
done
==300870==
==300870== HEAP SUMMARY:
==300870== in use at exit: 0 bytes in 0 blocks
==300870== total heap usage: 1,568 allocs, 1,568 frees, 979,064,155 bytes
allocated
==300870==
==300870== All heap blocks were freed -- no leaks are possible
==300870==
==300870== For lists of detected and suppressed errors, rerun with: -s
==300870== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I attached the testing code (testfloat_compress2.c) and the sz.config if
you want to reproduce my result.
Best,
Sheng
…On Fri, Mar 25, 2022 at 6:16 AM Jason Wang ***@***.***> wrote:
This is a bug, and we should probably fix it if practical. It probably
won't happen before the SC deadline. Can you send us an example of the
databuffer that triggers the bug, and the metadata (size, type, etc...) so
we can repoduce?
It doesn't need a specific code to reproduce. You can simply take any SZ
test or hello world code without calling SZ_Finalize, put all the compress
/ decompress operations in a while(true) loop and watch the OS memory
monitor.
—
Reply to this email directly, view it on GitHub
<#86 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACK3KSJM6QVHFD4LB6DWQCLVBWOARANCNFSM5RTBHHEA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
We are currently running into segfaults when compressing and decompressing in multiple threads that call SZ API. When a compression/decompression task in one thread finishes, if we call SZ_Finalize, then it will try to release all memory buffers that SZ allocates, including those allocated in other threads, which will then cause a segfault. If we don't call SZ_Finalize, then there is going to be a memory leak. Any suggestions on handling this kind of multi-thread workflows? Thanks.
The text was updated successfully, but these errors were encountered: