Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running out of local storage space while creating SigMF archive on external SSD #71

Open
bhorsfield opened this issue Sep 13, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@bhorsfield
Copy link

Hi Folks,

I have run into a problem when trying to convert a large IQ recording into a SigMF archive.

The problem starts when I call the archive() method. When the sigmf-data file is very large (10-20 GB), the local storage on my small computing device fills up until there is no space left, at which point my Python script crashes. This occurs even if the target location for the archive is on an external SSD.

My suspicion is that the tar utility that generates the SigMF archive is creating a temp file on the local drive. Unfortunately my local drive is an eMMC card with only 5GB of free storage, which greatly limits the size of the recordings I can archive.

Details of my system configuration are as follows:

  • NVIDIA Jetson TX2 with 8GB of RAM & 32GB eMMC drive
  • Ubuntu 18.04
  • SigMF version 1.2.1

Can anyone suggest a workaround for this problem?

Thanks,
Brendan.

@gmabey
Copy link
Contributor

gmabey commented Sep 13, 2024

Yes, the current implementation of this is extremely inefficient and wasteful. I had hoped to someday rewrite it using the tarfile module but instead I ended up writing C++ classes that do it instead. That's what I currently use in my day job, and I'm hoping to release that code "soon".
It seems to me that you appreciate working with archives over loose files, just like me.
Do you have any inclination towards contributing to this project and taking on a rewrite of that functionality?
If so, I would be happy to give you suggestions and pointers along the way.

The basic idea is to write directly into a tarball instead of copying the files into a temporary directory first.
I would suggest working to make sure that the tarfile.PAX_FORMAT variation is always used.

@Teque5 Teque5 added the enhancement New feature or request label Sep 13, 2024
@bhorsfield
Copy link
Author

It seems to me that you appreciate working with archives over loose files, just like me.

Yes, this is usually my preference, especially in cases where the end user must manually transfer recordings from one device to another, or upload them to a cloud storage drive. With multiple loose files there is higher risk of a file getting overlooked or misplaced during the transfer process.

Do you have any inclination towards contributing to this project and taking on a rewrite of that functionality?

Sure, I would be happy to make a contribution if I can. I am not a qualified SW engineer (my background is mostly in RF engineering), but I've been writing software on and off for many years as part of my job, so I should be able to make at least some progress on this task. If you have any tips to get me started, please let me know. Otherwise, I will start by familiarising myself with the tarfile module and take it from there.

Regarding the development timeframe, I should be able to devote some time to this towards the end of September. The SigMF archive feature directly affects my current project, so I can justify a few days of full-time effort.

Cheers,
Brendan.

@Deschain
Copy link

Deschain commented Sep 27, 2024

It seems that archive uses mkdtemp() from the module tempfile to create a temporary directory.

tmpdir = tempfile.mkdtemp()

If I'm not mistaken you can set the TMPDIR environment variable to specify the directory where the temporary directory will be created. I am not sure if it works in every system, you should check yours for the correct environment variable.

@bhorsfield
Copy link
Author

@Deschain, thanks for the suggestion.

I have tried this on my target platform and achieved good results. During the archiving process I can see that my internal storage remains unchanged, while the sigmf-meta & sigmf-data files appear in the new temp directory I created on my external SSD (and are automatically deleted once the archiving process is complete, thankfully).

I think this will be a satisfactory "short-to-medium-term" solution to my original problem.

It remains to be seen whether further improvements can be achieved by writing directly into a tarball instead of copying the files into a temporary directory first, as suggested by @gmabey. I will perform some benchmarking experiments with different tar generating utilities when time permits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants