Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Recordings in an Archive #21

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

jhazentia
Copy link

This pull request is intended to add support for multiple recordings in archives. Fixes #11. Please let me know if you have any suggested improvements. Below is a summary of the changes:

  • Added README examples for loading a SigMF archive with multiple recordings, creating a SigMF archive, and creating SigMF archives with multiple recordings
  • Modified SigMFArchive to accept multiple SigMFFiles
  • Renamed SigMFArchive name to path
  • Modified SigMFArchive to append to existing tarfile when a fileobj is passed that references an existing, open tarfile. Note that, if desired, this can be changed back to always overwrite existing file.
  • Added pretty parameter to SigMFArchive to control pretty printing of SigMF metadata in archives
  • Added name parameter to SigMFFile. SigMFArchive will use the SigMFFile name parameter to create recording parent directories/file names in archive
  • Renamed SigMFArchiveReader name to path
  • Modified SigMFArchiveReader to read archives containing multiple recordings. The __len__(), __iter__(), and __getitem__() operate on the list of SigMFFiles instead of individual SigMFFile
  • Modified SigMFMetafile dump() and dumps() methods to append newline to the JSON to make this behavior consistent throughout the code.
  • Added __eq__() method to SigMFFile for testing
  • Misc other supporting changes and docstring updates.
  • Added supporting tests

Note that this PR does not include any changes to include a collection in archive, read collection from archive, or to update gui.py.

These changes are based on my colleagues' (Douglas Anderson @djanderson and Todd Schumann @ToddSchumann) changes: https://github.com/ntia/sigmf/tree/multi-recording-archive.

@jhazentia
Copy link
Author

Also note that this fixes a bug where SigMFFile offset_and_size was ignored by archive.py

@777arc
Copy link
Member

777arc commented Dec 12, 2023

@gmabey what are your thoughts on this one?

Copy link
Contributor

@gmabey gmabey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like most of the changes, but I'm surprised if all the new features are supported in Python 3.6 -- if they are then that's great!
I will try to use this version in my projects and see how well it works in the next couple of days, but in general I'm concerned about a few points of backward compatibility.
@jhazentia could you please try running the tests (maybe you already have) in python 3.6 just to make sure it's good?

@@ -890,14 +944,23 @@ def get_dataset_filename_from_metadata(meta_fn, metadata=None):
return None


def fromarchive(archive_path, dir=None):
"""Extract an archive and return a SigMFFile.
def fromarchive(archive_path):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think it would be better (as it's less likely to break someone's existing code that specifies dir) to leave it as an argument, even if it's not used.


The `dir` parameter is no longer used as this function has been changed to
access SigMF archives without extracting them.
If the archive contains a single recording, a single SigMFFile object will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems "squishy" to me -- I don't think I'm very comfortable with the prospect of the return type changing based on the contents of the SigMF Archive.
However I do acknowledge that it is an attempt to provide some level of backward compatibility with existing code.

@@ -511,13 +547,33 @@ def validate(self):
version = self.get_global_field(self.VERSION_KEY)
validate.validate(self._metadata, self.get_schema())

def archive(self, name=None, fileobj=None):
def archive(self, file_path=None, fileobj=None, pretty=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that file_path is truly a better parameter name, but I'm not immediately convinced that it's worth breaking code for someone who provided an explicit parameter name in the method call ...

@gmabey
Copy link
Contributor

gmabey commented Dec 14, 2023

@gmabey what are your thoughts on this one?

Thanks for tagging me on this -- this PR had fallen off my radar, and I think I'm currently unsubcribed from changes to this project.

@Teque5
Copy link
Collaborator

Teque5 commented Dec 14, 2023

If we want this feature then I'll probably need to dive in and resolve these merge conflicts. Lately been spending free coding time on AoC.

@jhazentia
Copy link
Author

@gmabey Thanks for the feedback! I plan to address your feedback, but it will probably be after the new year. Is Python 3.6 support still needed? It looks like Python 3.6 and 3.7 are end of life now.

@Teque5 I can work on the merge conflicts when I address the other feedback.

@gmabey
Copy link
Contributor

gmabey commented Dec 22, 2023

@jhazentia yeah so the reason why the SigMF project has persisted with python 3.6 support is (only because of) RedHat 8 ... that's just the reality of many users (including me), even though it's less than ideal.

@Teque5 Teque5 assigned jhazentia and unassigned Teque5 Jan 4, 2024
@Teque5
Copy link
Collaborator

Teque5 commented Jan 10, 2025

As I close all outstanding PRs, this one has been open for more than a year. As such I'm going to break it up into multiple smaller PRs and use @jhazentia's additions when possible starting with #98.

Since archive.py got mostly rewritten in #94 I'll wait to merge that before implementing the ability to archive one or more files, which was the original purpose of this PR.

@Teque5 Teque5 self-assigned this Jan 10, 2025
@jhazentia
Copy link
Author

Thanks @Teque5! I apologize I have not been able get back to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to create an archive with multiple recordings
4 participants