Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EDF open fails when physical max and min are both 0 #256

Open
adammj opened this issue Jun 17, 2024 · 7 comments
Open

EDF open fails when physical max and min are both 0 #256

adammj opened this issue Jun 17, 2024 · 7 comments

Comments

@adammj
Copy link

adammj commented Jun 17, 2024

Line 214 of edflib.c tests for the following condition edfhdr->edfparam[i].phys_max==edfhdr->edfparam[i].phys_min), which then throws an error.

I understand in a technical sense that this shouldn't be possible. However, this is the real world, and I have thousands of EDF files (recorded by various groups) that forgot to set the physical max+min for 1 or more channels (they are both set to 0). This leads to the condition that this EDF library will refuse to open those files, even though only 1 or 2 channels are "bad". However, some others, like the native one in MATLAB and some EDF viewers, will gladly open these same files.

I'd like to propose that instead the file still be opened (maybe in all cases, but certainly in the case that both physical min and max are 0). Looking over the code elsewhere, I don't see any instances where this would cause new issues. It's just that reading back the physical values for the "bad" channel would all be 0s. Whereas reading the digital values will still report whatever was actually stored.

Another option, which I think would require more work, is to modify the initial load to act like those channels don't exist.

If this is acceptable, I can certainly make the changes.

@skjerns
Copy link
Collaborator

skjerns commented Jun 18, 2024

That's a good idea, thanks!

Yes, there are quite a couple of things that the original C library is quite strict on, not only this, see e.g. #114 , however we are only a wrapper for edflib

However, the more additions we include and monkey-patch in edflib.c, the harder it get's to keep up with upstream changes. It was already a pain to update to the most recent version, which is already a few years out of date.

On the long run it would be great to keep the edflib.c as is from upstream and insert all additions as patches elsewhere, however, I haven't looked into what would be the best approach for that.

Do you have suggestions on how to solve this?

@adammj
Copy link
Author

adammj commented Jun 19, 2024

I noticed the upstream library after I submitted this, and have already heard back from the developer that they won't allow these sorts of compromises. I guess they haven't heard of Postel's law ("be conservative in what you do, be liberal in what you accept from others").

Regardless, now that I understand your desire to minimize work in keeping up with upstream changes, I'll mull it over and see if I can come up with a better solution than the one I currently implemented on my local copy.

I'll get back with you.

@skjerns
Copy link
Collaborator

skjerns commented Jun 19, 2024

Yes, teuniz is quite strict with implementation specifics, but I guess it has it's advantages.

If you find a way to implement changes without altering the C code that would be great! I'm not very proficient at C unfortunately and I haven't gotten around to look into options on how to monkey-patch or overload certain functions. That would be very useful for other things as well and for long-term maintainability.

@adammj
Copy link
Author

adammj commented Jun 21, 2024

Unfortunately, I don't see a way around altering the C code (which I have experience with). Looking at the teuniz's C and Python code closer, and testing it against my collection of 30k EDF files, I've found an additional half dozen+ cases distributed over 4k files which also fail. It's obvious that the code is way too strict in what it allows (and their "solution" of providing helper utilities to rewrite the EDF files is not acceptable where strict provenance is required—which should be all of science).

@skjerns
Copy link
Collaborator

skjerns commented Jun 21, 2024

Thanks!

As far as my research goes, using macros to redirect the function calls could work?

Eg


#define foo my_foo  // Redirect foo to my_foo

#include "edflib.h"  // Include the original header

#undef foo  // Remove the macro redirection to avoid pollution

// Your code that calls foo()
int main() {
    int result = foo(10);  // This will call my_foo(10)
    return 0;
}

@TrianecT-Wouter
Copy link

Could you not just set the phys_min to 0 and (if phys_max is also 0), the phys_max to 0.000001? Then the loading will continue and the data will still be all 0.

In most of the cases these channels probably contain 0's anyway? (reason for faulty phys_min and phys_max)

@adammj
Copy link
Author

adammj commented Dec 17, 2024

Then the loading will continue and the data will still be all 0.

In most of the cases these channels probably contain 0's anyway? (reason for faulty phys_min and phys_max)

No, this is not the case. The "problem" is whatever tool wrote the EDF files set the physical min and max as 0, when there is still valid data, probably because the conversion factor from digital to physical wasn't set. This is a shortcoming of the EDF format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants