-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
job crashes early in hdfio #1422
Comments
Hm there's not a single line coming from Sphinx in the error message. Do you have a small code to reproduce the error? |
Could it be that there's a stray entry in the database from a time when you deleted the job files manually outside of pyiron? |
Can you also maybe try to see whether a different version of pyiron helps? It might help us figure out which changes could have caused the problem. |
Changing to pyiron/2024-05-20 seemed to help. I was on pyiron/latest before, which apparently is NOT latest. Is it possible that the pyiron version used on the cluster is incompatible with the pyiron/latest on the login node? This is a VERY frustrating experience I am having here. Loads of incomprehensible warnings. Error messages with zero information value. 'Objects can be only recovered from hdf5 if TYPE is given' is essentially a 'Something error occured'. I close the ticket, nothing to win here any more. |
@niklassiemer Can you comment on this? |
Hmmm to my taste the PR got closed a bit too early. If there are updates I would appreciate you guys to post them here. |
pyiron/latest is indeed after all the hand updated version with python3.10 which was somewhat older than the docker-stack build from yesterday. However, the version on the cluster and the one on the login node should not differ! Actually, the kernel chosen in the notebook should also be loaded on the compute node via preserving of the environment. If this is not the case, I need to know and find a solution! |
Got the problem again, with the new kernel. So it's not about the python kernel. I solved the problem again. This time, by avoiding minus-sign in the job name. I may have done this last time, too. Is it possible that the appearance of a minus sign in the job name causes issues? It seems reproducible. |
another thought: could be some inconsistency in the name normalization. For hdf5 file '-' seems replaced by m, in the job table, the '-' is still there. In the working directory, it becomes |
Thanks for coming back to this! This could indeed be a reason! I opened an issue on |
Summary
A SPHInX (restart) job fails to run due to failures in hdf5io. Error message is "ValueError: Objects can be only recovered from hdf5 if TYPE is given"
I cannot tell if this is related to restart.
pyiron Version and Platform
cmti
Expected Behavior
Job runs.
Actual Behavior
Job crashes.
Job execution crashes with the following error.out
Steps to Reproduce
?? Deleting and setting up the job again produces the error again.
The text was updated successfully, but these errors were encountered: