-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
move gs.RETURNN_PYTHON_EXE, gs.RETURNN_ROOT and maybe others #254
Comments
Yes, in my opinion the default should not be changed in general but rather overwritten by explicitly providing an argument to the job.
No, also the env (I think you are referring to Same as:
I think we should be able to remove all these settings if we decide we want to. The Users would then have to adapt their configs If they are using a specific version via |
No, I mean I don't really see why the versions of the files in
So you mean, you would want to break old Sisyphus setups for this? |
then you also should not use this to provide any executables to the recipes but provide everything explicitly. At AppTek we have implemented a means to run Jobs inside a singularity container, thereby also making the installed binaries and even the OS version explicit.
I don't see a possibility to stop (ab)using the sisyphus global settings and not break the old setups. |
Well, my argument is actually that it's fine to have some global defaults. Like things you have installed in
Yes, this sounds good. But this is also not an argument to the jobs, or is it? So, following your argumentation, you would say that this also should be an argument to the jobs? Also, it would be an argument to basically every single job, so we would need to rewrite all of i6_core? But my argument basically is that there is some default via a settings file (in this case via the Sis global settings file + system installation) and this is fine. My argument is that this should be allowed and up to the user.
Yes but still we maybe could clean this up somehow. Or maybe not really? Asked more generally, where to make the distinction? E.g. FFmpeg. There the default in related jobs would look up in the Or I see My argument is that it is fine to have global settings for such things, and it's also fine that the user can overwrite the global settings (e.g. control whatever is installed in the system). Or TensorFlow version? CUDA version? |
Ah, you mean that instead of using I like this solution. I would prefer the |
The motivation from my side to have DEFAULT_ENVIRONMENT_SET and DEFAULT_ENVIRONMENT_KEEP is to make sure that the environment where each job is executed is the same not matter how a user configured his environment and not to influence the workflow. In an ideal world I should be able to just copy the recipe and config folder from someone else and be able to create the same results. In my opinion basic stuff like tr, wget, sed, etc. should be set globally, and specialized tools be set inside the workflow/config. The line between basic stuff not specialized tools is as so many things in life blurry... e.g. is a self compiled ffmpeg version basic or special? Having default values for everything inside a workflow makes it much easier to get started and ideally you only need to overwrite stuff which you are currently working on. So I think default values should be given and they should be inside the recipes. Overwriting default values for all cases could be done inside the recipes or a special file inside the recipes. If you want to mix versions inside the same setup you can also do it inside the actual config file which is called to start the workflow and pass it down to where ever you need it, but this make the code more complex and is, in my experience, not needed in most cases. |
My argument is just that I don't quite see why we think it is ok that wget, Python env, TensorFlow version etc are treated as global settings (not job arguments) while we handle RETURNN/RASR etc differently. Or I see why sometimes you want to have some tools as optional explicit arguments (although I don't see that for RETURNN, as we always keep backward compatibility very strictly...) but the default fallback, why can't it be treated in a same way as the system env (wget, ffmpeg, etc) and Python env, which you configure via the Sis global settings file. This is basically my argument, or how I would like to have it. When you don't provide an explicit argument for some tool (e.g. RETURNN), I think the default fallback should be handled in the same way as we do for other system tools. I don't mean that we should make RETURNN available in Basically, also as we already have it right now, by "abusing" the Sis user global settings file for also other tools. I'm basically fine with how it is, except that @critias and @michelwi argue that the Sis user global settings file should only be about Sis things, so we would want a separate user global settings file for other tools. In case of RETURNN specifically, it always should be safe to use the latest master version. Maybe you disagree. But there are no reported issues at all on this. If you ever encounter any issue with backward compatibility, we always treated that very seriously and fixed it, and made sure via adding a test that this would not break again. Also we never really made any behavioral changes, even if the defaults were badly chosen. For that reason, we introduced explicit behavior versions in RETURNN, so that you can explicitly configure the behavior version. But anyway, this is maybe off topic here now, as |
In many cases this is up to the user. In my setups I am not treating the python env for Returnn or TensorFlow as global setting. Also, there is really no consistent decision on this. So in the end this just evolved naturally as needed. This is why you can set e.g. RASR, RETURNN, Tensorflow, ffmpeg specifically for jobs, but other things not. It was just added where this was needed at some point. In the "beginning" (e.g. the original setup from years ago) everything was global. I personally would discourage people from using the global settings, but I understand that for many users this okay to use (for me it is not, because I have many different systems under one Sisyphus experiment folder). |
Yes, exactly. This is also what I'm saying. It is up to the user, i.e. the user should have such possibility to still use it as global settings. Or even mix both, i.e. have one global setting (used by default, by This issue here is just about the technical question on how to allow this. Specifically, the argument has been made that (Despite, I still don't really see the need to ever use a custom version of RETURNN. If you need that, you might have messed sth up, or have your own RETURNN fork, which is highly discouraged because you cannot share your work anymore with others. But again, this is a separate discussion, and please post any issues in the RETURNN GitHub issue tracker if there are any issues, or talk to me. On Python or TensorFlow version, yes, maybe that can make a small difference in some cases. If we maybe can do anything about this, and are not doing it yet, then also open an issue in the RETURNN tracker. But in any case, also here I would always recommend to use the most recent version which works in your env. Definitely Python >=3.6 and TF >=2. It is important that your setup works with the latest Python/TF version again such that you can share it with others and that it also works in the future.) |
sisyphus.global_settings was not meant to be used inside recipes. It's there to define sisyphus specific settings, e.g. timeouts, which engines to use or which environment should be set during job execution.
Workflow related things should be defined inside the recipes or config directory. If you want to define all external paths in one place you could create a file either inside the recipe folder or, especially if you don't want to cleanly share it across multiple recipes, inside your config folder. This file can then be imported in all relevant places. It would be possible to define the used path inside a tk.Path object there, or use the output of a job which sets everything up.
Originally posted by @critias in rwth-i6/sisyphus#82 (comment)
Of course, we need to keep backward compatibility now, so
gs.RETURNN_PYTHON_EXE
,gs.RETURNN_ROOT
still needs to stay, or is maybe automatically used as default fallback in the new settings specific location.So, does anyone has a suggestion how this should look like?
From the followup discussion in rwth-i6/sisyphus#82 (comment), it could look like:
We have a
recipes.i6_core.settings
orrecipes.i6_core.defaults
file. And similarly as insisyphus.global_settings
, at the end of the file, we use a mechanism similar toupdate_global_settings_from_file
which loadsconfig.i6_core_settings
if it exists, which potentially overwrites any values fromrecipes.i6_core.settings
.The text was updated successfully, but these errors were encountered: