Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trying to understand manual setup for new split package version of moment kinetics #177

Closed
mrhardman opened this issue Feb 1, 2024 · 22 comments · Fixed by #174
Closed

Comments

@mrhardman
Copy link
Collaborator

Using https://github.com/mabarnes/moment_kinetics/tree/geometry-upgrade-separate-packages with Julia 1.10 I am trying to understand the manual set up steps on a Linux machine to understand the machine setup scripts.

I fall down at the first hurdle https://mabarnes.github.io/moment_kinetics/previews/PR174/manual_setup/#Manual-setup

pkg> develop ./moment_kinetics/
ERROR: MethodError: no method matching get(::Pair{String,Any}, ::String, ::Nothing)
Closest candidates are:
  get(::Base.EnvDict, ::AbstractString, ::Any) at env.jl:77
  get(::REPL.Terminals.TTYTerminal, ::Any, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/REPL/src/Terminals.jl:176
  get(::IdDict{K,V}, ::Any, ::Any) where {K, V} at abstractdict.jl:594

Is this a problem with Julia 1.1.0, or have I broken something locally in the repo?

@johnomotani
Copy link
Collaborator

Sorry, no idea what's happening there. Have you tried pointing julia at a fresh .julia directory? Packages left over from old versions might confuse things...

@mrhardman
Copy link
Collaborator Author

I already tried that, so I think a complete new git clone might be required.

@mrhardman
Copy link
Collaborator Author

Indeed, making a fresh new git clone of the repo seems to fix this, which seems to suggest that these install choices can be made once, and once only? So if I want to switch between working without post processing, to working with, I need to make a new repo?

@mrhardman
Copy link
Collaborator Author

Now we have this:

 pkg> develop ./moment_kinetics/
   Cloning default registries into `~/excalibur/moment_kinetics_newgeo/.julia`
   Cloning registry from "https://github.com/JuliaRegistries/General.git"
     Added registry `General` to `~/excalibur/moment_kinetics_newgeo/.julia/registries/General`
 Resolving package versions...
┌ Warning: julia version requirement for package moment_kinetics not satisfied
└ @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:154
ERROR: Unsatisfiable requirements detected for package MPIPreferences [3da0fdf6]:
 MPIPreferences [3da0fdf6] log:
 ├─possible versions are: 0.1.0-0.1.10 or uninstalled
 ├─restricted to versions * by moment_kinetics [b5ff72cc], leaving only versions 0.1.0-0.1.10
 │ └─moment_kinetics [b5ff72cc] log:
 │   ├─possible versions are: 0.1.0 or uninstalled
 │   └─moment_kinetics [b5ff72cc] is fixed to version 0.1.0
 └─restricted by julia compatibility requirements to versions: uninstalled — no versions left

@johnomotani
Copy link
Collaborator

Indeed, making a fresh new git clone of the repo seems to fix this, which seems to suggest that these install choices can be made once, and once only? So if I want to switch between working without post processing, to working with, I need to make a new repo?

If you had an old Manifest.toml file in the directory, that could also be causing julia to try to use wrong/old versions of dependencies. You shouldn't ever need to make a new repo.

The options for setting up post-processing packages are described here: https://mabarnes.github.io/moment_kinetics/previews/PR174/developing/#Post-processing-packages

@johnomotani
Copy link
Collaborator

johnomotani commented Feb 1, 2024

Not sure what's going on with your error. There is some explanation of this kind of error here
https://discourse.julialang.org/t/a-guide-how-to-handle-error-unsatisfiable-requirements-detected/43406
but we don't pin any versions of MPIPreferences, so unless you have done something odd, that shouldn't be a problem.

In a fresh clone of the separate-postprocessing-packages-merge-geometry-upgrade, with an empty .julia, I can run

$ touch Project.toml
$ julia --project -O3
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0 (2023-12-25)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> ]
(moment_kinetics-clean-test) pkg> dev ./moment_kinetics

and it completes successfully.

When you're in pkg> mode, you should see the name of the directory in brackets before the pkg> like I have above? If you have something like (@v1.10) pkg> you're in some global environment, which might have old packages installed.

@mrhardman
Copy link
Collaborator Author

Looking at these instructions https://mabarnes.github.io/moment_kinetics/previews/PR174/manual_setup/#Manual-setup

should there be a -- project in this call to julia if we have already emptied the Project.toml file by touch Project.toml?

$ julia
julia> ]
pkg> develop ./moment_kinetics/

@mrhardman
Copy link
Collaborator Author

More explicitly, after cloning the repo, I did this

hardman:~/excalibur/moment_kinetics_newgeo$ rm -rf .julia
hardman:~/excalibur/moment_kinetics_newgeo$ touch Project.toml
hardman:~/excalibur/moment_kinetics_newgeo$ julia --project
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.1.0 (2019-01-21)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(moment_kinetics_newgeo) pkg> develop ./moment_kinetics/
   Cloning default registries into `~/excalibur/moment_kinetics_newgeo/.julia`
   Cloning registry from "https://github.com/JuliaRegistries/General.git"
     Added registry `General` to `~/excalibur/moment_kinetics_newgeo/.julia/registries/General`
 Resolving package versions...
┌ Warning: julia version requirement for package moment_kinetics not satisfied
└ @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.1/Pkg/src/Operations.jl:154
ERROR: Unsatisfiable requirements detected for package MPIPreferences [3da0fdf6]:
 MPIPreferences [3da0fdf6] log:
 ├─possible versions are: 0.1.0-0.1.10 or uninstalled
 ├─restricted to versions * by moment_kinetics [b5ff72cc], leaving only versions 0.1.0-0.1.10
 │ └─moment_kinetics [b5ff72cc] log:
 │   ├─possible versions are: 0.1.0 or uninstalled
 │   └─moment_kinetics [b5ff72cc] is fixed to version 0.1.0
 └─restricted by julia compatibility requirements to versions: uninstalled — no versions left

should the --project be there? why did we always using --project before if we should start dropping it now?

@johnomotani
Copy link
Collaborator

Sorry, yes absolutely the --project should be there.

  • Ah, you've got julia version 1.1.0, not 1.10.0! That's probably the problem!

@mrhardman
Copy link
Collaborator Author

Sorry, yes absolutely the --project should be there.

  • Ah, you've got julia version 1.1.0, not 1.10.0! That's probably the problem!

OK, that was a bad typo on my part.

Trying now with Julia 1.10.0 in https://github.com/mabarnes/moment_kinetics/tree/geometry-upgrade-separate-packages I find a problem trying to add Symbolics. Is this error message telling me that I have broken our source, or is something else going on? I am trying to work on this in my version of the merged branch because I have gone through by hand here in the manufactured solns modules already.

(moment_kinetics_newgeo) pkg> add Symbolics
...
Precompiling project...
  ✗ moment_kinetics → manufactured_solns_ext
  49 dependencies successfully precompiled in 55 seconds. 120 already precompiled.
  2 dependencies had output during precompilation:
┌ Symbolics → SymbolicsForwardDiffExt
│  ┌ Warning: Module SymbolicsForwardDiffExt with build ID ffffffff-ffff-ffff-0117-d79ae0d5d96a is missing from the cache.
│  │ This may mean SymbolicsForwardDiffExt [4a213a23-c09c-5cde-9712-b631ad2c72df] does not support precompilation but is imported by a module that does.
│  └ @ Base loading.jl:1942
│  ┌ Error: Error during loading of extension SymbolicsForwardDiffExt of Symbolics, use `Base.retry_load_extensions()` to retry.
│  │   exception =
│  │    1-element ExceptionStack:
│  │    Declaring __precompile__(false) is not allowed in files that are being precompiled.
│  │    Stacktrace:
│  │      [1] _require(pkg::Base.PkgId, env::Nothing)
│  │        @ Base ./loading.jl:1946
│  │      [2] __require_prelocked(uuidkey::Base.PkgId, env::Nothing)
│  │        @ Base ./loading.jl:1806
│  │      [3] #invoke_in_world#3
│  │        @ Base ./essentials.jl:921 [inlined]
│  │      [4] invoke_in_world
│  │        @ Base ./essentials.jl:918 [inlined]
│  │      [5] _require_prelocked
│  │        @ Base ./loading.jl:1797 [inlined]
│  │      [6] _require_prelocked
│  │        @ Base ./loading.jl:1796 [inlined]
│  │      [7] run_extension_callbacks(extid::Base.ExtensionId)
│  │        @ Base ./loading.jl:1289
│  │      [8] run_extension_callbacks(pkgid::Base.PkgId)
│  │        @ Base ./loading.jl:1324
│  │      [9] run_package_callbacks(modkey::Base.PkgId)
│  │        @ Base ./loading.jl:1158
│  │     [10] __require_prelocked(uuidkey::Base.PkgId, env::String)
│  │        @ Base ./loading.jl:1813
│  │     [11] #invoke_in_world#3
│  │        @ Base ./essentials.jl:921 [inlined]
│  │     [12] invoke_in_world
│  │        @ Base ./essentials.jl:918 [inlined]
│  │     [13] _require_prelocked(uuidkey::Base.PkgId, env::String)
│  │        @ Base ./loading.jl:1797
│  │     [14] macro expansion
│  │        @ Base ./loading.jl:1784 [inlined]
│  │     [15] macro expansion
│  │        @ Base ./lock.jl:267 [inlined]
│  │     [16] __require(into::Module, mod::Symbol)
│  │        @ Base ./loading.jl:1747
│  │     [17] #invoke_in_world#3
│  │        @ Base ./essentials.jl:921 [inlined]
│  │     [18] invoke_in_world
│  │        @ Base ./essentials.jl:918 [inlined]
│  │     [19] require(into::Module, mod::Symbol)
│  │        @ Base ./loading.jl:1740
│  │     [20] include
│  │        @ Base ./Base.jl:495 [inlined]
│  │     [21] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::String)
│  │        @ Base ./loading.jl:2216
│  │     [22] top-level scope
│  │        @ stdin:3
│  │     [23] eval
│  │        @ Core ./boot.jl:385 [inlined]
│  │     [24] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)
│  │        @ Base ./loading.jl:2070
│  │     [25] include_string
│  │        @ Base ./loading.jl:2080 [inlined]
│  │     [26] exec_options(opts::Base.JLOptions)
│  │        @ Base ./client.jl:316
│  │     [27] _start()
│  │        @ Base ./client.jl:552
│  └ @ Base loading.jl:1295
└
┌ Symbolics → SymbolicsPreallocationToolsExt
│  ┌ Warning: Module SymbolicsPreallocationToolsExt with build ID ffffffff-ffff-ffff-0117-d7994e827fe2 is missing from the cache.
│  │ This may mean SymbolicsPreallocationToolsExt [d479e226-fb54-5ebe-a75e-a7af7f39127f] does not support precompilation but is imported by a module that does.
│  └ @ Base loading.jl:1942
│  ┌ Error: Error during loading of extension SymbolicsPreallocationToolsExt of Symbolics, use `Base.retry_load_extensions()` to retry.
│  │   exception =
│  │    1-element ExceptionStack:
│  │    Declaring __precompile__(false) is not allowed in files that are being precompiled.
│  │    Stacktrace:
│  │      [1] _require(pkg::Base.PkgId, env::Nothing)
│  │        @ Base ./loading.jl:1946
│  │      [2] __require_prelocked(uuidkey::Base.PkgId, env::Nothing)
│  │        @ Base ./loading.jl:1806
│  │      [3] #invoke_in_world#3
│  │        @ Base ./essentials.jl:921 [inlined]
│  │      [4] invoke_in_world
│  │        @ Base ./essentials.jl:918 [inlined]
│  │      [5] _require_prelocked
│  │        @ Base ./loading.jl:1797 [inlined]
│  │      [6] _require_prelocked
│  │        @ Base ./loading.jl:1796 [inlined]
│  │      [7] run_extension_callbacks(extid::Base.ExtensionId)
│  │        @ Base ./loading.jl:1289
│  │      [8] run_extension_callbacks(pkgid::Base.PkgId)
│  │        @ Base ./loading.jl:1324
│  │      [9] run_package_callbacks(modkey::Base.PkgId)
│  │        @ Base ./loading.jl:1158
│  │     [10] __require_prelocked(uuidkey::Base.PkgId, env::String)
│  │        @ Base ./loading.jl:1813
│  │     [11] #invoke_in_world#3
│  │        @ Base ./essentials.jl:921 [inlined]
│  │     [12] invoke_in_world
│  │        @ Base ./essentials.jl:918 [inlined]
│  │     [13] _require_prelocked(uuidkey::Base.PkgId, env::String)
│  │        @ Base ./loading.jl:1797
│  │     [14] macro expansion
│  │        @ Base ./loading.jl:1784 [inlined]
│  │     [15] macro expansion
│  │        @ Base ./lock.jl:267 [inlined]
│  │     [16] __require(into::Module, mod::Symbol)
│  │        @ Base ./loading.jl:1747
│  │     [17] #invoke_in_world#3
│  │        @ Base ./essentials.jl:921 [inlined]
│  │     [18] invoke_in_world
│  │        @ Base ./essentials.jl:918 [inlined]
│  │     [19] require(into::Module, mod::Symbol)
│  │        @ Base ./loading.jl:1740
│  │     [20] include
│  │        @ Base ./Base.jl:495 [inlined]
│  │     [21] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::String)
│  │        @ Base ./loading.jl:2216
│  │     [22] top-level scope
│  │        @ stdin:3
│  │     [23] eval
│  │        @ Core ./boot.jl:385 [inlined]
│  │     [24] include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)
│  │        @ Base ./loading.jl:2070
│  │     [25] include_string
│  │        @ Base ./loading.jl:2080 [inlined]
│  │     [26] exec_options(opts::Base.JLOptions)
│  │        @ Base ./client.jl:316
│  │     [27] _start()
│  │        @ Base ./client.jl:552
│  └ @ Base loading.jl:1295
└

@johnomotani
Copy link
Collaborator

I seem to see errors like this fairly often at the moment. The package seems to work anyway - I don't know what's going on, but currently hoping that Symbolics (or something?) will update and stop this happening without us having to worry about it.

@johnomotani
Copy link
Collaborator

PS on the branch in #178, the 1D-mirror_MMS_ngrid_9_nel_r_1_z_4_vpa_4_vperp_2_diss.toml example does at least run...

@mrhardman
Copy link
Collaborator Author

Indeed, the last error was a result of unreported errors in the compilation of the external manufactured solutions library, as fortunately indicated in a comment by @johnomotani.

@mrhardman
Copy link
Collaborator Author

PS on the branch in #178, the 1D-mirror_MMS_ngrid_9_nel_r_1_z_4_vpa_4_vperp_2_diss.toml example does at least run...

Ah yes, but did you catch all my annoying geometry changes ? ; )

I'll compare our branches after mine runs, I think I am almost there.

@mrhardman
Copy link
Collaborator Author

The manual startup instructions (https://mabarnes.github.io/moment_kinetics/previews/PR174/manual_setup/#Manual-setup) seem to have a few missing details. For my manual setup of https://github.com/mabarnes/moment_kinetics/tree/geometry-upgrade-separate-packages, I had to change a few details.

  1. MPI was not automatically installed, therefore we need a
pkg > add MPI
  1. Symbolics has to be installed by choice with
pkg> add Symbolics
  1. A typo in the MPIPreferences discussion (missing using)
pkg> add MPIPreferences
pkg> <press 'backspace'>
julia> using MPIPreferences; MPIPreferences.use_system_binary()
  1. No statements about setting the ENV["PYTHON"] variable and rebuilding PyCall with the correct system python.
$ which python
/your/python/location
$ julia -O3 --project
julia> ENV["PYTHON"]="/your/python/location"
julia> using Pkg; Pkg.build("PyCall")

@mrhardman
Copy link
Collaborator Author

PS on the branch in #178, the 1D-mirror_MMS_ngrid_9_nel_r_1_z_4_vpa_4_vperp_2_diss.toml example does at least run...

I am now able to confirm that the simulations run on my merge branch, but it seems like there are merge mistakes which make the MMS tests fail. Just trying to check if this is a problem with the number of cores used in the simulation.

I suspect that if there are problems it will come from the splitting of the manufactured solutions module, which I am not sure is worth the pain, since we will almost always want to have that module enabled for future higher dimensional development.

@mrhardman
Copy link
Collaborator Author

mrhardman commented Feb 2, 2024

I have now fixed the merge branch https://github.com/mabarnes/moment_kinetics/tree/geometry-upgrade-separate-packages and made a PR to the other merge branch by @johnomotani for comparison. I have verified that my merge branch behaves properly on one core for both simulations and diagnostics, but I still have to understand what is wrong for the multi-core jobs. I suspect it could be down to the compilation problems with MPI, but TBC. I may look in the creation of the .so files to fix this.

EDIT: It seems that the problem with MPI is either intermittent, or my last changes somehow fixed it.

@johnomotani
Copy link
Collaborator

Something to watch out for - the diff on the github.com PRs (or the .../compare page) seems to compare the 'new' branch to the last common commit in the base branch, ignoring any newer changes on the base branch. In other words, it's not the same as the output of git diff. I don't know how to do a git diff on github.com, it's probably easier to just run it in your local repo. I did a git diff, and it looks like the differences between our merged branches are whitespace, print statements, and a couple of very small fixes/improvements that I made to machines/shared/machine_setup.jl

@mrhardman
Copy link
Collaborator Author

Should I do a local merge then and re-run my tests, and let you know how it goes?

@johnomotani
Copy link
Collaborator

Should I do a local merge then and re-run my tests, and let you know how it goes?

Sounds good.

@mrhardman
Copy link
Collaborator Author

P.S. Thanks for taking a look!

General potentially unrelated question: I am still running with this new branch without making an .so precompiled file. I seem to notice that for large core counts (>32), precompile time (time before "starting setup") tends to scale linearly with number of cores. Is this expected? Does the "precompile time" scale with core count if the .so is created in advance?

@johnomotani
Copy link
Collaborator

General potentially unrelated question: I am still running with this new branch without making an .so precompiled file. I seem to notice that for large core counts (>32), precompile time (time before "starting setup") tends to scale linearly with number of cores. Is this expected?

Not sure. With julia-1.10.0 it should be better than it was before. I wouldn't have expected the precompile/compile time to depend on number of cores, but I guess it's possible that e.g. the disk accesses compiling functions, etc. cause competition between the processes.

Does the "precompile time" scale with core count if the .so is created in advance?

No, at least not as far as I've noticed on ARCHER2 - using a .so, the time to start the time advance is well under a minute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants