Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hanging with mpi4py.future #106

Open
jychoi-hpc opened this issue Jun 26, 2020 · 7 comments
Open

Hanging with mpi4py.future #106

jychoi-hpc opened this issue Jun 26, 2020 · 7 comments

Comments

@jychoi-hpc
Copy link

I am trying to use this scorep python binding with mpi4py.future without success. My code is just hanging without progressing.

Here is my example code (future_hello.py):

from mpi4py import MPI
from mpi4py.futures import MPICommExecutor

def hello(x):
    name = MPI.Get_processor_name()
    me = MPI.COMM_WORLD.Get_rank()
    nproc = MPI.COMM_WORLD.Get_size()
    print (me, nproc, name, 'arg', x)

if __name__ == '__main__':
    nproc = MPI.COMM_WORLD.Get_size()
    with MPICommExecutor(MPI.COMM_WORLD, root=0) as executor:
        if executor is not None:
            for i in range(nproc*2):
                future = executor.submit(hello, i)
            print ("Done.")

I am trying to run as follows:

srun -n 4 python -m scorep --mpp=mpi --thread=pthread future_hello.py 

Without scorep tracing, I expect an output something like:

1 4 nid02401 arg 0
2 4 nid02401 arg 1
3 4 nid02401 arg 2
2 4 nid02401 arg 3
3 4 nid02401 arg 4
1 4 nid02401 arg 5
2 4 nid02401 arg 6
3 4 nid02401 arg 7

I am wondering if this is an expected error or if there is any fix I can try.

I appreciate any advice in advance.

@AndreasGocht
Copy link
Collaborator

Hey,

I can reproduce the issue. Which MPI implementation du you use?

Best,

Andreas

@jychoi-hpc
Copy link
Author

Thank you for looking at this.

The above example is from Cori, NERSC, which is using MPICH. I also tested with OpenMPI on my local desktop and saw the same problem.

@AndreasGocht
Copy link
Collaborator

It took me a while, to dig through the mpi4py, to understand what happens. It looks like a Score-P bug, as I was able to create a pure MPI-C code, with the same behaviour.

I'll raise a ticket with the Score-P developers, but I am not sure, how fast this Issue can be solved.

Basically it seems to relate to the way MPI_Comm_create and then MPI_Intercomm_create is used. The related lines are:
https://bitbucket.org/mpi4py/mpi4py/src/6ad13434227f5afcdeb1d733e4eb121d17b50ed1/src/mpi4py/futures/_lib.py#lines-242
https://bitbucket.org/mpi4py/mpi4py/src/6ad13434227f5afcdeb1d733e4eb121d17b50ed1/src/mpi4py/MPI/Comm.pyx#lines-165
https://bitbucket.org/mpi4py/mpi4py/src/6ad13434227f5afcdeb1d733e4eb121d17b50ed1/src/mpi4py/MPI/Comm.pyx#lines-1394

Best,

Andreas

@AndreasGocht
Copy link
Collaborator

Turns out, that this is a known Score-P Open Issue:

  • If an application uses MPI inter-communicators, Score-P
    measurement will hang during the creation of the communicator.

There is nothing I can do from the Score-P Python Binding side. There might be possible to patch the mpi4py code. However, I`ll document my findings here, and leave the ticket Open till a solution is found.

Sorry that I do not have more positive news.

Best,

Andreas

@AndreasGocht
Copy link
Collaborator

Example Sourcecode

Below is a basic C-Example which reproduces the Issue. It happens only with three or more processes. The code works without Score-P.

#include <stdio.h>
#include <mpi.h>

int main(int argc, char **argv) 
{ 
    MPI_Group MPI_GROUP_WORLD, group; 
    MPI_Comm groupcomm;
    MPI_Comm intercomm;
    
    static int list_a[] = {0}; 
    int global_rank = -1;     
    int size_list_a = sizeof(list_a)/sizeof(int); 
    
    MPI_Init(&argc, &argv);     
    MPI_Comm_rank(MPI_COMM_WORLD, &global_rank); 
    MPI_Comm_group(MPI_COMM_WORLD, &MPI_GROUP_WORLD); 

    if (global_rank == 0)
    {    
        MPI_Group_incl(MPI_GROUP_WORLD, size_list_a, list_a, &group);
    }
    else
    {
        MPI_Group_excl(MPI_GROUP_WORLD, size_list_a, list_a, &group);
    }
    
    int remote_leader;
    if (global_rank == 0)
    {
        remote_leader = 1;
    }
    else
    {
        remote_leader = 0;
    }
    
    fprintf(stderr,"RANK %d begin MPI_Comm_create\n", global_rank);
    MPI_Comm_create(MPI_COMM_WORLD, group, &groupcomm); 
    fprintf(stderr,"RANK %d end MPI_Comm_create\n", global_rank);
    
    fprintf(stderr,"RANK %d begin MPI_Intercomm_create\n", global_rank);
    MPI_Intercomm_create(groupcomm, 0, MPI_COMM_WORLD, remote_leader, 0, &intercomm);
    fprintf(stderr,"RANK %d end MPI_Intercomm_create\n", global_rank);
    
    int local_rank = -1;
    MPI_Comm_rank(MPI_COMM_WORLD, &local_rank);
        
    printf("my_global_rank %d, my_local_rank %d \n",global_rank, local_rank);
    
    MPI_Comm_free(&groupcomm); 
    MPI_Comm_free(&intercomm); 
    
    MPI_Group_free(&group); 
    MPI_Group_free(&MPI_GROUP_WORLD); 
    MPI_Finalize(); 
} 

Execute

scorep mpicc mpi_comm_create_example.c -o test
mpiexec -n 3 ./test

Program desciption

The program does the following:

  • Two MPI_Comm's are created: one with rank 0 (root group) and another one without rank 0 (non-root group)
  • An Intercom is established:
    • between the local rank 0 of the non-root group and rank 0
    • between the local rank 0 of root group and rank 1
  • Using Score-P the program deadlocks; otherwise, the code finishes.

Analysis

Looking into the Score-P code (6.0) using ddt, it turns out, that rank 0 stops at the PMPI_Barrier() in MPI_Finalize() (SCOREP_Mpi_Env.c:314), while the other ranks (non-root group) wait for a PMPI_Bcast() in scorep_mpi_comm_create_id() to finish (scorep_mpi_communicator_mgmt.c:267), which never happens.

@maximilian-tech
Copy link

MPI_Intercomm tracing is now supported in Score-P 8.0. The code does not hang.

However, Code

from mpi4py import MPI
from mpi4py.futures import MPICommExecutor

def hello(x):
    name = MPI.Get_processor_name()
    me = MPI.COMM_WORLD.Get_rank()
    nproc = MPI.COMM_WORLD.Get_size()
    print (me, nproc, name, 'arg', x)

if __name__ == '__main__':
    nproc = MPI.COMM_WORLD.Get_size()
    with MPICommExecutor(MPI.COMM_WORLD, root=0) as executor:
        if executor is not None:
            futures = []
            for i in range(nproc*2):
                future = executor.submit(hello, i)
                futures.append(future)
            for future in futures:
                future.result()
            print ("Done.")

leads to

$ mpirun -np 3 python  -m scorep --mpp=mpi --thread=pthread future_hello.py
...
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/maxi/.local/lib/python3.10/site-packages/scorep/__main__.py", line 142, in <module>
    scorep_main()
  File "/home/maxi/.local/lib/python3.10/site-packages/scorep/__main__.py", line 119, in scorep_main
    tracer.run(code, globs, globs)
  File "/home/maxi/.local/lib/python3.10/site-packages/scorep/_instrumenters/scorep_instrumenter.py", line 55, in run
    exec(cmd, globals, locals)
  File "future_hello.py", line 19, in <module>
    future.result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
_pickle.PicklingError: Can't pickle <function hello at 0x7f5ede003400>: attribute lookup hello on __main__ failed

This only occurs when using Score-P.

@NanoNabla
Copy link
Collaborator

This seems to be another issue. Moreover, the code does not hang anymore but raises an exception. I can reproduce the error without using mpi environment.

I will take a look into this issue in #157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants