Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reading in many best tracks... #124

Open
krober10nd opened this issue Aug 30, 2021 · 7 comments
Open

reading in many best tracks... #124

krober10nd opened this issue Aug 30, 2021 · 7 comments

Comments

@krober10nd
Copy link
Collaborator

  • Creating 10 or so BestTrack objects in a loop leads to a jam up.
  • For example, something like...
       storms = [f"{track}R1.trk" for track in track_numbers]]
       files = [pathlib.Path(f"{pold}/{storm}") for storm in storms]
       mesh = AdcircMesh.open(mesh_path, crs=4326)
       bbox = mesh.get_bbox(output_type="bbox")
       for k, (storm_name, storm) in enumerate(zip( files, storms)):                                     
           t1 = time.time()                      
           print("about to read it in...")                                                                           
           bt = BestTrackForcing(storm_name, nws=8)
           try:                                   
               print("about to clip...")          
               bt.clip_to_bbox(bbox, mesh.crs)     
           except:                                                                       
               print(f"failure on storm {storm}")        
               pass                                                           
           filenew = pathlib.Path(f"{pnew}/{storm}")
           print("about to write")
           bt.write(filenew, overwrite=True)
           print(f"Processed storm {storm}")
           print(f"Elapsed time is: {time.time()-t1}")
@krober10nd
Copy link
Collaborator Author

It hangs on BestTrackForcing when the class constructor is called.

@jreniel
Copy link
Collaborator

jreniel commented Aug 31, 2021

Could you try running this with htop open and take a look at the RAM usage?
It's weird that it would hang on __init__ because as far as I know, no heavy computation should be happening there.

@jreniel
Copy link
Collaborator

jreniel commented Aug 31, 2021

Else, can you send you me a bunch of sample files (with fake data, random numbers. are fine) so I can try and reproduce?

@krober10nd
Copy link
Collaborator Author

krober10nd commented Aug 31, 2021

Yea, memory isn't the issue. I suspect it may have to do with hyper threading somehow. On our client's cloud computer, when I deactiavted hyper threading, I was able to plough through. I'm batch processing another 4000 tracks right now without hyper threading, let's see if it jams. If it does, I'll send you some by email.

@krober10nd
Copy link
Collaborator Author

Previously I was doing this and was setting processes=cpu_count, now I'm setting it to the number of physical cores.

"""Run through all the events and clip the tracks associated
   with each event to the model domain.

   Format to the desired NWS format as well
"""
import pathlib
import time

import numpy as np
from adcircpy import AdcircMesh
from adcircpy.forcing.winds import BestTrackForcing
from multiprocess import Pool, cpu_count

mesh_path = pathlib.Path("/work/meshes/SIRA25m_P_NO_OCEAN_BOUNDARY.14")

mesh = AdcircMesh.open(mesh_path, crs=4326)

NWS = 8  # or 19 or 20

events = [
    "Battery_AL_ncep_reanal",
]

track_numbers = range(1000, 8001)  # all tracks

def my_best_track_processing_function(storm, bbox):
    print(storm, flush=True)
    bt = BestTrackForcing(storm, nws=NWS)
    try:
        bt.clip_to_bbox(bbox, mesh.crs)
    except:
        print(f"failure on track {storm}")
        pass
    return bt


bbox = mesh.get_bbox(output_type="bbox")


for event in events:
    print(f"Processing event: {event}")

    # set up new and old path. Files are taken from pold and written to pnew
    pold = f"/work/forced/ncep/wind_data/unclipped/"
    pnew = f"/work/forced/ncep/wind_data/clipped/"

    pathlib.Path(pnew).mkdir(parents=True, exist_ok=True)

    storms = [f"{track}R1.trk" for track in track_numbers[1000:4000]]

    files = [pathlib.Path(f"{pold}/{storm}") for storm in storms]

    t1 = time.time()
    with Pool(processes=48) as p: #cpu_count()) as p:
        res = p.starmap(
            my_best_track_processing_function,
            [(storm_name, bbox) for storm_name in files],
        )
    p.join()
    print(f"Elapsed time is: {time.time()-t1}")

    results = list(res)

    # write each best track to disk (in serial)
    for bt, name in zip(results, storms):
        print(f"writing storm {name}")
        filenew = pathlib.Path(f"{pnew}/{name}")
        bt.write(filenew, overwrite=True)

@jreniel
Copy link
Collaborator

jreniel commented Aug 31, 2021

Normally, a cloud computer doesn't have "dual threads". That is, each "virtual core" is a single core.
I have never tested this package in those types of environments, but it sounds like this issue might be related to that.
Feel free to follow up or close whenever you feel this is resolved, or requires more attention.

@krober10nd
Copy link
Collaborator Author

Yea, this is an Amazon cloud computer that is enabled by default. Not sure how to turn it off. First time I've used it too. The solution is to use fewer threads but that doesn't seem to guarantee it works either...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants