Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uproot cannot write root files over xrootd. #1252

Open
ikrommyd opened this issue Jul 19, 2024 · 10 comments · May be fixed by scikit-hep/fsspec-xrootd#76
Open

uproot cannot write root files over xrootd. #1252

ikrommyd opened this issue Jul 19, 2024 · 10 comments · May be fixed by scikit-hep/fsspec-xrootd#76
Labels
bug The problem described is something that must be fixed

Comments

@ikrommyd
Copy link

ikrommyd commented Jul 19, 2024

To reproduce:

import uproot

file = uproot.recreate("root://cmseos.fnal.gov//store/user/ikrommyd/test/dummy.root")
file["tree"] = {"branch": [1,2,3]}
file.close()

fails with:

OSError                                   Traceback (most recent call last)
Cell In[2], line 1
----> 1 uproot.recreate("root://cmseos.fnal.gov//store/user/ikrommyd/test/dummy.root")

File ~/miniforge3/envs/work/lib/python3.11/site-packages/uproot/writing/writable.py:110, in recreate(file_path, **options)
    106 file_path = uproot._util.regularize_path(file_path)
    107 storage_options = {
    108     key: value for key, value in options.items() if key not in recreate.defaults
    109 }
--> 110 sink = uproot.sink.file.FileSink(file_path, **storage_options)
    111 compression = options.pop("compression", create.defaults["compression"])
    113 initial_directory_bytes = options.pop(
    114     "initial_directory_bytes", create.defaults["initial_directory_bytes"]
    115 )

File ~/miniforge3/envs/work/lib/python3.11/site-packages/uproot/sink/file.py:53, in FileSink.__init__(self, urlpath_or_file_like, **storage_options)
     51 else:
     52     if not self._file_exists(urlpath_or_file_like, **storage_options):
---> 53         self._truncate_file(urlpath_or_file_like, **storage_options)
     55     self._open_file = fsspec.open(
     56         urlpath_or_file_like, mode="r+b", **storage_options
     57     )

File ~/miniforge3/envs/work/lib/python3.11/site-packages/uproot/sink/file.py:81, in FileSink._truncate_file(cls, urlpath, **storage_options)
     79 parent_directory = fs.sep.join(local_path.split(fs.sep)[:-1])
     80 fs.mkdirs(parent_directory, exist_ok=True)
---> 81 fs.touch(local_path, truncate=True)

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec/asyn.py:118, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    115 @functools.wraps(func)
    116 def wrapper(*args, **kwargs):
    117     self = obj or args[0]
--> 118     return sync(self.loop, func, *args, **kwargs)

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec/asyn.py:103, in sync(loop, func, timeout, *args, **kwargs)
    101     raise FSTimeoutError from return_result
    102 elif isinstance(return_result, BaseException):
--> 103     raise return_result
    104 else:
    105     return return_result

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec/asyn.py:56, in _runner(event, coro, result, timeout)
     54     coro = asyncio.wait_for(coro, timeout=timeout)
     55 try:
---> 56     result[0] = await coro
     57 except Exception as ex:
     58     result[0] = ex

File ~/miniforge3/envs/work/lib/python3.11/site-packages/fsspec_xrootd/xrootd.py:380, in XRootDFileSystem._touch(self, path, truncate, **kwargs)
    376     status, _ = await _async_wrap(self._myclient.truncate)(
    377         path, size=0, timeout=self.timeout
    378     )
    379     if not status.ok:
--> 380         raise OSError(f"File not touched properly: {status.message}")
    381 else:
    382     len = await self._info(path)

OSError: File not touched properly: [ERROR] Server responded with an error: [3013] Unable to truncate (null); Operation not supported

The below reproducers work fine locally but will fail if "dummy" becomes a remote xrootd path.

import awkward as ak
import dask_awkward as dak
import uproot

# Eager
array = ak.Array({"x":[1,2,3], "y":[4,5,6]})
file = uproot.recreate("dummy.root")
file["tree"] = array

# Lazy
x = dak.from_awkward(array, 3)
uproot.dask_write(x, "dummy", prefix="whatever", compute=True)
@ikrommyd ikrommyd added the bug (unverified) The problem described would be a bug, but needs to be triaged label Jul 19, 2024
@ikrommyd
Copy link
Author

@jpivarski when you find some time, please look into this as well.
Also mentioning @martindurant cause I don't know if fsspec is relevant to this.

@martindurant
Copy link

The exception seems to be coming from xrootd, not fsspec. I can't say why the file didn't truncate.

@jpivarski
Copy link
Member

@amadio gave us an alternative in xrootd/xrootd#2304 (comment), which can replace

@classmethod
def _truncate_file(cls, urlpath: str, **storage_options) -> None:
"""
Args:
urlpath (str): A filesystem URL that specifies the file to truncate by fsspec.
Truncates the file to zero bytes. Creates parent directories if necessary.
"""
fs, local_path = fsspec.core.url_to_fs(urlpath, **storage_options)
parent_directory = fs.sep.join(local_path.split(fs.sep)[:-1])
fs.mkdirs(parent_directory, exist_ok=True)
fs.touch(local_path, truncate=True)

(which calls truncate in fsspec-xrootd). It is only ever called from

if not self._file_exists(urlpath_or_file_like, **storage_options):
self._truncate_file(urlpath_or_file_like, **storage_options)

@amadio's alternative involves opening the file with a different set of flags.

@maxgalli, do you think that can be implemented in Uproot, or does that fix need to go into fsspec-xrootd? I was at first thinking that we could do it, but after looking at the code references, it doesn't look like we have direct access to the XRootD.client.File object (anymore, now that we've adopted fsspec).

@maxgalli
Copy link
Collaborator

@amadio gave us an alternative in xrootd/xrootd#2304 (comment), which can replace

@classmethod
def _truncate_file(cls, urlpath: str, **storage_options) -> None:
"""
Args:
urlpath (str): A filesystem URL that specifies the file to truncate by fsspec.
Truncates the file to zero bytes. Creates parent directories if necessary.
"""
fs, local_path = fsspec.core.url_to_fs(urlpath, **storage_options)
parent_directory = fs.sep.join(local_path.split(fs.sep)[:-1])
fs.mkdirs(parent_directory, exist_ok=True)
fs.touch(local_path, truncate=True)

(which calls truncate in fsspec-xrootd). It is only ever called from

if not self._file_exists(urlpath_or_file_like, **storage_options):
self._truncate_file(urlpath_or_file_like, **storage_options)

@amadio's alternative involves opening the file with a different set of flags.

@maxgalli, do you think that can be implemented in Uproot, or does that fix need to go into fsspec-xrootd? I was at first thinking that we could do it, but after looking at the code references, it doesn't look like we have direct access to the XRootD.client.File object (anymore, now that we've adopted fsspec).

Hi @jpivarski, sorry for the late reply. It seems to me that implementing Guillerme's suggestion into fsspec-xrootd might be the way to go, but I'm not sure as I'm not familiar with the code yet. I will look into this in more detail in a few weeks from now, after thesis submission (I would thus keep the issue open for now if it is ok).

@jpivarski jpivarski added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Aug 30, 2024
@ikrommyd ikrommyd changed the title dask_write cannot write over xrootd. uproot cannot write root files over xrootd. Sep 17, 2024
@nsmith-
Copy link
Member

nsmith- commented Dec 19, 2024

I'm confused, why do we need to have truncate=True in the touch command if it is only called when the file doesn't exist?

@nsmith-
Copy link
Member

nsmith- commented Dec 19, 2024

Since you later want to open a file with truncate on open, I would suggest to use that interface in fsspec-xrootd. Open the file with mode w once to touch/create/truncate it, then open it for r+b later.

@nsmith-
Copy link
Member

nsmith- commented Dec 19, 2024

Furthermore, I suspect interacting in read-write mode with a remote filesystem is not a great idea. We should encourage users to add the simplecache:: (see fsspec docs) prefix to the destination so a local temporary file can be opened, edited and finalized, before it is written once to the remote server on close.

@maxgalli
Copy link
Collaborator

Hi Nick, I came back to this, doing some investigation and trying to follow your suggestion. However, there are a few things I don't understand.
If I try to first touch the file and then re-open it in r+b mode, I get a raise NotImplementedError("File mode not supported") error rising from here. This makes sense, as r+b is not considered in this set.
Moreover, I would have naively expected r+b to be assigned OpenFlags.UPDATE in these lines, but instead it seems to fall into the last category. Is this behavior intended?

@nsmith-
Copy link
Member

nsmith- commented Jan 22, 2025

Ah, this might be a mistake, perhaps we can allow a mode? Though, with a we can only append.
I'm not sure what the comment update "+" mode removed for now since seek() is read only refers to. Does xrootd support rw seek?

Did you try the simplecache::? Then we would only write to the remote storage once.

@maxgalli
Copy link
Collaborator

Ah, this might be a mistake, perhaps we can allow a mode? Though, with a we can only append. I'm not sure what the comment update "+" mode removed for now since seek() is read only refers to. Does xrootd support rw seek?

I think the comment might refer to this: the inherited seek function raises an error if not in rb mode. However, I thought it could maybe be implemented in fsspec-xrootd, in order to change the behavior?
As for xrootd, I looked around and it should support rw seek.

Did you try the simplecache::? Then we would only write to the remote storage once.

Not yet, I'm still trying to find a combination of changes in uproot/fsspec/fsspec-xrootd that fixes the bug - will add it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants