ros3 driver breaks randomly


#1

Hi,

I am trying to use the ros3 driver via h5py. This normally works pretty smoothly, but will randomly fail with a curl error (traceback below). It appears that this is an AWS issue- sometimes reads fail, and best practice is to retry a few times. However, this is tough because it can potentially occur any time we access any data from the file. Is there a way to configure the ros3 driver to automatically retry a few times when the curl command fails? I noticed fsspec does not have this issue. Would it be better to switch to that mode of reading HDF5 files?


File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/pynwb/__init__.py:246, in NWBHDF5IO.__init__(self, **kwargs)
    244     elif manager is None:
    245         manager = get_manager()
--> 246 super(NWBHDF5IO, self).__init__(path, manager=manager, mode=mode, file=file_obj, comm=comm, driver=driver)

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/hdmf/utils.py:593, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    591 def func_call(*args, **kwargs):
    592     pargs = _check_args(args, kwargs)
--> 593     return func(args[0], **pargs)

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/hdmf/backends/hdf5/h5tools.py:81, in HDF5IO.__init__(self, **kwargs)
     79 self.__mode = mode
     80 self.__file = file_obj
---> 81 super().__init__(manager, source=path)
     82 self.__built = dict()       # keep track of each builder for each dataset/group/link for each file
     83 self.__read = dict()        # keep track of which files have been read. Key is the filename value is the builder

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/hdmf/utils.py:593, in docval.<locals>.dec.<locals>.func_call(*args, **kwargs)
    591 def func_call(*args, **kwargs):
    592     pargs = _check_args(args, kwargs)
--> 593     return func(args[0], **pargs)

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/hdmf/backends/io.py:23, in HDMFIO.__init__(self, **kwargs)
     21 self.__built = dict()
     22 self.__source = source
---> 23 self.open()

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/hdmf/backends/hdf5/h5tools.py:725, in HDF5IO.open(self)
    722 if self.driver is not None:
    723     kwargs.update(driver=self.driver)
--> 725 self.__file = File(self.source, open_flag, **kwargs)

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/h5py/_hl/files.py:507, in File.__init__(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, fs_strategy, fs_persist, fs_threshold, fs_page_size, page_buf_size, min_meta_keep, min_raw_keep, locking, **kwds)
    502     fapl = make_fapl(driver, libver, rdcc_nslots, rdcc_nbytes, rdcc_w0,
    503                      locking, page_buf_size, min_meta_keep, min_raw_keep, **kwds)
    504     fcpl = make_fcpl(track_order=track_order, fs_strategy=fs_strategy,
    505                      fs_persist=fs_persist, fs_threshold=fs_threshold,
    506                      fs_page_size=fs_page_size)
--> 507     fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
    509 if isinstance(libver, tuple):
    510     self._libver = libver

File ~/.conda/envs/nwb_explorer/lib/python3.9/site-packages/h5py/_hl/files.py:220, in make_fid(name, mode, userblock_size, fapl, fcpl, swmr)
    218     if swmr and swmr_support:
    219         flags |= h5f.ACC_SWMR_READ
--> 220     fid = h5f.open(name, flags, fapl=fapl)
    221 elif mode == 'r+':
    222     fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File h5py/h5f.pyx:106, in h5py.h5f.open()

OSError: Unable to open file (curl cannot perform request)

#2

Hi Ben,

I canā€™t promise this is going to get fixed imminently, but Iā€™ve been looking to go over the ros3 VFD code this fall and when I do I will see about making the VFD more tolerant of read fails.


#3

Yes, I think it is more useful. It supports Google and Azure cloud object stores, any HTTP server that accepts range GET requests, and temporary credentials.

Aleksandar