Accessing attributes of VDS

Hello!

I’m trying to find out if I can access attributes of virtual dataset. A simple example is below:

from pathlib import Path

import h5py
import numpy as np

def main() -> None:
    base_path = Path(".")

    source_path = base_path / "source_data.hdf5"
    data = np.array([1, 2, 3, 4, 5], dtype=np.float32)
    with h5py.File(source_path, mode="w") as h5file:
        dataset = h5file.create_dataset("data", data=data)
        dataset.attrs["dataset_attr"] = "data_attr"

    virtual_layout = h5py.VirtualLayout(shape=(5,), dtype=np.float32)
    virtual_source = h5py.VirtualSource(source_path, "data", shape=(5,))
    virtual_layout[:] = virtual_source

    dst_path = base_path / "dst.hdf5"
    with h5py.File(dst_path, mode="w") as h5file:
        h5file.create_virtual_dataset("virtual_data", virtual_layout)

if __name__ == "__main__":
    main()

If I try to access the attribute “dataset_attr” of created virtual dataset, I will get an error:

with h5py.File(dst_path, mode="r") as h5file:
    h5ifle["virtual_data"].attrs["dataset_attr"]

KeyError: "Unable to synchronously open attribute (can't locate attribute: 'dataset_attr')"

So, is there convinient way to access attributes of virtual datasets? Or I must open the source file using virtual_sources() method and read attributes directly from the source file?

Your virtual dataset virtual_data has no attributes. The source dataset, data in the source_data.hdf5 file, has that attribute so you can only read it from there.

I am referencing here almost exactly the same question asked in the h5py repo: Access to virtual dataset attributes · Issue #2668 · h5py/h5py · GitHub.

1 Like

Thanks a lot for the answer!