Hye,
While the C++ interface allows it, the Python h5py interface does not seem to allow the creation of a 1D unlimited array.
Moreover all the examples of unlimited tables are given with at least dimension like: maxshape = (None, 3).
I have the possibility of doing (None, 1) but this remains a declaration of a bi-dim table.
I cannot use this workaround because it does not meet the characteristics of the format described for VTK (VTK File Formats - VTK documentation).
Did I miss something? Is there another way to declare a 1D array unlimited?
Thank you in advance for any help you can give me.
Did you try maxshape=(None,)
?
Yes.
parent.create_dataset(name=child, dtype='i8', shape=(1),
dtype='i8', maxshape=(None), chunks=True)
parent_add_child.resize(2)
>> RuntimeError: Unable to set dataset extent (dimension cannot exceed the existing maximal size (new: 2 max: 1))
I think I understand the general logic (since I get there in 2D maxshape=(None, 1)).
I think there is a wrong interpretation in dataset.py of h5py line 64:
tmp_shape = maxshape if maxshape is not None else shape
when maxshape=(None)
is equivalent to maxshape=None (undefined) then we use shape value.
>>> m=(None)
>>> m is None
True
>>> m=(None,1)
>>> m is None
False
>>> m=[None]
>>> m is None
False
>>> m=[None,1]
>>> m is None
False
The documentation (book, examples and other) never addresses the case of an unlimited 1D array, hence my question here. ;(
The solution could be this, after looking at the h5py moduel code :
parent.create_dataset(name=child, dtype='i8', shape=(1),
dtype='i8', maxshape=(h5py.UNLIMITED), chunks=True)
parent_add_child.resize(2)
I will see if the propagation of this solution in my code is compliant.
Thank’s, @ajelenak
The code below:
import h5py
with h5py.File('test.h5', mode='w') as h5f:
h5f.create_dataset(
name='child', dtype='i8', shape=(2,), maxshape=(None,), chunks=True)
Produces a 1D dataset with unlimited size.
$ h5dump -p test.h5
GROUP "/" {
DATASET "child" {
DATATYPE H5T_STD_I64LE
DATASPACE SIMPLE { ( 2 ) / ( H5S_UNLIMITED ) }
STORAGE_LAYOUT {
CHUNKED ( 2 )
SIZE 0
}
FILTERS {
NONE
}
FILLVALUE {
FILL_TIME H5D_FILL_TIME_ALLOC
VALUE H5D_FILL_VALUE_DEFAULT
}
ALLOCATION_TIME {
H5D_ALLOC_TIME_INCR
}
DATA {
(0): 0, 0
}
}
}
}
Hi @ajelenak,
Oups!
Indeed, my error is pythonesque:
(2,) and (None,)
is not the same as
(2) and (None)
if I modify your example:
>>> with h5py.File('test.h5', mode='w') as h5f:
... dset = h5f.create_dataset(
... name='child', dtype='i8', shape=(1), maxshape=(None), chunks=True)
... dset.resize([4])
... dset[3] = 2
it’s wrong:
RuntimeError: Unable to set dataset extent (dimension cannot exceed the existing maximal size (new: 4 max: 1))
But, this
>>> with h5py.File('test.h5', mode='w') as h5f:
... dset = h5f.create_dataset(
... name='child', dtype='i8', shape=(1,), maxshape=(None,), chunks=True)
... dset.resize([4])
... dset[3] = 2
it’s good!
A trapping python subtlety that I didn’t know about.
I was complaining that None and (None) was the same thing whereas with this comma (None,), it is no longer considered the same. Well seen.
I’m really sorry, I didn’t understand your comment before.
Thank you for your help.
PS: This also allows me to understand why when I passed a tuple of a single element without putting a comma, I got feedback that the object was not iterable. I couldn’t understand this subtlety. I find it tricky that Python allows you to modify the tuple type of an element because it is not mutable!
Many thanks for this understanding.