Hi Tuan,
it might be more efficient to formulate such a time-array as a dataset in the root group rather than an attribute. Datasets don't have size limits, attributes have limitations as they are supposed to be small. Not sure how much it is, could be some 64k limitation.
This dataset might be a one-dimensional array of a compound structure, containing the floating-point value of the time and a string containing the corresponding group name. That way you can read this array quickly and access the group associated with it, independent on which naming convention is used for the group's name. Could even be some random combination of letters. Still, might be good to see this time-dataset as a "cache" for attributes that are stored in the group, so generating this time-dataset could also be a postprocessing step when scanning the groups in the file with time attributes. This might be more efficient than re-creating and appending this time-dataset when each new data set is added to the file, but this would need to be explore in practice. Iterating over groups is also pretty fast, even for large files, but depends how good/bad it will be in your use case.
Werner
On Fri, 16 Dec 2011 00:44:52 -0600, Hoang Trong Minh Tuan > <hoangtrongminhtuan@gmail.com> wrote:
Dr. Werner,
Thanks a lot for your advice. Right now, each HDF file has some
groups, each group has 2 dataset, both correspond to the same
time-step. So, based on your suggestion, I think attributes
(holding the time step information) should attach to the group.
However, I want to quickly to read the time information into an
array, so I'm thinking of putting the time points into an array
which belong to an attribute of the root group. So, if the array
is a[...], then if each group has 10 datasets
a[1] is the time for dataset1 in group 1
a[2]...... dataset2 in group 1
...
a[11] is the time for dataset1 in group 2
Do you think that should be fine? Also, is there a limit for size
of data containing in the attributes, or at least a good threshold?
Thanks,
Tuan
On Thu, Dec 15, 2011 at 12:54 PM, Werner Benger > <werner@cct.lsu.edu <mailto:werner@cct.lsu.edu>> wrote:
Hi Tuan,
if you have multiple datasets for the same time then it would
be better to attach the time information to the common group
where they are in.
Using a double-valued attribute called "time" would do in the
most simple case. If you need a more advanced specification of
time, for instance using units on the time scale, you could
use a named type for this time unit where such global
properties are defined. This named type would best go in a
group independent from those time group, for instance a group
without time attribute, or the group which contains those time
groups.
Possibly you might also want some "reverse lookup" for each
dataset's name, like a table on which time values this dataset
is available, in case this changes and you don't have all
datasets defined on all times. This could be done by another
group, and subgroups for each dataset, and then using symbolic
links to the actual data, or via some dataset that provides
the same information as a table. Just symbolic links are more
elegant, I don't think it's possible to make a dataset
containing symbolic links, at most object references, but
that's not the same.
Werner
On Thu, 15 Dec 2011 10:28:28 -0600, Hoang Trong Minh Tuan > <hoangtrongminhtuan@gmail.com > <mailto:hoangtrongminhtuan@gmail.com>> wrote:
Hi Werner,
I've just successfully created a HDF5 with multi-groups
and multi-datasets. I have another question: what is the
best way to attach the time information (or may be some
others) to each dataset.
Tuan
On Thu, Dec 8, 2011 at 2:31 PM, Werner Benger > <werner@cct.lsu.edu <mailto:werner@cct.lsu.edu>> wrote:
Hi Tuan,
with that many time steps, you might want to
organize the time hierarchically, like having a group
of hundred time groups, so 100 x 100 time groups cover
the 10.000 timesteps. It's probably inefficient to
have 10.000 timesteps or more in the same group,
though I don't have experience (yet) with that
scenario. It would also be inefficient if all your
datasets per time step are pretty small. It might be
better in that case to use a multidimensional dataset
with one varying dimension, and this dimension being
the time, such that you can append data as it flows
and you get new ones.
I don't use IDL, so I don't know which constraints
IDL would give on the HDF5 layout. If IDL is your
primary target, it might be best to investigate what
data layout IDL can handle best.
Werner
On Thu, 08 Dec 2011 07:01:36 -0600, Hoang Trong Minh > Tuan <hoangtrongminhtuan@gmail.com > <mailto:hoangtrongminhtuan@gmail.com>> wrote:
Hi Dr. Werner,
I'm doing the simulation of cells. In such
case, one group is a snapshot at a single time
point of the system. As such, I will have tens of
thousands of such groups in a file; or maybe
multiple files, each file contains thousands of
groups. Also, I want to generate the video from
these snapshots using IDL. Would your suggestion
still be the reasonable approach or should I do in
a different way? . Thank you!
Bests,
Tuan
On Thu, Dec 8, 2011 at 2:28 AM, Werner Benger > <werner@cct.lsu.edu <mailto:werner@cct.lsu.edu>> > wrote:
Hi Tuan,
why don't you put all datasets which belong
to a specific time into a group, one group for
each timestep, and attach time information
(physical time, seconds, float attribute) as
attribute to this group?
Werner
On Thu, 08 Dec 2011 00:40:34 -0600, Hoang > Trong Minh Tuan <hoangtrongminhtuan@gmail.com > <mailto:hoangtrongminhtuan@gmail.com>> wrote:
Hi all,
I am doing simulation in which I need
to keep track of time information and
2d/3d data at each time step ( I may have
more than one arrays). My question is what
is the best way to store such data. Should
I keep 2 separate dataset, one to store
time, and one to store 2d/3d data; or I
can combine them into a special dataset
(which is I don't know)?
Thanks a lot,
Tuan
-- ___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at
Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809
<tel:%2B1%20225%20578%204809> Fax.: +1 225
578-5362 <tel:%2B1%20225%20578-5362>
-- ___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State
University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 <tel:%2B1%20225%20578%204809>
Fax.: +1 225 578-5362 <tel:%2B1%20225%20578-5362>
-- ___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State
University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 <tel:%2B1%20225%20578%204809> Fax.: +1
225 578-5362 <tel:%2B1%20225%20578-5362>
--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org