Memory management and HDF I/O

Hello all,

     I've been looking at the tutorial and examples a bit, and I'm
still fuzzy on something. To demonstrate, consider the
HDF5 1.8 C example on arrays:

http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-examples/1_8/C/H5T/h5ex_t_array.c

In the second half, the previously created hdf5 file is read
and the associated data is output to the screen. Looking at
the deallocation at the bottom, it appears that the H5Dread()
function call copies the data from the data set in the hdf5
file into the integer pointers allocated by the user, "rdata".
My guess would be that if I tried to modify the data referred
to by the rdata pointer, and then called H5Dwrite(), that the
function call would again copy the pointer data back to the
dataset.

     The question, then, is this: am I always required to only
refer to copies of the data contained in the dataset, or am
I somehow allowed to get a "pointer" to the actual data?
It seems uncomfortable to me to require this copying back
and forth for large datasets, but maybe it is the case that
the information in the hdf5 file is not stored in memory but
only copied directly from the file to the integer pointers
by the H5Dread() function?

Thanks,
Andrew

Hi Andrew,

  in general, you'd always get and want a copy of the data set, because
it might be a different memory layout in the file. For instance, the
dataset could be compressed in the file, and use a fraction of the
memory that the full dataset requires in your application. Also the
byteorder of the datasets on disk and memory might be different.

  For the special case that the data are exactly the same on disk
and in memory, it might indeed be nice to have a mmap() interface
to directly map the disk data into memory. As far as I know, this
is not supported by HDF5.

  Werner

···

On Mon, 08 Feb 2010 16:16:32 +0100, Andrew W. Steiner <awsteiner@gmail.com> wrote:

Hello all,

     I've been looking at the tutorial and examples a bit, and I'm
still fuzzy on something. To demonstrate, consider the
HDF5 1.8 C example on arrays:

http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/hdf5-examples/1_8/C/H5T/h5ex_t_array.c

In the second half, the previously created hdf5 file is read
and the associated data is output to the screen. Looking at
the deallocation at the bottom, it appears that the H5Dread()
function call copies the data from the data set in the hdf5
file into the integer pointers allocated by the user, "rdata".
My guess would be that if I tried to modify the data referred
to by the rdata pointer, and then called H5Dwrite(), that the
function call would again copy the pointer data back to the
dataset.

     The question, then, is this: am I always required to only
refer to copies of the data contained in the dataset, or am
I somehow allowed to get a "pointer" to the actual data?
It seems uncomfortable to me to require this copying back
and forth for large datasets, but maybe it is the case that
the information in the hdf5 file is not stored in memory but
only copied directly from the file to the integer pointers
by the H5Dread() function?

Thanks,
Andrew

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362