Manipulating memory for VFD

John_Biddiscombe · April 14, 2010, 8:56am

I haven't looked at how memory (file offsets) are allocated very closely.....but

If using chunking, hdf maintains a table of chunks and the addresses of these are available and I believe they can be generated by the VFD so that chunks can be placed in some special manner (I saw some code responsible for this I'm sure).

If chunking is not used and a 'flat file' is created (in memory) is it still possible for the VFD to break datasets into pieces, or must they be linear. Presumably if N datasets are created, the VFD can mix them around a bit to fit free space in the file, but I'm wondering if a single dataset has to be continuous or not.

thanks

JB

···

--
John Biddiscombe, email:biddisco @ cscs.ch

CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82

werner · April 14, 2010, 10:52am

As far as I know, each chunk has to be flat and is retrieved by HDF5 with a single read() call, and this would be the case for an unchunked dataset as well.

However, only thing the VFD needs to do is to deliver data via single read() call, and that one can be split up into something complex that collects data from different locations. The VFD would just need to merge them.

If that is easy to do, is another question, since as of now the VFD would not know if it's called to read data or metadata (afaik). There are plans to equip the VFD with more meta-information, so that it knows which part of an HDF5 file it is supposed to read, and to eventually bundle multiple seek()/read() calls together into a single VFD call. Though this won't be there in the near future.

Werner

···

On Wed, 14 Apr 2010 04:56:42 -0400, Biddiscombe, John A. <biddisco@cscs.ch> wrote:

I haven't looked at how memory (file offsets) are allocated very closely.....but

If using chunking, hdf maintains a table of chunks and the addresses of these are available and I believe they can be generated by the VFD so that chunks can be placed in some special manner (I saw some code responsible for this I'm sure).

If chunking is not used and a 'flat file' is created (in memory) is it still possible for the VFD to break datasets into pieces, or must they be linear. Presumably if N datasets are created, the VFD can mix them around a bit to fit free space in the file, but I'm wondering if a single dataset has to be continuous or not.

thanks

JB

--
John Biddiscombe, email:biddisco @ cscs.ch
http://www.cscs.ch/
CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Quincey_Koziol · April 14, 2010, 2:28pm

As far as I know, each chunk has to be flat and is retrieved by HDF5 with a single read() call, and this would be the case for an unchunked dataset as well.

However, only thing the VFD needs to do is to deliver data via single read() call, and that one can be split up into something complex that collects data from different locations. The VFD would just need to merge them.

If that is easy to do, is another question, since as of now the VFD would not know if it's called to read data or metadata (afaik). There are plans to equip the VFD with more meta-information, so that it knows which part of an HDF5 file it is supposed to read, and to eventually bundle multiple seek()/read() calls together into a single VFD call. Though this won't be there in the near future.

The "type" of the memory to retrieve is part of the read()/write() VFD callbacks, so it's technically possible. The VFD could build a map of "physical" (what the HDF5 library thought was happening) to "virtual" (what the VFD knows about where things are really stored) addresses and just use that to feed data back to the library. Complicated, but doable...

Quincey

···

On Apr 14, 2010, at 5:52 AM, Werner Benger wrote:

Werner

On Wed, 14 Apr 2010 04:56:42 -0400, Biddiscombe, John A. <biddisco@cscs.ch> wrote:

I haven't looked at how memory (file offsets) are allocated very closely.....but

If using chunking, hdf maintains a table of chunks and the addresses of these are available and I believe they can be generated by the VFD so that chunks can be placed in some special manner (I saw some code responsible for this I'm sure).

If chunking is not used and a 'flat file' is created (in memory) is it still possible for the VFD to break datasets into pieces, or must they be linear. Presumably if N datasets are created, the VFD can mix them around a bit to fit free space in the file, but I'm wondering if a single dataset has to be continuous or not.

thanks

JB

--
John Biddiscombe, email:biddisco @ cscs.ch
http://www.cscs.ch/
CSCS, Swiss National Supercomputing Centre | Tel: +41 (91) 610.82.07
Via Cantonale, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82

--
___________________________________________________________________________
Dr. Werner Benger Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
211 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org