Thank you Gerd and Dave.
···
________________________________
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org> on behalf of Dave Allured - NOAA Affiliate <dave.allured@noaa.gov>
Sent: Thursday, April 20, 2017 10:16 AM
To: hdf-forum@lists.hdfgroup.org
Subject: Re: [Hdf-forum] [**EXTERNAL**] Re: first non-fill-value in the sparse chunked dataset
Efim,
Can you simply add a scalar integer attribute that keeps track of the lower bound index value of the slower dimension? Just update this attribute every time you write to the data set, or at least every time the lower bound goes lower. This would be an application level solution, rather than something provided by the library.
This resembles a minimal version of Gerd's suggestion #1.
--Dave
On Thu, Apr 20, 2017 at 8:39 AM, Efim Dyadkin <Efim.Dyadkin@pdgm.com<mailto:Efim.Dyadkin@pdgm.com>> wrote:
Sorry I should have specified what "first" is. I have a 2d dataset with slower dimension sparse and unlimited,
and with fast dimension non-sparse and of fixed length. Typically for my data, information can be written first
in the "middle" of the slower dimension of the dataset and then grow in any direction (to the left and to the right)
incrementally. I need to keep track of current bounding box in order to only access populated part of the dataset.
The upper boundary of the slower dimension is basically an extent of the dataset so I do not need to store it
on my own. As to lower boundary I hoped I could find it by getting access to a first available chunk with
a smallest index along slower dimension.
I think exposing at least a boolean grid of existing chunks could be helpful for sparse data handling.
Thanks,
Efim
From: Hdf-forum <hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>> on behalf of Gerd Heber <gheber@hdfgroup.org<mailto:gheber@hdfgroup.org>>
Sent: Thursday, April 20, 2017 7:20 AM
To: HDF Users Discussion List
Subject: [**EXTERNAL**] Re: [Hdf-forum] first non-fill-value in the sparse chunked dataset
The “first non-fill-value” in which order? (chronological, C-order, …)
Short answer: No chance.
Slightly longer: (Apart from H5DOwrite_chunk…) There is currently no API that
gives you direct control over/introspection into chunks. You can control certain
aspects of chunk allocation time and policy (via dataset creation properties),
but the rest is pretty opaque and a side-effect of H5D[read,write].
I think you have at least two options:
1. Create an auxiliary structure where you maintain that type of log information.
(This is dangerous/illusionary because you’ll be making assumptions about how the
HDF5 library writes/updates chunks, and what happens in the underlying storage.)
2. Create a proper sparse structure and don’t use chunking to mimic one.
(You might still struggle with the definition of ‘first.’)
G.
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org<mailto:hdf-forum-bounces@lists.hdfgroup.org>] On Behalf Of Efim Dyadkin
Sent: Wednesday, April 19, 2017 5:04 PM
To: hdf-forum@lists.hdfgroup.org<mailto:hdf-forum@lists.hdfgroup.org>
Subject: [Hdf-forum] first non-fill-value in the sparse chunked dataset
Hi,
I am using a sparse chunked dataset with a certain fill value. I’d like to find a first non-fill-value element in the dataset. Can I narrow down my search to a first available chunk? How can I do it?
Thank you,
Efim Dyadkin
------------------- This e-mail, including any attached files, may contain confidential and privileged information for the sole use of the intended recipient. Any review, use, distribution, or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive information for the intended recipient), please contact the sender by reply e-mail and delete all copies of this message.