swizzle on split file driver

Hi All,

Anyone using the split file driver?

As I understand it, it seems to allow an HDF5 application to 'spread'
data across one or more files, each one potentially using a different
vfd. For example, I could have RAW data use the sec2 vfd and all other
'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
right?

But, can I somehow have RAW datasets BELOW a certain size, say smaller
than 16 kilobytes, go into the core vfd also but have all other raw data
go into the sec2 vfd?

Based on what I've read, I don't think this is possible with HDF5 out of
the box as it appears we can target RAW data to only one vfd in a split
mode. Am I right?

If so, I wonder if following makes sense...

Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
like a mixture of core and sec2 vfds; small writes get handled in core
like the core vfd does it and large writes get handled as 'normal' like
sec2 already does handle it. What would the difficulties be in creating
such a beast?

Mark

···

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

Hi Mark,

···

On Feb 8, 2010, at 10:59 PM, Mark Miller wrote:

Hi All,

Anyone using the split file driver?

As I understand it, it seems to allow an HDF5 application to 'spread'
data across one or more files, each one potentially using a different
vfd. For example, I could have RAW data use the sec2 vfd and all other
'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
right?

But, can I somehow have RAW datasets BELOW a certain size, say smaller
than 16 kilobytes, go into the core vfd also but have all other raw data
go into the sec2 vfd?

Based on what I've read, I don't think this is possible with HDF5 out of
the box as it appears we can target RAW data to only one vfd in a split
mode. Am I right?

If so, I wonder if following makes sense...

Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
like a mixture of core and sec2 vfds; small writes get handled in core
like the core vfd does it and large writes get handled as 'normal' like
sec2 already does handle it. What would the difficulties be in creating
such a beast?

  Seems like a sensible idea. We have prototyped the idea of "stacking" VFD's, but it didn't get far enough to put into production. Finishing that work up so you could have a split VFD on top of the core VFD (for metadata) and the sec2 VFD (for raw data) would be a really interesting thing.

  Quincey

Hi Quincey,

Just to make sure I understand...

it was my impression from talking with Richard Hedges here at LLNL that
the split file driver already supported the first part of what I
described. That is having some datastreams go to core vfd and raw go to
sec2. That does NOT require the new 'stacking' feature you mention, does
it?

Mark

···

On Tue, 2010-02-09 at 04:43, Quincey Koziol wrote:

Hi Mark,

On Feb 8, 2010, at 10:59 PM, Mark Miller wrote:

> Hi All,
>
> Anyone using the split file driver?
>
> As I understand it, it seems to allow an HDF5 application to 'spread'
> data across one or more files, each one potentially using a different
> vfd. For example, I could have RAW data use the sec2 vfd and all other
> 'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
> right?
>
> But, can I somehow have RAW datasets BELOW a certain size, say smaller
> than 16 kilobytes, go into the core vfd also but have all other raw data
> go into the sec2 vfd?
>
> Based on what I've read, I don't think this is possible with HDF5 out of
> the box as it appears we can target RAW data to only one vfd in a split
> mode. Am I right?
>
> If so, I wonder if following makes sense...
>
> Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
> like a mixture of core and sec2 vfds; small writes get handled in core
> like the core vfd does it and large writes get handled as 'normal' like
> sec2 already does handle it. What would the difficulties be in creating
> such a beast?

  Seems like a sensible idea. We have prototyped the idea of "stacking" VFD's, but it didn't get far enough to put into production. Finishing that work up so you could have a split VFD on top of the core VFD (for metadata) and the sec2 VFD (for raw data) would be a really interesting thing.

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

Hi Quincey,

So, I wasn't sure if you saw my other query on this. But, I've got
conflicting information regarding the 'multi' (or 'split') vfd.

Is it possible to use the multi (or split) vfd in such a way that RAW
data goes to sec2 vfd and all other data streams go to core vfd?

I was not sure if your response meant that that is currently not
possible or that just having a single vfd act like sort of a core-sec2
hybrid is what is not possible.

Mark

···

On Tue, 2010-02-09 at 04:43, Quincey Koziol wrote:

Hi Mark,

On Feb 8, 2010, at 10:59 PM, Mark Miller wrote:

> Hi All,
>
> Anyone using the split file driver?
>
> As I understand it, it seems to allow an HDF5 application to 'spread'
> data across one or more files, each one potentially using a different
> vfd. For example, I could have RAW data use the sec2 vfd and all other
> 'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
> right?
>
> But, can I somehow have RAW datasets BELOW a certain size, say smaller
> than 16 kilobytes, go into the core vfd also but have all other raw data
> go into the sec2 vfd?
>
> Based on what I've read, I don't think this is possible with HDF5 out of
> the box as it appears we can target RAW data to only one vfd in a split
> mode. Am I right?
>
> If so, I wonder if following makes sense...
>
> Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
> like a mixture of core and sec2 vfds; small writes get handled in core
> like the core vfd does it and large writes get handled as 'normal' like
> sec2 already does handle it. What would the difficulties be in creating
> such a beast?

  Seems like a sensible idea. We have prototyped the idea of "stacking" VFD's, but it didn't get far enough to put into production. Finishing that work up so you could have a split VFD on top of the core VFD (for metadata) and the sec2 VFD (for raw data) would be a really interesting thing.

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

Hi Mark,

Hi Quincey,

Just to make sure I understand...

it was my impression from talking with Richard Hedges here at LLNL that
the split file driver already supported the first part of what I
described. That is having some datastreams go to core vfd and raw go to
sec2. That does NOT require the new 'stacking' feature you mention, does
it?

  Yes, the current split VFD is working and supported (and doesn't require the stacking I mention).

    Quincey

···

On Feb 9, 2010, at 8:37 AM, Mark Miller wrote:

Mark

On Tue, 2010-02-09 at 04:43, Quincey Koziol wrote:

Hi Mark,

On Feb 8, 2010, at 10:59 PM, Mark Miller wrote:

Hi All,

Anyone using the split file driver?

As I understand it, it seems to allow an HDF5 application to 'spread'
data across one or more files, each one potentially using a different
vfd. For example, I could have RAW data use the sec2 vfd and all other
'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
right?

But, can I somehow have RAW datasets BELOW a certain size, say smaller
than 16 kilobytes, go into the core vfd also but have all other raw data
go into the sec2 vfd?

Based on what I've read, I don't think this is possible with HDF5 out of
the box as it appears we can target RAW data to only one vfd in a split
mode. Am I right?

If so, I wonder if following makes sense...

Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
like a mixture of core and sec2 vfds; small writes get handled in core
like the core vfd does it and large writes get handled as 'normal' like
sec2 already does handle it. What would the difficulties be in creating
such a beast?

  Seems like a sensible idea. We have prototyped the idea of "stacking" VFD's, but it didn't get far enough to put into production. Finishing that work up so you could have a split VFD on top of the core VFD (for metadata) and the sec2 VFD (for raw data) would be a really interesting thing.

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi Mark,

Hi Quincey,

So, I wasn't sure if you saw my other query on this. But, I've got
conflicting information regarding the 'multi' (or 'split') vfd.

Is it possible to use the multi (or split) vfd in such a way that RAW
data goes to sec2 vfd and all other data streams go to core vfd?

  Hmm, sorry, I was mistaken earlier - the split (and multi) VFDs access FAPLs for their underlying files, which should allow you to use the core VFD for metadata and the sec2 VFD2 for raw data.

  Quincey

···

On Feb 10, 2010, at 10:09 AM, Mark Miller wrote:

I was not sure if your response meant that that is currently not
possible or that just having a single vfd act like sort of a core-sec2
hybrid is what is not possible.

Mark

On Tue, 2010-02-09 at 04:43, Quincey Koziol wrote:

Hi Mark,

On Feb 8, 2010, at 10:59 PM, Mark Miller wrote:

Hi All,

Anyone using the split file driver?

As I understand it, it seems to allow an HDF5 application to 'spread'
data across one or more files, each one potentially using a different
vfd. For example, I could have RAW data use the sec2 vfd and all other
'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
right?

But, can I somehow have RAW datasets BELOW a certain size, say smaller
than 16 kilobytes, go into the core vfd also but have all other raw data
go into the sec2 vfd?

Based on what I've read, I don't think this is possible with HDF5 out of
the box as it appears we can target RAW data to only one vfd in a split
mode. Am I right?

If so, I wonder if following makes sense...

Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
like a mixture of core and sec2 vfds; small writes get handled in core
like the core vfd does it and large writes get handled as 'normal' like
sec2 already does handle it. What would the difficulties be in creating
such a beast?

  Seems like a sensible idea. We have prototyped the idea of "stacking" VFD's, but it didn't get far enough to put into production. Finishing that work up so you could have a split VFD on top of the core VFD (for metadata) and the sec2 VFD (for raw data) would be a really interesting thing.

  Quincey

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Ok, super! That is what Richard and I had believed but I was not
completely sure one way or the other. Thanks for info.

Mark

···

On Wed, 2010-02-10 at 15:12, Quincey Koziol wrote:

Hi Mark,

On Feb 10, 2010, at 10:09 AM, Mark Miller wrote:

> Hi Quincey,
>
> So, I wasn't sure if you saw my other query on this. But, I've got
> conflicting information regarding the 'multi' (or 'split') vfd.
>
> Is it possible to use the multi (or split) vfd in such a way that RAW
> data goes to sec2 vfd and all other data streams go to core vfd?

  Hmm, sorry, I was mistaken earlier - the split (and multi) VFDs access FAPLs for their underlying files, which should allow you to use the core VFD for metadata and the sec2 VFD2 for raw data.

  Quincey

> I was not sure if your response meant that that is currently not
> possible or that just having a single vfd act like sort of a core-sec2
> hybrid is what is not possible.
>
> Mark
>
> On Tue, 2010-02-09 at 04:43, Quincey Koziol wrote:
>> Hi Mark,
>>
>> On Feb 8, 2010, at 10:59 PM, Mark Miller wrote:
>>
>>> Hi All,
>>>
>>> Anyone using the split file driver?
>>>
>>> As I understand it, it seems to allow an HDF5 application to 'spread'
>>> data across one or more files, each one potentially using a different
>>> vfd. For example, I could have RAW data use the sec2 vfd and all other
>>> 'meta' data (b-trees, super-block, metadata, etc.) use a core vfd,
>>> right?
>>>
>>> But, can I somehow have RAW datasets BELOW a certain size, say smaller
>>> than 16 kilobytes, go into the core vfd also but have all other raw data
>>> go into the sec2 vfd?
>>>
>>> Based on what I've read, I don't think this is possible with HDF5 out of
>>> the box as it appears we can target RAW data to only one vfd in a split
>>> mode. Am I right?
>>>
>>> If so, I wonder if following makes sense...
>>>
>>> Create a new sec2 vfd, call it 'sec2_wcore' which will essentially act
>>> like a mixture of core and sec2 vfds; small writes get handled in core
>>> like the core vfd does it and large writes get handled as 'normal' like
>>> sec2 already does handle it. What would the difficulties be in creating
>>> such a beast?
>>
>> Seems like a sensible idea. We have prototyped the idea of "stacking" VFD's, but it didn't get far enough to put into production. Finishing that work up so you could have a split VFD on top of the core VFD (for metadata) and the sec2 VFD (for raw data) would be a really interesting thing.
>>
>> Quincey
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum@hdfgroup.org
>> http://**mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> --
> Mark C. Miller, Lawrence Livermore National Laboratory
> ================!!LLNL BUSINESS ONLY!!================
> miller86@llnl.gov urgent: miller86@pager.llnl.gov
> T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@hdfgroup.org
> http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://*mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
Mark C. Miller, Lawrence Livermore National Laboratory
================!!LLNL BUSINESS ONLY!!================
miller86@llnl.gov urgent: miller86@pager.llnl.gov
T:8-6 (925)-423-5901 M/W/Th:7-12,2-7 (530)-753-851