best remote protocol

Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to access
HDF5 files without using http? gridftp?
is there any study on the project HDF5 and SRB?
thanks
Lana

Hi Lana,

lana abadie wrote:

Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to access HDF5 files without using http? gridftp?

You need to define what you mean by "best" in order to get a sensible answer. What are your requirements?

Cheers,
--dan

···

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

Hi Lana,

We did some study on SRB a few years a ago. Last year, The HDF Group and
the SRB team did some work on HDF5/iRODs. If you are using Windows or Linux,
you can try it with HDFView 2.5 ( not the patch).

See the details of the project at
http://www.hdfgroup.org/projects/irods/

The demo server info at
http://www.hdfgroup.org/projects/irods/irods_download.html

Since the funding ended a while ago, the product/feature is not
officially supported. Also, the demo server at sdsc may be off
anytime.

Thanks
--pc

lana abadie wrote:

···

Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to access HDF5 files without using http? gridftp?
is there any study on the project HDF5 and SRB?
thanks
Lana
------------------------------------------------------------------------

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files over
the network (fast and reliable). i think that the API allows creating and
opening locally, no? so i thought og gridftp? i need to know how long it
takes to trasnfer big files...
also i read somewhere that an HDF5 file can reach one TB, so to transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

···

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

You need to define what you mean by "best" in order to get a sensible
answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

lana abadie wrote:

hi Dan,

i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows
creating and opening locally, no? so i thought og gridftp? i need to
know how long it takes to trasnfer big files…
Lana,

If you want to transfer HDF5 files then FTP will work fine.
HDF5 files are no different from others in this respect. What most
people worry about is reducing bandwidth requirements, and they do this
by transferring only a subset of an HDF5 file. The fact that HDF5 data
is stored in a computer file is something of a secondary issue. An
easy way to do this is to store the HDF5 files on a computer’s disk and
mount that disk remotely, e.g. using Network File System or NFS. Your
application opens the file as usual and the HDF5 library does the
subsetting automatically. This method has some limitations with regard
to security–all NFS installations I’ve seen are under the control of a
single system administrator for this reason. I think it would be
difficult to distribute data widely using this approach. I have used
SSH/FUSE file system in place of NFS. The HDF Group has conducted a
study on this. They summarized their results here on page 11 (http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt).
One of the notable things about this method (other than being
incredibly simple) is that it doesn’t combine the transport protocol
with a subsetting one. The application just opens the HDF5 as a normal
file. It doesn’t need to be rewritten or recompiled. Users need only
an ssh account and the system supplying the data must turn on sftp.

I’ve yet to determine if the sftp daemon, and thus SSH/FUSE, can
support anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc
Alted recently post about on this forum. I do not entirely understand
the problem he is proposing to solve, but I think it is different from
the idea using subsetting to reduce bandwidth requirements.

–dan

also i read somewhere that an HDF5 file can reach one TB,
so to transfer this, you need a goog protocol.

i was wondering also if you have some papers on HDF5/SRB, as i know SRB
is an efficient MSS and allows remote access.

let me know if you need more information and thanks for your feedback

Lana

-- Daniel Kahn
Science Systems and Applications Inc.
301-867-2162
···

2009/10/8 Daniel Kahn daniel_kahn@ssaihq.com

Hi
Lana,

lana abadie wrote:

You need to define what you mean by “best” in order to get a sensible
answer. What are your requirements?

Cheers,

–dan

Daniel Kahn

Science Systems and Applications Inc.

301-867-2162

Hi

i’m new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

Lana --

If you do indeed have to transfer the entire file across wide-area, and have GridFTP available, it should work (and will probably be faster) than FTP.

Regarding your question about HDF5 and SRB, you might want to look at http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the left of these pages for links to detailed information for the projects. Unfortunately, these were prototype projects with no long-term support, so they likely aren't suitable for production efforts.

-Ruth

···

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files over the network (fast and reliable). i think that the API allows creating and opening locally, no? so i thought og gridftp? i need to know how long it takes to trasnfer big files...

Lana,

If you want to transfer HDF5 files then FTP will work fine. HDF5 files are no different from others in this respect. What most people worry about is reducing bandwidth requirements, and they do this by transferring only a subset of an HDF5 file. The fact that HDF5 data is stored in a computer file is something of a secondary issue. An easy way to do this is to store the HDF5 files on a computer's disk and mount that disk remotely, e.g. using Network File System or NFS. Your application opens the file as usual and the HDF5 library does the subsetting automatically. This method has some limitations with regard to security--all NFS installations I've seen are under the control of a single system administrator for this reason. I think it would be difficult to distribute data widely using this approach. I have used SSH/FUSE file system in place of NFS. The HDF Group has conducted a study on this. They summarized their results here on page 11 (http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of the notable things about this method (other than being incredibly simple) is that it doesn't combine the transport protocol with a subsetting one. The application just opens the HDF5 as a normal file. It doesn't need to be rewritten or recompiled. Users need only an ssh account and the system supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can support anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc Alted recently post about on this forum. I do not entirely understand the problem he is proposing to solve, but I think it is different from the idea using subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to transfer this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB is an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>
Hi Lana,

lana abadie wrote:
Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to access HDF5 files without using http? gridftp?
You need to define what you mean by "best" in order to get a sensible answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Lana,

in case that helps: I have used in the past the no longer supported network
driver over TCP sockets and I really liked it. Currently I am using a
special version of the core driver, create a file in memory, attach it to an
XML message and use standard XML message interfaces thereof. The
disadvantage of these approaches compared to the arguably powerful ones
already proposed is that you have to write some code yourself. The advantage
is that if you need to deploy on client platforms where the IT is reluctant
to install or support more than it has to, it's the easiest way to go
through. Also it allows you to manage the amount of information you need to
transfer depending on your application, for example using the VFL allows me
to copy only essential data from disk to memory, zip it, zap it, and process
it. If I need the whole thing I can schedule it to come in low-traffic times
of the day.

HTH

-- dimitris

···

2009/10/9 Ruth Aydt <aydt@hdfgroup.org>

Lana --
If you do indeed have to transfer the entire file across wide-area, and
have GridFTP available, it should work (and will probably be faster) than
FTP.

Regarding your question about HDF5 and SRB, you might want to look at
http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent
http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the left
of these pages for links to detailed information for the projects.
Unfortunately, these were prototype projects with no long-term support, so
they likely aren't suitable for production efforts.

-Ruth

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows creating
and opening locally, no? so i thought og gridftp? i need to know how long it
takes to trasnfer big files...

Lana,

If you want to transfer HDF5 *files* then FTP will work fine. HDF5 files
are no different from others in this respect. What most people worry about
is reducing bandwidth requirements, and they do this by transferring only a
subset of an HDF5 file. The fact that HDF5 data is stored in a computer
file is something of a secondary issue. An easy way to do this is to store
the HDF5 files on a computer's disk and mount that disk remotely, e.g. using
Network File System or NFS. Your application opens the file as usual and
the HDF5 library does the subsetting automatically. This method has some
limitations with regard to security--all NFS installations I've seen are
under the control of a single system administrator for this reason. I think
it would be difficult to distribute data widely using this approach. I have
used SSH/FUSE file system in place of NFS. The HDF Group has conducted a
study on this. They summarized their results here on page 11 (
http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of the
notable things about this method (other than being incredibly simple) is
that it doesn't combine the transport protocol with a subsetting one. The
application just opens the HDF5 as a normal file. It doesn't need to be
rewritten or recompiled. Users need only an ssh account and the system
supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can support
anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc Alted
recently post about on this forum. I do not entirely understand the problem
he is proposing to solve, but I think it is different from the idea using
subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

You need to define what you mean by "best" in order to get a sensible
answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Dimitris,

  you might enjoy to hear that we recently got the HDF streaming VFD functional
and running again; it was some effort to make it standalone as an addon to the
HDF5 library rather than it being part of the library, but it now works again.

It will yet require some code cleanup, and we'll make it soon available via

  http://hdf5-addons.origo.ethz.ch/

(currently this repository has nearly nothing, just an old version of the
streaming VFD which needs to be compiled within the library).

The new version will also come with the feature that its stream can be
embedded into another protocol, for instance HTTP.

  Werner

···

On Fri, 09 Oct 2009 00:21:01 +0200, Dimitris Servis <servisster@gmail.com> wrote:

Lana,

in case that helps: I have used in the past the no longer supported network
driver over TCP sockets and I really liked it. Currently I am using a
special version of the core driver, create a file in memory, attach it to an
XML message and use standard XML message interfaces thereof. The
disadvantage of these approaches compared to the arguably powerful ones
already proposed is that you have to write some code yourself. The advantage
is that if you need to deploy on client platforms where the IT is reluctant
to install or support more than it has to, it's the easiest way to go
through. Also it allows you to manage the amount of information you need to
transfer depending on your application, for example using the VFL allows me
to copy only essential data from disk to memory, zip it, zap it, and process
it. If I need the whole thing I can schedule it to come in low-traffic times
of the day.

HTH

-- dimitris

2009/10/9 Ruth Aydt <aydt@hdfgroup.org>

Lana --
If you do indeed have to transfer the entire file across wide-area, and
have GridFTP available, it should work (and will probably be faster) than
FTP.

Regarding your question about HDF5 and SRB, you might want to look at
http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent
http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the left
of these pages for links to detailed information for the projects.
Unfortunately, these were prototype projects with no long-term support, so
they likely aren't suitable for production efforts.

-Ruth

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows creating
and opening locally, no? so i thought og gridftp? i need to know how long it
takes to trasnfer big files...

Lana,

If you want to transfer HDF5 *files* then FTP will work fine. HDF5 files
are no different from others in this respect. What most people worry about
is reducing bandwidth requirements, and they do this by transferring only a
subset of an HDF5 file. The fact that HDF5 data is stored in a computer
file is something of a secondary issue. An easy way to do this is to store
the HDF5 files on a computer's disk and mount that disk remotely, e.g. using
Network File System or NFS. Your application opens the file as usual and
the HDF5 library does the subsetting automatically. This method has some
limitations with regard to security--all NFS installations I've seen are
under the control of a single system administrator for this reason. I think
it would be difficult to distribute data widely using this approach. I have
used SSH/FUSE file system in place of NFS. The HDF Group has conducted a
study on this. They summarized their results here on page 11 (
http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of the
notable things about this method (other than being incredibly simple) is
that it doesn't combine the transport protocol with a subsetting one. The
application just opens the HDF5 as a normal file. It doesn't need to be
rewritten or recompiled. Users need only an ssh account and the system
supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can support
anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc Alted
recently post about on this forum. I do not entirely understand the problem
he is proposing to solve, but I think it is different from the idea using
subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi
i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

You need to define what you mean by "best" in order to get a sensible
answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Werner,

it is certainly good news. I know you have been working on this since it was
dropped from the official release. I believe that such drivers are maybe one
of the most useful features of HDF5 because they allow much more flexible
use of existing architectures and infrastructure and ubiquitous protocols. I
don't mean to reduce the importance of solutions like the drivers for
GridFTP or Globus or SRB etc, but I think the streaming driver allows more
flexibility to use any protocol in transport or application.

Thanks for the good news!

-- dimitris

···

2009/10/9 Werner Benger <werner@cct.lsu.edu>

Dimitris,

you might enjoy to hear that we recently got the HDF streaming VFD
functional
and running again; it was some effort to make it standalone as an addon to
the
HDF5 library rather than it being part of the library, but it now works
again.

It will yet require some code cleanup, and we'll make it soon available via

       http://hdf5-addons.origo.ethz.ch/

(currently this repository has nearly nothing, just an old version of the
streaming VFD which needs to be compiled within the library).

The new version will also come with the feature that its stream can be
embedded into another protocol, for instance HTTP.

       Werner

On Fri, 09 Oct 2009 00:21:01 +0200, Dimitris Servis <servisster@gmail.com> > wrote:

Lana,

in case that helps: I have used in the past the no longer supported
network
driver over TCP sockets and I really liked it. Currently I am using a
special version of the core driver, create a file in memory, attach it to
an
XML message and use standard XML message interfaces thereof. The
disadvantage of these approaches compared to the arguably powerful ones
already proposed is that you have to write some code yourself. The
advantage
is that if you need to deploy on client platforms where the IT is
reluctant
to install or support more than it has to, it's the easiest way to go
through. Also it allows you to manage the amount of information you need
to
transfer depending on your application, for example using the VFL allows
me
to copy only essential data from disk to memory, zip it, zap it, and
process
it. If I need the whole thing I can schedule it to come in low-traffic
times
of the day.

HTH

-- dimitris

2009/10/9 Ruth Aydt <aydt@hdfgroup.org>

Lana --

If you do indeed have to transfer the entire file across wide-area, and
have GridFTP available, it should work (and will probably be faster) than
FTP.

Regarding your question about HDF5 and SRB, you might want to look at
http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent
http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the left
of these pages for links to detailed information for the projects.
Unfortunately, these were prototype projects with no long-term support,
so
they likely aren't suitable for production efforts.

-Ruth

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows
creating
and opening locally, no? so i thought og gridftp? i need to know how long
it
takes to trasnfer big files...

Lana,

If you want to transfer HDF5 *files* then FTP will work fine. HDF5 files
are no different from others in this respect. What most people worry
about
is reducing bandwidth requirements, and they do this by transferring only
a
subset of an HDF5 file. The fact that HDF5 data is stored in a computer
file is something of a secondary issue. An easy way to do this is to
store
the HDF5 files on a computer's disk and mount that disk remotely, e.g.
using
Network File System or NFS. Your application opens the file as usual and
the HDF5 library does the subsetting automatically. This method has some
limitations with regard to security--all NFS installations I've seen are
under the control of a single system administrator for this reason. I
think
it would be difficult to distribute data widely using this approach. I
have
used SSH/FUSE file system in place of NFS. The HDF Group has conducted a
study on this. They summarized their results here on page 11 (
http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of the
notable things about this method (other than being incredibly simple) is
that it doesn't combine the transport protocol with a subsetting one.
The
application just opens the HDF5 as a normal file. It doesn't need to be
rewritten or recompiled. Users need only an ssh account and the system
supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can support
anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc Alted
recently post about on this forum. I do not entirely understand the
problem
he is proposing to solve, but I think it is different from the idea using
subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB
is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi

i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

  You need to define what you mean by "best" in order to get a sensible

answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization
Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Hi all
thanks a lot of all your feedback: i have questions related to your
comments:
1. Dan, i looked at the presentation you point me: How big is the file?,
the units were seconds i guess? which version of NFS (v4?) I know that one
of the main issues of NFS is concurrent access, (many concurrent readers are
noot good from a performance point of view). Is there any plans to use pNFS
( i read somewhere that it was very efficient). is there any issue in case
of one guy in Australia connected to England?
what about writing performance?
2. Dimitris and Werner : is there any documentation about HDF5/VDF which
describes the solution
thanks
Lana

···

2009/10/9 Werner Benger <werner@cct.lsu.edu>

Dimitris,

you might enjoy to hear that we recently got the HDF streaming VFD
functional
and running again; it was some effort to make it standalone as an addon to
the
HDF5 library rather than it being part of the library, but it now works
again.

It will yet require some code cleanup, and we'll make it soon available via

       http://hdf5-addons.origo.ethz.ch/

(currently this repository has nearly nothing, just an old version of the
streaming VFD which needs to be compiled within the library).

The new version will also come with the feature that its stream can be
embedded into another protocol, for instance HTTP.

       Werner

On Fri, 09 Oct 2009 00:21:01 +0200, Dimitris Servis <servisster@gmail.com> > wrote:

Lana,

in case that helps: I have used in the past the no longer supported
network
driver over TCP sockets and I really liked it. Currently I am using a
special version of the core driver, create a file in memory, attach it to
an
XML message and use standard XML message interfaces thereof. The
disadvantage of these approaches compared to the arguably powerful ones
already proposed is that you have to write some code yourself. The
advantage
is that if you need to deploy on client platforms where the IT is
reluctant
to install or support more than it has to, it's the easiest way to go
through. Also it allows you to manage the amount of information you need
to
transfer depending on your application, for example using the VFL allows
me
to copy only essential data from disk to memory, zip it, zap it, and
process
it. If I need the whole thing I can schedule it to come in low-traffic
times
of the day.

HTH

-- dimitris

2009/10/9 Ruth Aydt <aydt@hdfgroup.org>

Lana --

If you do indeed have to transfer the entire file across wide-area, and
have GridFTP available, it should work (and will probably be faster) than
FTP.

Regarding your question about HDF5 and SRB, you might want to look at
http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent
http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the left
of these pages for links to detailed information for the projects.
Unfortunately, these were prototype projects with no long-term support,
so
they likely aren't suitable for production efforts.

-Ruth

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows
creating
and opening locally, no? so i thought og gridftp? i need to know how long
it
takes to trasnfer big files...

Lana,

If you want to transfer HDF5 *files* then FTP will work fine. HDF5 files
are no different from others in this respect. What most people worry
about
is reducing bandwidth requirements, and they do this by transferring only
a
subset of an HDF5 file. The fact that HDF5 data is stored in a computer
file is something of a secondary issue. An easy way to do this is to
store
the HDF5 files on a computer's disk and mount that disk remotely, e.g.
using
Network File System or NFS. Your application opens the file as usual and
the HDF5 library does the subsetting automatically. This method has some
limitations with regard to security--all NFS installations I've seen are
under the control of a single system administrator for this reason. I
think
it would be difficult to distribute data widely using this approach. I
have
used SSH/FUSE file system in place of NFS. The HDF Group has conducted a
study on this. They summarized their results here on page 11 (
http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of the
notable things about this method (other than being incredibly simple) is
that it doesn't combine the transport protocol with a subsetting one.
The
application just opens the HDF5 as a normal file. It doesn't need to be
rewritten or recompiled. Users need only an ssh account and the system
supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can support
anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc Alted
recently post about on this forum. I do not entirely understand the
problem
he is proposing to solve, but I think it is different from the idea using
subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB
is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi

i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

  You need to define what you mean by "best" in order to get a sensible

answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization
Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

Hi Lana,

Unless I've missed it, you still have not described your requirements to the forum. What are you trying to do?

lana abadie wrote:

Hi all
thanks a lot of all your feedback: i have questions related to your comments:
1. Dan, i looked at the presentation you point me: How big is the file?, the units were seconds i guess? which version of NFS (v4?) I know that one of the main issues of NFS is concurrent access, (many concurrent readers are noot good from a performance point of view). Is there any plans to use pNFS ( i read somewhere that it was very efficient).

I suggested you contact the report's author, Kent Yang, (ymuqun_at_hdfgroup.org) for those details.

is there any issue in case of one guy in Australia connected to England?

I am unaware of any issues between Australia and England.

--dan

···

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

Hi Dan
for the moment, i'd like to know if i can use HDF5 to format raw data
(coming from an experiment) arriving at high speed 1Gb/sec. before being
transferred to storage, i.e. DAQ-> data formatting + compression->
transfer->permanent storage.

Voila
Lana

···

2009/10/9 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

Unless I've missed it, you still have not described your requirements to
the forum. What are you trying to do?

lana abadie wrote:

Hi all
thanks a lot of all your feedback: i have questions related to your
comments:
1. Dan, i looked at the presentation you point me: How big is the file?,
the units were seconds i guess? which version of NFS (v4?) I know that one
of the main issues of NFS is concurrent access, (many concurrent readers are
noot good from a performance point of view). Is there any plans to use pNFS
( i read somewhere that it was very efficient).

I suggested you contact the report's author, Kent Yang, (
ymuqun_at_hdfgroup.org) for those details.

is there any issue in case of one guy in Australia connected to England?

I am unaware of any issues between Australia and England.

--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

Hi lana,

Hi all
thanks a lot of all your feedback: i have questions related to your
comments:
2. Dimitris and Werner : is there any documentation about HDF5/VDF which
describes the solution

I'm not sure how much was about it in pre-HDF5 1.6.5, probably it's
mostly the source code. In the resurrection version, the API is somewhat
more powerful and consequently a bit more complex, documentation will still
have to be written for it... (after code cleanup, providing some examples,
and equipping it with some suitable build system).

The basic idea has been laid out in this article (which I won't consider
a very good paper, but it's a writeup of early ideas at least):

http://www.zib.de/Publications/Reports/SC-99-43.pdf

It should be noted that the streaming VFD is not primarily intended
to be used for file transfer. It can be used for that, and it should
be as fast as the network connectivity, as it's just sending a file
over a raw TCP socket, so there's no protocol overhead (beside tcp
itself, some say UDP might be more efficient under certain conditions).

The primary intention for the stream VFD is that one can create
data on demand, or send them from an application's memory over
some socket to some client, with never actually creating a file
on the server side. It basically sends a memory image of an
HDF5 file over the network, which also limits the "file size"
(no sequential access).

  Werner

···

On Fri, 09 Oct 2009 18:29:59 +0200, lana abadie <lana.abadie@gmail.com> wrote:

thanks
Lana

2009/10/9 Werner Benger <werner@cct.lsu.edu>

Dimitris,

you might enjoy to hear that we recently got the HDF streaming VFD
functional
and running again; it was some effort to make it standalone as an addon to
the
HDF5 library rather than it being part of the library, but it now works
again.

It will yet require some code cleanup, and we'll make it soon available via

       http://hdf5-addons.origo.ethz.ch/

(currently this repository has nearly nothing, just an old version of the
streaming VFD which needs to be compiled within the library).

The new version will also come with the feature that its stream can be
embedded into another protocol, for instance HTTP.

       Werner

On Fri, 09 Oct 2009 00:21:01 +0200, Dimitris Servis <servisster@gmail.com> >> wrote:

Lana,

in case that helps: I have used in the past the no longer supported
network
driver over TCP sockets and I really liked it. Currently I am using a
special version of the core driver, create a file in memory, attach it to
an
XML message and use standard XML message interfaces thereof. The
disadvantage of these approaches compared to the arguably powerful ones
already proposed is that you have to write some code yourself. The
advantage
is that if you need to deploy on client platforms where the IT is
reluctant
to install or support more than it has to, it's the easiest way to go
through. Also it allows you to manage the amount of information you need
to
transfer depending on your application, for example using the VFL allows
me
to copy only essential data from disk to memory, zip it, zap it, and
process
it. If I need the whole thing I can schedule it to come in low-traffic
times
of the day.

HTH

-- dimitris

2009/10/9 Ruth Aydt <aydt@hdfgroup.org>

Lana --

If you do indeed have to transfer the entire file across wide-area, and
have GridFTP available, it should work (and will probably be faster) than
FTP.

Regarding your question about HDF5 and SRB, you might want to look at
http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent
http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the left
of these pages for links to detailed information for the projects.
Unfortunately, these were prototype projects with no long-term support,
so
they likely aren't suitable for production efforts.

-Ruth

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows
creating
and opening locally, no? so i thought og gridftp? i need to know how long
it
takes to trasnfer big files...

Lana,

If you want to transfer HDF5 *files* then FTP will work fine. HDF5 files
are no different from others in this respect. What most people worry
about
is reducing bandwidth requirements, and they do this by transferring only
a
subset of an HDF5 file. The fact that HDF5 data is stored in a computer
file is something of a secondary issue. An easy way to do this is to
store
the HDF5 files on a computer's disk and mount that disk remotely, e.g.
using
Network File System or NFS. Your application opens the file as usual and
the HDF5 library does the subsetting automatically. This method has some
limitations with regard to security--all NFS installations I've seen are
under the control of a single system administrator for this reason. I
think
it would be difficult to distribute data widely using this approach. I
have
used SSH/FUSE file system in place of NFS. The HDF Group has conducted a
study on this. They summarized their results here on page 11 (
http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of the
notable things about this method (other than being incredibly simple) is
that it doesn't combine the transport protocol with a subsetting one.
The
application just opens the HDF5 as a normal file. It doesn't need to be
rewritten or recompiled. Users need only an ssh account and the system
supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can support
anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc Alted
recently post about on this forum. I do not entirely understand the
problem
he is proposing to solve, but I think it is different from the idea using
subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB
is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi

i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

  You need to define what you mean by "best" in order to get a sensible

answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--
___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization
Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

--
___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

lana abadie wrote:

Hi Dan
for the moment, i'd like to know if i can use HDF5 to format raw data (coming from an experiment) arriving at high speed 1Gb/sec. before being transferred to storage, i.e. DAQ-> data formatting + compression-> transfer->permanent storage.

Hi Lana,

This sounds interesting. Have you looked at the Packet Table interface? The HDF Group developed for someone who wanted to do pretty much what you want do. I assume it works, but I haven't used it myself.

You can find a slide show about it here: http://www.google.com/search?q=packet+table+hdf5&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a <http://www.google.com/search?q=packet+table+hdf5&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a>

Check the HDF group web site for documentation, technical details, etc.

Cheers,
--dan

···

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

Hi all
thanks for your feedback. This idea of packet table is interesting... i have
to make some tests.
i may see some of you at sc09.
cheers
Lana

···

2009/10/9 Werner Benger <werner@cct.lsu.edu>

Hi lana,

On Fri, 09 Oct 2009 18:29:59 +0200, lana abadie <lana.abadie@gmail.com> > wrote:

Hi all

thanks a lot of all your feedback: i have questions related to your
comments:
2. Dimitris and Werner : is there any documentation about HDF5/VDF which
describes the solution

I'm not sure how much was about it in pre-HDF5 1.6.5, probably it's
mostly the source code. In the resurrection version, the API is somewhat
more powerful and consequently a bit more complex, documentation will still
have to be written for it... (after code cleanup, providing some examples,
and equipping it with some suitable build system).

The basic idea has been laid out in this article (which I won't consider
a very good paper, but it's a writeup of early ideas at least):

http://www.zib.de/Publications/Reports/SC-99-43.pdf

It should be noted that the streaming VFD is not primarily intended
to be used for file transfer. It can be used for that, and it should
be as fast as the network connectivity, as it's just sending a file
over a raw TCP socket, so there's no protocol overhead (beside tcp
itself, some say UDP might be more efficient under certain conditions).

The primary intention for the stream VFD is that one can create
data on demand, or send them from an application's memory over
some socket to some client, with never actually creating a file
on the server side. It basically sends a memory image of an
HDF5 file over the network, which also limits the "file size"
(no sequential access).

       Werner

thanks

Lana

2009/10/9 Werner Benger <werner@cct.lsu.edu>

Dimitris,

you might enjoy to hear that we recently got the HDF streaming VFD
functional
and running again; it was some effort to make it standalone as an addon
to
the
HDF5 library rather than it being part of the library, but it now works
again.

It will yet require some code cleanup, and we'll make it soon available
via

      http://hdf5-addons.origo.ethz.ch/

(currently this repository has nearly nothing, just an old version of the
streaming VFD which needs to be compiled within the library).

The new version will also come with the feature that its stream can be
embedded into another protocol, for instance HTTP.

      Werner

On Fri, 09 Oct 2009 00:21:01 +0200, Dimitris Servis < >>> servisster@gmail.com> >>> wrote:

Lana,

in case that helps: I have used in the past the no longer supported
network
driver over TCP sockets and I really liked it. Currently I am using a
special version of the core driver, create a file in memory, attach it
to
an
XML message and use standard XML message interfaces thereof. The
disadvantage of these approaches compared to the arguably powerful ones
already proposed is that you have to write some code yourself. The
advantage
is that if you need to deploy on client platforms where the IT is
reluctant
to install or support more than it has to, it's the easiest way to go
through. Also it allows you to manage the amount of information you need
to
transfer depending on your application, for example using the VFL allows
me
to copy only essential data from disk to memory, zip it, zap it, and
process
it. If I need the whole thing I can schedule it to come in low-traffic
times
of the day.

HTH

-- dimitris

2009/10/9 Ruth Aydt <aydt@hdfgroup.org>

Lana --

If you do indeed have to transfer the entire file across wide-area, and
have GridFTP available, it should work (and will probably be faster)
than
FTP.

Regarding your question about HDF5 and SRB, you might want to look at
http://www.hdfgroup.org/projects/ncsa_srb/ and the more recent
http://www.hdfgroup.org/projects/irods/ -- Look at the tabs on the
left
of these pages for links to detailed information for the projects.
Unfortunately, these were prototype projects with no long-term support,
so
they likely aren't suitable for production efforts.

-Ruth

On Oct 8, 2009, at 3:49 PM, Daniel Kahn wrote:

lana abadie wrote:

hi Dan,
i was wondering what is the best access protocol to transfer HDF5 files
over the network (fast and reliable). i think that the API allows
creating
and opening locally, no? so i thought og gridftp? i need to know how
long
it
takes to trasnfer big files...

Lana,

If you want to transfer HDF5 *files* then FTP will work fine. HDF5
files
are no different from others in this respect. What most people worry
about
is reducing bandwidth requirements, and they do this by transferring
only
a
subset of an HDF5 file. The fact that HDF5 data is stored in a
computer
file is something of a secondary issue. An easy way to do this is to
store
the HDF5 files on a computer's disk and mount that disk remotely, e.g.
using
Network File System or NFS. Your application opens the file as usual
and
the HDF5 library does the subsetting automatically. This method has
some
limitations with regard to security--all NFS installations I've seen
are
under the control of a single system administrator for this reason. I
think
it would be difficult to distribute data widely using this approach. I
have
used SSH/FUSE file system in place of NFS. The HDF Group has conducted
a
study on this. They summarized their results here on page 11 (
http://hdfeos.org/briefing/2009/2008-ESDIS-task-status.ppt). One of
the
notable things about this method (other than being incredibly simple)
is
that it doesn't combine the transport protocol with a subsetting one.
The
application just opens the HDF5 as a normal file. It doesn't need to
be
rewritten or recompiled. Users need only an ssh account and the system
supplying the data must turn on sftp.

I've yet to determine if the sftp daemon, and thus SSH/FUSE, can
support
anonymous connections.

Note that this was a distinct effort from HDF5/FUSE which Francesc
Alted
recently post about on this forum. I do not entirely understand the
problem
he is proposing to solve, but I think it is different from the idea
using
subsetting to reduce bandwidth requirements.

--dan

also i read somewhere that an HDF5 file can reach one TB, so to
transfer
this, you need a goog protocol.
i was wondering also if you have some papers on HDF5/SRB, as i know SRB
is
an efficient MSS and allows remote access.
let me know if you need more information and thanks for your feedback
Lana

2009/10/8 Daniel Kahn <daniel_kahn@ssaihq.com>

Hi Lana,

lana abadie wrote:

Hi

i'm new to HDF5. i wanted to know what is the best remote protocol to
access HDF5 files without using http? gridftp?

You need to define what you mean by "best" in order to get a
sensible

answer. What are your requirements?

Cheers,
--dan

--
Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

--

Daniel Kahn
Science Systems and Applications Inc.
301-867-2162

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

--

___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization
Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University
(CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362

--
___________________________________________________________________________
Dr. Werner Benger <werner@cct.lsu.edu> Visualization
Research
Laboratory for Creative Arts and Technology (LCAT)
Center for Computation & Technology at Louisiana State University (CCT/LSU)
239 Johnston Hall, Baton Rouge, Louisiana 70803
Tel.: +1 225 578 4809 Fax.: +1 225 578-5362