symmetric encryption filters?

Hello - has anyone used a symmetric encryption filter with HDF5? I would like to introduce encryption (AES, DES, 3DES) in the pipeline after zlib compression to encrypt some datasets.

Any examples, starting points, or suggestions would help.

Thanks!
--Jim

While it is possible to perform some encryption in a filter, the filter
mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the
decryption, they get only data that is stored in the file. Otherwise,
the HDF5 library would not be able to do the decoding in a completely
transparent way. And if you put the key into the file (as filter
options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the
key via a side-channel. This is possible, but it will be hard to do this
in a usable and portable fashion. For instance, you cannot just pop up a
dialog asking for a key, because many programs using HDF5 don't even
have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded in
the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

···

On 03/21/2014 12:44 AM, Rowe, Jim wrote:

Hello – has anyone used a symmetric encryption filter with HDF5? I
would like to introduce encryption (AES, DES, 3DES) in the pipeline
after zlib compression to encrypt some datasets.

Any examples, starting points, or suggestions would help.

Thanks!

--Jim

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

Hi Jim

I have written a non-terminal VFL driver that segregates the metadata and
encrypts it.

Cheers

Dimitris

···

2014-03-21 11:23 GMT+01:00 huebbe < nathanael.huebbe@informatik.uni-hamburg.de>:

While it is possible to perform some encryption in a filter, the filter
mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the
decryption, they get only data that is stored in the file. Otherwise,
the HDF5 library would not be able to do the decoding in a completely
transparent way. And if you put the key into the file (as filter
options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the
key via a side-channel. This is possible, but it will be hard to do this
in a usable and portable fashion. For instance, you cannot just pop up a
dialog asking for a key, because many programs using HDF5 don't even
have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded in
the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

On 03/21/2014 12:44 AM, Rowe, Jim wrote:
> Hello - has anyone used a symmetric encryption filter with HDF5? I
> would like to introduce encryption (AES, DES, 3DES) in the pipeline
> after zlib compression to encrypt some datasets.
>
>
>
> Any examples, starting points, or suggestions would help.
>
>
>
>
>
> Thanks!
>
> --Jim
>
>
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@lists.hdfgroup.org
>
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Hmm. This is an interesting discussion. Let me see if I can add two centsŠ

The HDF5 library allows you to define your own 'filters' which operate on
the data in-transit as it is written to and read from the file. The filters
are just call backs made from the HDF5 library to your user-defined code to
operate on chunks of the dataset as they are emitted underneath the
H5Dwrite
and H5Dread calls.

If you write data via some user-defined filter then any reader will need
to have access to the code that does the reverse operation (decrypt in
your case
and, of course any decryption keys). So, there is already implied in this
that
if you define some 'weird' filter, none of the existing HDF5 tools will be
able
to read your data (hdfview, h5dump, or third party applications that read
HDF5
like IDL, MATLAB, VisIt, etc.). But, given that you are talking about
encryption
here, I suspect that such an outcome is actually perfectly fine.

So, only applications that have access to your reader code (decryption
filters)
will be able to read the data.

And, why not handle that the way something
like ssh does it now. Your reader 'filter' would have to acquire the key
from
~/.ssh/id_rsa and then use what it gets to decrypt the chunks getting read
during H5Dread. Failure to acquire the key would result in a filter error
and
ultimately a read error in H5Dread's error stack. You could do some work to
detect this case and report a useful error message (e.g. "no appropriate
key
to read encrypted data").

Would you have a single HDF5 file with datasets encrypted for different
ids?
If so, I think the ssh-like mechanim still works.

Because 'filter' operations apply only to the raw data of a dataset, the
metadata
is not encrypted. This means things like the names, dimensions, datatypes,
etc
(and any attributes defined on the datasets) cannot be encrypted via the
'filter'
approach. Perhaps this is why another responder mentioned the introduction
of
a Virtual File Driver that collects metadata together and encrypts that
separately.
I could see how that could be important in certain circumstances.

Some other issues are that 'filters' can be applied only when dataset are
'chunked'.
And, the filters are then applied independently to each chunk. So, what
you get for
a single dataset is a bunch of chunks, each chunk independently encrypted.
So, you
don't have the whole dataset encrypted in one fell swoop. I don't think
that would
cause problems but thought I would mention it.

HDF5 can be 'smart' about applying filters and wind up NOT applying a
requested filter
in circumstances where you tell it the filter is optional. So, you have to
take care
to be sure your filter won't be treated by HDF5 that way and wind up
skpping and
encryption filter it should not have. Just be sure to set up the filters
correctly
when you define them to HDF5.

Will encryption *increase* the size of the data being written? I don't
think it does
but I guess its always possible depending on what you are doing. If so,
HDF5 may not
be able to tolerate that. It may expect chunks to be equal to or less than
in size
that the un-filtered chunks and error-out (or skip such a filter) if that
is not the
case. So, just be sure too review the documentation on these details.

I guess this is a long winded way of saying I think you could make it work
within
the limitations of some of the issues I mention above. And, I think you
can invent
a way to handle the keys that can probably be made to work.

Hope that was helpful.

Mark

···

On 3/21/14 3:23 AM, "huebbe" <nathanael.huebbe@informatik.uni-hamburg.de> wrote:

While it is possible to perform some encryption in a filter, the filter
mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the
decryption, they get only data that is stored in the file. Otherwise,
the HDF5 library would not be able to do the decoding in a completely
transparent way. And if you put the key into the file (as filter
options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the
key via a side-channel. This is possible, but it will be hard to do this
in a usable and portable fashion. For instance, you cannot just pop up a
dialog asking for a key, because many programs using HDF5 don't even
have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded in
the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

On 03/21/2014 12:44 AM, Rowe, Jim wrote:

Hello ­ has anyone used a symmetric encryption filter with HDF5? I
would like to introduce encryption (AES, DES, 3DES) in the pipeline
after zlib compression to encrypt some datasets.

Any examples, starting points, or suggestions would help.

Thanks!

--Jim

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.
org

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

Hi Dimitris,

  Your work sounds very interesting. Is it possible to share your code
as an open source project to the community? Or is it already available
somewhere for download and test?

···

--
Are your big (Earth) data in HDF(-EOS)?

On Fri, Mar 21, 2014 at 5:28 AM, Dimitris Servis <servisster@gmail.com> wrote:

Hi Jim

I have written a non-terminal VFL driver that segregates the metadata and
encrypts it.

Cheers

Dimitris

2014-03-21 11:23 GMT+01:00 huebbe
<nathanael.huebbe@informatik.uni-hamburg.de>:

While it is possible to perform some encryption in a filter, the filter
mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the
decryption, they get only data that is stored in the file. Otherwise,
the HDF5 library would not be able to do the decoding in a completely
transparent way. And if you put the key into the file (as filter
options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the
key via a side-channel. This is possible, but it will be hard to do this
in a usable and portable fashion. For instance, you cannot just pop up a
dialog asking for a key, because many programs using HDF5 don't even
have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded in
the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

On 03/21/2014 12:44 AM, Rowe, Jim wrote:
> Hello - has anyone used a symmetric encryption filter with HDF5? I
> would like to introduce encryption (AES, DES, 3DES) in the pipeline
> after zlib compression to encrypt some datasets.
>
>
>
> Any examples, starting points, or suggestions would help.
>
>
>
>
>
> Thanks!
>
> --Jim
>
>
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@lists.hdfgroup.org
>
> http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

How about

http://www.hdfgroup.uiuc.edu/HDF5/projects/boeing/encryption/

or

doi:10.1117/12.919736

?

G.

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Dimitris Servis
Sent: Friday, March 21, 2014 5:29 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] symmetric encryption filters?

Hi Jim
I have written a non-terminal VFL driver that segregates the metadata and encrypts it.

Cheers
Dimitris

2014-03-21 11:23 GMT+01:00 huebbe <nathanael.huebbe@informatik.uni-hamburg.de>:
While it is possible to perform some encryption in a filter, the filter
mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the
decryption, they get only data that is stored in the file. Otherwise,
the HDF5 library would not be able to do the decoding in a completely
transparent way. And if you put the key into the file (as filter
options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the
key via a side-channel. This is possible, but it will be hard to do this
in a usable and portable fashion. For instance, you cannot just pop up a
dialog asking for a key, because many programs using HDF5 don't even
have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded in
the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

On 03/21/2014 12:44 AM, Rowe, Jim wrote:

Hello - has anyone used a symmetric encryption filter with HDF5? I
would like to introduce encryption (AES, DES, 3DES) in the pipeline
after zlib compression to encrypt some datasets.

Any examples, starting points, or suggestions would help.

Thanks!

--Jim

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Hi Hyoklee

no :slight_smile: it is unfortunately proprietary code for the moment, maybe at some
point we share it as soon as we have a project with THG (cc Quincey).
However, I can tell you that compared to using filters it is by far the
most efficient method to encrypt the file.

Cheers

Dimitris

···

2014-03-21 14:05 GMT+01:00 H. Joe Lee <hyoklee@hdfgroup.org>:

Hi Dimitris,

  Your work sounds very interesting. Is it possible to share your code
as an open source project to the community? Or is it already available
somewhere for download and test?
--
Are your big (Earth) data in HDF(-EOS)?

On Fri, Mar 21, 2014 at 5:28 AM, Dimitris Servis <servisster@gmail.com> > wrote:
> Hi Jim
>
> I have written a non-terminal VFL driver that segregates the metadata and
> encrypts it.
>
> Cheers
>
> Dimitris
>
>
> 2014-03-21 11:23 GMT+01:00 huebbe
> <nathanael.huebbe@informatik.uni-hamburg.de>:
>
>> While it is possible to perform some encryption in a filter, the filter
>> mechanism is not designed for encryption. The problem is the key:
>> Filters don't get arbitrary data from the calling application to do the
>> decryption, they get only data that is stored in the file. Otherwise,
>> the HDF5 library would not be able to do the decoding in a completely
>> transparent way. And if you put the key into the file (as filter
>> options, or similar), the NSA will be happy.
>>
>> To use the filter mechanism for encryption, you would need to get the
>> key via a side-channel. This is possible, but it will be hard to do this
>> in a usable and portable fashion. For instance, you cannot just pop up a
>> dialog asking for a key, because many programs using HDF5 don't even
>> have a text terminal connected to them while they run.
>>
>> Also note that filtering does not touch the metadata in the file. I. e.
>> the NSA will be able to see the entire description of what is encoded in
>> the file, they will just not have the actual data.
>>
>> If you want security, just use gpg to encrypt the entire file.
>>
>> Cheers,
>> Nathanael Hübbe
>>
>>
>>
>> On 03/21/2014 12:44 AM, Rowe, Jim wrote:
>> > Hello - has anyone used a symmetric encryption filter with HDF5? I
>> > would like to introduce encryption (AES, DES, 3DES) in the pipeline
>> > after zlib compression to encrypt some datasets.
>> >
>> >
>> >
>> > Any examples, starting points, or suggestions would help.
>> >
>> >
>> >
>> >
>> >
>> > Thanks!
>> >
>> > --Jim
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > Hdf-forum is for HDF software users discussion.
>> > Hdf-forum@lists.hdfgroup.org
>> >
>> >
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>> >
>>
>>
>> --
>> Please be aware that the enemies of your civil rights and your freedom
>> are on CC of all unencrypted communication. Protect yourself.
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum@lists.hdfgroup.org
>>
>>
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum@lists.hdfgroup.org
>
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Thanks, Mark. The Boeing encryption handles the problem nicely by letting you pass in your encryption "key" when you register the filter with HDF5. It also supports having multiple keys, so conceivably you could allow someone access to parts of the data, but not others. Anyone interest should take a look at the link Gerd sent (http://www.hdfgroup.uiuc.edu/HDF5/projects/boeing/encryption/).

Block encryption itself--such as AES--doesn't change the size of the data at all. However, paired with compression, the order is critical. You absolutely need to compress it first, then encrypt it.

Warm Regards,
Jim

···

-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Miller, Mark C.
Sent: Friday, March 21, 2014 11:17 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] symmetric encryption filters?

Hmm. This is an interesting discussion. Let me see if I can add two centsŠ

The HDF5 library allows you to define your own 'filters' which operate on the data in-transit as it is written to and read from the file. The filters are just call backs made from the HDF5 library to your user-defined code to operate on chunks of the dataset as they are emitted underneath the H5Dwrite and H5Dread calls.

If you write data via some user-defined filter then any reader will need to have access to the code that does the reverse operation (decrypt in your case and, of course any decryption keys). So, there is already implied in this that if you define some 'weird' filter, none of the existing HDF5 tools will be able to read your data (hdfview, h5dump, or third party applications that read
HDF5
like IDL, MATLAB, VisIt, etc.). But, given that you are talking about encryption here, I suspect that such an outcome is actually perfectly fine.

So, only applications that have access to your reader code (decryption
filters)
will be able to read the data.

And, why not handle that the way something like ssh does it now. Your reader 'filter' would have to acquire the key from ~/.ssh/id_rsa and then use what it gets to decrypt the chunks getting read during H5Dread. Failure to acquire the key would result in a filter error and ultimately a read error in H5Dread's error stack. You could do some work to detect this case and report a useful error message (e.g. "no appropriate key to read encrypted data").

Would you have a single HDF5 file with datasets encrypted for different ids?
If so, I think the ssh-like mechanim still works.

Because 'filter' operations apply only to the raw data of a dataset, the metadata is not encrypted. This means things like the names, dimensions, datatypes, etc (and any attributes defined on the datasets) cannot be encrypted via the 'filter'
approach. Perhaps this is why another responder mentioned the introduction of a Virtual File Driver that collects metadata together and encrypts that separately.
I could see how that could be important in certain circumstances.

Some other issues are that 'filters' can be applied only when dataset are 'chunked'.
And, the filters are then applied independently to each chunk. So, what you get for a single dataset is a bunch of chunks, each chunk independently encrypted.
So, you
don't have the whole dataset encrypted in one fell swoop. I don't think that would cause problems but thought I would mention it.

HDF5 can be 'smart' about applying filters and wind up NOT applying a requested filter in circumstances where you tell it the filter is optional. So, you have to take care to be sure your filter won't be treated by HDF5 that way and wind up skpping and encryption filter it should not have. Just be sure to set up the filters correctly when you define them to HDF5.

Will encryption *increase* the size of the data being written? I don't think it does but I guess its always possible depending on what you are doing. If so,
HDF5 may not
be able to tolerate that. It may expect chunks to be equal to or less than in size that the un-filtered chunks and error-out (or skip such a filter) if that is not the case. So, just be sure too review the documentation on these details.

I guess this is a long winded way of saying I think you could make it work within the limitations of some of the issues I mention above. And, I think you can invent a way to handle the keys that can probably be made to work.

Hope that was helpful.

Mark

On 3/21/14 3:23 AM, "huebbe" <nathanael.huebbe@informatik.uni-hamburg.de> wrote:

While it is possible to perform some encryption in a filter, the filter
mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the
decryption, they get only data that is stored in the file. Otherwise,
the HDF5 library would not be able to do the decoding in a completely
transparent way. And if you put the key into the file (as filter
options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the
key via a side-channel. This is possible, but it will be hard to do
this in a usable and portable fashion. For instance, you cannot just
pop up a dialog asking for a key, because many programs using HDF5
don't even have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded
in the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

On 03/21/2014 12:44 AM, Rowe, Jim wrote:

Hello ­ has anyone used a symmetric encryption filter with HDF5? I
would like to introduce encryption (AES, DES, 3DES) in the pipeline
after zlib compression to encrypt some datasets.

Any examples, starting points, or suggestions would help.

Thanks!

--Jim

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.
org

--
Please be aware that the enemies of your civil rights and your freedom
are on CC of all unencrypted communication. Protect yourself.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Thanks for the updates. Encrypting the entire file is not practical and doesn't achieve our goals. It seems that the ideal would be a combination of what Dimitry has done and Gerd points to with the Boeing encryption project. Dimitry if your code became non-proprietary it would be great to be able to use. Our shop doesn't have the bandwidth (or skills) to roll our own VFL.

We are using HDF5 as the backing store for a proprietary application and are not worried about the NSA or complete portability to be read by other tools. We just need reasonably robust obfuscation of our underlying data.

Warm Regards,
Jim

···

-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Gerd Heber
Sent: Friday, March 21, 2014 7:00 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] symmetric encryption filters?

How about

http://www.hdfgroup.uiuc.edu/HDF5/projects/boeing/encryption/

or

doi:10.1117/12.919736

?

G.

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Dimitris Servis
Sent: Friday, March 21, 2014 5:29 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] symmetric encryption filters?

Hi Jim
I have written a non-terminal VFL driver that segregates the metadata and encrypts it.

Cheers
Dimitris

2014-03-21 11:23 GMT+01:00 huebbe <nathanael.huebbe@informatik.uni-hamburg.de>:
While it is possible to perform some encryption in a filter, the filter mechanism is not designed for encryption. The problem is the key:
Filters don't get arbitrary data from the calling application to do the decryption, they get only data that is stored in the file. Otherwise, the HDF5 library would not be able to do the decoding in a completely transparent way. And if you put the key into the file (as filter options, or similar), the NSA will be happy.

To use the filter mechanism for encryption, you would need to get the key via a side-channel. This is possible, but it will be hard to do this in a usable and portable fashion. For instance, you cannot just pop up a dialog asking for a key, because many programs using HDF5 don't even have a text terminal connected to them while they run.

Also note that filtering does not touch the metadata in the file. I. e.
the NSA will be able to see the entire description of what is encoded in the file, they will just not have the actual data.

If you want security, just use gpg to encrypt the entire file.

Cheers,
Nathanael Hübbe

On 03/21/2014 12:44 AM, Rowe, Jim wrote:

Hello - has anyone used a symmetric encryption filter with HDF5? I
would like to introduce encryption (AES, DES, 3DES) in the pipeline
after zlib compression to encrypt some datasets.

Any examples, starting points, or suggestions would help.

Thanks!

--Jim

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgro
up.org

--
Please be aware that the enemies of your civil rights and your freedom are on CC of all unencrypted communication. Protect yourself.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

Just a hint:

In my VFL approach I use stream encryption as the metadata blocks are saved
and loaded in arbitrary sizes...

Cheers

Dimitris

···

2014-03-21 19:22 GMT+01:00 Rowe, Jim <J.Rowe@questintegrity.com>:

Thanks, Mark. The Boeing encryption handles the problem nicely by letting
you pass in your encryption "key" when you register the filter with HDF5.
It also supports having multiple keys, so conceivably you could allow
someone access to parts of the data, but not others. Anyone interest
should take a look at the link Gerd sent (
http://www.hdfgroup.uiuc.edu/HDF5/projects/boeing/encryption/).

Block encryption itself--such as AES--doesn't change the size of the data
at all. However, paired with compression, the order is critical. You
absolutely need to compress it first, then encrypt it.

Warm Regards,
Jim

-----Original Message-----
From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf
Of Miller, Mark C.
Sent: Friday, March 21, 2014 11:17 AM
To: HDF Users Discussion List
Subject: Re: [Hdf-forum] symmetric encryption filters?

Hmm. This is an interesting discussion. Let me see if I can add two centsŠ

The HDF5 library allows you to define your own 'filters' which operate on
the data in-transit as it is written to and read from the file. The filters
are just call backs made from the HDF5 library to your user-defined code to
operate on chunks of the dataset as they are emitted underneath the
H5Dwrite and H5Dread calls.

If you write data via some user-defined filter then any reader will need
to have access to the code that does the reverse operation (decrypt in your
case and, of course any decryption keys). So, there is already implied in
this that if you define some 'weird' filter, none of the existing HDF5
tools will be able to read your data (hdfview, h5dump, or third party
applications that read
HDF5
like IDL, MATLAB, VisIt, etc.). But, given that you are talking about
encryption here, I suspect that such an outcome is actually perfectly fine.

So, only applications that have access to your reader code (decryption
filters)
will be able to read the data.

And, why not handle that the way something like ssh does it now. Your
reader 'filter' would have to acquire the key from ~/.ssh/id_rsa and then
use what it gets to decrypt the chunks getting read during H5Dread. Failure
to acquire the key would result in a filter error and ultimately a read
error in H5Dread's error stack. You could do some work to detect this case
and report a useful error message (e.g. "no appropriate key to read
encrypted data").

Would you have a single HDF5 file with datasets encrypted for different
ids?
If so, I think the ssh-like mechanim still works.

Because 'filter' operations apply only to the raw data of a dataset, the
metadata is not encrypted. This means things like the names, dimensions,
datatypes, etc (and any attributes defined on the datasets) cannot be
encrypted via the 'filter'
approach. Perhaps this is why another responder mentioned the introduction
of a Virtual File Driver that collects metadata together and encrypts that
separately.
I could see how that could be important in certain circumstances.

Some other issues are that 'filters' can be applied only when dataset are
'chunked'.
And, the filters are then applied independently to each chunk. So, what
you get for a single dataset is a bunch of chunks, each chunk independently
encrypted.
So, you
don't have the whole dataset encrypted in one fell swoop. I don't think
that would cause problems but thought I would mention it.

HDF5 can be 'smart' about applying filters and wind up NOT applying a
requested filter in circumstances where you tell it the filter is optional.
So, you have to take care to be sure your filter won't be treated by HDF5
that way and wind up skpping and encryption filter it should not have. Just
be sure to set up the filters correctly when you define them to HDF5.

Will encryption *increase* the size of the data being written? I don't
think it does but I guess its always possible depending on what you are
doing. If so,
HDF5 may not
be able to tolerate that. It may expect chunks to be equal to or less than
in size that the un-filtered chunks and error-out (or skip such a filter)
if that is not the case. So, just be sure too review the documentation on
these details.

I guess this is a long winded way of saying I think you could make it work
within the limitations of some of the issues I mention above. And, I think
you can invent a way to handle the keys that can probably be made to work.

Hope that was helpful.

Mark

On 3/21/14 3:23 AM, "huebbe" <nathanael.huebbe@informatik.uni-hamburg.de> > wrote:

>While it is possible to perform some encryption in a filter, the filter
>mechanism is not designed for encryption. The problem is the key:
>Filters don't get arbitrary data from the calling application to do the
>decryption, they get only data that is stored in the file. Otherwise,
>the HDF5 library would not be able to do the decoding in a completely
>transparent way. And if you put the key into the file (as filter
>options, or similar), the NSA will be happy.
>
>To use the filter mechanism for encryption, you would need to get the
>key via a side-channel. This is possible, but it will be hard to do
>this in a usable and portable fashion. For instance, you cannot just
>pop up a dialog asking for a key, because many programs using HDF5
>don't even have a text terminal connected to them while they run.
>
>Also note that filtering does not touch the metadata in the file. I. e.
>the NSA will be able to see the entire description of what is encoded
>in the file, they will just not have the actual data.
>
>If you want security, just use gpg to encrypt the entire file.
>
>Cheers,
>Nathanael Hübbe
>
>
>
>On 03/21/2014 12:44 AM, Rowe, Jim wrote:
>> Hello ­ has anyone used a symmetric encryption filter with HDF5? I
>> would like to introduce encryption (AES, DES, 3DES) in the pipeline
>> after zlib compression to encrypt some datasets.
>>
>>
>>
>> Any examples, starting points, or suggestions would help.
>>
>>
>>
>>
>>
>> Thanks!
>>
>> --Jim
>>
>>
>>
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> Hdf-forum@lists.hdfgroup.org
>>
>>http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup
.
>>org
>>
>
>
>--
>Please be aware that the enemies of your civil rights and your freedom
>are on CC of all unencrypted communication. Protect yourself.
>

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org

http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org