HDF5DotNet strings

Hello all. I'm new to the HDF5 world, so I seek the wisdom of the elders.

I am trying to write a tool that will scan through an HDF file and pull out
basic information (group and dataset names, data names, data values (if
string), etc.). I decided to write it in C#, since I am the most familiar
with that, and since I hoped to one day wrap this tool into a set of tools
that will all be accessed in one windows application. I downloaded and
installed the HDF5DotNet package, and in a few hours was up and running,
getting dataset names and gorup names and value names.

My problem is trying to read in strings. I came across this group through a
Google search on my issue, and found a recent thread by Scott Mitchell (
http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2009-June/000216.html)
that described his issues with reading strings with HDF5DotNet, but neither
my own attempts to solve this problem, nor the solutions provided to him
have been 100% effective.

Firstly, my initial stab at trying to read a string (in this case, an
attribute value):

char [] data = new char [H5T.getSize(dataType)];
H5A.read(attributeId, dataType, new H5Array<char>(data));
Console.WriteLine(" - Value: "+new string(data));

This is probably wrong, but I did notice that while the output was mostly
garbage, the LAST character would print out fine. The output would look
like:
   - Value: ??E

when it was supposed to output "IMAGE". Or whatever.

So, I tried the solution proposed to Scott Mitchell, and used the code from
the example HDF5DotNet project linked to him, plus his comments after he
says he solved his problem. For the sample HDF5 'strings.h5', the solution
worked great. I was able to output string values left and right. However,
if I tried to modify the strings.h5 to add fixed length string attirbutes,
the values of the attributes would print out blank. And when I tried to
open other sample HDF5 files (hdf5_test.h5, and a few of the files I will
eventually be scanning), the values output were either blank or garbage.

Help! What am I doing wrong? The datatype I am passing to the read
function I am getting off of H5T.getType() against the dataset/attribute I
am trying to read. Do I need to modify this somehow? What am I missing?

Thanks,
Eliot

For completeness sake, here's the code I am trying from Scott Mitchell's
thread:

from inside my attributes iterate function:

            if (H5T.getClass(dataType) == H5T.H5TClass.STRING)
            {
                unsafe
                {
                    Chararray[] data = new Chararray[1];

                    H5A.read(attributeId, dataType, new
H5Array<Chararray>(data));

                    Console.WriteLine(" - Value: " +
data[0].GetString((int)H5T.getSize(dataType)));
                }
            }

        //this defines the struct used to read in the string data
        //since it is unsafe we need to use the layout to define the struct
        [StructLayout(LayoutKind.Sequential)]
        public unsafe struct Chararray
        {
            private double* recordedText;

            //an initializer to get and set the char* since it is unsafe
            public double* RecordedText
            {
                get
                {
                    return recordedText;
                }
                set
                {
                    recordedText = value;
                }
            }

            public override string ToString()
            {
                string s = "";
                //the HDF5 STRING is not a string but in fact a char *
                //since it is we need to translate the return into a pointer
address

                IntPtr ipp = (IntPtr)this.recordedText;
                //This call is used to transform the pointer into the value
of the pointer.

                //NOTE: this only works with null-terminated strings.
                s =
System.Runtime.InteropServices.Marshal.PtrToStringAnsi(ipp);

                return s;
            }
        }

Hi Eliot,

I never did get fixed length strings to work. I could live with flexible length strings, so I didn't pursue it anymore. But I'd like to see this resolved.

Keep in touch & good luck,
Scott

···

From: hdf-forum-bounces@hdfgroup.org [mailto:hdf-forum-bounces@hdfgroup.org] On Behalf Of Eliot Stone
Sent: Tuesday, September 01, 2009 11:02 AM
To: hdf-forum@hdfgroup.org
Subject: [Hdf-forum] HDF5DotNet strings

Hello all. I'm new to the HDF5 world, so I seek the wisdom of the elders.

I am trying to write a tool that will scan through an HDF file and pull out basic information (group and dataset names, data names, data values (if string), etc.). I decided to write it in C#, since I am the most familiar with that, and since I hoped to one day wrap this tool into a set of tools that will all be accessed in one windows application. I downloaded and installed the HDF5DotNet package, and in a few hours was up and running, getting dataset names and gorup names and value names.

My problem is trying to read in strings. I came across this group through a Google search on my issue, and found a recent thread by Scott Mitchell (http://mail.hdfgroup.org/pipermail/hdf-forum_hdfgroup.org/2009-June/000216.html) that described his issues with reading strings with HDF5DotNet, but neither my own attempts to solve this problem, nor the solutions provided to him have been 100% effective.

Firstly, my initial stab at trying to read a string (in this case, an attribute value):

char [] data = new char [H5T.getSize(dataType)];
H5A.read(attributeId, dataType, new H5Array<char>(data));
Console.WriteLine(" - Value: "+new string(data));

This is probably wrong, but I did notice that while the output was mostly garbage, the LAST character would print out fine. The output would look like:
   - Value: ??E

when it was supposed to output "IMAGE". Or whatever.

So, I tried the solution proposed to Scott Mitchell, and used the code from the example HDF5DotNet project linked to him, plus his comments after he says he solved his problem. For the sample HDF5 'strings.h5', the solution worked great. I was able to output string values left and right. However, if I tried to modify the strings.h5 to add fixed length string attirbutes, the values of the attributes would print out blank. And when I tried to open other sample HDF5 files (hdf5_test.h5, and a few of the files I will eventually be scanning), the values output were either blank or garbage.

Help! What am I doing wrong? The datatype I am passing to the read function I am getting off of H5T.getType() against the dataset/attribute I am trying to read. Do I need to modify this somehow? What am I missing?

Thanks,
Eliot

For completeness sake, here's the code I am trying from Scott Mitchell's thread:

from inside my attributes iterate function:

            if (H5T.getClass(dataType) == H5T.H5TClass.STRING)
            {
                unsafe
                {
                    Chararray[] data = new Chararray[1];

                    H5A.read(attributeId, dataType, new H5Array<Chararray>(data));

                    Console.WriteLine(" - Value: " + data[0].GetString((int)H5T.getSize(dataType)));
                }
            }

        //this defines the struct used to read in the string data
        //since it is unsafe we need to use the layout to define the struct
        [StructLayout(LayoutKind.Sequential)]
        public unsafe struct Chararray
        {
            private double* recordedText;

            //an initializer to get and set the char* since it is unsafe
            public double* RecordedText
            {
                get
                {
                    return recordedText;
                }
                set
                {
                    recordedText = value;
                }
            }

            public override string ToString()
            {
                string s = "";
                //the HDF5 STRING is not a string but in fact a char *
                //since it is we need to translate the return into a pointer address

                IntPtr ipp = (IntPtr)this.recordedText;
                //This call is used to transform the pointer into the value of the pointer.

                //NOTE: this only works with null-terminated strings.
                s = System.Runtime.InteropServices.Marshal.PtrToStringAnsi(ipp);

                return s;
            }
        }

________________________________
This e-mail and any files transmitted with it may be proprietary and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error please notify the sender.
Please note that any views or opinions presented in this e-mail are solely those of the author and do not necessarily represent those of ITT Corporation. The recipient should check this e-mail and any attachments for the presence of viruses. ITT accepts no liability for any damage caused by any virus transmitted by this e-mail.

Hi Eliot

I was able to get both fixed length and variable length strings working
for regular attributes. We did have some trouble with variable length
strings in a compound datatype, so we stuck with fixed lengths string
there.

Are the strings you are trying to read fixed or variable length? If
you can post a example file that has at least one dataset with a string
attribute you want to read, I’ll try to take a look at it to see if I
can write up a quick example on how to read it.

Jesse

Eliot Stone wrote:

···

Hdf-forum@hdfgroup.orghttp://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org