Trim of strings on read in HDF-Java

I ran into an unexpected feature in the HDF-Java implementation. When a string (including strings within a string array) is read from a file, the strings are trimmed of any character <= '\u0020', including newlines, tabs, and linefeeds. I thought this was strange.
Dataset.java method byteToString() has this code (jhdfobj-2.11.0):

            // trim only the end
            int end = str.length();
            while (end > 0 && str.charAt(end - 1) <= '\u0020')
                end--;

The full string is read from the file and converted from bytes up to the point of the above code, when the end index is trimmed to the first character above '\u0020'.

What's the reason behind this? My guess is to make sure that there are no null characters at the end of the string, but seems to be overly aggressive.

Jarom Nelson; x33953
Computer Scientist, NIF, LLNL

Hi Jarom,

Off-hand we are not sure of the reason for this. I entered issue JAVA-1959 so that we investigate it further.
Thank you!

-Barbara
help@hdfgroup.org<mailto:help@hdfgroup.org>

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Nelson, Jarom
Sent: Thursday, March 02, 2017 11:06 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Trim of strings on read in HDF-Java

I ran into an unexpected feature in the HDF-Java implementation. When a string (including strings within a string array) is read from a file, the strings are trimmed of any character <= '\u0020', including newlines, tabs, and linefeeds. I thought this was strange.
Dataset.java method byteToString() has this code (jhdfobj-2.11.0):

            // trim only the end
            int end = str.length();
            while (end > 0 && str.charAt(end - 1) <= '\u0020')
                end--;

The full string is read from the file and converted from bytes up to the point of the above code, when the end index is trimmed to the first character above '\u0020'.

What's the reason behind this? My guess is to make sure that there are no null characters at the end of the string, but seems to be overly aggressive.

Jarom Nelson; x33953
Computer Scientist, NIF, LLNL