Cross-platform Unicode filename support?

Hi all,

Anyone know what happened with

http://hdf-forum.184993.n3.nabble.com/hdf-forum-Unicode-filenames-on-Windows-td194309.html

? That thread ended with Quincey saying

    "Seems like a reasonable idea. I've filed a bug in our bug tracker
and it'll get prioritized with the other things there, but we'd be
happy to accept a well-tested patch from the community also."

Was it ever worked on / brought up within the HDF5 Group?

Thanks in advance,
Elvis

PS. Would love if there was a public bug tracker. DS.

Hi Elvis,

We'd love to do the Unicode work but it hasn't moved to the head of the queue yet with all the new features that were introduced in 1.10 and that are coming in the future. If a sponsor comes forth with money that would change, but right now Unicode work is unfunded and would have to be done on our sustained engineering budget, which gets lower priority than paid-for work.

And we're working on making a public bug tracker available. Hopefully that will be available soon.

Cheers,

Dana Robinson
Software Engineer
The HDF Group

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elvis Stansvik
Sent: Monday, July 4, 2016 4:40 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Cross-platform Unicode filename support?

Hi all,
Anyone know what happened with

    http://hdf-forum.184993.n3.nabble.com/hdf-forum-Unicode-filenames-on-Windows-td194309.html
? That thread ended with Quincey saying

    "Seems like a reasonable idea. I've filed a bug in our bug tracker
and it'll get prioritized with the other things there, but we'd be
happy to accept a well-tested patch from the community also."
Was it ever worked on / brought up within the HDF5 Group?
Thanks in advance,
Elvis
PS. Would love if there was a public bug tracker. DS.

To get Unicode path support on Windows, I make a custom HDF5 build. I make two changes to the H5win32defs.h file.

First, I add a static function that converts UTF8 to Windows UCS2:
static int utf8open(const char* filename, int oflag, int pmode)
{
    // In order to open files with Unicode paths, Windows requires that I call the _wopen function
    // rather than pass UTF8 to _open. Therefore, this function will first translate the filename
    // to UCS2.
    WCHAR wfilename[1024];
    int cch = MultiByteToWideChar(
        CP_UTF8,
        MB_ERR_INVALID_CHARS,
        filename,
        -1,
        wfilename,
        ARRAYSIZE(wfilename));
    if (0 == cch)
    {
        // File or path not found is the most reasonable of the documented error codes to return
        // from this function if there is a text encoding error in the file name.
        errno = ENOENT;
        return -1;
    }

    // _O_BINARY must be set in Windows to avoid CR-LF <-> LF EOL transformations when performing
    // I/O. _O_NOINHERIT must be set to prevent child processes inheriting the handle.
    return _wopen(wfilename, oflag | _O_BINARY | _O_NOINHERIT, pmode);
}

Second, I change the #define HDopen in that file to use my function:
#define HDopen(S,F,M) utf8open(S,F|_O_BINARY,M)

Before calling HDF5 functions that take file paths, I also have to convert the UCS2 paths I get from Windows API calls to UTF8. It’s kind of inconvenient, but so far I haven’t had any problems passing the UTF8-encoded paths through HDF5 using this technique, and the required changes are not very invasive.

Matthew

···

From: Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] On Behalf Of Elvis Stansvik
Sent: Monday, July 4, 2016 3:40 AM
To: hdf-forum@lists.hdfgroup.org
Subject: [Hdf-forum] Cross-platform Unicode filename support?

Hi all,
Anyone know what happened with

    http://hdf-forum.184993.n3.nabble.com/hdf-forum-Unicode-filenames-on-Windows-td194309.html
? That thread ended with Quincey saying

    "Seems like a reasonable idea. I've filed a bug in our bug tracker
and it'll get prioritized with the other things there, but we'd be
happy to accept a well-tested patch from the community also."
Was it ever worked on / brought up within the HDF5 Group?
Thanks in advance,
Elvis
PS. Would love if there was a public bug tracker. DS.

Alright, thanks for the info Dana.

Elvis

···

2016-07-05 15:37 GMT+02:00 Dana Robinson <derobins@hdfgroup.org>:

Hi Elvis,

We'd love to do the Unicode work but it hasn't moved to the head of the
queue yet with all the new features that were introduced in 1.10 and that
are coming in the future. If a sponsor comes forth with money that would
change, but right now Unicode work is unfunded and would have to be done on
our sustained engineering budget, which gets lower priority than paid-for
work.

And we're working on making a public bug tracker available. Hopefully that
will be available soon.

Cheers,

Dana Robinson

Software Engineer

The HDF Group

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Elvis Stansvik
*Sent:* Monday, July 4, 2016 4:40 AM
*To:* hdf-forum@lists.hdfgroup.org
*Subject:* [Hdf-forum] Cross-platform Unicode filename support?

Hi all,

Anyone know what happened with

http://hdf-forum.184993.n3.nabble.com/hdf-forum-Unicode-filenames-on-Windows-td194309.html

? That thread ended with Quincey saying

    "Seems like a reasonable idea. I've filed a bug in our bug tracker
and it'll get prioritized with the other things there, but we'd be
happy to accept a well-tested patch from the community also."

Was it ever worked on / brought up within the HDF5 Group?

Thanks in advance,

Elvis

PS. Would love if there was a public bug tracker. DS.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Thanks to both of you for the workarounds/advice.

At the moment we're only on *NIX platforms, where this is less of a
problem. But it's very good to see that it's easy to patch HDF5 to make it
work on Windows as well, should we start supporting it.

Elvis

···

2016-07-05 18:03 GMT+02:00 Xavier, Matthew <Matthew.Xavier@mts.com>:

To get Unicode path support on Windows, I make a custom HDF5 build. I make
two changes to the H5win32defs.h file.

First, I add a static function that converts UTF8 to Windows UCS2:

static int utf8open(const char* filename, int oflag, int pmode)

{

    // In order to open files with Unicode paths, Windows requires that I
call the _wopen function

    // rather than pass UTF8 to _open. Therefore, this function will first
translate the filename

    // to UCS2.

    WCHAR wfilename[1024];

    int cch = MultiByteToWideChar(

        CP_UTF8,

        MB_ERR_INVALID_CHARS,

        filename,

        -1,

        wfilename,

        ARRAYSIZE(wfilename));

    if (0 == cch)

    {

        // File or path not found is the most reasonable of the documented
error codes to return

        // from this function if there is a text encoding error in the
file name.

        errno = ENOENT;

        return -1;

    }

    // _O_BINARY must be set in Windows to avoid CR-LF <-> LF EOL
transformations when performing

    // I/O. _O_NOINHERIT must be set to prevent child processes inheriting
the handle.

    return _wopen(wfilename, oflag | _O_BINARY | _O_NOINHERIT, pmode);

}

Second, I change the #define HDopen in that file to use my function:

#define HDopen(S,F,M) utf8open(S,F|_O_BINARY,M)

Before calling HDF5 functions that take file paths, I also have to convert
the UCS2 paths I get from Windows API calls to UTF8. It’s kind of
inconvenient, but so far I haven’t had any problems passing the
UTF8-encoded paths through HDF5 using this technique, and the required
changes are not very invasive.

Matthew

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Elvis Stansvik
*Sent:* Monday, July 4, 2016 3:40 AM
*To:* hdf-forum@lists.hdfgroup.org
*Subject:* [Hdf-forum] Cross-platform Unicode filename support?

Hi all,

Anyone know what happened with

http://hdf-forum.184993.n3.nabble.com/hdf-forum-Unicode-filenames-on-Windows-td194309.html

? That thread ended with Quincey saying

    "Seems like a reasonable idea. I've filed a bug in our bug tracker
and it'll get prioritized with the other things there, but we'd be
happy to accept a well-tested patch from the community also."

Was it ever worked on / brought up within the HDF5 Group?

Thanks in advance,

Elvis

PS. Would love if there was a public bug tracker. DS.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

The function H5FD_sec2_open() in H5FDsec2.c does the actual open operation.
You can pass a utf8 string to H5fopen() and it should end up in here. Then
convert your filename back to utf-16 and then use _wopen where HDopen is
being called. I think this will work.

···

On Tue, Jul 5, 2016 at 7:25 AM, Elvis Stansvik <elvis.stansvik@orexplore.com > wrote:

Alright, thanks for the info Dana.

Elvis

2016-07-05 15:37 GMT+02:00 Dana Robinson <derobins@hdfgroup.org>:

Hi Elvis,

We'd love to do the Unicode work but it hasn't moved to the head of the
queue yet with all the new features that were introduced in 1.10 and that
are coming in the future. If a sponsor comes forth with money that would
change, but right now Unicode work is unfunded and would have to be done on
our sustained engineering budget, which gets lower priority than paid-for
work.

And we're working on making a public bug tracker available. Hopefully
that will be available soon.

Cheers,

Dana Robinson

Software Engineer

The HDF Group

*From:* Hdf-forum [mailto:hdf-forum-bounces@lists.hdfgroup.org] *On
Behalf Of *Elvis Stansvik
*Sent:* Monday, July 4, 2016 4:40 AM
*To:* hdf-forum@lists.hdfgroup.org
*Subject:* [Hdf-forum] Cross-platform Unicode filename support?

Hi all,

Anyone know what happened with

http://hdf-forum.184993.n3.nabble.com/hdf-forum-Unicode-filenames-on-Windows-td194309.html

? That thread ended with Quincey saying

    "Seems like a reasonable idea. I've filed a bug in our bug tracker
and it'll get prioritized with the other things there, but we'd be
happy to accept a well-tested patch from the community also."

Was it ever worked on / brought up within the HDF5 Group?

Thanks in advance,

Elvis

PS. Would love if there was a public bug tracker. DS.

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5