Packet Table with Variable Length Character String SegFaults



I’ve searched the group and found a few references to writing compound data types with variable length character strings to a packet table. Taking all those into account, I think that I’ve implemented a test case correctly, but I keep getting a segfault when appending to a packet table. Below is the simple main() that fails at H5PTappend(). Hopefully someone can offer some insight as to why it is failing. I’m using the stock version of HDF5 (v1.10.5-4) that comes with CEntOS 8.x so I think this version should allow for variable length packet tables. Any help/insight would be appreciated.


#include "hdf5.h"
#include "hdf5_hl.h"

typedef struct v_moddata_t {
    hvl_t cLogHandle;
} v_moddata_t;
//  Test program
int main(int argc, char **argv) {
    hid_t iErr = 0;

    char cFileName[] = "Testing.hd5";
    printf("Test HDF5 File: %s\n",cFileName);
    //  Open HDF5 file for export
    hid_t hidFileID = H5Fcreate(cFileName, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

    printf(" Dumping vpacket table\n");
    //  Attempt to open the revision history packet table
    hid_t hidPTableID = H5PTopen(hidFileID, "VRevision History");
    //  If there is an error attempt to create a fixed length packet table
    if( hidPTableID == H5I_BADID ) {
        //  Create copy of native character type
        hid_t hidCharLenVarTypeID = H5Tcopy(H5T_C_S1);
        //  Set size of character type
        H5Tset_size(hidCharLenVarTypeID, H5T_VARIABLE);
        //  Create memory data type for compound data
        hid_t hidModDataTypeID = H5Tcreate(H5T_COMPOUND, sizeof(v_moddata_t));
        H5Tinsert(hidModDataTypeID, "log", HOFFSET(v_moddata_t, cLogHandle), hidCharLenVarTypeID);
        //  Create packet table
        hidPTableID = H5PTcreate(hidFileID, "VRevision History", hidModDataTypeID, 100, H5P_DEFAULT);
//      hidPTableID = H5PTcreate_fl(hidFileID, "VRevision History", hidModDataTypeID, (hid_t) 100, -1);
        //  Free resources
        //  Check for error and return
        if( hidPTableID == H5I_INVALID_HID ) return 1;
//      if( hidPTableID == H5I_BADID ) return 1;
    //  Fill data type
    v_moddata_t modDat;
    modDat.cLogHandle.len = 16;
    modDat.cLogHandle.p = (void *) "Some stuff here\0";
        printf("String %i - '%s'\n", (int) modDat.cLogHandle.len, (char *) modDat.cLogHandle.p);
    //  Append data to packet table
    iErr = H5PTappend(hidPTableID, (size_t) 1, &modDat);
    //  Close packet table
    iErr = H5PTclose(hidPTableID);
    //  Close file
    iErr = H5Fclose(hidFileID);

    return 0;


Answering my own question…

If I remove:

//  Create copy of native character type
hid_t hidCharLenVarTypeID = H5Tcopy(H5T_C_S1);
//  Set size of character type
H5Tset_size(hidCharLenVarTypeID, H5T_VARIABLE);

and replace with:

 hid_t hidCharLenVarTypeID = H5Tvlen_create(H5T_C_S1);

Then it seems to work. However, coming from this link, it says:

Creating variable-length string datatypes

As the term implies, variable-length strings are strings of varying lengths; they can be arbitrarily long, anywhere from 1 character to thousands of characters.

HDF5 provides the ability to create a variable-length string datatype. Like all string datatypes, this type is based on the atomic string datatype: H5T_C_S1 in C or H5T_FORTRAN_S1 in Fortran. While these datatypes default to one character in size, they can be resized to specific fixed lengths or to variable length.

Variable-length strings will transparently accommodate ASCII strings or UTF-8 strings. This characteristic is set with H5Tset_cset in the process of creating the datatype.

The following HDF5 calls create a C-style variable-length string datatype, vls_type_c_id:

vls_type_c_id = H5Tcopy(H5T_C_S1)
status = H5Tset_size(vls_type_c_id, H5T_VARIABLE)

In a C environment, variable-length strings will always be NULL-terminated, so the buffer to hold such a string must be one byte larger than the string itself to accommodate the NULL terminator.

So there seems to be some confusion here.



In HDF5, there are two kinds of strings. Fixed-length strings and variable-length strings. It might be better to call the latter “non-fixed-length strings” for contrast with ‘fixed-length’, but also to avoid the impression that they are variable-length sequences in the sense of an HDF5 datatype. They are not! This is also stated in the H5Tvlen_create documentation.

H5T_VLEN_CREATE cannot be used to create a variable-length string datatype.

I didn’t have a chance to study your code, but the most common mistake when dealing with non-fixed-length strings in compounds is that your corresponding C struct must have a char* field instead of a char[] for fixed-length strings. G.



What you are saying may be true, but doing it ‘correctly’ doesn’t work for Packet Tables. I get the following segfault backtrace with gdb when using a debug compiled 1.10.5 version of HDF5:

Program received signal SIGSEGV, Segmentation fault.
0x00000000007210cd in H5T_vlen_str_mem_getlen (_vl=) at /root/Software/hdf5-1.10.5/src/H5Tvlen.c:546
546 /root/Software/hdf5-1.10
(gdb) where
#0 0x00000000007210cd in H5T_vlen_str_mem_getlen (_vl=) at /root/Software/hdf5-1.10.5/src/H5Tvlen.c:546
#1 0x00000000006a4093 in H5T__conv_vlen (src_id=, dst_id=, cdata=, nelmts=1, buf_stride=,
bkg_stride=, buf=0x7ffff7e53018, bkg=0x7ffff5cd5018) at /root/Software/hdf5-1.10.5/src/H5Tconv.c:3193
#2 0x0000000000693e64 in H5T_convert (tpath=0xadda10, src_id=216172782113784133, dst_id=216172782113784134, nelmts=nelmts@entry=1, buf_stride=buf_stride@entry=0,
bkg_stride=bkg_stride@entry=0, buf=0x7ffff7e53018, bkg=0x7ffff5cd5018) at /root/Software/hdf5-1.10.5/src/H5T.c:5024
#3 0x00000000006a1d6c in H5T__conv_struct (src_id=, dst_id=, cdata=, nelmts=1, buf_stride=0,
bkg_stride=, _buf=0x7ffff7e53018, _bkg=0x7ffff5cd5018) at /root/Software/hdf5-1.10.5/src/H5Tconv.c:2263
#4 0x0000000000693e64 in H5T_convert (tpath=0xad5df0, src_id=216172782113784124, dst_id=216172782113784123, nelmts=nelmts@entry=1, buf_stride=buf_stride@entry=0,
bkg_stride=bkg_stride@entry=0, buf=0x7ffff7e53018, bkg=0x7ffff5cd5018) at /root/Software/hdf5-1.10.5/src/H5T.c:5024
#5 0x00000000004c2d55 in H5D__scatgath_write (io_info=0x7fffffffd8c0, type_info=0x7fffffffdac0, nelmts=1, file_space=0xae4cd0, mem_space=0xad45b0)
at /root/Software/hdf5-1.10.5/src/H5Dscatgath.c:701
#6 0x0000000000498499 in H5D__chunk_write (io_info=0x7fffffffdb40, type_info=0x7fffffffdac0, nelmts=, file_space=,
mem_space=, fm=0xae42f0) at /root/Software/hdf5-1.10.5/src/H5Dchunk.c:2436
#7 0x00000000004bca0b in H5D__write (dataset=dataset@entry=0xad4df0, mem_type_id=mem_type_id@entry=216172782113784124, mem_space=0xad45b0, file_space=0xad5c30,
buf=, buf@entry=0x7fffffffdd10) at /root/Software/hdf5-1.10.5/src/H5Dio.c:817
#8 0x00000000004bd3d4 in H5Dwrite (dset_id=dset_id@entry=360287970189639680, mem_type_id=mem_type_id@entry=216172782113784124,
mem_space_id=mem_space_id@entry=288230376151711747, file_space_id=file_space_id@entry=288230376151711748, dxpl_id=720575940379279368, dxpl_id@entry=0,
buf=buf@entry=0x7fffffffdd10) at /root/Software/hdf5-1.10.5/src/H5Dio.c:335
#9 0x00000000004092da in H5TB_common_append_records (dataset_id=360287970189639680, mem_type_id=216172782113784124, nrecords=nrecords@entry=1, orig_table_size=0,
buf=buf@entry=0x7fffffffdd10) at /root/Software/hdf5-1.10.5/hl/src/H5TB.c:3492
#10 0x0000000000402e44 in H5PTappend (table_id=, nrecords=1, data=0x7fffffffdd10) at /root/Software/hdf5-1.10.5/hl/src/H5PT.c:558
#11 0x00000000004024f0 in main (argc=1, argv=0x7fffffffde48) at …/src/main_c.c:60

The source code for the error is:

 * Function:    H5T_vlen_str_mem_getlen
 * Purpose:     Retrieves the length of a memory based VL string.
 * Return:      Non-negative on success/Negative on failure
 * Programmer:  Quincey Koziol
 *              Wednesday, June 2, 1999
static ssize_t
H5T_vlen_str_mem_getlen(const void *_vl)
    const char *s=*(const char * const *)_vl;   /* Pointer to the user's string information */
    const char *s;      /* Pointer to the user's string information */


    /* check parameters */
    HDmemcpy(&s, _vl, sizeof(char *));

    FUNC_LEAVE_NOAPI((ssize_t)HDstrlen(s))            //--------------------Line # 546
}   /* end H5T_vlen_str_mem_getlen() */

I’ll try adding a char * placeholder in the struct to see if that allows me to use the ‘correct’ way to declare a non-fixed length character string. I’ve seen that in some example code but only to read packet table data, not to write it.